Rust: using `String` and `&str` in your APIs

As part of writing my toy document database, I have to deal with a lot of strings. As part of my Rust rewrite, I have found that Rust is more complicated than Go in this regard.

Rust has two main string types (and a few more for good measure, but they don’t come up so often). Coming from Go, this was something that kept bugging me. I didn’t have a good handle on which to use. In Go, you pretty much always pass around Go’s standard string type, which is a slice of runes (a Go rune is a UTF-8 character point).

Rust, however, has two string types — String and &str — that one has to repeatedly choose between. Or so it felt like to me. Though it is my first non-trivial rust project, I wanted at least somewhat idiomatic APIs for my toy document database. So I went looking for guidance. Were there rules of thumb that I could rely on, or was seemingly every function definition going to need deep thought to choose one or the other?

Frankly, it’s clear I’m not the first person to ask this question! And, fortunately, there do appear to be straightforward patterns that cover most situations. This post pulls together a summary of the guidelines I found. I link to my sources, which give a much more complete coverage of the subject and are therefore very worthy of your attention too 🙂.

As noted, each article I link to contains a decent chunk of other info that’s worth reading. For example, I’m not going to talk here about what a String or a &str actually is. Follow the links to get this bit of background if you need it. Follow them anyway; they were useful beyond the immediate patterns I was after.

The first guideline I found comes from Understanding when to use String vs str - help - The Rust Programming Language Forum. Strictly, Rule 2 comes from the Logrocket blog, but I put it here to keep things together. These cover most scenarios, so lets put them in a box!

The basic “String or &str” rules

Use &str for function arguments when they are read-only.
Use &mut String for arguments that are modified in-place.
Use String for string-like return values.

Reference:

Generally, you should use &str in function arguments and returns when you can use &str, but you can’t always. In particular, any function which creates a new string that did not previously exist must return String rather than &str, because in order to continue existing past the function returning, the string needs to be owned by the return value.
The way I would suggest looking at it is: use &str when you know there is an owner of the string already, and they will hold still for you to borrow it as long as you need it. If there is no existing owner, or if the owner has its own business that is incompatible with you borrowing it, then you need to use String.

Next I read Understanding String and &str in Rust - LogRocket Blog. This expanded my understanding as it explains why using &str in function arguments produces ergonomic APIs: the compiler gives a helping hand.

&String can be “deref-coerced” into &str. This means that the compiler will automatically convert arguments passed of type &String into &str for you. Because of this, the caller of your API doesn’t have to worry whether they are passing exactly the right type of string, and they avoid boilerplate conversion code. Nice!

It also notes what a constant string should look like in your code:

const CONST_STRING: &'static str = "some constant string";

Finally, I dug a bit deeper into the ergonomics and efficiencies of APIs that use strings.

In this vein, Creating a Rust function that accepts String or &str is a good guide to creating a usable API that also adheres to the rust tenet of being efficient. In this case, that means avoiding needless memory allocations. Which is good because memory allocations are a major source of program slowness.

This article also stands out for being the oldest I link to — the others are dated within the last couple of years, whereas this one is from 2015. Given the speed at which Rust develops, perhaps this means the specifics of the technique are out of date. Even if they are, I still found the Into technique eye-opening in my Rust journey: Rust appears to have zillion little affordances like this, and I’m only really starting to comprehend the power they give its type system. (See also the “defer-coercing” above).

Our example is that we want to write a function that can take ownership of a String, so that we can safely keep it around in a struct:

struct Person {
    name: String,
}

impl Person {
    fn new(name: WHAT_GOES_HERE) -> Person {
        Person { name: AND_HERE }
    }
}

When deciding on our types, we want to avoid both mussing up the call site with unneeded code and needlessly allocating memory just so our API looks tidy:

If we accept a String, the caller must do to_string() if they have a &str. We don’t want this because it’s tedious and clutters their code.
But if we instead accept a &str, then inside the function we will need to create a new string using to_string() so we can own it (because &str means our function is only borrowing the reference). This is bad because, if the caller can give us ownership of the string it passes to our new function, then we can avoid allocating a new string.

It turns out that by relying on the Into trait, we can write code that will only create a new copy of the string if we are given a &str:

struct Person {
    name: String,
}

impl Person {
    fn new<S: Into<String>>(name: S) -> Person {
        Person { name: name.into() }
    }
}

In this code:

If name is a &str then into() will clone it, allocating memory for a new String. We have to accept that we need to allocate new memory for the copy, because we need to take ownership of the string.
However, if name is a String, into() is a no-op and we get the string without creating a copy. The caller gets to decide whether to give us the original or a clone.

By passing a String, the user of our API gets to decide whether to allocate the new string. On the other hand, passing an &str is often convenient, and we support that seamlessly too, internally creating the new String when we need it. Finally, we also handle any other type that implements the Into<String> trait, which is a nice side-bonus.

(This approach is improved still further in From &str to Cow. There we learn how to avoid allocation even in the &str case if the compiler can prove the &str can be borrowed for long enough. If you are still reading at this point, I definitely recommend hopping over there to read that one.)

So there we have it. I think I know more about what’s going on with these two types now. But, even better, I can write functions that take strings in ways that will be familiar for other rust users, and help communicate what the function will do with those strings.

Aim achieved 💪.

(Whether my APIs will be any good is a different story 😂).

PostRust: using String and &str in your APIs

Post
Rust: using `String` and `&str` in your APIs