Rust: using String
and &str
in your APIs
As part of writing my toy document database, I have to deal with a lot of strings. As part of my Rust rewrite, I have found that Rust is more complicated than Go in this regard.
Rust has two main string types (and a few more for good measure, but they don’t
come up so often). Coming from Go, this was something that kept bugging me. I
didn’t have a good handle on which to use. In Go, you pretty much always pass
around Go’s standard string type, which is a slice of runes (a Go rune
is a
UTF-8 character point).
Rust, however, has two string types — String
and &str
— that one has to
repeatedly choose between. Or so it felt like to me. Though it is my first
non-trivial rust project, I wanted at least somewhat idiomatic APIs for my toy
document database. So I went looking for guidance. Were there rules of thumb
that I could rely on, or was seemingly every function definition going to need
deep thought to choose one or the other?
Frankly, it’s clear I’m not the first person to ask this question! And, fortunately, there do appear to be straightforward patterns that cover most situations. This post pulls together a summary of the guidelines I found. I link to my sources, which give a much more complete coverage of the subject and are therefore very worthy of your attention too 🙂.
As noted, each article I link to contains a decent chunk of other info that’s
worth reading. For example, I’m not going to talk here about what a String
or
a &str
actually is. Follow the links to get this bit of background if you need
it. Follow them anyway; they were useful beyond the immediate patterns I was
after.
The first guideline I found comes from Understanding when to use String vs str - help - The Rust Programming Language Forum. Strictly, Rule 2 comes from the Logrocket blog, but I put it here to keep things together. These cover most scenarios, so lets put them in a box!
The basic “String or &str” rules
- Use
&str
for function arguments when they are read-only. - Use
&mut String
for arguments that are modified in-place. - Use
String
for string-like return values.
Reference:
Generally, you should use
&str
in function arguments and returns when you can use&str
, but you can’t always. In particular, any function which creates a new string that did not previously exist must returnString
rather than&str
, because in order to continue existing past the function returning, the string needs to be owned by the return value.The way I would suggest looking at it is: use
&str
when you know there is an owner of the string already, and they will hold still for you to borrow it as long as you need it. If there is no existing owner, or if the owner has its own business that is incompatible with you borrowing it, then you need to useString
.
Next I read Understanding String and &str in Rust - LogRocket Blog. This
expanded my understanding as it explains why using &str
in function arguments
produces ergonomic APIs: the compiler gives a helping hand.
&String
can be “deref-coerced” into &str
. This means that the compiler will
automatically convert arguments passed of type &String
into &str
for you.
Because of this, the caller of your API doesn’t have to worry whether they are
passing exactly the right type of string, and they avoid boilerplate
conversion code. Nice!
It also notes what a constant string should look like in your code:
const CONST_STRING: &'static str = "some constant string";
Finally, I dug a bit deeper into the ergonomics and efficiencies of APIs that use strings.
In this vein, Creating a Rust function that accepts String or &str is a good guide to creating a usable API that also adheres to the rust tenet of being efficient. In this case, that means avoiding needless memory allocations. Which is good because memory allocations are a major source of program slowness.
This article also stands out for being the oldest I link to — the others are
dated within the last couple of years, whereas this one is from 2015. Given the
speed at which Rust develops, perhaps this means the specifics of the technique
are out of date. Even if they are, I still found the Into
technique
eye-opening in my Rust journey: Rust appears to have zillion little affordances
like this, and I’m only really starting to comprehend the power they give its
type system. (See also the “defer-coercing” above).
Our example is that we want to write a function that can take ownership of a
String
, so that we can safely keep it around in a struct:
struct Person {
name: String,
}
impl Person {
fn new(name: WHAT_GOES_HERE) -> Person {
Person { name: AND_HERE }
}
}
When deciding on our types, we want to avoid both mussing up the call site with unneeded code and needlessly allocating memory just so our API looks tidy:
- If we accept a
String
, the caller must doto_string()
if they have a&str
. We don’t want this because it’s tedious and clutters their code. - But if we instead accept a
&str
, then inside the function we will need to create a new string usingto_string()
so we can own it (because&str
means our function is only borrowing the reference). This is bad because, if the caller can give us ownership of the string it passes to ournew
function, then we can avoid allocating a new string.
It turns out that by relying on the Into
trait, we can write code that will
only create a new copy of the string if we are given a &str
:
struct Person {
name: String,
}
impl Person {
fn new<S: Into<String>>(name: S) -> Person {
Person { name: name.into() }
}
}
In this code:
- If
name
is a&str
theninto()
will clone it, allocating memory for a newString
. We have to accept that we need to allocate new memory for the copy, because we need to take ownership of the string. - However, if
name
is aString
,into()
is a no-op and we get the string without creating a copy. The caller gets to decide whether to give us the original or a clone.
By passing a String
, the user of our API gets to decide whether to allocate
the new string. On the other hand, passing an &str
is often convenient, and we
support that seamlessly too, internally creating the new String
when we need
it. Finally, we also handle any other type that implements the Into<String>
trait, which is a nice side-bonus.
(This approach is improved still further in
From &str to Cow. There we learn how
to avoid allocation even in the &str
case if the compiler can prove the &str
can be borrowed for long enough. If you are still reading at this point, I
definitely recommend hopping over there to read that one.)
So there we have it. I think I know more about what’s going on with these two types now. But, even better, I can write functions that take strings in ways that will be familiar for other rust users, and help communicate what the function will do with those strings.
Aim achieved 💪.
(Whether my APIs will be any good is a different story 😂).