Rust extension traits, greppability and IDEs

Traits are a central feature of Rust, critical for its implementation of
polymorphism; traits are used for both static (by serving as bounds for generic
parameters) and dynamic (by having trait objects to serve as interfaces)
polymorphism.

This post assumes some familiarity with traits and discusses only a specific
aspect of them – how extension traits affect code readability. To learn the
basics of traits in Rust, the official book is a good starting point.

Extension traits

This Rust RFC
provides a good, short definition of extension traits:

Extension traits are a programming pattern that makes it possible to add
methods to an existing type outside of the crate defining that type.

For example, here’s a trait with a single method:

trait Magic {
fn magic_num(&self) -> usize;
}

We can now implement the Magic trait for our types:

struct Foobar {
name: String,
}

impl Magic for Foobar {
fn magic_num(&self) -> usize {
return if self.name.len() == 0 { 2 } else { 33 };
}
}

Now a FooBar can be passed wherever a Magic is expected. FooBar is a
custom type, but what’s really interesting is that we can also implement
Magic for any other type, including types that we did not define. Let’s
implement it for bool:

impl Magic for bool {
fn magic_num(&self) -> usize {
return if *self { 3 } else { 54 };
}
}

We can now write code like true.magic_num() and it will work! We’ve added
a method to a built-in Rust type. Obviously, we can also implement this trait
for types in the standard library; e.g.:

impl<T> Magic for Vec<T> {
fn magic_num(&self) -> usize {
return if self.len() == 0 { 10 } else { 5 };
}
}

Extension traits in the wild

Extension traits aren’t just a fringe feature; they are widely used in the Rust
ecosystem.

One example is the popular serde crate, which includes code that serializes
and deserializes data structures in multiple formats. One of the traits
serde provides is serde::Serialize; once we import this trait and one of
the concrete serializers serde provides, we can do stuff like [1]:

let mut serializer = serde_json::Serializer::new(std::io::stdout());
185.serialize(&mut serializer).unwrap();

Importing serde::Serialize is critical for this code to work, even though we
don’t refer to Serialize anywhere in our code explicitly. Rust requires
traits to be explicitly imported to imbue their methods onto existing types;
otherwise it’s hard to avoid naming collisions in case multiple traits from
different crates provide the same methods.

Another example is the byteorder crate, which helps encode numbers into
buffers with explicit length and endianness. To write some numbers into a vector
byte-by-byte, we have to import the relevant trait and enum first, and then
we can call the newly-added methods directly on a vector:

use byteorder::{LittleEndian, WriteBytesExt};

// …

let mut wv = vec![];
wv.write_u16::<LittleEndian>(259).unwrap();
wv.write_u16::<LittleEndian>(517).unwrap();

The write_u16 method is part of the WriteBytesExt trait, and it’s
implemented on a Vec by the byteorder crate. To be more precise, it’s
automatically implemented on any type that implements the Write trait.

Finally, let’s look at rayon – a library for simplified data-parallelism. It
provides magical iterators that have the same functionality as iter but
compute their results in parallel, leveraging multiple CPU cores. The rayon
documentation recommends to import the traits the crate injects as follows:

It is recommended that you import all of these traits at once by adding
use rayon::prelude::* at the top of each module that uses Rayon methods.

Having imported it thus, we can proceed to use Rayon as follows:

let exps = vec![2, 4, 6, 12, 24];
let pows_of_two: Vec<_> = exps.par_iter().map(|n| 2_u64.pow(*n)).collect();

Note the par_iter, which replaces a regular iter. It’s been magically
implemented on a vector, as well as a bunch of other types that support
iteration.

On greppability and code readability

All these uses of extension traits are pretty cool and useful, no doubt. But
that’s not the main point of my post. What I really want to discuss is how the
general approach relates to code readability, which is in my mind one of the
most important aspects of programming we should all be thinking about.

This Rust technique fails the greppability test; it’s not a word I made up –
google it! If it’s not immediately apparent, greppability means the ability to
explore a code base using textual search tools like grep, git grep,
ripgrep, pss or what have you.

Suppose you encounter this piece of code in a project you’re exploring:

let mut wv = vec![];
wv.write_u16::<LittleEndian>(259).unwrap();

“Interesting”, you think, “I didn’t know that Vec has a write_u16
method”. You quickly check the documentation – indeed, it doesn’t! So where is
it coming from? You grep the project… nothing. It’s nowhere in the
imports. You examine the imports one by one, and notice the:

use byteorder::{LittleEndian, WriteBytesExt};

“Aha!”, you say, “this imports LittleEndian, so maybe this has to do with
the byteorder crate”. You check the documentation of that crate and indeed,
you find the write_u16 method there; phew.

With par_iter you’re less lucky. Nothing in imports will catch your eye,
unless you’re already familiar with the rayon crate. If you’re not, then
use rayon::prelude::* won’t ring much of a bell in relation to par_iter.

Of course, you can just google this symbol like this and you’ll find it. Or maybe
you don’t even understand what the problem is, because your IDE is perfectly
familiar with these symbols and will gladly pop up their documentation when you
hover over them.

IDEs and language servers

These days we have free, powerful and fast IDEs that make all of this a
non-issue (looking at Visual Studio Code, of course). Coupled with smart
language servers, these IDEs are as familiar with your code as the compiler;
the language servers typically run a full front-end sequence on the code, ending
up with type-checked ASTs cross-referenced with symbol tables that let them
understand where each symbol is coming from, its type and so on. For Rust the
language server is RLS, for Go its gopls; all popular languages have them these
days [2].

It’s entirely possible that using a language like Rust without a sophisticated
IDE is madness, and I’m somewhat stuck in the past. But I have to say, I do
lament the loss of greppability. There’s something very universal about being
able to understand a project using only grep and the official documentation.

In fact, for some languages it’s likely that this has been the case for a long
while already. Who in their right mind has the courage to tackle a Java project
without an IDE? It’s just that this wasn’t always the case for systems
programming languages, and Rust going this way makes me slightly sad. Or maybe
I’m just too indoctrinated in Go at this point, where all symbol access happens
as package.Symbol, packages are imported explicitly and there is no magic
name injection anywhere (almost certainly by design).

I can’t exactly put my finger on why this is bothering me; perhaps I’m just
yelling at clouds
here. While I’m at it, I should finally write that post about printf-based
debugging…

[1]
Note that it could be simpler to use serde’s to_json function
here, but I opted for the explicit serializer because I wanted to show
how we invoke a new method on an integer literal.

[2]
Apparently, not all tooling has access to sophisticated language servers;
for example, as far as I can tell GitHub source analysis won’t be able to
find where write_u16 is coming from, and the same is true of
Sourcegraph.

Leave a Reply

Your email address will not be published.