--- title: "Another one bites the Rust" subtitle: "Lessons learned from my first Rust project" date: 2019-08-25 draft: false author: Wilfried tags: [dev, programming, rust] --- # Rust, the (re)discovery πŸ—ΊοΈ Back to **2014**, the year I wrote my first _Rust_ program ! Wow, that was even **before** version 1.0 ! At my school, it was all about _C_, and 10 years of _C_ is quite boring (because, yes, I started programming way before my engineering school). I wanted some fresh air, and even if I really liked _Python_ back to that time, i'm more into **statically typed** programming languages. I can't remember exactly what was my first contact with _Rust_, maybe a blog post or a reddit post, and my reaction was something like : ![Brain Explosion (gif)](https://media.giphy.com/media/xT0xeJpnrWC4XWblEk/giphy.gif) Now, let's dig some reasons about why _Rust_ blows my mind. _Rust_ is a programming langage focused on safety, and concurrency. It's basically the modern replacement of C++, plus, a multi-paradigm approach to system programming development. This langage was created and designed by Graydon Hoare at Mozilla and used for the _Servo_ browser engine, now embedded in _Mozilla Firefox_. This system try to be as **memory safe** as possible : - no null pointers - no dandling pointers - no data races - values have to be initialized and used - no need to free allocated data Back in 2014, the first thing that come to my mind was : Yeah, yeah, just yet another garbage collected langage And I was wrong ! _Rust_ uses other mechanisms such as _ownership_, _lifetimes_, _borrowing_, and the ultimate **borrow checker** to ensure that memory is safe and freed only when data will not be used anywhere else in the program (_ie_ : when _lifetime_ is over). This new memory management concepts ensure _safety_ **AND** _speed_ since there is no overhead generated by a garbage collection. For me, as a young hardcore C programmer[^1], this was literally heaven. Less struggling with `calloc`, `malloc`, `free` and `valgrind`, thanks god _Mozilla_ ! Bless me ! **But**, I dropped it in 2015. Why ? Because this was, at this time, far from perfect. Struggling with all the new concepts was quite disturbing, add to that cryptic compilation errors and you open a gate to _brainhell_. My young me was not enough confident to learn and there was no community to help me understand things that was unclear to me. Five years later, my programming skills are clearly not at the same level as before, learning and writing a lot of _Golang_ and _Javascript_ stuff, playing with _Elixir_ and _Haskell_, completely changed how I manipulate and how I visualize code in day to day basis. It was time to **give another** chance to _Rust_. # Fediwatcher πŸ“Š In order to practice my _Rust_ skill, I wanted to build a concrete and useful _(at least for me)_ project. Inspired by the work of **href** on [fediverse.network](https://fediverse.network) my main idea was to build a small app to fetch metrics from various instances of the **fediverse** and push it into an [InfluxDB](https://influxdata.com) timeseries database. **Fediwatcher** was born ! The code is available on a [github repo](https://github.com/papey/fediwatcher) associated with my [github account](https://github.com/papey). If you're interested, check out the [Fediwatcher public instance](https://metrics.papey.fr). Ok, ok, enough personal promotion, now that you get the main idea, go for the technical part and all the lessons learn writing this small project ! # Cargo, compiling & building πŸ—οΈ `cargo` is the _Rust_ **standard** packet manager created and maintained by the _Rust_ project. This is the tool used by all rustaceans and that's a good thing. If you don't know it yet, I'm also a gopher, and package management with `go` is a big pile of shit. In 2019, leaving package management to the community is, I think, the biggest mistake you can make when creating a new programming language. _So go for cargo !_ `cargo` is used for : - downloading app dependencies - compiling app dependencies - compiling your project - running your project - running tests on your project - publishing you project to [crates.io](https://crates.io) All the informations are contained in the `Cargo.toml` file, no need for a `makefile` makes my brain happy and having a common way to tests code without external packages is pretty straightforward and a strong invitation to test your code. All the standard stuff also means that default docker images contains all you need to setup a _Continous Integration_ without the need to maintain specific container images for a specific test suite or whatever. Less time building stuff, means more productive time to code. With _Rust_, all the batteries are included. About compiling, spoiler alert, _Rust_ project compiling **IS SLOW** : {{< highlight sh >}} cargo clean time cargo build Compiling autocfg v0.1.4 Compiling libc v0.2.58 Compiling arrayvec v0.4.10 Compiling spin v0.5.0 Compiling proc-macro2 v0.4.30 [...] Compiling reqwest v0.9.18 Compiling fediwatcher v0.1.0 (/Users/wilfried/code/github/fediwatcher) Finished dev [unoptimized + debuginfo] target(s) in 4m 25s cargo build 571,39s user 50,53s system 233% cpu 4:25,95 total {{< / highlight >}} This is quite surprising if you compare it to a fast compiling language like `go`, but that's fair because the compiler have to check a bunch of things related to memory safety. With no garbage collector, speed at runtime and memory safety, you have to pay a price and this price is the **compile time**. But I really think it's not a weakness for _Rust_ because `cargo` caching is amazing and after the first compilation, iterations are pretty fast so it's not a real issue. When it comes to building a _Docker_ image, I learned a nice tip to optimize container image building with a clever use of layers. Here is the tip ! {{< highlight dockerfile "linenos=table" >}} # New empty project RUN USER=root cargo new --bin fediwatcher WORKDIR /fediwatcher # Fetch deps list COPY ./Cargo.lock ./Cargo.lock COPY ./Cargo.toml ./Cargo.toml # Step to build a default hello world project. # Since Cargo.lock and Cargo.toml are present, # all deps will be downloaded and cached inside this upper layer RUN cargo build --release RUN rm src/\*.rs # Now, copy source code COPY ./src ./src # Build the real project RUN rm ./target/release/deps/fediwatcher\* RUN cargo build --release {{< / highlight >}} Remember that dependencies are less volatile than code, and with containers this means get dependencies as soon as possible and copy code later ! In the _Rust_ case, the first thing to do is creating an empty project using `cargo new`. This will create a default project with a basic hello world in `main.rs` file. After that, copy all things related to dependencies (`Cargo.toml` and `Cargo.lock` files) and trigger a build, in this image layer, all the deps will be downloaded and compiled. Now that there is a layer containing all the dependencies, copy the real source code and then compile the real project. With this technique, the dependencies layer will be cached and used in later build. Believe me, this a time saver ! Not lost yet ? Good, because there is more, so take a deep breath and go digging some _Rust_ features. # Flow control πŸ›‚ _Rust_ takes inspiration from various programming language, mainly _C++_ a imperative language, but there is also a lot of features that are typical in _functional programming_. I already write some _Haskell_ (because **Xmonad** ftw) and some _Elixir_ but I don't feel very confident with functional programming yet. I find this salad mix called as multi-paradigm programming very convenient to understand and try some functional way of thinking. The top most functional feature of rust is the `match` statement. To me, this the most beautiful and clean way to handle multiple paths inside a program. For imperative programmers out there, a `match` is like a `switch case` on steroids. To illustrate, let's look at a simple example[^2]. {{< highlight rust >}} let number = 2; println!("Tell me something about {}", number); match number { // Match a single value 1 => println!("One!"), // Match several values 2 | 3 | 5 | 7 | 11 => println!("This is a prime"), // Match an inclusive range 13..=19 => println!("A teen"), // Whatever _ => println!("Ain't special"), } {{< / highlight >}} Here, all the cases are matched, but what if I removed the last branch ? {{< highlight txt >}} help: ensure that all possible cases are being handled, possibly by adding wildcards or more match arms {{< / highlight >}} See ? _Rust_ violently pointing out missing stuff, and that's why it's a pleasant language to use. A `match` statement can also be used to _destructure_ a variable, a common pattern in _functional_ programming. Destructuring is a process used to break a structure into multiple and independent variables. This can also be useful when you need only a part of a structure, making your code more comprehensive and readable[^3]. {{< highlight rust >}} struct Foo { x: (u32, u32), y: u32, } // Try changing the values in the struct to see what happens let foo = Foo { x: (1, 2), y: 3 }; match foo { Foo { x: (1, b), y } => println!("First of x is 1, b = {}, y = {} ", b, y), // you can destructure structs and rename the variables, // the order is not important Foo { y: 2, x: i } => println!("y is 2, i = {:?}", i), // and you can also ignore some variables: Foo { y, .. } => println!("y = {}, we don't care about x", y), // this will give an error: pattern does not mention field `x` //Foo { y } => println!("y = {}", y); } {{< / highlight >}} With `match`, I made my first step inside the _functional programming_ way of thinking. The second one was iterators, functions chaining and closures, the perfect combo ! The idea is to chain function and pass input and output from one to another. Chaining using small scope functions made code more redable, more testable and more reliable. As always, an example ! {{< highlight rust >}} let iterator = [1,2,3,4,5].iter(); // fold, is also known as reduce, in other languages let sum = iterator.fold(0, |total, next| total + next); println!("{}", sum); {{< / highlight >}} The first line is used to create a `iterator`, a structure used to perform tasks on a sequence of items. Later on, a specific method associated with iterators `fold` is used to sum up all items inside the iterator and produce a single final value : the sum. As a parameter, we pass a `closure` (a function defined on the fly) with a `total` and a `next` arguments. The `total` variable is used to store current count status and `next` is the next value inside the iterator to add to `total`. A non functional alternative as the code shown above will be something like : {{< highlight rust >}} let collection = [1,2,3,4,5]; let mut sum = 0; for elem in collection.iter() { sum += elem; } println!("{}", sum); {{< / highlight >}} With more complex data, more operations, removing for loops and chaining function using `map`, `filter` or `fold` really makes code cleaner and easier to understand. You just get important stuff, there is no distraction and a code without boiler plate lines is less error probes. Flow control is a large domain and it contains error handling. In _Rust_ there is two kind of generic errors : `Option` used to describe the possibility of _absence_ and `Result` used as supersed of `Option` to handle the possibility of errors. Here is the definition of an `Option` : {{< highlight rust >}} enum Option { None, Some(T), } {{< / highlight >}} Where `None` means "no value" and `Some(T)` means "some variable (of type `T`)" An `Option` is useful if you, for example, search for a file that may not exists {{< highlight rust >}} let file = "not.exists"; match find(file, '.') { None => println!("File not found."), Some(i) => println!("File found : {}", &file), } {{< / highlight >}} If you need an explicit error to handle, go for `Result` : {{< highlight rust >}} enum Result { Ok(T), Err(E), } {{< / highlight >}} Where `Ok(T)` means "everything is good for the value (of type `T`)" and `Err(E)` means "An error (of type `E`) occurs". To conclude, it's possible to define an `Option` like this : {{< highlight rust >}} type Option = Result; {{< / highlight >}} "An `Option` is a `Result` with an empty `Err` value". Q.E.D ! At this point of my journey (re)discovering _Rust_ I was super happy with all this new concepts. As a gopher, I know how crappy error handling can be in other languages, so a clean and standard way to handle error, count me in. So, what about composing functions that needs error handling ? Ahah ! Let's go : {{< highlight rust "linenos=table, hl_lines=32-40 43-45 48-50" >}} // An example using music bands // Allow dead code, for learning purpose #![allow(dead_code)] #[derive(Debug)] enum Bands { AAL, Alcest, Sabaton, } // But does it djent ? fn does_it_djent(b: Bands) -> Option { match b { // Only Animals As Leaders djents Bands::AAL => Some(b), _ => None, } } // Do I like it ? fn likes(b: Bands) -> Option { // No, I do not like Sabaton match b { Bands::Sabaton => None, _ => Some(b), } } // Do it djent and do I like it ? the match version ! fn match_likes_djent(b: Bands) -> Option { match does_it_djent(b) { Some(b) => match likes(b) { Some(b) => Some(b), None => None, }, None => None, } } // Do it djent and do I like it ? the map version ! fn map_likes_djent(b: Bands) -> Option> { does_it_djent(b).map(likes) } // Do it djents and do I like it ? the and_then version ! fn and_then_likes_djent(b: Bands) -> Option { does_it_djent(b).and_then(likes) } fn main() { let aal = Bands::AAL; match and_then_likes_djent(aal) { Some(b) => println!("I like {:?} and it djents", b), None => println!("Hurgh, this band doesn't even djent !"), } } {{< / highlight >}} On a first try, the basic solution is to use a series of `match` statements (line 32). With two functions, that's ok, but with 3 or more, this will be a pain in the ass to read. Searching for a cleaner way of handling stuff that returns an `Option` I find the associated `map` method. **BUT** using `map` with something that also return an `Option` leads to (function definition on line 43) : **an Option of an Option !** ![Facepalm](https://upload.wikimedia.org/wikipedia/commons/3/3b/Paris_Tuileries_Garden_Facepalm_statue.jpg) Is everything doomed ? No ! Because there is the god send `and_then` method (function starting on line 48). Basically, `and_then` ensure that we keep a "flat" structure and do not add an `Option` wrapping to an already existing `Option`. _Lesson learned_ : if you have to deal with a lot of `Option`s or `Result`s, use `and_then`. Last but not least, I also want to write about the `?` operator for error handling. Since _Rust_ version 1.13, this new operator removes a lot of boiler plate and redundant code. Before 1.13, error handling will probably look like this : {{< highlight rust >}} fn read_from_file() -> Result { let f = File::open("sample.txt"); let mut s = String::new(); let mut f = match f { Ok(f) => f Err(e) => return Err(e), }; match f.read_to_string(&mut s) { Ok(_) => Ok(s), Err(e) => Err(e), } } {{< / highlight >}} With 1.13 and later, {{< highlight rust >}} fn read_from_file() -> Result { let mut s = String::new(); let mut f = File::open("sample.txt")?; f.read_to_string(&mut s)?; Ok(s) } {{< / highlight >}} Nice and clean ! _Rust_ also experiment with a function name `try`, used like the `?` operator, but chaining functions leads to unreadable and ugly code : {{< highlight rust >}} try!(try!(try!(foo()).bar()).baz()) {{< / highlight >}} To conclude, there is a lot of stuff here, to make code easy to understand and maintain. Flow control using match and functions combination may seems odd at the beggining but after some pratice and experiments I find quite pleasant to use. But there is (again), more, fasten your seatbelt, next section will blow your mind. # Ownership, borrowing, lifetimes 🀯 To be clear, the 10 first hours of _Rust_ coding just smash my brains because of this three concepts. They are quite handy to understand at first, because they change the way we make and understand programs. With time, pratice and compiler help, the mist is replaced by a beautiful sunligth. There is plenty of other blog posts, tutorials and lessons about lifetimes, ownership and borrowing. I will add my brick to this wall, with my own understanding of it. Let's start with **ownership**. _Rust_ memory management is base on this concept. Every resources (variables, objects...) is **own** by a block of code. At the end of this block, resourses are destroyed. This is the standard predicatable, reproducible, behavior of _Rust_. For small stuff, that's easy to understand : {{< highlight rust "linenos=table, hl_lines=11">}} fn main() { // create a block, or scope { // resource creation let i = 42; println!("{}", i); // i is destroyed by the compiler, and you have nothing else to do } // fail, because i do not exists anymore println!("{}", i); } {{< / highlight >}} Compiling this piece of code will throw an error : {{< highlight txt >}} error[E0425]: cannot find value `i` in this scope --> src/main.rs:11:20 | 11 | println!("{}", i); | ^ not found in this scope {{< / highlight >}} To remove the error, just delete the line 11. Ok, cool ! But what if I want to pass a resources to another block or even a function ? {{< highlight rust "linenos=table" >}} fn priprint(val: int) { println!("{}", val); } fn main() { let i = 42; priprint(i); println!("{}", i); } {{< / highlight >}} Here, this piece of code works because _Rust_ copy the value of `i` into `val` when calling the `priprint` function. All primitive types in _Rust_ works this way, but, if you want to pass, for example, a struct, _Rust_ will **move** the resource to the function. By **moving** a resource, you **transfer** ownership to the receiver. So in the example below `priprint` will be responsible of the destruction of the struct passed to it. {{< highlight rust "linenos=table, hl_lines=12" >}} struct Number { value: i32 } fn priprint(n: Number) { println!("{}", n.value); } fn main() { let n = Number{ value: 42 }; priprint(n); println!("{}", n.value); } {{< / highlight >}} When compiling, _Rust_ will not be happy : {{< highlight txt >}} error[E0382]: borrow of moved value: `n` --> src/main.rs:14:20 | 10 | let n = Number{ value: 42 }; | - move occurs because `n` has type `Number`, which does not implement the `Copy` trait 11 | 12 | priprint(n); | - value moved here 13 | 14 | println!("{}", n.value); | ^^^^^^^ value borrowed here after move {{< / highlight >}} After **ownership** comes **borrowing**. With **borrowing** our _Rust_ program is able to have multiple references or _pointers_. Passing a reference to another block tells to this block, here is a **borrow** (mutable or imutable) do what you want with it but do not destroy it at the end of your scope. To pass references, or **borrows**, add the `&` operator to `priprint` argument and parameter. {{< highlight rust "linenos=table, hl_lines=5 12" >}} struct Number { value: i32 } fn priprint(n: &Number) { println!("{}", n.value); } fn main() { let n = Number{ value: 42 }; priprint(&n); println!("{}", n.value); } {{< / highlight >}} Seems cool no ? If a friend of mine borrow my car, I hope he will not return it in pieces. Now, **lifetimes** ! _Rust_ resources always have a **lifetime** associated to it. This means that resources the are accessible or "live" from the moment you declare it and the moment they are dropped. If you're familiar with other programming languages, think about **extensible scopes**. To me **extensible scopes** means that **scopes** can be move from one block of code to another. Simple, huh ? But things get complicated if you add references in the mix. Why ? Because references also have **lifetime**, and this **lifetime**, called **associated lifetime**, can be smaller than the **lifetime** pointed by the reference. Can this **associated lifetime** be longer ? No ! Because we want to access valid data ! In most cases, _Rust_ compiler is able to guess how **lifetimes** are related. If not, it will explicitly ask you to annotate you're code with **lifetimes specifiers**. To dig this subject, a whole article is necessary and I don't find my self enough confident with **lifetimes** yet to explain it in details. This is clearly the hardest part when you learning _Rust_. If you don't understand what you're doing at the beginning, that's not a real problem. Don't give up, read, try and experiment, the reward worth it. ![No idea](https://i.kym-cdn.com/entries/icons/original/000/008/342/ihave.jpg) # What's next ? πŸ”­ Thanks to _Rust_ and my little project, I learned a bunch of new concepts related to programming. _Rust_ is a beautiful language. The first time I used it, many years ago, it was a bit odd to understand. Today, with more programming experiences, I really understand why it matters. To me 2019, will be the _Rust_ year. A lots of _Rust_ projects pops up on Github, and that's a good sign of how the language start to gain in popularity. Backed up with Mozilla and the community, I really believe that's it's the go to language for the next 10 years. Of course, _Golang_ is also in this spectrum of new generation laguages but they complement one each other with various ways of thinking and programming. That's clear to me, I will continue to make _Go_ **AND** _Rust_ programs. Now, I need to go deeper. On one hand, by adding new features to **Fediwatcher** I want to experiment around concurrency and how I can compare it to _Golang_. On the other hand, I'm really, really interested by **web assembly** and I think _Rust_ is a good bet to start exploring this new open field. Last but not least, all this skills will allow me to continue my contributions to [Plume](https://github.com/Plume-org/Plume), a _Rust_ federated blogging application, based on ActivityPub. Let's go^Wrust ! [^1]: I am not a C hardcore programmer anymore, beceause of _Golang_ and _Rust_, of course. [^2]: Taken and adapted from [Rust documentation](https://doc.rust-lang.org/rust-by-example/flow_control/match.html) [^3]: Taken from [Rust documentation](https://doc.rust-lang.org/rust-by-example/flow_control/match/destructuring/destructure_structures.html)