From 559536db5938235a85940ad9bc919c5cbd6d28fe Mon Sep 17 00:00:00 2001 From: Wilfried OLLIVIER Date: Sun, 25 Aug 2019 14:50:58 +0200 Subject: [PATCH] Add 03-bites-the-rust article --- content/post/03-bites-the-rust.md | 668 ++++++++++++++++++++++++++++++ 1 file changed, 668 insertions(+) create mode 100644 content/post/03-bites-the-rust.md diff --git a/content/post/03-bites-the-rust.md b/content/post/03-bites-the-rust.md new file mode 100644 index 0000000..3eec28c --- /dev/null +++ b/content/post/03-bites-the-rust.md @@ -0,0 +1,668 @@ +--- +title: "Another one bites the Rust" +subtitle: "Lessons learned from my first Rust project" +date: 2019-08-25 +draft: false +tags: [dev, programming, rust] +--- + +# Rust, the (re)discovery πŸ—ΊοΈ + +Back to **2014**, the year I wrote my first _Rust_ program ! Wow, that was +even **before** version 1.0 ! + +At my school, it was all about _C_, and 10 years of _C_ is quite boring +(because, yes, I started programming way before my engineering school). I +wanted some fresh air, and even if I really liked _Python_ back to that time, +i'm more into **strongly typed** programming languages. + +I can't remember exactly what was my first contact with _Rust_, maybe a blog +post or a reddit post, and my reaction was something like : + +![Brain Explosion (gif)](https://media.giphy.com/media/xT0xeJpnrWC4XWblEk/giphy.gif) + +Now, let's dig some reasons about why _Rust_ blows my mind. + +_Rust_ is a programming langage focused on safety, and concurrency. It's +basically the modern replacement of C++, plus, a multi-paradigm approach to +system programming development. This langage was created and designed by +Graydon Hoare at Mozilla and used for the _Servo_ browser engine, now +embedded in _Mozilla Firefox_. + +This system try to be as **memory safe** as possible : + +- no null pointers +- no dandling pointers +- no data races +- values have to be initialized and used +- no need to free allocated data + +Back in 2014, the first thing that come to my mind was : + + Yeah, yeah, just yet another garbage collected langage + +And I was wrong ! _Rust_ uses other mechanisms such as _ownership_, +_lifetimes_, _borrowing_, and the ultimate **borrow checker** to ensure that +memory is safe and freed only when data will not be used anywhere else in the +program (_ie_ : when _lifetime_ is over). This new memory management concepts +ensure _safety_ **AND** _speed_ since there is no overhead generated by a +garbage collection. + +For me, as a young hardcore C programmer[^1], this was literally heaven. Less struggling +with `calloc`, `malloc`, `free` and `valgrind`, thanks god _Mozilla_ ! Bless me ! + +**But**, I dropped it in 2015. Why ? Because this was, at this time, far from perfect. +Struggling with all the new concepts was quite disturbing, add to that cryptic compilation +errors and you open a gate to _brainhell_. My young me was not enough confident to learn +and there was no community to help me understand things that was unclear to me. + +Five years later, my programming skills are clearly not at the same level as +before, learning and writing a lot of _Golang_ and _Javascript_ stuff, playing +with _Elixir_ and _Haskell_, completely changed how I manipulate and how I +visualize code in day to day basis. It was time to **give another** chance to _Rust_. + +# Fediwatcher πŸ“Š + +In order to practice my _Rust_ skill, I wanted to build a concrete and +useful _(at least for me)_ project. + +Inspired by the work of **href** on +[fediverse.network](https://fediverse.network) my main idea was to build a +small app to fetch metrics from various instances of the **fediverse** and +push it into an [InfluxDB](https://influxdata.com) timeseries database. + +**Fediwatcher** was born ! + +The code is available on a [github repo](https://github.com/papey/fediwatcher) +associated with my [github account](https://github.com/papey). + +If you're interested, check out the [Fediwatcher public instance](https://metrics.papey.fr). + +Ok, ok, enough personal promotion, now that you get the main idea, go for +the technical part and all the lessons learn writing this small project ! + +# Cargo, compiling & building πŸ—οΈ + +`cargo` is the _Rust_ **standard** packet manager created and maintained by +the _Rust_ project. This is the tool used by all rustaceans and that's a good +thing. If you don't know it yet, I'm also a gopher, and package management +with `go` is a big pile of shit. In 2019, leaving package management to the +community is, I think, the biggest mistake you can make when creating a new +programming language. _So go for cargo !_ + +`cargo` is used for : + +- downloading app dependencies +- compiling app dependencies +- compiling your project +- running your project +- running tests on your project +- publishing you project to [crates.io](https://crates.io) + +All the informations are contained in the `Cargo.toml` file, no need for a +`makefile` makes my brain happy and having a common way to tests code without +external packages is pretty straightforward and a strong invitation to test +your code. + +All the standard stuff also means that default docker images contains all you +need to setup a _Continous Integration_ without the need to maintain specific +container images for a specific test suite or whatever. Less time building +stuff, means more productive time to code. With _Rust_, all the batteries are +included. + +About compiling, spoiler alert, _Rust_ project compiling **IS SLOW** : + +{{< highlight sh >}} +cargo clean +time cargo build +Compiling autocfg v0.1.4 +Compiling libc v0.2.58 +Compiling arrayvec v0.4.10 +Compiling spin v0.5.0 +Compiling proc-macro2 v0.4.30 +[...] +Compiling reqwest v0.9.18 +Compiling fediwatcher v0.1.0 (/Users/wilfried/code/github/fediwatcher) +Finished dev [unoptimized + debuginfo] target(s) in 4m 25s +cargo build 571,39s user 50,53s system 233% cpu 4:25,95 total +{{< / highlight >}} + +This is quite surprising if you compare it to a fast compiling language like `go`, +but that's fair because the compiler have to check a bunch of things related to +memory safety. With no garbage collector, speed at runtime and memory safety, +you have to pay a price and this price is the **compile time**. + +But I really think it's not a weakness for _Rust_ because `cargo` caching is +amazing and after the first compilation, iterations are pretty fast so it's +not a real issue. + +When it comes to building a _Docker_ image, I learned a nice tip to optimize +container image building with a clever use of layers. Here is the tip ! + +{{< highlight dockerfile "linenos=table" >}} + +# New empty project + +RUN USER=root cargo new --bin fediwatcher +WORKDIR /fediwatcher + +# Fetch deps list + +COPY ./Cargo.lock ./Cargo.lock +COPY ./Cargo.toml ./Cargo.toml + +# Step to build a default hello world project. +# Since Cargo.lock and Cargo.toml are present, +# all deps will be downloaded and cached inside this upper layer + +RUN cargo build --release +RUN rm src/\*.rs + +# Now, copy source code + +COPY ./src ./src + +# Build the real project + +RUN rm ./target/release/deps/fediwatcher\* +RUN cargo build --release +{{< / highlight >}} + +Remember that dependencies are less volatile than code, and with containers +this means get dependencies as soon as possible and copy code later ! In the +_Rust_ case, the first thing to do is creating an empty project using `cargo new`. +This will create a default project with a basic hello world in +`main.rs` file. + +After that, copy all things related to dependencies +(`Cargo.toml` and `Cargo.lock` files) and trigger a build, in this image +layer, all the deps will be downloaded and compiled. + +Now that there is a layer +containing all the dependencies, copy the real source code and then compile +the real project. With this technique, the dependencies layer will be cached and used +in later build. Believe me, this a time saver ! + +Not lost yet ? Good, because there is more, so take a deep breath and go +digging some _Rust_ features. + +# Flow control πŸ›‚ + +_Rust_ takes inspiration from various programming language, mainly _C++_ a +imperative language, but there is also a lot of features that are typical in +_functional programming_. I already write some _Haskell_ (because +**Xmonad** ftw) and some _Elixir_ but I don't feel very confident with +functional programming yet. + +I find this salad mix called as multi-paradigm +programming very convenient to understand and try some functional way of +thinking. + +The top most functional feature of rust is the `match` statement. To me, +this the most beautiful and clean way to handle multiple paths inside a +program. For imperative programmers out there, a `match` is like a `switch case` on steroids. +To illustrate, let's look at a simple example[^2]. + +{{< highlight rust >}} +let number = 2; + +println!("Tell me something about {}", number); + +match number { + // Match a single value + 1 => println!("One!"), + // Match several values + 2 | 3 | 5 | 7 | 11 => println!("This is a prime"), + // Match an inclusive range + 13..=19 => println!("A teen"), + // Whatever + _ => println!("Ain't special"), +} +{{< / highlight >}} + +Here, all the cases are matched, but what if I removed the last branch ? + +{{< highlight txt >}} +help: ensure that all possible cases are being handled, possibly by adding +wildcards or more match arms +{{< / highlight >}} + +See ? _Rust_ violently pointing out missing stuff, and that's why it's a +pleasant language to use. + +A `match` statement can also be used to _destructure_ a variable, a common +pattern in _functional_ programming. Destructuring is a process used to +break a structure into multiple and independent variables. This can also be +useful when you need only a part of a structure, making your code more +comprehensive and readable[^3]. + +{{< highlight rust >}} +struct Foo { + x: (u32, u32), + y: u32, +} + +// Try changing the values in the struct to see what happens +let foo = Foo { x: (1, 2), y: 3 }; + +match foo { + Foo { x: (1, b), y } => println!("First of x is 1, b = {}, y = {} ", b, y), + + // you can destructure structs and rename the variables, + // the order is not important + Foo { y: 2, x: i } => println!("y is 2, i = {:?}", i), + + // and you can also ignore some variables: + Foo { y, .. } => println!("y = {}, we don't care about x", y), + // this will give an error: pattern does not mention field `x` + //Foo { y } => println!("y = {}", y); +} +{{< / highlight >}} + +With `match`, I made my first step inside the _functional programming_ way +of thinking. The second one was iterators, functions chaining and +closures, the perfect combo ! The idea is to chain function and pass input +and output from one to another. Chaining using small scope functions made +code more redable, more testable and more reliable. As always, an example ! + +{{< highlight rust >}} +let iterator = [1,2,3,4,5].iter(); + +// fold, is also known as reduce, in other languages +let sum = iterator.fold(0, |total, next| total + next); + +println!("{}", sum); +{{< / highlight >}} + +The first line is used to create a `iterator`, a structure used to perform +tasks on a sequence of items. Later on, a specific method associated with +iterators `fold` is used to sum up all items inside the iterator and produce +a single final value : the sum. As a parameter, we pass a `closure` (a +function defined on the fly) with a `total` and a `next` arguments. The +`total` variable is used to store current count status and `next` is the +next value inside the iterator to add to `total`. + +A non functional alternative as the code shown +above will be something like : + +{{< highlight rust >}} +let collection = [1,2,3,4,5]; + +let mut sum = 0; + +for elem in collection.iter() { + sum += elem; +} + +println!("{}", sum); +{{< / highlight >}} + +With more complex data, more operations, removing for loops and chaining +function using `map`, `filter` or `fold` really makes code cleaner and easier +to understand. You just get important stuff, there is no distraction and +a code without boiler plate lines is less error probes. + +Flow control is a large domain and it contains error handling. In _Rust_ +there is two kind of generic errors : `Option` used to describe the +possibility of _absence_ and `Result` used as supersed of `Option` to handle +the possibility of errors. + +Here is the definition of an `Option` : + +{{< highlight rust >}} +enum Option { + None, + Some(T), +} +{{< / highlight >}} + +Where `None` means "no value" and `Some(T)` means "some variable (of type `T`)" + +An `Option` is useful if you, for example, search for a file that may not exists + +{{< highlight rust >}} +let file = "not.exists"; + +match find(file, '.') { + None => println!("File not found."), + Some(i) => println!("File found : {}", &file), +} +{{< / highlight >}} + +If you need an explicit error to handle, go for `Result` : + +{{< highlight rust >}} +enum Result { + Ok(T), + Err(E), +} +{{< / highlight >}} + +Where `Ok(T)` means "everything is good for the value (of type `T`)" and +`Err(E)` means "An error (of type `E`) occurs". To conclude, it's possible to +define an `Option` like this : + +{{< highlight rust >}} +type Option = Result; +{{< / highlight >}} + +"An `Option` is a `Result` with an empty `Err` value". Q.E.D ! + +At this point of my journey (re)discovering _Rust_ I was super happy with all +this new concepts. As a gopher, I know how crappy error handling can be in +other languages, so a clean and standard way to handle error, count me in. + +So, what about composing functions that needs error handling ? Ahah ! Let's +go : + +{{< highlight rust "linenos=table, hl_lines=32-40 43-45 48-50" >}} +// An example using music bands + +// Allow dead code, for learning purpose +#![allow(dead_code)] + +#[derive(Debug)] +enum Bands { + AAL, + Alcest, + Sabaton, +} + +// But does it djent ? +fn does_it_djent(b: Bands) -> Option { + match b { + // Only Animals As Leaders djents + Bands::AAL => Some(b), + _ => None, + } +} + +// Do I like it ? +fn likes(b: Bands) -> Option { + // No, I do not like Sabaton + match b { + Bands::Sabaton => None, + _ => Some(b), + } +} + +// Do it djent and do I like it ? the match version ! +fn match_likes_djent(b: Bands) -> Option { + match does_it_djent(b) { + Some(b) => match likes(b) { + Some(b) => Some(b), + None => None, + }, + None => None, + } +} + +// Do it djent and do I like it ? the map version ! +fn map_likes_djent(b: Bands) -> Option> { + does_it_djent(b).map(|b| likes(b)) +} + +// Do it djents and do I like it ? the and_then version ! +fn and_then_likes_djent(b: Bands) -> Option { + does_it_djent(b).and_then(|b| likes(b)) +} + +fn main() { + let aal = Bands::AAL; + + match and_then_likes_djent(aal) { + Some(b) => println!("I like {:?} and it djents", b), + None => println!("Hurgh, this band doesn't even djent !"), + } +} +{{< / highlight >}} + +On a first try, the basic solution is to use a series of `match` statements +(line 32). With two functions, that's ok, but with 3 or more, this +will be a pain in the ass to read. Searching for a cleaner way of handling +stuff that returns an `Option` I find the associated `map` method. **BUT** +using `map` with something that also return an `Option` leads to (function +definition on line 43) : + +**an Option of an Option !** + +![Facepalm](https://upload.wikimedia.org/wikipedia/commons/3/3b/Paris_Tuileries_Garden_Facepalm_statue.jpg) + +Is everything doomed ? No ! Because there is the god send `and_then` method +(function starting on line 48). Basically, `and_then` ensure that we keep a +"flat" structure and do not add an `Option` wrapping to an already existing +`Option`. _Lesson learned_ : if you have to deal with a lot of `Option`s or +`Result`s, use `and_then`. + +Last but not least, I also want to write about the `?` operator for error +handling. Since _Rust_ version 1.13, this new operator removes a lot of +boiler plate and redundant code. + +Before 1.13, error handling will probably look like this : + +{{< highlight rust >}} +fn read_from_file() -> Result { + let f = File::open("sample.txt"); + let mut s = String::new(); + + let mut f = match f { + Ok(f) => f + Err(e) => return Err(e), + }; + + + match f.read_to_string(&mut s) { + Ok(_) => Ok(s), + Err(e) => Err(e), + } +} +{{< / highlight >}} + +With 1.13 and later, + +{{< highlight rust >}} +fn read_from_file() -> Result { + let mut s = String::new(); + let mut f = File::open("sample.txt")?; + + f.read_to_string(&mut s)?; + + Ok(s) +} +{{< / highlight >}} + +Nice and clean ! _Rust_ also experiment with a function name `try`, used like +the `?` operator, but chaining functions leads to unreadable and ugly code : + +{{< highlight rust >}} +try!(try!(try!(foo()).bar()).baz()) +{{< / highlight >}} + +To conclude, there is a lot of stuff here, to make code easy to understand +and maintain. Flow control using match and functions combination may seems +odd at the beggining but after some pratice and experiments I find quite +pleasant to use. But there is (again), more, fasten your seatbelt, next +section will blow your mind. + +# Ownership, borrowing, lifetimes 🀯 + +To be clear, the 10 first hours of _Rust_ coding just smash my brains because +of this three concepts. They are quite handy to understand at first, because +they change the way we make and understand programs. With time, pratice and +compiler help, the mist is replaced by a beautiful sunligth. There is plenty +of other blog posts, tutorials and lessons about lifetimes, ownership and +borrowing. I will add my brick to this wall, with my own understanding of it. + +Let's start with **ownership**. _Rust_ memory management is base on this +concept. Every resources (variables, objects...) is **own** by a block of +code. At the end of this block, resourses are destroyed. This is the standard +predicatable, reproducible, behavior of _Rust_. For small stuff, +that's easy to understand : + +{{< highlight rust "linenos=table, hl_lines=11">}} +fn main() { + // create a block, or scope + { + // resource creation + let i = 42; + println!("{}", i); + // i is destroyed by the compiler, and you have nothing else to do + } + + // fail, because i do not exists anymore + println!("{}", i); + +} +{{< / highlight >}} + +Compiling this piece of code will throw an error : + +{{< highlight txt >}} +error[E0425]: cannot find value `i` in this scope + --> src/main.rs:11:20 + | +11 | println!("{}", i); + | ^ not found in this scope +{{< / highlight >}} + +To remove the error, just delete the line 11. + +Ok, cool ! But what if I want to pass a resources to another block or even a +function ? + +{{< highlight rust "linenos=table" >}} +fn priprint(val: int) { + println!("{}", val); +} + +fn main() { + let i = 42; + + priprint(i); + + println!("{}", i); + +} +{{< / highlight >}} + +Here, this piece of code works because _Rust_ copy the value of `i` into +`val` when calling the `priprint` function. All primitve type in _Rust_ +works this way, but, if you want to pass, for example, a struct, _Rust_ +will **move** the resource to the function. By **moving** a resource, you +**transfer** ownership to the receiver. So in the example below `priprint` +will be responsible of the destruction of the struct passed to it. + +{{< highlight rust "linenos=table, hl_lines=12" >}} +struct Number { + value: i32 +} + +fn priprint(n: Number) { + println!("{}", n.value); +} + +fn main() { + let n = Number{ value: 42 }; + + priprint(n); + + println!("{}", n.value); + +} +{{< / highlight >}} + +When compiling, _Rust_ will not be happy : + +{{< highlight txt >}} +error[E0382]: borrow of moved value: `n` + --> src/main.rs:14:20 + | +10 | let n = Number{ value: 42 }; + | - move occurs because `n` has type `Number`, which does not implement the `Copy` trait +11 | +12 | priprint(n); + | - value moved here +13 | +14 | println!("{}", n.value); + | ^^^^^^^ value borrowed here after move +{{< / highlight >}} + +After **ownership** comes **borrowing**. With **borrowing** our _Rust_ +program is able to have multiple references or _pointers_. Passing a +reference to another block tells to this block, here is a **borrow** (mutable +or imutable) do what you want with it but do not destroy it at the end of your +scope. To pass references, or **borrows**, add the `&` operator to `priprint` +argument and parameter. + +{{< highlight rust "linenos=table, hl_lines=5 12" >}} +struct Number { + value: i32 +} + +fn priprint(n: &Number) { + println!("{}", n.value); +} + +fn main() { + let n = Number{ value: 42 }; + + priprint(&n); + + println!("{}", n.value); + +} +{{< / highlight >}} + +Seems cool no ? If a friend of mine borrow my car, I hope he will not +return it in pieces. + +Now, **lifetimes** ! _Rust_ resources always have a **lifetime** associated +to it. This means that resources the are accessible or "live" from the moment you +declare it and the moment they are dropped. If you're familiar with other +programming languages, think about **extensible scopes**. To me **extensible +scopes** means that **scopes** can be move from one block of code to another. Simple, huh ? But +things get complicated if you add references in the mix. Why ? Because +references also have **lifetime**, and this **lifetime**, called **associated +lifetime**, can be smaller than the **lifetime** pointed by the reference. Can +this **associated lifetime** be longer ? No ! Because we want to access valid +data ! In most cases, _Rust_ compiler is able to guess how **lifetimes** are +related. If not, it will explicitly ask you to annotate you're code with +**lifetimes specifiers**. To dig this subject, a whole article is necessary and +I don't find my self enough confident with **lifetimes** yet to explain it +in details. This is clearly the hardest part when you learning _Rust_. If +you don't understand what you're doing at the beginning, that's not a real problem. +Don't give up, read, try and experiment, the reward worth it. + +![No idea](https://i.kym-cdn.com/entries/icons/original/000/008/342/ihave.jpg) + +# What's next ? πŸ”­ + +Thanks to _Rust_ and my little project, I learned a bunch of new concepts +related to programming. + +_Rust_ is a beautiful language. The first time I used it, many years ago, it +was a bit odd to understand. Today, with more programming experiences, I +really understand why it matters. To me 2019, will be the _Rust_ year. A lots +of _Rust_ projects pops up on Github, and that's a good sign of how the +language start to gain in popularity. Backed up with Mozilla and the +community, I really believe that's it's the go to language for the next 10 +years. Of course, _Golang_ is also in this spectrum of new generation +laguages but they complement one each other with various ways of thinking and +programming. That's clear to me, I will continue to make _Go_ **AND** _Rust_ +programs. + +Now, I need to go deeper. On one hand, by adding new features to +**Fediwatcher** I want to experiment around concurrency and how I can +compare it to _Golang_. + +On the other hand, I'm really, really interested by **web assembly** and I +think _Rust_ is a good bet to start exploring this new open field. Last but not +least, all this skills will allow me to continue my contributions to +[Plume](https://github.com/Plume-org/Plume), a _Rust_ federated blogging +application, based on ActivityPub. + +Let's go^Wrust ! + +[^1]: I am not a C hardcore programmer anymore, beceause of _Golang_ and _Rust_, of course. +[^2]: Taken and adapted from [Rust documentation](https://doc.rust-lang.org/rust-by-example/flow_control/match.html) +[^3]: Taken from [Rust documentation](https://doc.rust-lang.org/rust-by-example/flow_control/match/destructuring/destructure_structures.html)