TechMeUp - Rust

An introduction to the Rust language

Anthony Baire

September 20, 2018



Licensed under Creative Commons Attribution-NonCommercial-NoDerivs 3.0 France

Summary

  1. Introduction

  2. Language basics

  3. Traits

  4. Memory management

  5. Other nice features

Part 1. Introduction

Rust Features

fn main()
{
    println!("Hello World!");
}

Rust Timeline

2006: personal project started by Graydon Hoare (Mozilla)

2009: Mozilla starts sponsoring Rust

2011: rustc compiles itself

2013: Servo project started (web browser engine)

15 May 2015: Rust 1.0 released (Rust 2015)

2016: Quantum project started (integration of Servo components into Firefox)

14 Nov 2017: Firefox 57 (first release using Rust code)

9 May 2018: Firefox 60 (Quantum CSS merged into Firefox)

later in 2018: Rust 2018 (first major release since 1.0)

Rust Influences (1/2)

(from the Rust book)

Rust Influences (2/2)

(from the Rust book)

Rust's focus on reliability (1/3)

Memory Safety

Note: memory safety is overridable using unsafe{} blocks

(*) mostly

Rust's focus on reliability (2/3)

Clarity

(*) mostly

Rust's focus on reliability (3/3)

Miscellaneous

Part 2. Language basics

Built-in types & literals

let var : i32 = 4567i32;    // signed int  (32-bit)
let var : u8  = 74;         // unsigned int (8-bit)
let var : usize = 1;        // unsigned int (pointer-sized)

let var : f32 = 3.14;       // float (single precision)
let var : f64 = 3.14;       // float (double precision)

let var : bool = true;      // boolean

let var : () = ();          // unit (no value)

let var : char = 'é';       // character (unicode)
let var : u8  = b'A';       // character (8-bit)

let var : &str  =  "ñéíó";             // unicode string slice
let var : &[u8] = b"Hello";            // slice of 8-bit unsigned ints
let var : (i32, bool) = (78, true);    // tuple 
let var : [u8;6]      = [1,1,2,3,5,8]; // fixed-size array
let var = 8i32..11;	               // half-open range -> [8 9 10]
let var = 8i32..=11;	               // closed range    -> [8 9 10 11]

Struct types

// struct
struct MyStruct {
    pub a: u32,     // public field
    pub b: bool,    // public field
    c: char,        // private field
};

// tuple struct
struct MyTuple(char, i32, bool);



let s : MyStruct = MyStruct{a: 27, c: 'ñ', b: true};
assert_eq!{s.c, 'ñ'};

let t : MyTuple  = MyTuple('é', 0, false);
assert_eq!{t.0, 'é'};

Algebraic types (enums)

// definition
enum Message {
    Quit,                           // variant with no data
    Move { x: i32, y: i32 },        // containing a struct
    Write(String),                  // containing a single data
    ChangeColor(i32, i32, i32),     // containing a tuple
}



let msg1 : Message = Message::Quit;
let msg2 : Message = Message::Move{x: 47, y:78};
let msg3 : Message = Message::ChangeColor(0, 0, 0);

assert_eq!(msg1, Message::Quit);
assert_ne!(msg2, Message::Quit);

Control flow

if condition {
    ...
} else {
    ...
}

while condition {
    ...
}

for var in iterator {
    ...
}

loop {      // infinite loop
    ...
}

Pattern matching

let val : i32 = foo();
match val {
    0           => { println!("val is 0");                      },
    4 | 7 | 8   => { println!("val is 4, 7 or 8");              },
    v if v > 0  => { println!("positive value: val == {}", v);  },
    _           => { println!("other value");                   },
}


// deconstruct enums
enum MyEnum { None, Int(i32), Bool(bool) }

let val : MyEnum = foo();
match val {
    MyEnum::None    => { println!("no value");              },
    MyEnum::Bool(b) => { println!("boolean value: {}", b);  },
    MyEnum::Int(i)  => { println!("integer value: {}", i);  },
}

References

let mut var : i32 = 12;

{
    let im : &i32 = &var;         // immutable reference

    assert_eq!(*im, 12);          // dereferencing with `*`
    assert!((*im).is_positive()); // idem
    assert!(im.is_positive());    // automatic deref on method call
}

{
    let mu : &mut i32 = &mut var; // mutable reference

    *mu = 4;         		  // dereferencing with `*`
}
assert_eq!(var, 4);

Pointers (memory-unsafe!)

let mut var : i32 = 12;

let ip = &var     as *const i32;   // immutable pointer
let mp = &mut var as *mut   i32;   // mutable pointer

println!("ip={:?} mp={:?}", ip, mp); // => "ip=0x7ffd2ea0954c mp=0x...

// dereferencing a pointer is forbidden...
//
*ip;   // error[E0133]: dereference of raw pointer

// ...except within an `unsafe` block
unsafe {
    assert_eq!(*ip, 12);
    *mp = 45;
}
assert_eq!(var, 45);

Purposes: interfacing with foreign code & implementing optimisations

Functions

// arguments passing
fn foo(by_value: i32,  by_ref: &i32,  by_mutable_ref: &mut i32)
{
}

// return value
fn bar() -> i32
{
    27
}
fn baz() -> i32
{
    return 13;
    ...
}
assert_eq!(bar(), 27);
assert_eq!(baz(), 13);

Note: no overloading, no var args

Closures

struct MyStruct { val: i32 }

let mut a = MyStruct{ val: 12 };
{
    // borrow closure -> var `a` is captured by reference  (Fn/FnMut)
    let mut clo = | b: i32, c: i32 | {
        a.val += b + c;
    };
    clo(4, 3);
}
assert_eq!(a.val, 19);
{
    // move closure -> var `a` captured by value (FnOnce)
    let clo = move | b: i32 | -> i32 {
        return a.val + b;
    };

    assert_eq!(clo(9), 28);
    a;  // error[E0382]: use of moved value: `a`
}

Macros

// macro definition (Note: Rust 2018 will introduce a new syntax)
macro_rules! sum {
    ()                          => {0};
    ($a:expr)                   => {$a};
    ($a:expr, $($tail:expr),+)  => { $a + sum!($($tail),+) };
}

// macro instantiation
assert_eq!(sum!(),           0);
assert_eq!(sum!(98),        98);
assert_eq!(sum!(4,7,9,8),   28);

// Note: can use either MACRO!(), MACRO![] or MACRO!{}
sum!(2, 4);
sum!{2, 4};
sum![2, 4];

Methods

User-defined types (struct/enum) may implement methods

struct Point {
    x: f64,
    y: f64,
}

impl Point {
    // static method (there is no formal constructor)
    fn new(x: f64, y: f64) -> Point {
        Point{x: x, y: y}
    }
    fn distance(&self) -> f64 {
        (self.x*self.x + self.y*self.y).sqrt()
    }
    fn translate(&mut self, x: f64, y: f64) {
        self.x += x;
        self.y += y;
    }
}

Part 3. Traits

Rust traits

Rust does not provide classes

Rust's type system provides ad-hoc polymorphism using traits (similar Haskell's typeclasses)

 

Definition

A trait defines a list of methods and/or type definitions

Methods may have a default implementation

Traits are stateless (no attributes)

 

Implementation

Any trait may be implemented for any type (including built-in types)

Basic example

trait Description {                     // definition
    fn description(&self) -> &str;
}

impl Description for i32 {              // implementation for i32
    fn description(&self) -> &str {
        "this is an integer"
    }
}

impl Description for bool {             // implementation for bool
    fn description(&self) -> &str {
        "this is a boolean"
    }
}

// usage
println!("{}", 1024i32.description());  // -> this is an integer
println!("{}", false.description());    // -> this is a boolean

Example: std::iter::Iterator

trait Iterator {
    ////////////// required methods & types //////////////
    // (*must* be implemented by types using the trait)
    type Item;   // type  of values produced by this iterator

    fn next(&mut self) -> Option<Self::Item>;  // return next value

    ////////////// provided methods & types //////////////
    // (*may* be implemented by types using the trait)

    // return the nth element
    fn nth(&mut self, mut n: usize) -> Option<Self::Item> {
        for x in self {
            if n == 0 { return Some(x) }
            n -= 1;
        }
        None
    }
    ...
}

Example: Iterator implementation

impl String {
    fn chars(&self) -> Chars {
        ...
    }
    ...
}


struct Chars { ... }

impl Iterator for Chars {
    type Item = char;

    fn next(&mut self) -> Option<char> {
        ...
    }
}

std traits

std traits: formatting

// default format {}
pub trait Display {
    fn fmt(&self, f: &mut Formatter) -> Result<(), Error>;
}

// debug format {:?}
pub trait Debug {
    fn fmt(&self, f: &mut Formatter) -> Result<(), Error>;
}

// `str` implements `Display` and `Debug`
let s = "Hello World!";
println!("Display: {}",   s); // -> Display: Hello World!
println!("Debug:   {:?}", s); // -> Debug:   "Hello World!"

// tuples implement only `Debug`
let t = (true, "hello");
println!("Display: {}", t); // error[E0277]: Display is not implemented
println!("Debug: {:?}", t); // -> Debug: (true, "hello")

std traits: destructor

pub trait Drop {
    fn drop(&mut self);
}

.drop() is deterministically called when object goes out of scope

RAII (Resource Acquisition Is Initialisation)

Rust does not provide try { ... } finally { ... } constructs ➔ must use the RAII design pattern instead

let mutex = std::sync::Mutex::new(()));

{
    let guard = mutex.lock()?;
    // mutex is locked

} // guard goes out of scope -> guard.drop() is called

// mutex is unlocked

std traits: conversions

// implemented in the destination type (preferred)
pub trait From<T> {
    fn from(T) -> Self;
}

// implemented in the source type
pub trait Into<T> {
    fn into(self) -> T;
}
// default `Into` implementation (uses From<T>)
impl<T, U> Into<U> for T where U: From<T>
{
    fn into(self) -> U {
        U::from(self)
    }
}

// example: fill a uint8 vector with a utf8 string
// (this works because `Vec<u8>` implements `From<&str>`)
let vec : Vec<u8> = "hello ñéíó".into();

derivable traits (Dont Repeat Yourself!)

#[derive(Debug, Eq, PartialEq, Ord, PartialOrd, Hash, Default)]
struct S {
    a: i32,
    b: bool,
}
let s = S{a: 4, b: true};

println!("{:?}", s);                        // Debug
// -> prints 'S { a: 4, b: true }'

assert!(s            == S{a: 4, b: true});  // Eq, PartialEq
assert!(S::default() == S{a: 0, b: false}); // Default, Eq, PartialEq
assert!(s > S{a: 2, b: true});              // Ord, PartialOrd
let hash = HashMap::<S, String>::new();     // Hash, Eq, PartialEq

Trait bounds (on functions)

Generic functions/types/traits can set constraints (trait bounds) on their parameters

// verbose syntax
fn describe<T>(item: &T)
    where T: Description
{
    println!("{}", item.description())
}
// compact syntax
fn describe<T: Description>(item: &T) {
    println!("{}", item.description())
}

// a generic may set multiple bounds
fn describe<T>(item: &T)
    where T: Description + Display
{
    println!("{}: {}", item.description(), item);
}

Trait bounds (on types, impl and traits)

// hash map provided by the rust std lib
pub struct HashMap<K, V, S = RandomState> {
    ...
}

impl<K: Hash + Eq, V> HashMap<K, V, RandomState>
{
    pub fn insert(&mut self, k: K, v: V) -> Option<V> {
        ...
    }
}


// a type that implements the PartialOrd trait (ordering operators: <
// <= > >=) must also implement PartialEq (equality operator: == !=)
pub trait PartialOrd<T = Self>: PartialEq<T>
{
    ...
}

Monomorphization vs. dynamic dispatch

Rust supports both monomorphization (like C++ templates) and dynamic dispatch (usal polymorphism)

Monomorphization is the preferred method

// monomorphization
// (function is compiled multiple times, once for each type)
fn mono<T: Trait>(param: &T)
{
    ...
}

// dynamic dispatch (based on a virtual table)
fn poly(param: &Trait)
{
    ...
}

Part 4. Memory management

Stack vs. Heap

Stack allocation is used for:

let i : u32 = 18;

 

Heap allocation is used in:

let b : Box<u32> = Box::new(18);
assert_eq!(*b, 18);

Moveability, Copy and Clone

clone

pub trait Clone {
    fn clone(&self) -> Self;
}

copy vs. move

copy vs. move

#[derive(Clone, Copy)]
struct C{val: i32};     // C is copyable

#[derive(Clone)]     
struct M{val: i32};     // M is moveable

// copyable values
let mut c1 = C{val: 5};
let c2 = c1;            // this is a copy
c1.val = 7;
assert_eq!(c1.val, 7);  // c1 and c2 are two instances of C
assert_eq!(c2.val, 5);  // with different values

// moveable values
let m1 = M{val: 5};
let m2 = m1;            // this is a move
assert_eq!(m2.val, 5);

m1.val;   // error[E0382]: use of moved value: `m1.val`

moveable really means move in memory

struct A(u32);

impl A {
    fn print_addr(&self)    { println!("addr {:p}", self); }
}

impl Drop for A {
    fn drop(&mut self)      { println!("drop {:p}", self); }
}

{
    let x = A(11);  x.print_addr();     // -> "addr 0x7ffff5884794"
    let y = x;      y.print_addr();     // -> "addr 0x7ffff5884798"
    let z = y;      z.print_addr();     // -> "addr 0x7ffff588479c"
}   // z is dropped                        -> "drop 0x7ffff588479c"

move happens in function calls too

struct A(u32);

fn process(a: A)     // Note that `a` is not a reference
{
    println!("A -> {}", a.0);
}



let x = A(11);

process(x);     // function call "consumes" the value of `x`

x.0;            // error[E0382]: use of moved value: `x.0`

immutability is not const

struct MyStruct {
    val: i32
}

let i = MyStruct{val: 12};     // immutable variable

i.val = 0;      // error[E0594]: cannot assign to field `i.val`
                //               of immutable binding 

let mut m = i;  // consume `i` and move it to `m` which is mutable
m.val = 14;

i.val;          // error[E0382]: use of moved value: `i.val`

Rust ownership model

Ownership

Borrowing

Borrowing

let s = String::from("hello");

{
    // create a reference
    let r = &s;

    // attempt to move `s` to another var
    let s2 = s;     // error[E0505]: cannot move out of `s`
                    //               because it is borrowed

    // attempt to drop `s`
    drop(s);        // error[E0505]: cannot move out of `s`
                    //               because it is borrowed

}   // reference `r` goes out of scope

// no reference -> we are now allowed to move object `s`
let s3 = s;

Mutable borrow is exclusive

let mut s = String::from("hello");
{
    let m = &mut s;     // create a mutable reference
    *m += " world";     // mutate through ref m

    let m2 = &mut s;    // error[E0499]: cannot borrow `s` as mutable
                        // more than once at a time
    s += "bad";         // error[E0499]: cannot borrow `s` as mutable
                        // more than once at a time
    let i  = &s;        // error[E0502]: cannot borrow `s` as immutable
                        // because it is also borrowed as mutable

    let m3 = &mut*m;    // borrow reference m  -> m3
    *m3 += "!";
    m.clear();          // error[E0499]: cannot borrow `*m` ...

}   // references `m` and `m3` go out of scope
assert_eq!(s, "hello world!");

Immutable borrow is shared

let mut s = String::from("hello");
{
    let i1 = &s;    // create an immutable reference
    let i2 = &s;    // create another immutable reference

    // both references and the original object are readable
    assert_eq!(s,  "hello");
    assert_eq!(i1, "hello");
    assert_eq!(i2, "hello");

    // but cannot mutate s
    s += "bad";     // error[E0502]: cannot borrow `s` as mutable
                    // because it is also borrowed as immutable
}
s += " world";      // `s` cah now be mutated

Returning references

Borrowing works across function calls, but the compiler may need hints

For example, this code will not compile:

// error[E0106]: missing lifetime specifier
fn longest(x: &str, y: &str) -> &str
{
    if x.len() > y.len() {
        x
    } else {
        y
    }
}
// error[E0106]: missing lifetime specifier
fn push(acc: &mut String, value: &str) -> &mut String {
    *acc += value;
    acc
}

-> compiler is unable to infer the lifetime of the results

Lifetime annotations

// 'a is a lifetime parameter
//
// -> the returned reference borrows inputs `x` and `y`
//
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str
{
    if x.len() > y.len() {
        x
    } else {
        y
    }
}

// -> the returned reference borrows inputs `acc` only
fn push<'a, 'b>(acc: &'a mut String, value: &'b str) -> &'a mut String
{
    *acc += value;
    acc
}

Lifetime elision

Lifetimes parameters may be omitted in specific cases

// only one reference in the inputs
fn substr(string: &str, index: usize, len: usize) -> &str
{
    &string[index..(index+len)]
}
// in methods `&self` or `&mut self` is used by default
impl Push for String {
    fn push(&mut self, value: &str) -> &mut String {
        *self += value;  self
}}
// unused input lifetimes may be elided
fn push<'a>(acc: &'a mut String, value: &str) -> &'a mut String
{
    *acc += value;  acc
}

Storing references in structured types

// member references must be annotated with a lifetime parameter
struct Slice<'a>
{
    original:    &'a str,
    index_first: usize,
    index_last:  usize,
}


let mut s = String::from("Hello World!");

let sl = Slice{ original: &s, index_first: 6, index_last: 11 };

// `s` is now borrowed by `sl`
//
s += "bad"; // error[E0502]: cannot borrow `s` as mutable because
            //               it is also borrowed as immutable

Reference storage constaints (1/2)

A struct storing references can only live within the scope (stack frame) where the reference is created

-> typically only usable for short-lived object meant to inspect/manipulate another object

pub trait SliceExt {
    fn iter(&self) -> Iter<Self::Item>;
}

pub struct Iter<'a, T: 'a> {
    ptr: *const T,
    end: *const T,
    _marker: marker::PhantomData<&'a T>,
}

Reference storage constaints (2/2)

A struct cannot store a reference on itself

struct SubStr<'a> {
    buf:        String,
    sub_str:    &'a str,
}
let h = String::from("hello");
let s = SubStr{
    sub_str:    &h[2..4],
    buf:        h           // error[E0505]: cannot move out of `h`
};                          //               because it is borrowed

// alternate solution
struct SubStr {
    buf:        String,
    start:      usize,
    stop:       usize,
}
impl SubStr {
    fn sub_str(&self) -> &str { &self.buf[self.start..self.stop] }
}

Constraints on data layout

Purely safe types can only have a tree-like structure

Cross references (eg: doubly-linked list, graph) require:

Design pattern: interior mutability

Rust's borrowing rules requires shared objects to be immutable

Mutability is implementable with std primitives Cell and RefCell

// std::cell::Cell can only be mutated by value
pub struct Cell<T> { ... }
impl<T> Cell<T> {
    pub fn new(val: T) -> Self  { ...}

    ...

    pub fn set(&self, val: T)   { ... }
    pub fn get(&self) -> T      { ... }
}

Mutability example with Cell (1/2)

// example: counter with name
struct Counter {
    pub name:   String,
    value:      Cell<u32>,
}
impl Counter {
    fn new(name &str) -> Self {
        Counter{ name: name.into(), 0.into() }
    }
    fn value(&self) -> u32 {
        self.value.get()
    }
    fn increment(&self) {
        self.value.set(self.value.get() + 1);
    }
}

Mutability example with Cell (2/2)

// create a Counter and store it into a smart pointer
let ctr1 : Rc<Counter> = Counter::new("foo").into();

// clone this smart pointer
let ctr2 = ctr1.clone();



assert_eq!(ctr1.value(), 0);

ctr1.increment();
assert_eq!(ctr1.value(), 1);

ctr2.increment();
assert_eq!(ctr1.value(), 2);

Mutability with RefCell

Cell has zero overhead but allows only replacing the inner value

RefCell allows taking a reference to mutate the inner value
(with a little overhead for storing the borrowed state)

pub struct RefCell<T> { ... }

impl<T> RefCell<T> {
    pub fn new(val: T) -> Self  { ...}
    ...
    pub fn borrow(&self, val: T) -> Ref<T>      { ... }
    pub fn borrow_mut(&self)     -> RefMut<T>   { ... }
}
impl<'b, T> Deref for Ref<'b, T>
{
    type target = T
    fn deref(&self) -> &T { ... }
}

Mutability example with RefCell

// example: counter with name
struct Counter {
    pub name:   String,
    value:      RefCell<u32>,
}
impl Counter {
    fn new(name &str) -> Self {
        Counter{ name: name.into(), 0.into() }
    }
    fn value(&self) -> u32 {
        *self.value.borrow()
    }
    fn increment(&self) {
        *self.value.borrow_mut() += 1;
    }
}

RefCell usage

let c : RefCell<i32> = 0.into();
{
    // can be immutably borrowed multiple times 
    let ref1 = c.borrow();
    let ref2 = c.borrow();

    // mutable borrow fails
    let mut1 = c.borrow_mut(); // panic!("already borrowed")
}
{
    // can be mutably borrowed only once
    let mut1 = c.borrow_mut();
    
    // other borrows fail
    let ref1 = c.borrow();      // panic!("already mutably borrowed")
    let mut2 = c.borrow_mut();  // panic!("already borrowed")
}

Interior mutability with Mutex

pub struct Mutex<T> { ... }

impl<T> Mutex<T> {
    pub fn new(val: T) -> Self  { ...}
    ...
    pub fn lock(&self)     ->    LockResult<MutexGuard<T>>  { ... }
    pub fn try_lock(&self) -> TryLockResult<MutexGuard<T>>  { ... }
}

impl<'mutex, T> Deref for MutexGuard<'mutex, T>
{
    type Target = T
    fn deref(&self) -> &T
}

Fearless concurrency

Thread-safety of types is inferred by the compiler, they may have:

 

Inter-thread communication may rely on:

Part 5. Other nice features