An introduction to the Rust language
Anthony Baire
September 20, 2018
Licensed under Creative Commons Attribution-NonCommercial-NoDerivs 3.0 France
Introduction
Language basics
Traits
Memory management
Other nice features
fn main() { println!("Hello World!"); }
2006: personal project started by Graydon Hoare (Mozilla)
2009: Mozilla starts sponsoring Rust
2011: rustc compiles itself
2013: Servo project started (web browser engine)
15 May 2015: Rust 1.0 released (Rust 2015)
2016: Quantum project started (integration of Servo components into Firefox)
14 Nov 2017: Firefox 57 (first release using Rust code)
9 May 2018: Firefox 60 (Quantum CSS merged into Firefox)
later in 2018: Rust 2018 (first major release since 1.0)
(from the Rust book)
(from the Rust book)
Memory Safety
Note: memory safety is overridable using unsafe{}
blocks
(*) mostly
Clarity
panic!
)unimplemented!
, unreachable!
)_
)(*) mostly
Miscellaneous
panic!
if out of bound)assert!()
(always checked)debug_assert!()
(checked in Debug builds only)let var : i32 = 4567i32; // signed int (32-bit) let var : u8 = 74; // unsigned int (8-bit) let var : usize = 1; // unsigned int (pointer-sized) let var : f32 = 3.14; // float (single precision) let var : f64 = 3.14; // float (double precision) let var : bool = true; // boolean let var : () = (); // unit (no value) let var : char = 'é'; // character (unicode) let var : u8 = b'A'; // character (8-bit) let var : &str = "ñéíó"; // unicode string slice let var : &[u8] = b"Hello"; // slice of 8-bit unsigned ints let var : (i32, bool) = (78, true); // tuple let var : [u8;6] = [1,1,2,3,5,8]; // fixed-size array let var = 8i32..11; // half-open range -> [8 9 10] let var = 8i32..=11; // closed range -> [8 9 10 11]
// struct struct MyStruct { pub a: u32, // public field pub b: bool, // public field c: char, // private field }; // tuple struct struct MyTuple(char, i32, bool); let s : MyStruct = MyStruct{a: 27, c: 'ñ', b: true}; assert_eq!{s.c, 'ñ'}; let t : MyTuple = MyTuple('é', 0, false); assert_eq!{t.0, 'é'};
// definition enum Message { Quit, // variant with no data Move { x: i32, y: i32 }, // containing a struct Write(String), // containing a single data ChangeColor(i32, i32, i32), // containing a tuple } let msg1 : Message = Message::Quit; let msg2 : Message = Message::Move{x: 47, y:78}; let msg3 : Message = Message::ChangeColor(0, 0, 0); assert_eq!(msg1, Message::Quit); assert_ne!(msg2, Message::Quit);
if condition { ... } else { ... } while condition { ... } for var in iterator { ... } loop { // infinite loop ... }
let val : i32 = foo(); match val { 0 => { println!("val is 0"); }, 4 | 7 | 8 => { println!("val is 4, 7 or 8"); }, v if v > 0 => { println!("positive value: val == {}", v); }, _ => { println!("other value"); }, } // deconstruct enums enum MyEnum { None, Int(i32), Bool(bool) } let val : MyEnum = foo(); match val { MyEnum::None => { println!("no value"); }, MyEnum::Bool(b) => { println!("boolean value: {}", b); }, MyEnum::Int(i) => { println!("integer value: {}", i); }, }
let mut var : i32 = 12; { let im : &i32 = &var; // immutable reference assert_eq!(*im, 12); // dereferencing with `*` assert!((*im).is_positive()); // idem assert!(im.is_positive()); // automatic deref on method call } { let mu : &mut i32 = &mut var; // mutable reference *mu = 4; // dereferencing with `*` } assert_eq!(var, 4);
let mut var : i32 = 12; let ip = &var as *const i32; // immutable pointer let mp = &mut var as *mut i32; // mutable pointer println!("ip={:?} mp={:?}", ip, mp); // => "ip=0x7ffd2ea0954c mp=0x... // dereferencing a pointer is forbidden... // *ip; // error[E0133]: dereference of raw pointer // ...except within an `unsafe` block unsafe { assert_eq!(*ip, 12); *mp = 45; } assert_eq!(var, 45);
Purposes: interfacing with foreign code & implementing optimisations
// arguments passing fn foo(by_value: i32, by_ref: &i32, by_mutable_ref: &mut i32) { } // return value fn bar() -> i32 { 27 } fn baz() -> i32 { return 13; ... } assert_eq!(bar(), 27); assert_eq!(baz(), 13);
Note: no overloading, no var args
struct MyStruct { val: i32 } let mut a = MyStruct{ val: 12 }; { // borrow closure -> var `a` is captured by reference (Fn/FnMut) let mut clo = | b: i32, c: i32 | { a.val += b + c; }; clo(4, 3); } assert_eq!(a.val, 19); { // move closure -> var `a` captured by value (FnOnce) let clo = move | b: i32 | -> i32 { return a.val + b; }; assert_eq!(clo(9), 28); a; // error[E0382]: use of moved value: `a` }
// macro definition (Note: Rust 2018 will introduce a new syntax) macro_rules! sum { () => {0}; ($a:expr) => {$a}; ($a:expr, $($tail:expr),+) => { $a + sum!($($tail),+) }; } // macro instantiation assert_eq!(sum!(), 0); assert_eq!(sum!(98), 98); assert_eq!(sum!(4,7,9,8), 28); // Note: can use either MACRO!(), MACRO![] or MACRO!{} sum!(2, 4); sum!{2, 4}; sum![2, 4];
User-defined types (struct/enum) may implement methods
struct Point { x: f64, y: f64, } impl Point { // static method (there is no formal constructor) fn new(x: f64, y: f64) -> Point { Point{x: x, y: y} } fn distance(&self) -> f64 { (self.x*self.x + self.y*self.y).sqrt() } fn translate(&mut self, x: f64, y: f64) { self.x += x; self.y += y; } }
std
traitsRust does not provide classes
Rust's type system provides ad-hoc polymorphism using traits (similar Haskell's typeclasses)
A trait defines a list of methods and/or type definitions
Methods may have a default implementation
Traits are stateless (no attributes)
Any trait may be implemented for any type (including built-in types)
trait Description { // definition fn description(&self) -> &str; } impl Description for i32 { // implementation for i32 fn description(&self) -> &str { "this is an integer" } } impl Description for bool { // implementation for bool fn description(&self) -> &str { "this is a boolean" } } // usage println!("{}", 1024i32.description()); // -> this is an integer println!("{}", false.description()); // -> this is a boolean
std::iter::Iterator
trait Iterator { ////////////// required methods & types ////////////// // (*must* be implemented by types using the trait) type Item; // type of values produced by this iterator fn next(&mut self) -> Option<Self::Item>; // return next value ////////////// provided methods & types ////////////// // (*may* be implemented by types using the trait) // return the nth element fn nth(&mut self, mut n: usize) -> Option<Self::Item> { for x in self { if n == 0 { return Some(x) } n -= 1; } None } ... }
Iterator
implementationimpl String { fn chars(&self) -> Chars { ... } ... } struct Chars { ... } impl Iterator for Chars { type Item = char; fn next(&mut self) -> Option<char> { ... } }
std
traitsEq PartialEq<T>
Ord PartialOrd<T>
Add<T> Sub<T> Mul<T> Div<T> Rem<T> Neg<T> BitAnd<T> BitOr<T> BitXor<T> Not Shl<T> Shr<T> AddAssign<T> SubAssign<T>
...Hash<H>
Iterator
Index<I> IndexMut<I>
Drop
Default
Error
Display Debug Binary Octal LowerHex UpperHex Pointer
...From<T> Into<T> TryFrom<T> TryInto<T> FromStr ToString AsRef<T> AsMut<T> Deref
Copy Clone
Sync Send
Any
Sized
std
traits: formatting// default format {} pub trait Display { fn fmt(&self, f: &mut Formatter) -> Result<(), Error>; } // debug format {:?} pub trait Debug { fn fmt(&self, f: &mut Formatter) -> Result<(), Error>; } // `str` implements `Display` and `Debug` let s = "Hello World!"; println!("Display: {}", s); // -> Display: Hello World! println!("Debug: {:?}", s); // -> Debug: "Hello World!" // tuples implement only `Debug` let t = (true, "hello"); println!("Display: {}", t); // error[E0277]: Display is not implemented println!("Debug: {:?}", t); // -> Debug: (true, "hello")
std
traits: destructorpub trait Drop { fn drop(&mut self); }
.drop()
is deterministically called when object goes out of scope
RAII (Resource Acquisition Is Initialisation)
Rust does not provide try { ... } finally { ... }
constructs ➔ must use the
RAII design pattern instead
let mutex = std::sync::Mutex::new(())); { let guard = mutex.lock()?; // mutex is locked } // guard goes out of scope -> guard.drop() is called // mutex is unlocked
std
traits: conversions// implemented in the destination type (preferred) pub trait From<T> { fn from(T) -> Self; } // implemented in the source type pub trait Into<T> { fn into(self) -> T; } // default `Into` implementation (uses From<T>) impl<T, U> Into<U> for T where U: From<T> { fn into(self) -> U { U::from(self) } } // example: fill a uint8 vector with a utf8 string // (this works because `Vec<u8>` implements `From<&str>`) let vec : Vec<u8> = "hello ñéíó".into();
#[derive(Debug, Eq, PartialEq, Ord, PartialOrd, Hash, Default)] struct S { a: i32, b: bool, } let s = S{a: 4, b: true}; println!("{:?}", s); // Debug // -> prints 'S { a: 4, b: true }' assert!(s == S{a: 4, b: true}); // Eq, PartialEq assert!(S::default() == S{a: 0, b: false}); // Default, Eq, PartialEq assert!(s > S{a: 2, b: true}); // Ord, PartialOrd let hash = HashMap::<S, String>::new(); // Hash, Eq, PartialEq
Generic functions/types/traits can set constraints (trait bounds) on their parameters
// verbose syntax fn describe<T>(item: &T) where T: Description { println!("{}", item.description()) } // compact syntax fn describe<T: Description>(item: &T) { println!("{}", item.description()) } // a generic may set multiple bounds fn describe<T>(item: &T) where T: Description + Display { println!("{}: {}", item.description(), item); }
// hash map provided by the rust std lib pub struct HashMap<K, V, S = RandomState> { ... } impl<K: Hash + Eq, V> HashMap<K, V, RandomState> { pub fn insert(&mut self, k: K, v: V) -> Option<V> { ... } } // a type that implements the PartialOrd trait (ordering operators: < // <= > >=) must also implement PartialEq (equality operator: == !=) pub trait PartialOrd<T = Self>: PartialEq<T> { ... }
Rust supports both monomorphization (like C++ templates) and dynamic dispatch (usal polymorphism)
Monomorphization is the preferred method
// monomorphization // (function is compiled multiple times, once for each type) fn mono<T: Trait>(param: &T) { ... } // dynamic dispatch (based on a virtual table) fn poly(param: &Trait) { ... }
Stack allocation is used for:
let
)let i : u32 = 18;
Heap allocation is used in:
Vec<T>
, LinkedList<T>
, HashMap<K,T>
, ...)Box<T>
: unique pointerRc<T>
: reference-counting shared pointer (non thread-safe)Arc<T>
: atomic reference-counting shared pointer (thread-safe)let b : Box<u32> = Box::new(18); assert_eq!(*b, 18);
Copy
and Clone
.clone()
)Clone
trait
pub trait Clone { fn clone(&self) -> Self; }
#[Derive(Copy)]
#[Derive(Copy)]
only for small objects
(when copying the object is a cheap operation)#[derive(Clone, Copy)] struct C{val: i32}; // C is copyable #[derive(Clone)] struct M{val: i32}; // M is moveable // copyable values let mut c1 = C{val: 5}; let c2 = c1; // this is a copy c1.val = 7; assert_eq!(c1.val, 7); // c1 and c2 are two instances of C assert_eq!(c2.val, 5); // with different values // moveable values let m1 = M{val: 5}; let m2 = m1; // this is a move assert_eq!(m2.val, 5); m1.val; // error[E0382]: use of moved value: `m1.val`
struct A(u32); impl A { fn print_addr(&self) { println!("addr {:p}", self); } } impl Drop for A { fn drop(&mut self) { println!("drop {:p}", self); } } { let x = A(11); x.print_addr(); // -> "addr 0x7ffff5884794" let y = x; y.print_addr(); // -> "addr 0x7ffff5884798" let z = y; z.print_addr(); // -> "addr 0x7ffff588479c" } // z is dropped -> "drop 0x7ffff588479c"
.drop()
is called only oncestruct A(u32); fn process(a: A) // Note that `a` is not a reference { println!("A -> {}", a.0); } let x = A(11); process(x); // function call "consumes" the value of `x` x.0; // error[E0382]: use of moved value: `x.0`
const
const
value can never be changedstruct MyStruct { val: i32 } let i = MyStruct{val: 12}; // immutable variable i.val = 0; // error[E0594]: cannot assign to field `i.val` // of immutable binding let mut m = i; // consume `i` and move it to `m` which is mutable m.val = 14; i.val; // error[E0382]: use of moved value: `i.val`
let s = String::from("hello"); { // create a reference let r = &s; // attempt to move `s` to another var let s2 = s; // error[E0505]: cannot move out of `s` // because it is borrowed // attempt to drop `s` drop(s); // error[E0505]: cannot move out of `s` // because it is borrowed } // reference `r` goes out of scope // no reference -> we are now allowed to move object `s` let s3 = s;
let mut s = String::from("hello"); { let m = &mut s; // create a mutable reference *m += " world"; // mutate through ref m let m2 = &mut s; // error[E0499]: cannot borrow `s` as mutable // more than once at a time s += "bad"; // error[E0499]: cannot borrow `s` as mutable // more than once at a time let i = &s; // error[E0502]: cannot borrow `s` as immutable // because it is also borrowed as mutable let m3 = &mut*m; // borrow reference m -> m3 *m3 += "!"; m.clear(); // error[E0499]: cannot borrow `*m` ... } // references `m` and `m3` go out of scope assert_eq!(s, "hello world!");
let mut s = String::from("hello"); { let i1 = &s; // create an immutable reference let i2 = &s; // create another immutable reference // both references and the original object are readable assert_eq!(s, "hello"); assert_eq!(i1, "hello"); assert_eq!(i2, "hello"); // but cannot mutate s s += "bad"; // error[E0502]: cannot borrow `s` as mutable // because it is also borrowed as immutable } s += " world"; // `s` cah now be mutated
Borrowing works across function calls, but the compiler may need hints
For example, this code will not compile:
// error[E0106]: missing lifetime specifier fn longest(x: &str, y: &str) -> &str { if x.len() > y.len() { x } else { y } } // error[E0106]: missing lifetime specifier fn push(acc: &mut String, value: &str) -> &mut String { *acc += value; acc }
-> compiler is unable to infer the lifetime of the results
// 'a is a lifetime parameter // // -> the returned reference borrows inputs `x` and `y` // fn longest<'a>(x: &'a str, y: &'a str) -> &'a str { if x.len() > y.len() { x } else { y } } // -> the returned reference borrows inputs `acc` only fn push<'a, 'b>(acc: &'a mut String, value: &'b str) -> &'a mut String { *acc += value; acc }
Lifetimes parameters may be omitted in specific cases
// only one reference in the inputs fn substr(string: &str, index: usize, len: usize) -> &str { &string[index..(index+len)] } // in methods `&self` or `&mut self` is used by default impl Push for String { fn push(&mut self, value: &str) -> &mut String { *self += value; self }} // unused input lifetimes may be elided fn push<'a>(acc: &'a mut String, value: &str) -> &'a mut String { *acc += value; acc }
// member references must be annotated with a lifetime parameter struct Slice<'a> { original: &'a str, index_first: usize, index_last: usize, } let mut s = String::from("Hello World!"); let sl = Slice{ original: &s, index_first: 6, index_last: 11 }; // `s` is now borrowed by `sl` // s += "bad"; // error[E0502]: cannot borrow `s` as mutable because // it is also borrowed as immutable
A struct storing references can only live within the scope (stack frame) where the reference is created
-> typically only usable for short-lived object meant to inspect/manipulate another object
pub trait SliceExt { fn iter(&self) -> Iter<Self::Item>; } pub struct Iter<'a, T: 'a> { ptr: *const T, end: *const T, _marker: marker::PhantomData<&'a T>, }
A struct cannot store a reference on itself
struct SubStr<'a> { buf: String, sub_str: &'a str, } let h = String::from("hello"); let s = SubStr{ sub_str: &h[2..4], buf: h // error[E0505]: cannot move out of `h` }; // because it is borrowed // alternate solution struct SubStr { buf: String, start: usize, stop: usize, } impl SubStr { fn sub_str(&self) -> &str { &self.buf[self.start..self.stop] } }
Purely safe types can only have a tree-like structure
Cross references (eg: doubly-linked list, graph) require:
std
primitives:
Rc<T>
, Arc<T>
LinkedList<T>
, ...Rust's borrowing rules requires shared objects to be immutable
Mutability is implementable with std primitives Cell
and RefCell
// std::cell::Cell can only be mutated by value pub struct Cell<T> { ... } impl<T> Cell<T> { pub fn new(val: T) -> Self { ...} ... pub fn set(&self, val: T) { ... } pub fn get(&self) -> T { ... } }
Cell
(1/2)// example: counter with name struct Counter { pub name: String, value: Cell<u32>, } impl Counter { fn new(name &str) -> Self { Counter{ name: name.into(), 0.into() } } fn value(&self) -> u32 { self.value.get() } fn increment(&self) { self.value.set(self.value.get() + 1); } }
Cell
(2/2)// create a Counter and store it into a smart pointer let ctr1 : Rc<Counter> = Counter::new("foo").into(); // clone this smart pointer let ctr2 = ctr1.clone(); assert_eq!(ctr1.value(), 0); ctr1.increment(); assert_eq!(ctr1.value(), 1); ctr2.increment(); assert_eq!(ctr1.value(), 2);
Cell
has zero overhead but allows only replacing the inner value
RefCell
allows taking a reference to mutate the inner value
(with a little overhead for storing the borrowed state)
pub struct RefCell<T> { ... } impl<T> RefCell<T> { pub fn new(val: T) -> Self { ...} ... pub fn borrow(&self, val: T) -> Ref<T> { ... } pub fn borrow_mut(&self) -> RefMut<T> { ... } } impl<'b, T> Deref for Ref<'b, T> { type target = T fn deref(&self) -> &T { ... } }
RefCell
// example: counter with name struct Counter { pub name: String, value: RefCell<u32>, } impl Counter { fn new(name &str) -> Self { Counter{ name: name.into(), 0.into() } } fn value(&self) -> u32 { *self.value.borrow() } fn increment(&self) { *self.value.borrow_mut() += 1; } }
RefCell
usagelet c : RefCell<i32> = 0.into(); { // can be immutably borrowed multiple times let ref1 = c.borrow(); let ref2 = c.borrow(); // mutable borrow fails let mut1 = c.borrow_mut(); // panic!("already borrowed") } { // can be mutably borrowed only once let mut1 = c.borrow_mut(); // other borrows fail let ref1 = c.borrow(); // panic!("already mutably borrowed") let mut2 = c.borrow_mut(); // panic!("already borrowed") }
Mutex
Cell
and RefCell
are not thread-safe (cannot be shared between
threads)Mutex
provides thread-safe interior mutabilitypub struct Mutex<T> { ... } impl<T> Mutex<T> { pub fn new(val: T) -> Self { ...} ... pub fn lock(&self) -> LockResult<MutexGuard<T>> { ... } pub fn try_lock(&self) -> TryLockResult<MutexGuard<T>> { ... } } impl<'mutex, T> Deref for MutexGuard<'mutex, T> { type Target = T fn deref(&self) -> &T }
Thread-safety of types is inferred by the compiler, they may have:
Sync
trait if they are usable concurrently by multiple threadsSend
trait if the can be sent across threads
Inter-thread communication may rely on:
Arc<T>
)Mutex<T>
)
mpsc::channel<T>
)
Send
types may be transferred.send()
, sender cannot keep a
reference on sent dataResult<T,E>
, Error
and ?
async
+ await