Understanding Rust Loops
When you come from another language to learn Rust, some things may come across the same, but many things will be very different. Writing Rust loops can have the appearance of familiarity, but behind the scenes, Rust is translating those loops to its own syntax. If you learn that style for yourself, you will find Rust loops to be far more capable and useful in your day-to-day programming.
Rust works with a form of pattern matching to account for what may possibly result from each step of your loop. This ensures that you don’t need to write as many tests; Rust gives you several guarantees about the types and boundaries of your conditions being iterated over. This helps your tests to focus on more relevant things, as well as makes your tests more sensible.
Let’s have a look at how Rust works with loops.
Revealing a For Loop on a Range
A for loop may at first look as it would for many other languages. You tell it to use a variable for each step of a numbered range and simply use that value within the loop.
for value in 1..10 { println!("The value is {}", value); }
This will print out nine lines starting with “The value is”
, and then each line with a number from 1 to 9. When writing code like that, I would simply accept that the language was written to work in this way. But if you look at how many other loops are written in Rust, you’ll notice they seem very different. It’s as if they’re designed to operate in a different way.
So let’s dig into how this for loop is implemented and see why that isn’t so.
The first piece to look at is the range syntax used: 1..10
. What this does is create an instance of the struct Range
. 1
is given to the structs start
value and 10
to the end
.
pub struct Range<Idx> { pub start: Idx, pub end: Idx, }
The for loop then calls the IntoIterator::into_iter
method on that range which will iterate over each item and takes ownership of each item it brings into the loop. Iterators in Rust have one requirement, and that is to define a next
method when you implement that trait Iterator
for your type. For Range
, the implementation of Iterator
looks roughly like:
impl<A> Iterator for ops::Range<A> { type Item = A; #[inline] fn next(μt self) -> Option<A> { if self.start < self.end { if let Some(mut n) = self.start.add_usize(1) { mem::swap(μt n, μt self.start); Some(n) } else { None } } else { None } } }
This code is a little more verbose than it needs to be for this post — it has some compiler optimization focuses for its implementation.
In essence, what is being done is a start-to-end boundary check for LLVM (compiler) optimization, then it tries to add 1 to our start value, if that’s possible given number type limitations (ie, the maximum number was reached for the kind of number used). If it can add 1, it uses a memory swap method to update the start value.
And last, it returns either a Some(value)
or a None
(where value in Some
is the current number start is at). The Some
and None
results are what most loops we write will pattern match against and are both of the enum type Option<_>
.
You can see one case of pattern matching used above with if let
. When using match
or let
, you can perform the pattern matching known as destructuring assignment.
// structuring assignment let number = Some(1); // destructuring assignment let Some(result) = number; println!("The result is {}", result); // 1
When using if
just before let
, it will do a conditional destructuring assignment and code block execution if the pattern matches.
let number = None; if let Some(value) = number { println!("This doesn't run. You won't see this."); } let number = Some(12); if let Some(value) = number { println!("This prints! The value is {}", value); } // This prints! The value is 12
You can add an else
after an if let
if you want to handle two possible paths (just as in the Iterator example above). What is perhaps more common is to use match
for cases when you have two or more result paths to work with.
let x = Some(7); match x { Some(v) => println!("A value has been produced. It's {}", v), None => println!("No value."), } // A value has been produced. It's 7
Now back on the subject of the for loop, we’ve covered that it creates a Range
and that into_iter
is used to produce an iterator which will produce an Option<_>
value each time the next
method is called. And Option<Thing>
will either return as Some<Thing>
or None
once the end of the loop is reached.
In this case, the Thing in Option is a numeric type known as usize.
Ownership
The block of code that prints each of the values does so from the
destructured result from Some(value)
. Since into_iter
consumes
the ownership for each item passed in, then the collection of values
from the range will not be available after the loop. Assigning it to a variable and trying to use it later won’t work in this case.
let range = 1..10; for value in range { println!("The value is {}", value); } println!("{:?}", range);
This produces an error:
error[E0382]: use of moved value: `range` --> src/main.rs:7:20 | 3 | for value in range { | ----- value moved here ... 7 | println!("{:?}", range); | ^^^^^ value used here after move | = note: move occurs because `range` has type `std::ops::Range<i32>`, which does not implement the `Copy` trait</i32>
When looping over a collection, you have a few different ways to choose ownership of the items used.
iter()
— iterates over&T
(a borrowed reference to the item)iter_mut()
— iterates over&mut T
(an editable borrowed reference to the item)into_iter()
— iterates overT
(takes/consumes ownership)
Here we’ll look at a while loop that uses the let
destructured assignment feature and only borrow the items as we run through the loop, allowing it to be available afterward.
let range = (1..4).collect::<Vec <usize>>(); let mut range_iterator = range.iter(); while let Some(value) = range_iterator.next() { println!("The value is {}", value); } println!("{:?}", range);
And this outputs:
The value is 1 The value is 2 The value is 3 [1, 2, 3]
The last line is the debug output we asked for on the last line of code, and we still had access to the vector named range because we used iter()
to borrow it.
Typical Loops in Rust
One very common use of loops in Rust is the loop
/match
combination. It may look something like this:
let values = vec![1,2,3,4,5]; let mut iterator = values.into_iter(); loop { match iterator.next() { Some(number) => print!("{}", number), None => break, } } // 12345
Rust’s type system will make sure to match accounts for all the possible outputs that the next method produces from your iterator. When the collection reaches the end, it will produce a None
, and the match runs that code block breaking out of the loop.
match
can handle more advanced pattern matching.
let values = vec![1,2,3,4,5,6,7]; let mut iterator = values.into_iter(); loop { match iterator.next() { Some(4) | Some(6) => { println!("Even!"); }, Some(number) => { println!("Prime! {}", number); }, None => { break }, } }
And with guards:
let values = vec![1,2,3,4,5,6,7]; let mut iterator = values.into_iter(); loop { match iterator.next() { Some(number) if (number == 4) | (number == 6) => { println!("Not prime {}", number); }, Some(number) => { println!("Prime! {}", number); }, None => { break }, } }
And match also has bindings:
let values = 1..8; for value in values { match value { num @ 1...3 => println!("Lower Range: {}", num), num @ 3...6 => println!("Upper Range: {}", num), _ => println!("Not in range."), } }
And the above outputs:
Lower Range: 1 Lower Range: 2 Lower Range: 3 Upper Range: 4 Upper Range: 5 Upper Range: 6 Not in range.
There are many methods on the Iterator trait that can help you do many of the things you want to do while iterating over a collection [link]. With them, you can daisy-chain methods like so:
let mut db: Vec<ResumeKey> = vec![]; contents. split("\n\n"). map(|s| s.to_string()). filter(|s| !s.is_empty()). for_each(|item| { let rk = ResumeKey::try_from(item); if let Ok(key) = rk { db.push(key); } });
This is some code from my app I used for processing text from a resume key text file. The contents are the text from the file, and entries are split by double new line entries. From there, I just use try_from
to generate a ResumeKey
from the sections in the file — if it pattern matches as an Ok()
(a type of Result
), then I go ahead and push it into my dataset of resume keys.
As you get more familiar with Rust, it’s likely you’ll use the daisy-chained method, as many times it will comes across as more readable. Even so, the same pattern matching system will still happen behind the scenes where it is applicable.
Creating an Iterator
To create an iterator, we need to implement the trait Iterator
with the method next(&mut self)
for our object. It needs to be &mut self
because the iterator needs to change some information on the object it’s on, so the next time next
is called, it will have progressed forward by the information it saved last time. The template for Iterator
is as follows.
pub trait Iterator { type Item; fn next(μt self) -> Option<Self::Item>; }
The type
used here is for aliasing one type to another, specifically the type of item we’ll be iterating over. Since this is just a trait template, the alias for the type hasn’t been set yet, just the name Item
for when we use it.
struct Pairs { pairs: Vec<(usize,usize)>, } impl Iterator for Pairs { type Item = (usize, usize); fn next(μt self) -> Option<Self::Item> { self.pairs.pop() } } fn main() { let set = Pairs { pairs: vec![(1,2), (2,3), (4,5)] }; for pair in set { println!("Pair: {:?}", pair); } }
The above will output:
Pair: (4, 5) Pair: (2, 3) Pair: (1, 2)
When we define the Iterator
type for our item, we automatically get the into_iter
method, which lets it work in for
loops. Since into_iter
consumes ownership of the item, we’ve implemented the way next
should work for this item correctly.
If you would like to learn how to implement the iter
and iter_mut
versions as well, I highly recommend looking at the well-documented source code for vec_deque
. In here, you will find that the structs Iter
, IterMut
, and IntoIter
all have the trait Iterator
implemented for them for VecDeque
with their own behavior in the next
function.
Summary
Rust has a fantastic system that manages your types and ownership very well, and this shines in loops as well. When you otherwise would worry about types and bounds with other languages, Rust takes a load off your mind with compile time checks and points you in the right direction.
I have found Rust removes most of the need for TDD when working with Rust-to-Rust code. Where TDD and testing really come in to play is any time you work with something outside the ecosystem. Then all the normal practice guidelines apply.
Rust will spoil you with it’s compile time helpful error messages, and your loops will scarcely ever run in to typical issues found in other languages.
Published on Web Code Geeks with permission by Daniel P. Clark, partner at our WCG program. See the original article here: Understanding Rust Loops Opinions expressed by Web Code Geeks contributors are their own. |