The Simple Yet Powerful Ruby Enumerable Module
Here is a simple requirement: Find the positive numbers in an array. There are most likely hundreds of solutions, but here is one of them:
positives = [] for i in [-5, 10, 0, 15, -2] positives << i if i.positive? end p positives # => [10, 15]
It works, it’s simple, but it feels clunky (and very un-Ruby like if you ask me), especially when you have to repeatedly write code like this. When I first came to Ruby, I was used to writing code this way. But when I discovered I could express the same logic with the following code, I never wanted to go back:
p [-5, 10, 0, 15, -2].select { |i| i.positive? } # => [10, 15] # which can be shortened even further to: p [-5, 10, 0, 15, -2].select(&:positive?) # => [10, 15]
I was hooked on the language. It felt like I was transforming my data from its initial to the end state in one fluid motion. There were no temporary variables or if statements clouding the readability of my code.
It turns out that many of the methods I found the most useful in terms of transforming and filtering data came from one place: the Enumerable module. It’s one of the things that brought a smile to my face as I was first learning Ruby. It felt like such a breeze to be able to filter and and mold my data into whatever structure I wanted.
In this article, we’re going back to the basics by taking a deeper look at this module. We’ll do so by recreating parts of it ourselves. We are going to recreate the map
and reduce
methods, talk about some functional programming concepts, and touch on what the Enumerator class is and how it relates to the Enumerable
module.
First Steps: Enumerable Basics
This is most likely review for everyone here, so we won’t spend much time on it. When we call the map
method, Ruby will iterate over some piece of data (an Array
for example), passing each value to a function which will transform it in some way. We’ll end up with a new Array
comprised of all the transformed values. Our old data is left alone in the process; the result is something new.
Let’s say we have a Rails app, and we want the names of some of our users (pretend you don’t know about the pluck
method):
names = User.limit(3).map { |user| user.name } # => ["name1", "name2", "name3"]
Even though Ruby is an object-oriented programming language, this style comes from the functional programming camp. This is called a Higher-order function, and the basic idea is that you have one function (map
in this case) that takes another function as its value (the block which transforms the data). In Ruby, it has a bit of a twist in that it is called a block
, and it sort of comes after the regular argument list, but this is just syntax for the most part. The idea is the same.
Here’s another method that’s included in the Enumerable
module: reduce
. In this case, we’ll take an array and join the values together by a comma. Yes, there is a join
method for this, but this is to show that underneath this method could have been implemented with reduce
. The idea with reduce
is that you have a collection of values and you want to transform them in some way to produce some other value, which in this case will be a String
.
["First", "Middle", "Last"].reduce { |str, val| "#{str}, #{val}" } # => "First, Middle, Last"
There are all kinds of methods included in the Enumerable
module which range from all?
(telling you if something is true about all values), find
(to find the first value which is true based on some test you provide), or partition
(which splits your values into two groups based on some condition). I highly recommend reading through all of the methods available; other than understanding how objects work in Ruby, methods are probably the next most important area to feel comfortable with.
There are generally two ways to call an Enumerable method:
- By passing a block like we’ve seen above
- By passing a symbol with an
&
character before it, otherwise known asSymbol#to_proc
This is useful when all you’re going to do is call a single method on each of the values. The one caveat is that you can’t pass an argument to the Symbol#to_proc
method being called.
[-1,0,1].select { |num| num.positive? } # => [1] [-1,0,1].select(&:positive?) # => [1]
How to Make Something Enumerable
It isn’t just Array
and Hash
which include functionality found in the Enumerable
module — it can be any class you want. There are two basic steps to give your class all of this functionality:
- Include the line
include Enumerable
inside of your class. - Implement an
each
method which yields each value to a block of code.
Here’s an example with a class called Meal
. A meal is made up of three servings: an appetizer, an entree, and a dessert.
class Meal include Enumerable attr_accessor :appetizer, :entree, :dessert def initialize(appetizer, entree, dessert) self.appetizer = appetizer self.entree = entree self.dessert = dessert end def each yield appetizer yield entree yield dessert end end class Serving attr_accessor :name, :ingredients def initialize(name, ingredients) self.name = name self.ingredients = ingredients end def <=>(other) ingredients.size <=> other.ingredients.size end end meal = Meal.new( Serving.new("bruschetta", ["bread", "tomatoes", "basil"]), Serving.new("lasagna", ["ground beef", "tomatoes", "cheese", "pasta"]), Serving.new("cookie", ["flour", "sugar", "butter"]) )
Now that our Meal
class has included the Enumerable
module and we have implemented the each
method, we can call any method from the Enumerable
module such as map
:
p meal.map { |serving| serving.ingredients.size } # => [3, 4, 3]
If you want to use methods such as max
, min
, or sort
, you will have to implement the <=>
method on the values that are yielded to the block (the Serving
class), used for comparing two values. In this case, I chose to compare based on the number of ingredients (a somewhat arbitrary decision).
p meal.max #<Serving:0x007fefe10eb1d0 @name="lasagna", @ingredients=["ground beef", "tomatoes", "cheese", "pasta"]>
Let’s Recreate the Enumerable Module
Now that we’ve taken a look at how to use the methods found within Enumerable
, let’s attempt to make our own version of the module to see what is involved. Understanding how map
and reduce
work under the hood can help us truly understand what is happening when we use them in our code.
The first method we’ll recreate is the reduce
method. As you’ll see later, it can be used by pretty much every other method.
module MyEnumerable def simple_reduce(acc) each do |value| acc = yield(acc, value) end acc end end
Next we’ll include it inside of the Array
class and give it a try. Array
already implements the each
method so we can rely upon it.
class Array include MyEnumerable end p [1,2,3,4].simple_reduce(0) { |total, num| total + num } # => 10
The way reduce
works is by starting off with some initial value (in this case 0
) and passing an accumulator variable to your block of code for each value. It is your job to do something with the accumulator to come up with a “new” value, which is then passed in the second time the block is yielded to, and so on. Finally, once you’ve gone through all of the values, your accumulator value is returned. In this case, we’ve used the reduce
method to add up all the numbers in our array.
You’ll have noticed earlier that we could call most Enumerable methods two ways, by either passing a block of code or a symbol. We don’t have that functionality yet, so let’s implement it. We will accomplish this by using the send
method to dynamically dispatch a method call.
module MyEnumerable def my_reduce(acc, operator = nil, &block) raise ArgumentError, 'both operator and actual block given' if operator && block raise ArgumentError, 'either operator or block must be given' unless operator || block # if no block, create a lambda from the operator (symbol) block = block || -> (acc, value) { acc.send(operator, value) } each do |value| acc = block.call(acc, value) end acc end end p [1,2,3,4].my_reduce(0) { |total, num| total + num } # => 10 p [1,2,3,4].my_reduce(0, :+) # => 10
Let’s implement our own map
method now, relying on some of the work we did in the reduce
method.
def my_map(operator = nil, &block) raise ArgumentError, 'both operator and actual block given' if operator && block raise ArgumentError, 'either operator or block must be given' unless operator || block # if no block, create a lambda from the operator (symbol) block = block || -> (value) { value.send(operator) } my_reduce([]) do |arr, value| arr << block.call(value) end end
We call the my_reduce
method, setting its initial value to an empty Array. Then for each value, we append it to the new array that we are building.
We didn’t have to use the reduce method. In fact, if we look at how Ruby itself implements the map method, you’ll notice that it doesn’t. I’ve added comments to walk through what is happening. I should note that this version actually comes from the Array
class and not the Enumerable
module. It’s very similar but is more specialized/efficient just for arrays. It seems more straightforward, so for illustration purposes, I think it’s a better method to look at.
static VALUE rb_ary_collect(VALUE ary) { // declare variables to store the size of array and a new array long i; VALUE collect; // a macro which will return an `Enumerator` if no block is passed RETURN_SIZED_ENUMERATOR(ary, 0, 0, ary_enum_length); // assign a new array equal to the size of the one we are mapping collect = rb_ary_new2(RARRAY_LEN(ary)); // iterate the length of the array for (i = 0; i < RARRAY_LEN(ary); i++) { // push onto the new array the value which the block returns rb_ary_push(collect, rb_yield(RARRAY_AREF(ary, i))); } // return our new array return collect; }
Just for fun, here are two more methods: select
and reject
.
def my_select(&block) my_reduce([]) do |arr, value| arr << value if block.call(value) arr end end def my_reject(&block) my_reduce([]) do |arr, value| arr << value if !block.call(value) arr end end
Creating a Simple Enumerable Binary Tree
Let’s create a very simple binary tree that has implemented the Enumerable
module. It’ll end up looking very similar to the one that Mike Perham showed on his blog, but I wanted to add a small extra step about having it optionally return an Enumerator
. For more information on enumerators, I recommend a short video by Avdi Grimm where he provides some examples.
A binary tree consists of a series of nodes. Each Node
has a value and then a left branch and a right branch. You put values which are less than the current value on the left, greater than on the right.
class Node include Comparable include Enumerable attr_accessor :left, :value, :right def initialize(value) self.value = value end def each(&block) return to_enum(:each) unless block left.each(&block) if left block.call(value) right.each(&block) if right end def add(other) other = Node.new(other) unless other.is_a? Node if other > self if left left.add(other) else self.left = other end elsif other < self if right right.add(other) else self.right = other end end end def <=>(other) value <=> other.value end def to_s value end end
In our each
method, we first check to see if a block was passed to our method or not. If it was not, we call the to_enum
method provided by the Kernel
module, which will create a new instance of an Enumerator
. This allows us to iterate forward using the next
method, to peek
forward (without moving the internal position), or to rewind
back to the beginning.
Interestingly, Enumerator
includes the Enumerable
module, so you still have all of the same functionality.
Enumerator.ancestors # => [Enumerator, Enumerable, Object, Kernel, BasicObject]
If we do pass a block to the each
method, we’ll first call each
on the left node (if it exists), yield/call the block for the node’s value, and then call each
on the right side of the node (if it exists).
# Creating a root node and adding some children root = Node.new(10) root.add(5) root.add(15) root.add(20) root.add(1) p root.map { |item| item } # => [20, 15, 10, 5, 1] p root.max # => 20 p root.sort # [1, 5, 10, 15, 20]
And here is its use as an enumerator:
enum = root.each puts enum.class # => Enumerator puts enum.next # => 20 puts enum.next # => 15
Conclusion
We’ve only scratched the surface in this article about just how powerful and useful the methods found within the Enumerable
module are. Go and explore!
In this article, we looked at some of the most used methods, namely map
and reduce
(sometimes also called collect
and inject
). We also looked at how we can give our own class the ability to use these methods by including the Enumerable
module and implementing the each
method.
Finally, we actually reimplemented a few of these methods ourselves to see how things are done behind the scenes. There is no magic here, just what was previously unknown. In the last section, we looked at how to make a binary tree Enumerable, also giving it the ability to return an Enumerator for when each
is called without a block.
Reference: | The Simple Yet Powerful Ruby Enumerable Module from our WCG partner Florian Motlik at the Codeship Blog blog. |