The Simple Yet Powerful Ruby Enumerable Module

Leigh HallidayMarch 24th, 2016Last Updated: March 16th, 2016

0 42 9 minutes read

Here is a simple requirement: Find the positive numbers in an array. There are most likely hundreds of solutions, but here is one of them:

positives = []
for i in [-5, 10, 0, 15, -2]
  positives << i if i.positive?
end
p positives
# => [10, 15]

It works, it’s simple, but it feels clunky (and very un-Ruby like if you ask me), especially when you have to repeatedly write code like this. When I first came to Ruby, I was used to writing code this way. But when I discovered I could express the same logic with the following code, I never wanted to go back:

p [-5, 10, 0, 15, -2].select { |i| i.positive? }
# => [10, 15]
# which can be shortened even further to:
p [-5, 10, 0, 15, -2].select(&:positive?)
# => [10, 15]

I was hooked on the language. It felt like I was transforming my data from its initial to the end state in one fluid motion. There were no temporary variables or if statements clouding the readability of my code.

It turns out that many of the methods I found the most useful in terms of transforming and filtering data came from one place: the Enumerable module. It’s one of the things that brought a smile to my face as I was first learning Ruby. It felt like such a breeze to be able to filter and and mold my data into whatever structure I wanted.

In this article, we’re going back to the basics by taking a deeper look at this module. We’ll do so by recreating parts of it ourselves. We are going to recreate the map and reduce methods, talk about some functional programming concepts, and touch on what the Enumerator class is and how it relates to the Enumerable module.

First Steps: Enumerable Basics

This is most likely review for everyone here, so we won’t spend much time on it. When we call the map method, Ruby will iterate over some piece of data (an Array for example), passing each value to a function which will transform it in some way. We’ll end up with a new Array comprised of all the transformed values. Our old data is left alone in the process; the result is something new.

Let’s say we have a Rails app, and we want the names of some of our users (pretend you don’t know about the pluck method):

names = User.limit(3).map { |user| user.name }
# => ["name1", "name2", "name3"]

Even though Ruby is an object-oriented programming language, this style comes from the functional programming camp. This is called a Higher-order function, and the basic idea is that you have one function (map in this case) that takes another function as its value (the block which transforms the data). In Ruby, it has a bit of a twist in that it is called a block, and it sort of comes after the regular argument list, but this is just syntax for the most part. The idea is the same.

Here’s another method that’s included in the Enumerable module: reduce. In this case, we’ll take an array and join the values together by a comma. Yes, there is a join method for this, but this is to show that underneath this method could have been implemented with reduce. The idea with reduce is that you have a collection of values and you want to transform them in some way to produce some other value, which in this case will be a String.

["First", "Middle", "Last"].reduce { |str, val| "#{str}, #{val}" }
# => "First, Middle, Last"

There are all kinds of methods included in the Enumerable module which range from all? (telling you if something is true about all values), find (to find the first value which is true based on some test you provide), or partition (which splits your values into two groups based on some condition). I highly recommend reading through all of the methods available; other than understanding how objects work in Ruby, methods are probably the next most important area to feel comfortable with.

There are generally two ways to call an Enumerable method:

By passing a block like we’ve seen above
By passing a symbol with an & character before it, otherwise known as Symbol#to_proc

This is useful when all you’re going to do is call a single method on each of the values. The one caveat is that you can’t pass an argument to the Symbol#to_proc method being called.

[-1,0,1].select { |num| num.positive? }
# => [1]
[-1,0,1].select(&:positive?)
# => [1]

How to Make Something Enumerable

It isn’t just Array and Hash which include functionality found in the Enumerable module — it can be any class you want. There are two basic steps to give your class all of this functionality:

Include the line include Enumerable inside of your class.
Implement an each method which yields each value to a block of code.

Here’s an example with a class called Meal. A meal is made up of three servings: an appetizer, an entree, and a dessert.

class Meal
  include Enumerable

  attr_accessor :appetizer, :entree, :dessert

  def initialize(appetizer, entree, dessert)
    self.appetizer = appetizer
    self.entree = entree
    self.dessert = dessert
  end

  def each
    yield appetizer
    yield entree
    yield dessert
  end
end

class Serving
  attr_accessor :name, :ingredients

  def initialize(name, ingredients)
    self.name = name
    self.ingredients = ingredients
  end

  def <=>(other)
    ingredients.size <=> other.ingredients.size
  end
end

meal = Meal.new(
  Serving.new("bruschetta", ["bread", "tomatoes", "basil"]),
  Serving.new("lasagna", ["ground beef", "tomatoes", "cheese", "pasta"]),
  Serving.new("cookie", ["flour", "sugar", "butter"])
)

Now that our Meal class has included the Enumerable module and we have implemented the each method, we can call any method from the Enumerable module such as map:

p meal.map { |serving| serving.ingredients.size }
# => [3, 4, 3]

If you want to use methods such as max, min, or sort, you will have to implement the <=> method on the values that are yielded to the block (the Serving class), used for comparing two values. In this case, I chose to compare based on the number of ingredients (a somewhat arbitrary decision).

p meal.max
#<Serving:0x007fefe10eb1d0 @name="lasagna", @ingredients=["ground beef", "tomatoes", "cheese", "pasta"]>

Let’s Recreate the Enumerable Module

Now that we’ve taken a look at how to use the methods found within Enumerable, let’s attempt to make our own version of the module to see what is involved. Understanding how map and reduce work under the hood can help us truly understand what is happening when we use them in our code.

The first method we’ll recreate is the reduce method. As you’ll see later, it can be used by pretty much every other method.

module MyEnumerable
  def simple_reduce(acc)
    each do |value|
      acc = yield(acc, value)
    end
    acc
  end
end

Next we’ll include it inside of the Array class and give it a try. Array already implements the each method so we can rely upon it.

class Array
  include MyEnumerable
end

p [1,2,3,4].simple_reduce(0) { |total, num| total + num }
# => 10

The way reduce works is by starting off with some initial value (in this case 0) and passing an accumulator variable to your block of code for each value. It is your job to do something with the accumulator to come up with a “new” value, which is then passed in the second time the block is yielded to, and so on. Finally, once you’ve gone through all of the values, your accumulator value is returned. In this case, we’ve used the reduce method to add up all the numbers in our array.

You’ll have noticed earlier that we could call most Enumerable methods two ways, by either passing a block of code or a symbol. We don’t have that functionality yet, so let’s implement it. We will accomplish this by using the send method to dynamically dispatch a method call.

module MyEnumerable
  def my_reduce(acc, operator = nil, &block)
    raise ArgumentError, 'both operator and actual block given' if operator && block
    raise ArgumentError, 'either operator or block must be given' unless operator || block

    # if no block, create a lambda from the operator (symbol)
    block = block || -> (acc, value) { acc.send(operator, value) }

    each do |value|
      acc = block.call(acc, value)
    end
    acc
  end
end

p [1,2,3,4].my_reduce(0) { |total, num| total + num }
# => 10
p [1,2,3,4].my_reduce(0, :+)
# => 10

Let’s implement our own map method now, relying on some of the work we did in the reduce method.

def my_map(operator = nil, &block)
  raise ArgumentError, 'both operator and actual block given' if operator && block
  raise ArgumentError, 'either operator or block must be given' unless operator || block

  # if no block, create a lambda from the operator (symbol)
  block = block || -> (value) { value.send(operator) }

  my_reduce([]) do |arr, value|
    arr << block.call(value)
  end
end

We call the my_reduce method, setting its initial value to an empty Array. Then for each value, we append it to the new array that we are building.

We didn’t have to use the reduce method. In fact, if we look at how Ruby itself implements the map method, you’ll notice that it doesn’t. I’ve added comments to walk through what is happening. I should note that this version actually comes from the Array class and not the Enumerable module. It’s very similar but is more specialized/efficient just for arrays. It seems more straightforward, so for illustration purposes, I think it’s a better method to look at.

static VALUE
rb_ary_collect(VALUE ary)
{
  // declare variables to store the size of array and a new array
  long i;
  VALUE collect;

  // a macro which will return an `Enumerator` if no block is passed
  RETURN_SIZED_ENUMERATOR(ary, 0, 0, ary_enum_length);

  // assign a new array equal to the size of the one we are mapping
  collect = rb_ary_new2(RARRAY_LEN(ary));

  // iterate the length of the array
  for (i = 0; i < RARRAY_LEN(ary); i++) {
    // push onto the new array the value which the block returns
    rb_ary_push(collect, rb_yield(RARRAY_AREF(ary, i)));
  }

  // return our new array
  return collect;
}

Just for fun, here are two more methods: select and reject.

def my_select(&block)
  my_reduce([]) do |arr, value|
    arr << value if block.call(value)
    arr
  end
end

def my_reject(&block)
  my_reduce([]) do |arr, value|
    arr << value if !block.call(value)
    arr
  end
end

Creating a Simple Enumerable Binary Tree

Let’s create a very simple binary tree that has implemented the Enumerable module. It’ll end up looking very similar to the one that Mike Perham showed on his blog, but I wanted to add a small extra step about having it optionally return an Enumerator. For more information on enumerators, I recommend a short video by Avdi Grimm where he provides some examples.

A binary tree consists of a series of nodes. Each Node has a value and then a left branch and a right branch. You put values which are less than the current value on the left, greater than on the right.

class Node
  include Comparable
  include Enumerable
  attr_accessor :left, :value, :right

  def initialize(value)
    self.value = value
  end

  def each(&block)
    return to_enum(:each) unless block
    left.each(&block) if left
    block.call(value)
    right.each(&block) if right
  end

  def add(other)
    other = Node.new(other) unless other.is_a? Node
    if other > self
      if left
        left.add(other)
      else
        self.left = other
      end
    elsif other < self
      if right
        right.add(other)
      else
        self.right = other
      end
    end
  end

  def <=>(other)
    value <=> other.value
  end

  def to_s
    value
  end
end

In our each method, we first check to see if a block was passed to our method or not. If it was not, we call the to_enum method provided by the Kernel module, which will create a new instance of an Enumerator. This allows us to iterate forward using the next method, to peek forward (without moving the internal position), or to rewind back to the beginning.

Interestingly, Enumerator includes the Enumerable module, so you still have all of the same functionality.

Enumerator.ancestors
# => [Enumerator, Enumerable, Object, Kernel, BasicObject]

If we do pass a block to the each method, we’ll first call each on the left node (if it exists), yield/call the block for the node’s value, and then call each on the right side of the node (if it exists).

# Creating a root node and adding some children
root = Node.new(10)
root.add(5)
root.add(15)
root.add(20)
root.add(1)

p root.map { |item| item }
# => [20, 15, 10, 5, 1]
p root.max
# => 20
p root.sort
# [1, 5, 10, 15, 20]

And here is its use as an enumerator:

enum = root.each
puts enum.class
# => Enumerator
puts enum.next
# => 20
puts enum.next
# => 15

Conclusion

We’ve only scratched the surface in this article about just how powerful and useful the methods found within the Enumerable module are. Go and explore!

In this article, we looked at some of the most used methods, namely map and reduce (sometimes also called collect and inject). We also looked at how we can give our own class the ability to use these methods by including the Enumerable module and implementing the each method.

Finally, we actually reimplemented a few of these methods ourselves to see how things are done behind the scenes. There is no magic here, just what was previously unknown. In the last section, we looked at how to make a binary tree Enumerable, also giving it the ability to return an Enumerator for when each is called without a block.

Reference:

The Simple Yet Powerful Ruby Enumerable Module from our WCG partner Florian Motlik at the Codeship Blog blog.