Statefulness in a Stateless Language: Elixir

Micah WoodsJanuary 19th, 2016Last Updated: January 10th, 2016

0 46 7 minutes read

Elixir is blazing fast and highly concurrent. It’s functional, but its syntax is simple and easy to read. The language evolved out of the Ruby community and took many of Ruby’s core values with it. It’s optimized for developer happiness, and testing is a first-class citizen.

When approaching a new language, it’s important to go back to the basics. One of the first data structures developers learn is the stack.

Stacks are relatively easy to think about in imperative or object-oriented languages but can be much harder to reason about in functional languages. For example, here’s a simple implementation of a stack in Ruby:

class Stack
  def initialize
    @memory = []
  end

  def size
    memory.size
  end

  def push(item)
    memory.push(item)
  end

  def pop
    memory.pop
  end

  private
  attr_reader :memory
end

Because Ruby is classical, it’s easy to encapsulate behaviors and state. Elixir however only has functions. A list can be used as a stack, but because all data is immutable in Elixir, the variable must be reassigned on every call.

stack = []
stack = [ 1 | stack]   # push
[head | stack] = stack # pop
Enum.count(stack)      # size

These behaviors (for lack of a better word) can be placed in a module. This makes the code a little easier to reason about (it’s all in the same place).

defmodule Stack do

  def size(stack) do
    Enum.count(stack)
  end

  def pop(stack) do
    [last_in | rest] = stack
    {last_in, rest}
  end

  def push(stack, item) do
    [item | stack]
  end

end

stack = []
stack = Stack.push(stack, 1)
{item, stack} = Stack.pop(stack)
Stack.size(stack)

Code like this can be difficult though. Often in an application, state is necessary, or it at least makes the code easier. Luckily, Elixir can manage state by using recursion and processes.

Recursion

The following uses recursion as a form of looping. It also keeps track of the current count using recursion:

defmodule Counter do
  def count_by_one(count) do
    IO.puts count 
    count_by_one(count + 1)
  end
end

This code is an infinite loop that increases the count each iteration. It’s important that the last call of the function is a recursive function call. When the last call is recursive, Elixir will perform tail call optimization. This means that the current function call on the stack will be replaced by the new recursive function call, which prevents stack overflow errors.

If that’s a little deep, don’t worry about it. Here’s what you need to know:

Functions that do NOT end with a pure recursive function call are NOT tail call optimized.

defmodule Factorial do  
  def of(0), do: 1
  def of(n) when n > 0 do
    # Not tail call optimized
    # because recursion needs to
    # occur before multiplication
    n * of(n - 1)
  end
end

Functions that end with a pure recursive function call ARE tail call optimized.

defmodule Factorial do  
  def of(0), do: 1
  def of(n), do: of(n, 1)
  def of(1, acc), do: acc 
  def of(n, acc) when n > 1 do
    # Tail call optimized
    # because recursion is the
    # last calculation
    of(n - 1, acc * n)
  end
end

Don’t be scared of the multiple function definitions. Elixir uses pattern matching to execute the correct function. This has a couple of benefits:

performance
maintainability

That’s right; pattern matching allows for static dispatch and removes branching statements (if/else/unless) from the code. Static dispatch just means that a functions calls are decided at compile time.

Processes

Elixir is an extremely concurrent programming language. A typical Elixir application will have hundreds or even thousands of concurrent processes running. These processess are like ultra lightweight threads, but don’t worry — no mutex to manage here! It’s super simple to start a process.

iex(1)> spawn fn ->
...(1)>   :timer.sleep(1000) 
...(1)>   IO.puts "LONG RUNNING PROCESS"
...(1)> end
#PID<0.95.0>
LONG RUNNING PROCESS

But as easy as that is to type, it’s rare to find code like this in production. Processes are used primarily to maintain state, much like objects in object-oriented languages such as Ruby.

Processes can communicate with each other by sending and receiving messages. Here’s an example of a process sending a message to the current process or self.

iex(1)> current = self
#PID<0.57.0>
iex(2)> send(current, :hello_world)
:hello_world

Messages are added to a message queue and handled in order. The following example uses receive to dequeue the first message sent. The receive block then pattern matches on the message to see what it needs to execute.

iex(3)> receive do
...(3)>   :hello_world -> IO.puts "hello from process"
...(3)> end
hello from process
:ok

Notice that receive only runs once. In order to continue dequeueing messages, the program must recursively loop. And in order to recursively loop without a stack overflow, tail call optimization must occur.

defmodule HelloProcess do
  def loop do
    receive do
      :hello -> IO.puts "hello from another process"
      whatever -> IO.puts "don't know about #{whatever}"
    end
    loop # tail call optimized
  end
end

other = spawn HelloProcess, :loop, []
send other, :hello # prints "hello from another process"
send other, :blarg # prints "don't know about blarg"

Armed with this knowledge, the receive loop can be used to maintain state:

defmodule TrackingState do
  def loop(state \\ []) do
    receive do
      {:push, item} -> state = [item | state]
      whatever -> {:error, "#{whatever} is not a valid"}
    end
    IO.inspect(state)
    loop(state)
  end
end

other = spawn TrackingState, :loop, []
send other, {:push, 1} # prints [1]
send other, {:push, 2} # prints [2,1]

Each iteration of the loop is called with the new state. Patiently, receive waits for a message. Once the message is processed, loop is called once again with the new state. This is enough to implement a stack.

Stack Implementation

Mix is an amazing tool that ships with Elixir. Mix has many uses, but for these examples, I’ll only use it to create our application and run the tests. For more information on mix, check out the docs.

Use mix to create a new application:

$ mix new stack
$ cd stack

And run the tests:

$ mix test

Just by typing mix new app_name, an application with a testing harness was generated. Not only was a file lib/stack.ex created, but test/stack.exs was created also.

Stack implementations usually consist of size, push, and pop functions/methods. The following tests were added to test/stack.exs:

defmodule StackTest do
  use ExUnit.Case

  test "size is zero when empty" do
    {:ok, pid} = Stack.start_link
    assert Stack.size(pid) == 0
  end

  test "push adds to the stack" do
    {:ok, pid} = Stack.start_link
    Stack.push pid, :foo
    assert Stack.size(pid) == 1
  end

  test "pop removes one from the stack" do
    {:ok, pid} = Stack.start_link
    Stack.push(pid, :bar)
    Stack.push(pid, :foo)
    assert Stack.pop(pid) == :foo
    assert Stack.size(pid) == 1
  end
end

The initializer function was named start_link. This is the usual convention when creating a function that starts a linked process. A linked process means that when the spawned process has an error, it kills the process that created it as well. This is also easy to implement: Use it just like the spawn examples already covered:

spawn_link fn -> IO.puts "Long running process" end

spawn_link ModuleName, :function, ["args", "list"]

Here is a naive first pass at the stack implementation:

defmodule Stack do
  def start_link do
    pid = spawn_link(__MODULE__, :loop, [[]])
    {:ok, pid}
  end

  def loop(stack) do
    receive do
      {:size, sender} -> 
        send(sender, {:ok, Enum.count(stack)})
      {:push, item} -> stack = [item | stack]
      {:pop, sender} ->
        [item | stack] = stack
        send(sender, {:ok, item})
    end
    loop(stack)
  end

  def size(pid) do
    send pid, {:size, self}
    receive do {:ok, size} -> size end
  end

  def push(pid, item) do
    send pid, {:push, item}
  end

  def pop(pid) do
    send pid, {:pop, self}
    receive do {:ok, item} -> item end
  end
end

The start_link\0 method creates a linked process. It does so by calling the recursive loop function and sets its initial state to an empty list. The __MODULE__ references the current module, which makes for easy refactoring.

The loop\1 function uses the receive keyword to wait for messages. When the message {:size, sender} or {:pop, sender} is received, the loop function sends a message back to the sender in the form of {:ok, answer}. On receiving the message {:push, item}, it adds the item to the top (or front or head) of the stack but does not reply.

Functions size\1 and pop\1 work very similarly. Both functions send a message to the stack process from the current and wait (using receive) for the stack process to answer.

On the other hand, the push\2 function sends a message and the item to be added to the top (or front or head) of the stack but does not wait for a reply.

The tests are green. Ship it! Just kidding, time to refactor.

GenServers

The above implementation seems daunting when compared to a Ruby solution. Several concepts — like tail call optimization, process communication, and recursion — need to be understood before coming to a solution. One could argue that concepts like classes, instance variables, and message passing must be understood to create a Ruby stack. But it’s impossible to deny that the solution is almost twice as much code.

Thankfully, Elixir has an abstraction called GenServer (short for Generic Server). A GenServer’s goal is to abstract the receive loop, which makes the code cleaner and more manageable.

Once a GenServer process has been created (using start_link\2), messages can be sent using call\3 and cast\2. The former expects a reply to return to the calling function and the latter does not. This can be managed using the handle_call\3 and handle_cast\2 callbacks.

There is a lot more functionality you can employ with a GenServer; you might want to check out Elixir’s Getting Started Guide as well as the docs, which implement a stack very similar to the one below:

defmodule Stack do
  use GenServer

  def start_link do
    GenServer.start_link __MODULE__, []
  end

  def size(pid) do
    GenServer.call pid, :size
  end

  def push(pid, item) do
    GenServer.cast pid, {:push, item}
  end

  def pop(pid) do
    GenServer.call pid, :pop
  end

  ####
  # Genserver implementation

  def handle_call(:size, _from, stack) do
    {:reply, Enum.count(stack), stack}
  end

  def handle_cast({:push, item}, stack) do
    {:noreply, [item | stack]}
  end

  def handle_call(:pop, _from, [item | rest]) do
    {:reply, item, rest}
  end
end

Agents

The tests are still green, and this implementation is much easier to maintain and reason about.

However, for very simple processes that are used to maintain simple state, like a stack, there is an even easier abstraction: the Agent.

defmodule Stack do

  def start_link do
    Agent.start_link fn -> [] end
  end

  def size(pid) do
    Agent.get pid, fn stack -> Enum.count(stack) end
  end

  def push(pid, item) do
    Agent.update pid, fn stack -> [item | stack] end
  end

  def pop(pid) do
    Agent.get_and_update pid, fn [item | last] ->
      {item, last}
    end
  end
end

Running the tests, everything is still green. This final implementation feels good and is similar in lines of code to the Ruby solution.

Conclusion

Elixir is extremely performant and fun. However, some of the concepts are difficult to reason about when coming from an imperative or object-oriented language. Elixir is functional, stateless, and data is immutable. But when it’s necessary to keep track of state, Elixir’s got your back by using recursion and processes.

Reference:

Statefulness in a Stateless Language: Elixir from our WCG partner Florian Motlik at the Codeship Blog blog.