Statefulness in a Stateless Language: Elixir
Elixir is blazing fast and highly concurrent. It’s functional, but its syntax is simple and easy to read. The language evolved out of the Ruby community and took many of Ruby’s core values with it. It’s optimized for developer happiness, and testing is a first-class citizen.
When approaching a new language, it’s important to go back to the basics. One of the first data structures developers learn is the stack.
Stacks are relatively easy to think about in imperative or object-oriented languages but can be much harder to reason about in functional languages. For example, here’s a simple implementation of a stack in Ruby:
class Stack def initialize @memory = [] end def size memory.size end def push(item) memory.push(item) end def pop memory.pop end private attr_reader :memory end
Because Ruby is classical, it’s easy to encapsulate behaviors and state. Elixir however only has functions. A list can be used as a stack, but because all data is immutable in Elixir, the variable must be reassigned on every call.
stack = [] stack = [ 1 | stack] # push [head | stack] = stack # pop Enum.count(stack) # size
These behaviors (for lack of a better word) can be placed in a module. This makes the code a little easier to reason about (it’s all in the same place).
defmodule Stack do def size(stack) do Enum.count(stack) end def pop(stack) do [last_in | rest] = stack {last_in, rest} end def push(stack, item) do [item | stack] end end stack = [] stack = Stack.push(stack, 1) {item, stack} = Stack.pop(stack) Stack.size(stack)
Code like this can be difficult though. Often in an application, state is necessary, or it at least makes the code easier. Luckily, Elixir can manage state by using recursion and processes.
Recursion
The following uses recursion as a form of looping. It also keeps track of the current count using recursion:
defmodule Counter do def count_by_one(count) do IO.puts count count_by_one(count + 1) end end
This code is an infinite loop that increases the count each iteration. It’s important that the last call of the function is a recursive function call. When the last call is recursive, Elixir will perform tail call optimization. This means that the current function call on the stack will be replaced by the new recursive function call, which prevents stack overflow errors.
If that’s a little deep, don’t worry about it. Here’s what you need to know:
- Functions that do NOT end with a pure recursive function call are NOT tail call optimized.
defmodule Factorial do def of(0), do: 1 def of(n) when n > 0 do # Not tail call optimized # because recursion needs to # occur before multiplication n * of(n - 1) end end
- Functions that end with a pure recursive function call ARE tail call optimized.
defmodule Factorial do def of(0), do: 1 def of(n), do: of(n, 1) def of(1, acc), do: acc def of(n, acc) when n > 1 do # Tail call optimized # because recursion is the # last calculation of(n - 1, acc * n) end end
Don’t be scared of the multiple function definitions. Elixir uses pattern matching to execute the correct function. This has a couple of benefits:
- performance
- maintainability
That’s right; pattern matching allows for static dispatch and removes branching statements (if/else/unless) from the code. Static dispatch just means that a functions calls are decided at compile time.
Processes
Elixir is an extremely concurrent programming language. A typical Elixir application will have hundreds or even thousands of concurrent processes running. These processess are like ultra lightweight threads, but don’t worry — no mutex to manage here! It’s super simple to start a process.
iex(1)> spawn fn -> ...(1)> :timer.sleep(1000) ...(1)> IO.puts "LONG RUNNING PROCESS" ...(1)> end #PID<0.95.0> LONG RUNNING PROCESS
But as easy as that is to type, it’s rare to find code like this in production. Processes are used primarily to maintain state, much like objects in object-oriented languages such as Ruby.
Processes can communicate with each other by sending and receiving messages. Here’s an example of a process sending a message to the current process or self
.
iex(1)> current = self #PID<0.57.0> iex(2)> send(current, :hello_world) :hello_world
Messages are added to a message queue and handled in order. The following example uses receive
to dequeue the first message sent. The receive
block then pattern matches on the message to see what it needs to execute.
iex(3)> receive do ...(3)> :hello_world -> IO.puts "hello from process" ...(3)> end hello from process :ok
Notice that receive
only runs once. In order to continue dequeueing messages, the program must recursively loop. And in order to recursively loop without a stack overflow, tail call optimization must occur.
defmodule HelloProcess do def loop do receive do :hello -> IO.puts "hello from another process" whatever -> IO.puts "don't know about #{whatever}" end loop # tail call optimized end end other = spawn HelloProcess, :loop, [] send other, :hello # prints "hello from another process" send other, :blarg # prints "don't know about blarg"
Armed with this knowledge, the receive loop can be used to maintain state:
defmodule TrackingState do def loop(state \\ []) do receive do {:push, item} -> state = [item | state] whatever -> {:error, "#{whatever} is not a valid"} end IO.inspect(state) loop(state) end end other = spawn TrackingState, :loop, [] send other, {:push, 1} # prints [1] send other, {:push, 2} # prints [2,1]
Each iteration of the loop is called with the new state. Patiently, receive
waits for a message. Once the message is processed, loop is called once again with the new state. This is enough to implement a stack.
Stack Implementation
Mix is an amazing tool that ships with Elixir. Mix has many uses, but for these examples, I’ll only use it to create our application and run the tests. For more information on mix, check out the docs.
Use mix to create a new application:
$ mix new stack $ cd stack
And run the tests:
$ mix test
Just by typing mix new app_name
, an application with a testing harness was generated. Not only was a file lib/stack.ex
created, but test/stack.exs
was created also.
Stack implementations usually consist of size
, push
, and pop
functions/methods. The following tests were added to test/stack.exs
:
defmodule StackTest do use ExUnit.Case test "size is zero when empty" do {:ok, pid} = Stack.start_link assert Stack.size(pid) == 0 end test "push adds to the stack" do {:ok, pid} = Stack.start_link Stack.push pid, :foo assert Stack.size(pid) == 1 end test "pop removes one from the stack" do {:ok, pid} = Stack.start_link Stack.push(pid, :bar) Stack.push(pid, :foo) assert Stack.pop(pid) == :foo assert Stack.size(pid) == 1 end end
The initializer function was named start_link
. This is the usual convention when creating a function that starts a linked process. A linked process means that when the spawned process has an error, it kills the process that created it as well. This is also easy to implement: Use it just like the spawn
examples already covered:
spawn_link fn -> IO.puts "Long running process" end spawn_link ModuleName, :function, ["args", "list"]
Here is a naive first pass at the stack implementation:
defmodule Stack do def start_link do pid = spawn_link(__MODULE__, :loop, [[]]) {:ok, pid} end def loop(stack) do receive do {:size, sender} -> send(sender, {:ok, Enum.count(stack)}) {:push, item} -> stack = [item | stack] {:pop, sender} -> [item | stack] = stack send(sender, {:ok, item}) end loop(stack) end def size(pid) do send pid, {:size, self} receive do {:ok, size} -> size end end def push(pid, item) do send pid, {:push, item} end def pop(pid) do send pid, {:pop, self} receive do {:ok, item} -> item end end end
The start_link\0
method creates a linked process. It does so by calling the recursive loop
function and sets its initial state to an empty list. The __MODULE__
references the current module, which makes for easy refactoring.
The loop\1
function uses the receive
keyword to wait for messages. When the message {:size, sender}
or {:pop, sender}
is received, the loop function sends a message back to the sender in the form of {:ok, answer}
. On receiving the message {:push, item}
, it adds the item to the top (or front or head) of the stack but does not reply.
Functions size\1
and pop\1
work very similarly. Both functions send a message to the stack process from the current and wait (using receive
) for the stack process to answer.
On the other hand, the push\2
function sends a message and the item to be added to the top (or front or head) of the stack but does not wait for a reply.
The tests are green. Ship it! Just kidding, time to refactor.
GenServers
The above implementation seems daunting when compared to a Ruby solution. Several concepts — like tail call optimization, process communication, and recursion — need to be understood before coming to a solution. One could argue that concepts like classes, instance variables, and message passing must be understood to create a Ruby stack. But it’s impossible to deny that the solution is almost twice as much code.
Thankfully, Elixir has an abstraction called GenServer (short for Generic Server). A GenServer’s goal is to abstract the receive
loop, which makes the code cleaner and more manageable.
Once a GenServer process has been created (using start_link\2
), messages can be sent using call\3
and cast\2
. The former expects a reply to return to the calling function and the latter does not. This can be managed using the handle_call\3
and handle_cast\2
callbacks.
There is a lot more functionality you can employ with a GenServer; you might want to check out Elixir’s Getting Started Guide as well as the docs, which implement a stack very similar to the one below:
defmodule Stack do use GenServer def start_link do GenServer.start_link __MODULE__, [] end def size(pid) do GenServer.call pid, :size end def push(pid, item) do GenServer.cast pid, {:push, item} end def pop(pid) do GenServer.call pid, :pop end #### # Genserver implementation def handle_call(:size, _from, stack) do {:reply, Enum.count(stack), stack} end def handle_cast({:push, item}, stack) do {:noreply, [item | stack]} end def handle_call(:pop, _from, [item | rest]) do {:reply, item, rest} end end
Agents
The tests are still green, and this implementation is much easier to maintain and reason about.
However, for very simple processes that are used to maintain simple state, like a stack, there is an even easier abstraction: the Agent.
defmodule Stack do def start_link do Agent.start_link fn -> [] end end def size(pid) do Agent.get pid, fn stack -> Enum.count(stack) end end def push(pid, item) do Agent.update pid, fn stack -> [item | stack] end end def pop(pid) do Agent.get_and_update pid, fn [item | last] -> {item, last} end end end
Running the tests, everything is still green. This final implementation feels good and is similar in lines of code to the Ruby solution.
Conclusion
Elixir is extremely performant and fun. However, some of the concepts are difficult to reason about when coming from an imperative or object-oriented language. Elixir is functional, stateless, and data is immutable. But when it’s necessary to keep track of state, Elixir’s got your back by using recursion and processes.
Reference: | Statefulness in a Stateless Language: Elixir from our WCG partner Florian Motlik at the Codeship Blog blog. |