Exploring the Structure of Ruby Gems

Leigh HallidayMay 12th, 2016Last Updated: May 5th, 2016

0 39 5 minutes read

When creating Rails apps, especially ones that don’t diverge too far from a typical/standard one, we don’t have to think very often about how to structure our files or what goes where. Models go in the models folder, Controllers in the controllers folder, etc.

But what about all of those gems we include in our Gemfile? How are they structured? Do they also have a standard way to organize their code?

In this article, we’ll be looking at how Ruby gems are typically structured. We’ll look at where to put our code, how to namespace it, how to make sure it is all wired up together correctly. Through this process, we’ll take a look at four popular Ruby gems to see how they’ve done it.

Loading Code

Before we look at how code inside of a gem is structured, let’s look at the different mechanisms Ruby gives us to load code.

Before you can use code, it must be loaded into memory. Ruby has a special global variable called $LOADED_FEATURES that lets us see which files it has already looked at to load their code into memory. A random Rails app I just looked at had almost 2,500 entries.

To get Ruby to load files, we’ll need to use one of three different methods that come with the language. They are require, require_relative, and autoload. All three methods are part of the Kernel module.

require with a relative path looks for a file within directories found within the $LOAD_PATH, a registry of all the places code can be loaded from automatically for a specific app.
require_relative is similar to require, but it looks within folders relative to the file it is being loaded from.
autoload doesn’t load the files right away but registers a file to be loaded the first time a module is used: autoload(:MyModule, "/usr/local/lib/modules/my_module.rb").

Matz says that autoload will be deprecated and probably won’t survive Ruby 3.0. However, I have a hard time seeing how this code will be outright deprecated, judging by its use in some of the most popular gems such as rack, which is only second in total downloads losing out to the rake gem.

Why don’t I have to require in Rails?

In Rails apps, you rarely have to load files yourself. This is because gems found in your Gemfile are automatically required, and Rails itself handles resolving and loading files for you. Justin Weiss has a great article on how Rails handles loading all of the gems found in your Gemfile.

Organizing Code in a Gem

Let’s take a look at how code is organized in a Ruby gem. For demonstration purposes, I’ve created a simple gem to do something that is completely unnecessary in Ruby: left and right padding of a string!

It’s unnecessary in Ruby because, unlike JavaScript, the language itself can do this with the rjust and ljust methods of the String class.

Here is how the code is organized:

├── Gemfile
├── Gemfile.lock
├── README.md
├── lib
│   ├── padder
│   │   ├── center.rb
│   │   ├── left.rb
│   │   ├── right.rb
│   │   └── version.rb
│   └── padder.rb
├── padder.gemspec
└── spec
    ├── center_spec.rb
    ├── left_spec.rb
    ├── right_spec.rb
    └── spec_helper.rb

Where does the code live?

The library’s code lives inside of the lib folder for the majority of cases. This path is added by default to the $LOAD_PATH when the gem is activated. You can override this by setting the require_paths option in the padder.gemspec file. It is left out of this gem because I am going with the default.

Namespacing your gem

Your gem should have a unique namespace (module) in which all of its code lives. This is to avoid namespace collisions and interference with other gems.

You’d normally declare this module in a file with the same name as the gem just inside of the lib folder. For this gem, there’s a padder.rb file. This is important because when used in Rails, it will look here to load the files of your gem.

# lib/padder.rb
module Padder
  require 'padder/version'
  require 'padder/left'
  require 'padder/right'
  require 'padder/center'
end

Another job of this file is to require (or autoload) the other files needed for the library to perform its job. These other files are generally found within a folder with the same name as the module/gem.

I wanted to see what’s happening with the $LOADED_FEATURES array. Inside of the left_spec.rb file, I put in some code to look at this array before and after the require method is called to ensure that things are working as expected.

before = $LOADED_FEATURES.dup
require 'padder'
($LOADED_FEATURES - before).each { |str| puts str }

Which resulted in a difference of:

/Users/leigh/padder/lib/padder/version.rb
/Users/leigh/padder/lib/padder/left.rb
/Users/leigh/padder/lib/padder/right.rb
/Users/leigh/padder/lib/padder/center.rb
/Users/leigh/padder/lib/padder.rb

And taking a look at the $LOAD_PATH array, filtering out entries which come from other gems, I get this:

/Users/leigh/padder/spec
/Users/leigh/padder/lib

If we take a look at one of the classes found in this gem, the Padder::Left class, it looks like this:

module Padder
  class Left
    def self.pad(str, length, char)
      str = str.to_s
      return str if str.length >= length
      "#{char * length}#{str}".slice(-length, length)
    end
  end
end

Looking at the lib Folder

Many extremely popular projects also start with a single file in the lib folder, which matches the name of the gem and also the namespace/module that all of the code will live in. Looking at how popular gems have organized their code is a great way to learn new concepts and techniques.

Sidekiq

Sidekiq only has a single file named sidekiq.rb in this folder. It does the job of requiring all of its helper classes, dependencies, setting up defaults, and providing a series of helper methods which have to do with logging and configuration.

If you look closely, you’ll even find a bit of an Easter egg with this method:

def self.❨╯°□°❩╯︵┻━┻
  puts "Calm down, yo."
end

Capybara

Capybara uses this first file to not only require dependencies much like Sidekiq did it, but also to define a series of empty classes which it uses for exceptions.

module Capybara
  # ...
  class CapybaraError < StandardError; end
  class DriverNotFoundError < CapybaraError; end
  class FrozenInTime < CapybaraError; end
  class ElementNotFound < CapybaraError; end
  # ...
end

Rack

Rack uses this space to define things such as the constants used within the library. But instead of also requiring the files with the require method, this library has chosen to use the autoload method. This lazy-loads the necessary files when the module/class is referenced for the first time.

module Rack
  # ...
  GET     = 'GET'.freeze
  POST    = 'POST'.freeze
  PUT     = 'PUT'.freeze
  PATCH   = 'PATCH'.freeze
  # ...
  autoload :Builder, "rack/builder"
  autoload :BodyProxy, "rack/body_proxy"
  autoload :Cascade, "rack/cascade"
  # ...
end

Rake

Rake has a very tame initial file that simply defines the Rake module along with its version and requires the other files used by the library.

module Rake
  VERSION = '11.1.2'
end
# ...
require 'rake/linked_list'
require 'rake/cpu_counter'
require 'rake/scope'
# ...

Deeper into the lib

Once we enter the lib/rack, lib/rspec, lib/capybara, and lib/sidekiq folders, we run into a series of files. These generally each contain a single class (all living under the main module) or a further set of folders that will divide the gem’s code up into more specialized modules to handle specific portions of the libraries’ functionality.

Conclusion

Today we looked at a few of the most popular Ruby gems to see how they have chosen to organize their code. As you can see, they are all generally done the same way. Some have chosen to have modules loaded via the autoload method, while others have used require to load the files at the beginning.

One thing is the same though: The gem is responsible for loading its own code, and you as the user of the gem don’t have to worry about it for the most part.

Reference:

Exploring the Structure of Ruby Gems from our WCG partner Leigh Halliday at the Codeship Blog blog.