Exploring the Structure of Ruby Gems
When creating Rails apps, especially ones that don’t diverge too far from a typical/standard one, we don’t have to think very often about how to structure our files or what goes where. Models go in the models folder, Controllers in the controllers folder, etc.
But what about all of those gems we include in our Gemfile
? How are they structured? Do they also have a standard way to organize their code?
In this article, we’ll be looking at how Ruby gems are typically structured. We’ll look at where to put our code, how to namespace it, how to make sure it is all wired up together correctly. Through this process, we’ll take a look at four popular Ruby gems to see how they’ve done it.
Loading Code
Before we look at how code inside of a gem is structured, let’s look at the different mechanisms Ruby gives us to load code.
Before you can use code, it must be loaded into memory. Ruby has a special global variable called $LOADED_FEATURES
that lets us see which files it has already looked at to load their code into memory. A random Rails app I just looked at had almost 2,500 entries.
To get Ruby to load files, we’ll need to use one of three different methods that come with the language. They are require
, require_relative
, and autoload
. All three methods are part of the Kernel module.
require
with a relative path looks for a file within directories found within the$LOAD_PATH
, a registry of all the places code can be loaded from automatically for a specific app.require_relative
is similar torequire
, but it looks within folders relative to the file it is being loaded from.autoload
doesn’t load the files right away but registers a file to be loaded the first time a module is used:autoload(:MyModule, "/usr/local/lib/modules/my_module.rb")
.
Matz says that autoload will be deprecated and probably won’t survive Ruby 3.0. However, I have a hard time seeing how this code will be outright deprecated, judging by its use in some of the most popular gems such as rack
, which is only second in total downloads losing out to the rake
gem.
Why don’t I have to require in Rails?
In Rails apps, you rarely have to load files yourself. This is because gems found in your Gemfile
are automatically required, and Rails itself handles resolving and loading files for you. Justin Weiss has a great article on how Rails handles loading all of the gems found in your Gemfile.
Organizing Code in a Gem
Let’s take a look at how code is organized in a Ruby gem. For demonstration purposes, I’ve created a simple gem to do something that is completely unnecessary in Ruby: left and right padding of a string!
It’s unnecessary in Ruby because, unlike JavaScript, the language itself can do this with the
rjust
andljust
methods of theString
class.
Here is how the code is organized:
├── Gemfile ├── Gemfile.lock ├── README.md ├── lib │ ├── padder │ │ ├── center.rb │ │ ├── left.rb │ │ ├── right.rb │ │ └── version.rb │ └── padder.rb ├── padder.gemspec └── spec ├── center_spec.rb ├── left_spec.rb ├── right_spec.rb └── spec_helper.rb
Where does the code live?
The library’s code lives inside of the lib
folder for the majority of cases. This path is added by default to the $LOAD_PATH
when the gem is activated. You can override this by setting the require_paths option in the padder.gemspec
file. It is left out of this gem because I am going with the default.
Namespacing your gem
Your gem should have a unique namespace (module) in which all of its code lives. This is to avoid namespace collisions and interference with other gems.
You’d normally declare this module in a file with the same name as the gem just inside of the lib
folder. For this gem, there’s a padder.rb
file. This is important because when used in Rails, it will look here to load the files of your gem.
# lib/padder.rb module Padder require 'padder/version' require 'padder/left' require 'padder/right' require 'padder/center' end
Another job of this file is to require
(or autoload
) the other files needed for the library to perform its job. These other files are generally found within a folder with the same name as the module/gem.
I wanted to see what’s happening with the $LOADED_FEATURES
array. Inside of the left_spec.rb
file, I put in some code to look at this array before and after the require
method is called to ensure that things are working as expected.
before = $LOADED_FEATURES.dup require 'padder' ($LOADED_FEATURES - before).each { |str| puts str }
Which resulted in a difference of:
/Users/leigh/padder/lib/padder/version.rb /Users/leigh/padder/lib/padder/left.rb /Users/leigh/padder/lib/padder/right.rb /Users/leigh/padder/lib/padder/center.rb /Users/leigh/padder/lib/padder.rb
And taking a look at the $LOAD_PATH
array, filtering out entries which come from other gems, I get this:
/Users/leigh/padder/spec /Users/leigh/padder/lib
If we take a look at one of the classes found in this gem, the Padder::Left
class, it looks like this:
module Padder class Left def self.pad(str, length, char) str = str.to_s return str if str.length >= length "#{char * length}#{str}".slice(-length, length) end end end
Looking at the lib Folder
Many extremely popular projects also start with a single file in the lib folder, which matches the name of the gem and also the namespace/module that all of the code will live in. Looking at how popular gems have organized their code is a great way to learn new concepts and techniques.
Sidekiq
Sidekiq only has a single file named sidekiq.rb
in this folder. It does the job of requiring all of its helper classes, dependencies, setting up defaults, and providing a series of helper methods which have to do with logging and configuration.
If you look closely, you’ll even find a bit of an Easter egg with this method:
def self.❨╯°□°❩╯︵┻━┻ puts "Calm down, yo." end
Capybara
Capybara uses this first file to not only require dependencies much like Sidekiq did it, but also to define a series of empty classes which it uses for exceptions.
module Capybara # ... class CapybaraError < StandardError; end class DriverNotFoundError < CapybaraError; end class FrozenInTime < CapybaraError; end class ElementNotFound < CapybaraError; end # ... end
Rack
Rack uses this space to define things such as the constants used within the library. But instead of also requiring the files with the require
method, this library has chosen to use the autoload
method. This lazy-loads the necessary files when the module/class is referenced for the first time.
module Rack # ... GET = 'GET'.freeze POST = 'POST'.freeze PUT = 'PUT'.freeze PATCH = 'PATCH'.freeze # ... autoload :Builder, "rack/builder" autoload :BodyProxy, "rack/body_proxy" autoload :Cascade, "rack/cascade" # ... end
Rake
Rake has a very tame initial file that simply defines the Rake
module along with its version and requires the other files used by the library.
module Rake VERSION = '11.1.2' end # ... require 'rake/linked_list' require 'rake/cpu_counter' require 'rake/scope' # ...
Deeper into the lib
Once we enter the lib/rack
, lib/rspec
, lib/capybara
, and lib/sidekiq
folders, we run into a series of files. These generally each contain a single class (all living under the main module) or a further set of folders that will divide the gem’s code up into more specialized modules to handle specific portions of the libraries’ functionality.
Conclusion
Today we looked at a few of the most popular Ruby gems to see how they have chosen to organize their code. As you can see, they are all generally done the same way. Some have chosen to have modules loaded via the autoload
method, while others have used require
to load the files at the beginning.
One thing is the same though: The gem is responsible for loading its own code, and you as the user of the gem don’t have to worry about it for the most part.
Reference: | Exploring the Structure of Ruby Gems from our WCG partner Leigh Halliday at the Codeship Blog blog. |