Managing Private Dependencies with Bundler
Bundler is a great resource for managing dependencies in your Ruby projects. It helps verify compatible versions between each of your gem dependencies as well as create a version lock file. This guarantees that everyone who uses that same project will be working with the same gem versions that worked for you.
Specifying Dependencies with Gemfiles
The Gemfile is a place where you may specify each of your dependencies.
source 'https://rubygems.org' ruby '2.3.1' gem 'rails', '~> 4.1.16' gem 'json', '~> 1.8.3'
Typically the gems specified here are openly available and require no credentials to be downloaded. You may individually assign any source to git repositories as well, which will allow you to use your own open-source code, forks from open-source code, or direct links to the project’s git repository.
group :production do gem 'rails_log_stdout', github: 'heroku/rails_log_stdout' end
Avoid breaking changes
In the first example in the previous section, the ~>
given with the gem command specifies that all digits except for the last must be strictly that version. The last digit is the minimum required value for the project. ~>
is a version helper for declaring that it’s okay to install small fix updates.
This is part of semantic versioning (semver), which by design gives the numbers the following meanings:
- The first digit is a major version. When it changes, it symbolizes either a form of project completeness and/or breaking changes.
- The next number is the minor version. It normally allows for new features, bug fixes, and deprecation messages, but generally should not break your code base with changes.
- The last number is the patch version. This is specifically for non-breaking fixes or amendments.
But you should keep in mind that these are completely subjective to the project developers; they may choose how they interpret their own versioning semantics. Rails, for example, has added a fourth number in case there is an urgent security update.
If the project is young or volatile, you may want to specify an exact version with gem 'json', '1.8.3'
. Semver also allows for other comparison versions, such as less-than or greater-than to set minimum and maximum versions permitted. These should be used with caution; the whole point of semantic versioning is to help avoid breaking changes, and it’s hard to know what changes may occur across versions.
With git repositories, you may specify a commit reference to restrict which version of the code is included.
gem 'json_pure', github: 'flori/json', ref: '7347860'
You don’t have to include the entire reference number. Just the end of it will do.
Keeping with a more strict dependency set, you’ll still able to manually update it when you’re ready. It’s also easier to know what might have caused your code to break. Without semantic versioning and the gems constantly updating, you run into situations where so many things have changed that it’s hard to determine what went wrong and where.
Managing Private Gems
When building a large project, it’s often a good thing to separate out parts of the code and logic into smaller pieces. Some of that is nice to publish as open-source code and release gems that others can benefit from. But you may also have proprietary code that should remain private and would be better managed separated into its own code base.
You can create a separate private git repository and give it multiple gem directories. This is perhaps a simpler and safer way to manage multiple private gems rather than using submodules. Submodules are a way of adding a separate GitHub repository inside the current GitHub repository. They add a lot of complexity and much higher potential for making mistakes. I highly recommend a separate private repository instead of submodules, with each gem having its own directory to keep things simple. You may then add them in your Gemfile like so:
git "https://github.com/company/private-repo.git" do gem 'be_excellent', ref: '83623' gem 'to_eachother', ref: '8886c' end
Here each gem is in its respective folder with a matching gemspec file name within each folder. The reference number will be used here for the last git commit that you want to use that gem version of.
Once you’ve added a private repository, git will require a valid user for accessing the repository. Without preparing this on your production server, it will error out and not build.
Now when using GitHub, I recommend setting up an additional GitHub account that will only have access to the private gem repository. Here’s why I recommend this: To read the private repository as a user with an access token, it will require full user’s read and write privileges. GitHub does not, as of this writing, have a read-only access token.
To set up access to your private repository, follow these steps:
- Create an account specifically for the private repository.
- Add that account to the repository.
- In that new account’s GitHub settings, select Personal access tokens.
- Select Generate new token.
- Fill in the description Account for server to pull private repo.
- Check the box that says repo to give full access permission to the new access token.
- Click “Generate token.”
- Copy the token it gives you.
- Once you have the access token, assign it to your production environments variable:
BUNDLE_GITHUB__COM=your_really_long_key_here:x-oauth-basic
- Be sure to include the
:x-oath-basic
at the end of the access token.
Once you’ve set this environment variable on your production server, you should be able to build your production environment successfully with it, accessing GitHub as the new user you’ve created to access the private repository.
Be sure you’re running an updated version of Bundler. It’s been updated to sanitize URLs so as not to disclose GitHub private credentials anywhere (logs, lock file, stdout). With the ENV variable mentioned earlier, this should not be an issue, but if you call the environment variable from your Gemfile, this would previously be displayed in output.
Summary
With a little bit of work, you can get your private gems integrated and more easily maintain your code base. Having a project grow into a monolith can be quite the headache. Separating things out according to different concerns can help add clarity to your code base.
Of course, remember to be cautious when dealing with credentials and don’t use your main GitHub account for access tokens when giving read permission also requires that you give full write permissions to ALL of your projects.
Now, maintaining and switching Gemfile references for your private gems is relatively easy. But one complexity that comes from this is continuous integration tests for each of the subdirectories. Typical CI test suites will grab the main repo and its root test directory, but with many gem subdirectories each having their own test folder, you’ll likely need a more robust solution to remedy this. I look forward to seeing a solution to subdivided test suites. If this is too big of an issue for you to overlook, you may try to have each gem be its own private repo and add the server’s GitHub user account to each of them.
Keeping your code private and credentials safe are key concerns for companies. Managing credentials is still a big topic of discussion but one consensus on this is that credentials should not be stored in the code base.
Environment variables seem to be a step in the right direction, but these sensitive credentials should still be encrypted when possible. As time goes on, I expect that the developer community will hash out these issues, and we’ll all be more secure for it. For now, this is the most secure private GitHub/Bundler solution that I’m aware of, and I hope you’ve found it helpful.
Reference: | Managing Private Dependencies with Bundler from our WCG partner Daniel.P. Clark at the Codeship Blog blog. |