How to Find Ruby Code Smells with Reek
Your Ruby code smells. And it’s okay — mine does, too! Maybe it has some long methods with too many parameters, a few classes that try to do too much, an uncommunicative name here or there.
No codebase is perfect, but it’s worthwhile to be aware of its deficiencies and refactorings that could improve the state of things — so let’s look at a few example code smells, potential ways to fix them and a tool that helps find them in our applications.
Duplicate Method
One of the easiest to notice (and simplest to fix) code smells is duplicate method: a situation when a given method call is made two or more times in the same context.
Let’s assume we have a class that represents an office which has a method assessing whether the current weather helps concentrate on programming tasks. To assess the weather, the Office
class collaborates with a weather source and asks it for current weather at the office’s city:
class Office def weather_assessment if @weather_source.weather_at(@city).skies == :clear 'Too sunny' elsif @weather_source.weather_at(@city).temp > 20 'Too hot' else 'Perfect for coding' end end end
In the above implementation, the Office#weather_assessment
method makes two separate WeatherSource#weather_at
calls (probably resulting in firing synchronous requests over the network), each time passing the same argument. While there might be cases where this is warranted — the method’s results are volatile or explicitly calling it twice significantly improves readability — in the vast majority of cases making a single call and storing its result locally is a better solution:
class Office def weather_assessment weather = @weather_source.weather_at(@city) if weather.skies == :clear 'Too sunny' elsif weather.temp > 20 'Too hot' else 'Perfect for coding' end end end
Data Clump
Let’s create a few more methods for our Office
class — this time ones that use a navigation source collaborator to calculate distance, directions, and whether one needs a passport to get to the office’s geographical coordinates given a starting latitude and longitude:
class Office def directions(source_lat, source_lon) @nav_source.directions(from: [source_lat, source_lon], to: [@lat, @lon]) end def distance(source_lat, source_lon) @nav_source.distance(from: [source_lat, source_lon], to: [@lat, @lon]) end def needs_passport?(source_lat, source_lon) @nav_source.borders?(from: [source_lat, source_lon], to: [@lat, @lon]) end end
Notice how all of these methods take the same pair of parameters? This smell is called data clump, and — except for times when the arrangement is accidental — it usually means there’s a higher-level object missing from the implementation. In this case such an object could be a Location
struct representing the given pair of latitude/longitude coordinates:
Location = Struct.new(:lat, :lon) class Office def directions(source_location) @nav_source.directions(from: source_location, to: @location) end def distance(source_location) @nav_source.distance(from: source_location, to: @location) end def needs_passport?(source_location) @nav_source.borders?(from: source_location, to: @location) end end
Repeated Conditional
Let’s assume that (in our remote-friendly organization) we don’t exactly know the location of some of our offices. In this case we could be tempted to guard the NavSource
calls with a conditional check:
class Office def directions(source_location) if @location @nav_source.directions(from: source_location, to: @location) end end def distance(source_location) if @location @nav_source.distance(from: source_location, to: @location) end end def needs_passport?(source_location) if @location @nav_source.borders?(from: source_location, to: @location) end end end
This smell is called repeated conditional – doing the same check over and over again suggests that there might be a better solution. Unless such a set of checks has exceptional readability, refactoring usually yields a better solution –but whether the refactoring should consist of introducing a separate LocatedOffice
class or teaching the NavSource
about handling NullLocations
usually highly depends on the codebase in question.
Boolean/Control Parameter
Let’s now add a predicate that says whether the developers at the given Office
program in Ruby — and let’s make it parameterizable so the caller can choose whether this should be a strict question (‘only Ruby developers’) or a loose one (‘Ruby developers among others’).
class Office def ruby_developers?(strict = true) languages = @employees.map(&:primary_language).uniq if strict languages == ['Ruby'] else languages.include?('Ruby') end end end
This kind of code is called the boolean parameter smell (a case of the more general control parameter smell); the caller knows exactly which code path the method will take and controls its execution from the outside. The typical refactorings are to either split the parameterized predicate into dedicated methods or to split the class into more classes, each of which would implement one of the code paths.
In this case it seems separate methods work quite well:
class Office def only_ruby_developers? languages == ['Ruby'] end def ruby_developers? languages.include?('Ruby') end private def languages @employees.map(&:primary_language).uniq end end
Feature Envy
A code smell that’s a bit more problematic to refactor, but happens quite often, is feature envy.
Let’s add an Office#good_fit?
method that assesses whether a given office is a good fit for a given employee (passed in an argument); we consider the office a good fit when the employee is sociable or likes the city:
class Office def good_fit?(employee) employee.sociable? || employee.likes?(@city) end end
Notice how this method is much more interesting in interrogating its parameter; if it weren’t for the reference to the @city
instance variable it would be a utility function — a method that operates solely on its arguments and can be moved anywhere in the system without any change to its body.
Refactoring feature envy smells is more involved, but the general notion is the same: There is a chance that this method would work better if it was implemented in the context of its parameter:
class Office def good_fit?(employee) employee.would_like_to_work?(@city) end end class Employee def would_like_to_work?(city) sociable? || likes?(city) end end
Finding Smells with Reek
Now that we know a few common code smells, how can we find them in our own Ruby codebases? This is where Reek comes in — Reek is a code smell detector that can analyze Ruby files and pinpoint the smells, and you can think about it as a RuboCop for your architecture and code quality.
Reek’s usage is quite simple:
$ gem install reek $ reek office.rb office.rb -- 8 warnings: [1]:Office has no descriptive comment (IrresponsibleModule) [12, 18, 24]:Office takes parameters [source_lat, source_lon] to 3 methods (DataClump) [13, 19, 25]:Office tests @lat && @lon at least 3 times (RepeatedConditional) [40, 40]:Office#good_fit? refers to employee more than self (maybe move it to another class?) (FeatureEnvy) [30]:Office#ruby_developers? has boolean parameter 'strict' (BooleanParameter) [32]:Office#ruby_developers? is controlled by argument strict (ControlParameter) [33, 35]:Office#ruby_developers? refers to languages more than self (maybe move it to another class?) (FeatureEnvy) [3, 5]:Office#weather_assessment calls @weather_source.weather_at(@city) 2 times (DuplicateMethodCall)
Notice how Reek reports all of the code smells discussed above, as well as suggests that the original implementation of Office#ruby_developers?
also smells of feature envy; this suggests that maybe the languages
collection is the object that should implement the predicate. Reek also by default flags any classes and modules that lack code comments as irresponsible.
Reek can also be used on whole directories and comes with a configurable Rake task and a set of RSpec matchers, so it’s easy to integrate it into your tests or make your CI server flag any newly introduced code smells.
!New Call-to-action
Configuring Reek
Reek’s opinions about what constitutes a code smell are by necessity very subjective; while it’s relatively easy to make blanket decisions and enforce a certain syntax style, opinions on whether a given piece of code is a smell and whether it should be refactored usually vary much more and often have to be contrasted with the readability and blunt obviousness of the original, ‘smelly’ versions.
Fortunately, this can be addressed via Reek’s high configurability — both on the project-wide level and when it comes to particular code pieces.
A project-wide (and, for a bit more fine-grained cases, directory-wide) configuration can be added by creating a YAML file ending with .reek
. For example, the default configuration of the data clump smell is to aggressively flag even the smallest clumps as soon as they’re repeated:
DataClump: max_copies: 2 min_clump_size: 2
Conversely, a method is considered to have a long parameter list when it has four or more of them — except for constructors, which can take up to five without being flagged:
LongParameterList: max_params: 3 overrides: initialize: max_params: 5
Likewise, a method is considered to have too many statements when it has more than five of them — except for constructors, which are a bit harder to split in practice:
TooManyStatements: max_statements: 5 exclude: - initialize
The exclude setting can be tuned as needed — from just a single case of AClass#some_method
to a regular expression matching a set of methods, like /meth/
.
At the other end of the configuration applicability any method can be decorated with a set of magic comments, altering Reek’s flagging on a per-method basis — e.g., when we’re okay with a method call being made twice in a particular case:
# :reek:DuplicateMethodCall: { max_calls: 2 } def weather_assessment if @weather_source.weather_at(@city).skies == :clear 'Too sunny' elsif @weather_source.weather_at(@city).temp > 20 'Too hot' else 'Perfect for coding' end end
Such a comment will both have a higher chance to be moved along with the method when it’s refactored and keeps guarding the implementation from adding a third call (at which point the method gets smelly again).
Whither Now?
Now that you know how to find — and, potentially, refactor — smells in your code it’s super important to remember that smells do not necessarily mean that the code is bad; to quote the venerable Content Creation Wiki:
Note that a code smell is a hint that something might be wrong, not a certainty. A perfectly good idiom may be considered a code smell because it’s often misused, or because there’s a simpler alternative that works in most cases. Calling something a code smell is not an attack; it’s simply a sign that a closer look is warranted.
While it might be tempting to go all out and try to fix each and every code smell until Reek no longer points anything out, this is not the point. To quote the Wiki again:
Rules are for the guidance of the wise and the obedience of fools. There are two major approaches to programming:
- Pragmatic: code smells should be considered on a case by case basis
- Purist: all code smells should be avoided, no exceptions
…and therefore a code smell is a hint of possible bad practice to a pragmatist, but a sure sign of bad practice to a purist.
I highly recommend investigating your code smells and refactoring the ones that are indeed problematic — but also learning how to accept the ones that are not; both experiences will let you level up as a developer. You can read about more code smells (and potential refactorings) in Reek’s docs.
(If you enjoyed this post, check out my Ruby Smells talk from this year’s RubyConf Portugal, which shows more smells and refactorings and elaborates on Reek.)
Reference: | How to Find Ruby Code Smells with Reek from our WCG partner Florian Motlik at the Codeship Blog blog. |