GraphQL and Performance in Rails
We looked previously at getting set up with GraphQL on Rails. We defined some queries, some mutations, and had a good time doing so! But what if I told you that with only a few hundred records in the database, it’s possible to write a query that brings our server grinding to a halt?
In this article, we’ll look at three ways to avoid performance issues with GraphQL in your Rails app, and then at a tool to help monitor which queries are being executed against your GraphQL API.
Avoiding N+1 Queries
An example of an N+1 query is when you’re trying to show something like the owner’s name for five rentals, you end up with six queries. One to find the rentals, and another five to find the owners one by one. This is triggered by sending the following query:
query { rentals { id owner { name } } }
Here we are grabbing the rentals plus their owner’s names. If this were a REST endpoint, you’d know ahead of time we were going to return the rentals plus owners, so you’d probably add some eager-loading to your query.
Unfortunately this won’t work here because we don’t know what the client is going to ask for. We don’t want to always eager-load the owner, because what if the client doesn’t ask for them? Remember that GraphQL solves the issue of under- and over-fetching, so let’s not reintroduce that into our code by guessing what the client will ask for.
If we look at our logs, they will look something like:
Rental Load (0.9ms) SELECT "rentals".* FROM "rentals" User Load (0.3ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 120], ["LIMIT", 1]] User Load (0.2ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 116], ["LIMIT", 1]] User Load (0.2ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 116], ["LIMIT", 1]] User Load (0.1ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 117], ["LIMIT", 1]] User Load (0.2ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 114], ["LIMIT", 1]]
There is a gem provided by Shopify to solve just that called graphql-batch. It essentially allows us to batch up all of the user IDs and perform a single query to find them all at the same time. What we’re going to do is modify the owner
field’s resolver to look like this:
# app/graphql/types/rental_type.rb field :owner, Types::UserType do resolve -> (obj, args, context) { RecordLoader.for(User).load(obj.user_id) } end
In addition to adding this line to our schema file use GraphQL::Batch
, we also need to create the RecordLoader
class:
# app/graphql/record_loader.rb class RecordLoader < GraphQL::Batch::Loader def initialize(model) @model = model end def perform(ids) @model.where(id: ids).each { |record| fulfill(record.id, record) } ids.each { |id| fulfill(id, nil) unless fulfilled?(id) } end end
You’ll see that we now get the following queries…much better!
Rental Load (0.5ms) SELECT "rentals".* FROM "rentals" ORDER BY "rentals"."id" DESC LIMIT $1 [["LIMIT", 20]] User Load (0.4ms) SELECT "users".* FROM "users" WHERE "users"."id" IN (124, 125, 115, 120, 122, 121, 117, 118, 112, 113, 111, 119, 123)
This works great for belongs_to
relationships, but I had a hard time with has_many
. For example, if you wanted the bookings for each rental, it would again produce N+1 queries. Luckily I found a great gist by theorygeek that solves this, and it turns out that someone turned it into a Gem!. Add enable_preloading
to the schema file to be able to use this functionality.
Now, for the bookings
field for example, we could write it like this:
# app/graphql/types/rental_type.rb field :bookings, !types[Types::BookingType] do preload :bookings resolve -> (obj, args, ctx) { obj.bookings } end
We can now throw a query like this at our API without the server even flinching:
query { rentals { id owner { name } bookings { guest { name } } } }
Avoiding Overly Complex Queries
If we take the example from above, where the query begins to get nested deeper and deeper, we may get unintended results in terms of server performance. Let’s take the example below:
query { rentals { id bookings { id guest { id bookings { id } } } } }
It has started to become circular and could theoretically continue that way forever. With only a few hundred test records in my database, I had 5s response times. When I went one level deeper in the query, I had 50s response times. Although to be fair, this was before I implemented N+1 fixes mentioned above.
In terms of complexity, this query goes five levels deep. You might want to implement a rule stating that no query can go beyond four levels deep. It’ll be up to you to analyze which queries your users need to perform, but it’s good to have some sort of limit in place to avoid bringing your server down, and then expand it as needed.
I picked four levels deep because it would allow for a fairly typical scenario of:
- A post/book
- With comments/reviews
- With names of the commenters/reviewers
This is managed easily with the graphql
gem by setting the max_depth setting in your schema class.
# app/graphql/landbnb_schema.rb LandbnbSchema = GraphQL::Schema.define do max_depth 4 # adjust as required use GraphQL::Batch enable_preloading mutation(Types::MutationType) query(Types::QueryType) end
!Sign up for a free Codeship Account
Applying Timeouts
Another way to guard your GraphQL API would be to implement a global timeout. The code will try to resolve as many fields as it can within the allotted time. This is another great way to have a safeguard in place just in case there is a wild query you didn’t account for.
# added to bottom of app/graphql/landbnb_schema.rb LandbnbSchema.middleware << GraphQL::Schema::TimeoutMiddleware.new(max_seconds: 2) do |err, query| Rails.logger.info("GraphQL Timeout: #{query.query_string}") end
In testing, I set the timeout to 0.02
seconds to trigger the error result, which looks like:
{ "data": { "rentals": [ null ] }, "errors": [ { "message": "Timeout on Rental.id", "locations": [ { "line": 3, "column": 5 } ], "path": [ "rentals", 0, "id" ] } ] }
Monitoring
So how do you get visibility into the performance of your GraphQL API? For that, I’d recommend a great tool called Apollo Optics. It’s easy to get set up and running with Rails and provides visibility into what queries are being run, which fields are being used, and how long each operation takes. This service is free up to 1,000,000 requests per month.
The first step is to create an account, and once you do that you’ll be given an API Key. If you use the name OPTICS_API_KEY
, it will get picked up automatically. After installing the optics-agent
gem, you’ll want to add a config initializer file named optics_agent.rb
which looks like:
optics_agent = OpticsAgent::Agent.new # replace the schema class with yours optics_agent.configure { schema LandbnbSchema } Rails.application.config.middleware.use optics_agent.rack_middleware
And lastly we’ll need to modify the GraphqlController to set our context to:
context = { current_user: current_user, optics_agent: request.env[:optics_agent].with_document(query) }
Within a couple minutes, you should see queries streaming into the dashboard providing you with detailed reports and graphs of your GraphQL queries.
Conclusion
We’ve looked at three ways to avoid unwieldy GraphQL queries from bringing down the server. We’ve limited their damage by avoiding n+1 queries, avoiding overly complex queries, and by applying timeout restrictions to our query. Lastly we took a look at a GraphQL monitoring tool called Apollo Optics.
Reference: | GraphQL and Performance in Rails from our WCG partner Leigh Halliday at the Codeship Blog blog. |