Dealing with N+1 in GraphQL (Part 1)
When people see GraphQL their first question - what about N+1 problems.
This is a legit question. In Product Hunt as in every GraphQL powered project, we have the same issues.
For example, let's say we have the following query:
query {
posts(date: 'today') {
id
name
slug
topics {
id
name
slug
}
}
}
And the following type definition for a post:
class Graph::Types::PostType < GraphQL::Schema::Object
field :id, ID, null: false
field :name, String, null: false
field :slug, String, null: false
field :topics, [Graph::Types::TopicType], null: false
end
This would result the following queries:
SELECT * FROM posts WHERE DATE(featured_at) = {date}
SELECT * FROM topics JOIN post_topics WHERE post_id = {post_id}
SELECT * FROM topics JOIN post_topics WHERE post_id = {post_id}
SELECT * FROM topics JOIN post_topics WHERE post_id = {post_id}
SELECT * FROM topics JOIN post_topics WHERE post_id = {post_id}
SELECT * FROM topics JOIN post_topics WHERE post_id = {post_id}
SELECT * FROM topics JOIN post_topics WHERE post_id = {post_id}
-- ...
All those N+1s SQL queries ?
We don't only have additional queries but when a couple of posts have the same topics we load those multiple times.
The best solution, I have found to this problem so far was to use GraphQL::Batch
gem from Shopify.
Adding the gem to your project is simple.
class Graph::Schema << GraphQL::Schema
# ...
use GraphQL::Batch
# ...
end
The way batching works: Instead of executing the query immediately, it returns an instance of GraphQL::Batch::Loader
. The same instance of a loader object is returned for a given field leaf. After all fields for a given GraphQL query leaf are collected, then the perform
method of the loader is called with all requested records. preform
then must load and return for each record the requested data.
Here is how are going to solve the N+1 for topics with GraphQL::Batch
.
First, we change the post type to use a new resolver.
class Graph::Types::PostType < GraphQL::Schema::Object
# ...
field :topics, [Graph::Types::TopicType], null: false, function: Graph::Resolvers::Posts::TopicsResolver.new
end
Define the resolver itself:
class Graph::Resolvers::Posts::TopicsResolver < GraphQL::Function
def call(post, _args, _ctx)
# `.for` makes sure we return the same loader instance
# so all leaves, so we can group data
loader = TopicsLoader.for
# adds a post to the list of posts to be loaded
loader.load(post)
# returns the loader, not the actual topics
# this gets transformed into topics afterward
loader
end
# Loaders represent promises and mechanism to
# postpone loading until we have all posts in the list
class TopicsLoader < GraphQL::Batch::Loader
# perform called with all the posts
def perform(posts)
# this is the built-in active record mechanism to
# preload associations into a group of records
# association are loaded with the minimum amount of queries
# if a couple of posts have same topics they would be loaded once
::ActiveRecord::Associations::Preloader.new.preload(posts, :topics)
posts.each do |post|
# returns topics for every post in the list
fulfill post, post.topics
end
end
end
end
Now we have a lot less queries ?
SELECT * FROM posts WHERE DATE(featured_at) = {date}
SELECT * FROM topics JOIN post_topics WHERE post_id IN ({post_ids})
Also, each topic is loaded only once ?
Since loading associations is a very typical problem, I have created a generic AssociationLoader
. It can be found in this gist.
class Graph::Types::PostType < GraphQL::Schema::Object
# ...
field :topics, [Graph::Types::TopicType], null: false, function: Graph::Resolvers::AssociationResolver.new(:topics)
end