/ ruby

Dealing with N+1 in GraphQL (Part 1)

When people see GraphQL their first question - what about N+1 problems.

This is a legit question. In Product Hunt as in every GraphQL powered project, we have the same issues.

For example, let's say we have the following query:

query {
  posts(date: 'today') {
    id
    name
    slug
    topics {
      id
      name
      slug
    }
  }
}

And the following type definition for a post:

class Graph::Types::PostType < GraphQL::Schema::Object
  field :id, ID, null: false
  field :name, String, null: false
  field :slug, String, null: false
  field :topics, [Graph::Types::TopicType], null: false
end

This would result the following queries:

SELECT * FROM posts WHERE DATE(featured_at) = {date}
SELECT * FROM topics JOIN post_topics WHERE post_id = {post_id}
SELECT * FROM topics JOIN post_topics WHERE post_id = {post_id}
SELECT * FROM topics JOIN post_topics WHERE post_id = {post_id}
SELECT * FROM topics JOIN post_topics WHERE post_id = {post_id}
SELECT * FROM topics JOIN post_topics WHERE post_id = {post_id}
SELECT * FROM topics JOIN post_topics WHERE post_id = {post_id}
-- ...

All those N+1s SQL queries ?

We don't only have additional queries but when a couple of posts have the same topics we load those multiple times.

The best solution, I have found to this problem so far was to use GraphQL::Batch gem from Shopify.

Adding the gem to your project is simple.

class Graph::Schema << GraphQL::Schema
  # ...

  use GraphQL::Batch

  # ...
end

The way batching works: Instead of executing the query immediately, it returns an instance of GraphQL::Batch::Loader. The same instance of a loader object is returned for a given field leaf. After all fields for a given GraphQL query leaf are collected, then the perform method of the loader is called with all requested records. preform then must load and return for each record the requested data.

Here is how are going to solve the N+1 for topics with GraphQL::Batch.

First, we change the post type to use a new resolver.

class Graph::Types::PostType < GraphQL::Schema::Object
  # ...

  field :topics, [Graph::Types::TopicType], null: false, function: Graph::Resolvers::Posts::TopicsResolver.new
end

Define the resolver itself:

class Graph::Resolvers::Posts::TopicsResolver < GraphQL::Function
  def call(post, _args, _ctx)
    # `.for` makes sure we return the same loader instance
    # so all leaves, so we can group data
    loader = TopicsLoader.for
    # adds a post to the list of posts to be loaded
    loader.load(post)
    # returns the loader, not the actual topics
    # this gets transformed into topics afterward
    loader
  end

  # Loaders represent promises and mechanism to 
  # postpone loading until we have all posts in the list
  class TopicsLoader < GraphQL::Batch::Loader
    # perform called with all the posts
    def perform(posts)
      # this is the built-in active record mechanism to 
      # preload associations into a group of records
      # association are loaded with the minimum amount of queries
      # if a couple of posts have same topics they would be loaded once
      ::ActiveRecord::Associations::Preloader.new.preload(posts, :topics)
      
      posts.each do |post|
        # returns topics for every post in the list
        fulfill post, post.topics
      end
    end
  end
end

Now we have a lot less queries ?

SELECT * FROM posts WHERE DATE(featured_at) = {date}
SELECT * FROM topics JOIN post_topics WHERE post_id IN ({post_ids})

Also, each topic is loaded only once ?

Since loading associations is a very typical problem, I have created a generic AssociationLoader. It can be found in this gist.

class Graph::Types::PostType < GraphQL::Schema::Object
  # ...

  field :topics, [Graph::Types::TopicType], null: false, function: Graph::Resolvers::AssociationResolver.new(:topics)
end