Designing a scalable GraphQL schema
Gintas Kovalevskis
June 30, 2022
Table of contents
Schema is the key concept of GraphQL API, but it can be quite tough to get the schema design right, especially if it’s your first time. Is it future-proof? Does it follow best practices?
In this article, I will share the key lessons I learned over the past years designing and implementing GraphQL for our NordSec checkout infrastructure. Hopefully, this will make it easier for you to design a great GraphQL schema that is easy to use and scales well for the future.
1. Design your schema based on business needs
It can be tempting to generate a GraphQL schema based on your data model or DB schema to save time. However, this approach has one major disadvantage: your schema directly depends on your implementation details.
This is a problem because any change to your data model will cause your schema to change, possibly breaking things for your GraphQL users. This may make it very hard to maintain and scale your GraphQL application.
If you’re simply looking for a more user-friendly way to execute queries against DB, then this can be a viable approach. But unless that’s the case, you should design your schema based on your business use cases. Schema that doesn’t directly depend on your data model is much easier to scale, and schema based on your business needs is much easier for clients to consume and for you to reason about.
Keep your business cases in mind when using best practices/patterns developed by the community, as well. For example, consider a pagination pattern based on connections and edges. This is a good pattern if your goal is to implement pagination that can only navigate to the next or previous page. But what if your business needs more traditional pagination based on page numbers? In that case, the mentioned pattern won’t work for you.
2. Be specific with your naming
“There are only two hard things in Computer Science: cache invalidation and naming things.”
— Phil Karlton
Naming things is hard, and it’s especially true when we are talking about GraphQL. Once you’ve named your query/field and published it, there’s no going back - it’s probably already being used by clients whose applications would break if you change the names. To avoid this, try to be specific and avoid abstract names. Abstract names will make your queries less readable and may occupy names for better potential future use cases.
Let’s consider this Address type:
1type Address {2country: String!3}
Notice the country field. Is it clear what kind of data this represents? Is it a country name or country code? To make it worse, what if we want to add another field that represents the Country type that has both the country name and code? We can’t use the Country as it’s already being used for the String type.
This issue could easily be avoided if we gave our field a more specific name:
1type Address {2countryCode: String!3}
Keep in mind that similar naming collisions could also apply to object types. Consider the Status type. What kind of status is it, which object type does it belong to, and what if you want to have 2 separate Status types? Naming it something specific like OrderStatus eliminates the issue.
3. Nest your object types when possible
Consider the previous example with countryCode. What if we also want to add the country name? Are we going to add another countryName field? Sure, that might work, but it’s poor design because it clutters our Address type with unrelated country data. Not only that, but we can’t really reuse the country fields in any other types.
Instead of cluttering all the fields in a single type, it’s a good idea to nest related fields in their own types. In this case, we can just define a new type Country:
1type Country {2code: String!3}45type Address {6country: Country!7}
This makes the Address type much cleaner. It will only contain data that is related to an actual address (not a country). We can also reuse the Country type in other types, and implementation is now much easier because we can have separate resolvers just for country-related data.
Apply the single-responsibility principle here - is the type responsible for representing a specific field, or should it belong to a different type? Don’t be afraid to introduce new nested types if they only contain a single field. You will be grateful when you have to introduce new fields in the future.
4. Keep your interfaces small
Interfaces are a great way to introduce polymorphic data. However, it’s easy to fall into a trap and overload your interfaces with fields. When your schema scales up, you may experience a mismatch between the features your interfaces define and the features your types actually implement. To prevent this problem, it’s a good idea to keep your base interfaces small and introduce new features/fields through additional feature interfaces.
Take a look at this example:
1type Query {2plans: [Plan!]!3}45interface Plan {6id: ID!7period: Period!8price: Int!9}1011interface VaryingPlan {12variation: String!13}1415type NordVPNPlan implements Plan {16id: ID!17period: Period!18price: Int!19}2021type NordPassPlan implements Plan & VaryingPlan {22id: ID!23period: Period!24price: Int!25variation: String!26}
Here, the Plan interface has 3 fields (id, period, and price) that are required to describe any plan. NordVPNPlan and NordPassPlan both implement this interface, but NordPassPlan has an additional feature - it can have a variation. Instead of forcing the variation field on all plan types, we have a feature interface that is only implemented by NordPassPlan. We can query such plans like this:
1{2plans {3id4price5... on VaryingPlan {6variation7}8}9}
Because our initial Plan interface is so small, we have the flexibility to introduce new Plan types that have their own unique features.
5. Wrap your payload
To make your queries more flexible, consider wrapping your inputs and responses in payload objects. Let’s take a previous query that returns plans:
1type Query {2plans: [Plan!]!3}
This works fine as long as we want to return plans. But what if, in the future, we would also want to return the total plan count or introduce pagination? We would have to refactor the query (or introduce a new one) just to add new metadata fields. This can be addressed by wrapping plans into an object type:
1type Query {2plans: PlanListPayload3}45type PlanListPayload {6nodes: [Plan!]!7}
Now, whenever we want to extend the plans output with additional metadata, we can add it to PlanListPayload without any breaking changes:
1type PlanListPayload {2nodes: [Plan!]!3totalCount: Int!4}
Final note
Designing a GraphQL schema can be a daunting task, and changing and evolving business requirements won’t make it any easier. It’s impossible to predict the future and design a perfect schema, but we can take steps to make our schema more future-proof.
Hopefully, these tips will make it easier to design your GraphQL API. If you’re ever in doubt, just check how others do it - GitHub and GitLab have great schema examples.
Finally, take it slow:
“Don’t try to model your entire business domain in one sitting. Rather, build only the part of the schema that you need for one scenario at a time. By gradually expanding the schema, you will get validation and feedback more frequently to steer you toward building the right solution.”