Modeling GraphQL Mutations
GraphQL mutations provide a flexible way for a client to modify data on the server. On the surface level is a very simple concept: syntactically a GraphQL mutation resembles a normal query with one small difference — top-level mutation fields are never executed in parallel. This aspect helps to ensure that all subsequent sibling mutation fields always operate on the most up-to-date version of the data.
Although you can put arbitrary logic in the mutation field resolvers, typically top-level fields are used to perform some kind of side-effect, like updating data in the database, sending an email, etc. So it comes as no surprise that these field names usually start with a verb, like create
, delete
, update
, send
, etc.
This flexibility of GraphQL mutations begs a question: when you want to build a new GraphQL API, what design principles should you use when you are designing the mutations? Should it resemble a simple CRUD (Create, Read, Update and Delete) API where all updates operate on anemic data structures, or maybe there are other, possibly more robust, design principles that you can employ? It turns out that this question is not as trivial as it seems.
In this article, I would like to explore one possible way how you can approach this challenge with examples from our own GraphQL API.
Recently, wrote a series of articles on this topic:
I found them very inspiring and I can highly recommend to read them all. They also inspired me to share our approach of modeling the GraphQL mutations. I will try to reference relevant posts from which you can read to get more information on a specific topic.
A story of 2 models
At commercetools, our API is heavily influenced by the principles of CQRS (Command Query Responsibility Segregation) and DDD (Domain-Driven Design). CQRS might sound scary at first, but in its essence, it’s a quite simple concept: it separates read and write data model. GraphQL makes a strong distinction between input and output object types, which must be defined separately. This makes it perfectly compatible with the CQRS concepts. In many cases, these types might look quite similar, but not necessarily the same.
This brings a lot of flexibility: when we model mutation fields and their arguments (write model), we are no longer constrained by the shape and structure of the output types (read model). Given this flexibility, we decided to model all mutations as “commands” and “update actions”.
An example of command would be createProduct
, updateDiscountCode
, mergeAnonymousCart
, etc. In the GraphQL schema it is represented as a top-level mutation field. An update action represents a more fine-granular change on a specific aggregate root (you can think of an “aggregate root” as an entity with a globally unique ID, more on it below). All update actions are represented as GraphQL input types.
To make it more concrete, let’s have a look at an example of such an aggregate root that we expose via GraphQL API: DiscountCode
.
In this example, DiscountCodeDraft
as well as DiscountCodeUpdateAction
and other *Draft
input types represent our write model and slightly relate to the actual read mode (DiscountCode
output object type).
This design goes hand in hand with the ideas described in “Anemic Mutations” article.
Update actions and polymorphic input types
Modeling every possible change as an update action (a separate input object) has a lot of advantages. With this approach we have a lot of control over the domain constraints and are able to represent and enforce them with a help of the GraphQL type system.
Let’s look at the update actions of DiscountCode
aggregate root.
We are enforcing several domain constraints here. For example, there is no setCode
update action (after its creation, the string code
is immutable). When we are updating custom fields (with setCustomType
update action), we are also able to provide several convenience fields to identify the related Type
(another aggregate root) with either typeId
or typeKey
or full ResourceIdentifier
. It would be hard to provide this functionality if we would choose the anemic data model. Moreover, this gives us a very good documentation and code auto-completion in the GraphiQL for all commands and update actions.
Here is an example of how a client might update an existing DiscountCode
:
As you see, you can provide multiple update actions at once. You also must provide the last seen version of the discount code object. These aspects will become important as we will discuss the transactional boundaries and optimistic concurrency control.
You probably also noticed that the current form allows for a possibility to provide several input fields in a single update action, e.g. {setName: …, changeIsActive: …}
. This goes back to the way we represent polymorphic input types. In our Scala backend, all these update actions are modeled as a type hierarchy:
Unfortunately, the GraphQL spec does not support polymorphic input types at the moment. To address this limitation, we model it as an additional input object DiscountCodeUpdateAction
, that contains all of the available update action fields. To enforce a “single input field per update action” constraint, we implemented as simple run-time validation on the backend side.
All the ideas I just described correlate nicely to “Static-Friendly Mutations” and “Batch Updates” posts. So definitely check them out for more info.
As a nice side-effect of this model, we are also able to go one step further and internally represent all data changes as events. In our API backend, commands and update actions are validated, then they execute relevant business logic and get persisted as set of events (one command might produce multiple events).
This approach is called Event Sourcing. In practice, we have a set of listeners in the background which listen to all of the incoming events (we have a few queues in between) and either perform additional business logic or propagate and accumulate the data in the views. One such view is the read model you get as a result of successful mutation.
Transactional boundary
“We will argue below why we conclude that atomic transactions cannot span entities. The programmer must always stick to the data contained inside a single entity for each transaction. This restriction is true for entities within the same application and for entities within different applications.”
— Pat Helland
As Pat Helland described in his whitepaper “Life beyond Distributed Transactions”, it is hard to achieve atomic transactional behaviour that spans multiple entities and still maintain a massively scalable system. For this reason we provide a quite narrow transactional boundary that is limited to a single aggregate root.
In other words, all update actions in the list are applied as a single atomic operation (either all of them are successful and get persisted or none of them are applied). There is no such guarantee for commands though. In other words, there is no transactional guarantees between multiple top-level mutation fields.
Also keep in mind the difference between the nullable and not-null mutation field types. According to the GraphQL error propagation rules, if an error is raised during the resolution of a not-null field, then the whole object is considered unresolved and the error bubbles up. Since mutation fields are always resolved sequentially, this provides us a mechanism to control how mutation query execution should react in case of a failed mutation field.
In our case we decided to make all top-level mutation fields nullable. This means that if one mutation field fails for some reason, then subsequent sibling mutation fields will still execute.
An alternative would be to disallow this behaviour and make all fields not-null. In this case query execution will stop on the first failed mutation field. Yet another alternative would be to allow client to decide the error handling behaviour and provide both field variations in the schema.
Optimistic concurrency control
When it comes to mutations, especially on mission-critical data, it is important to be aware of inherent concurrency.
Let’s imagine a simple GraphQL client that:
- Loads stock information
- Makes an analysis of the stock entry and then makes a decision of whether to increment or decrement the quantity
- Sends appropriate mutation to the server
It is important to realize that between steps 1 and 3 another unrelated GraphQL client might do the same and update the stock entry while the first client still makes a decision (see diagram below). This means that the decision might no longer be valid. In this scenario, the client should be able to detect such conflicts and react on them (for instance, the client can retry all steps from 1 to 3).
By this point you probably already noticed the version
argument which is present in update
and delete
commands. An aggregate root version provides a mechanism for a client to detect such scenarios.
On this diagram, when a client performs a mutation, it must specify the version last seen on the stock entry. If the server is able to verify that it is still the most recent version, then the mutation succeeds. If the stock entry is mutated by some other client in meantime, then the server sees a more recent version and fails the mutation, thus allowing the client to detect concurrent modification of the data.
Without this mechanism, client would be unaware of external concurrent changes and will always override the server data with potentially outdated information. I also would like to point that while this mechanism is quite helpful, it is not always necessary. In some cases client just wants to override the data on the server, regardless of the current state and version. To cover these scenarios, you might consider making version
optional.
Thanks for reading and I hope you would find this article useful! If you are still struggling with modeling of your GraphQL mutations, give this approach a try. It will provide a scalable, maintainable and future-proof solution. Also feel free to drop a comment below and let us know what you think!