If you’re part of a small engineering team working on the first release of an application, at some point there will be a discussion and a choice about the app’s backend architecture. Depending on the architecture, the team will succeed and struggle in scaling with it. The impact of the chosen architecture includes:
Quality & Maintainability
Ease of adding new features
Variable vs fixed infrastructure costs
Ramping up new engineers
Communication between peers and other stakeholders
These are several of many second and third order effects that come out of choosing the first architecture for the app. At Courier, we chose to implement an event driven architecture (EDA) backed by AWS for our backend to ensure we could scale with many of the vectors listed. The next couple sections will cover what is EDA at a high level, why EDA fits Courier’s engineering needs, and the benefits & struggles we’ve experienced for the first year.
Event driven architecture is a choice to design software around events. Events typically represent a change of state that occurred in the past. A part of the app will dispatch events for each state change, and there will be one to many reactions to the event. Reactions include changing the view of a read store, invalidating cache, sending a notification, exposing the data via webhook to its consumers, or triggering another business process. Event driven architectures promote a highly decoupled system environment because once the system dispatches the event, it doesn’t need to know what happens afterwards. It all allows for independent work on the code that can react to the event. There is coupling though at the event itself. If there is a destructive change to the shape of the event, all the systems that react to the event will need to change to correctly process it.
Courier’s main focus is to help customers quickly design notifications and deliver them through our send endpoint. Behind the endpoint exists a set of processes that takes the notification and determines what recipients will receive in the end. Along each step of the process, we need to raise events that other parts of our app will subscribe and handle. Some of these handlers include rendering read only views of our message, hydrating our logs, and moving the notification from a transient state to an end state.
Based on what I said in the previous paragraph, there is a cohesive relationship with our engineering needs and what EDA offers for a guideline. Another reason we chose EDA is that AWS provides up to two event streams off of DynamoDB giving us an easy mechanism to raise events and put them into other services in AWS such as Kinesis where we can set up one to many observers.
With choosing EDA, the team at Courier gained many benefits that’s helped us, but we also ran into a few challenges.
Some of the benefits include:
Have very focused functions that respond to events whether they are HTTP events or ones we’ve raised through streams. We usually open up the function and in seconds, be able to understand what it’s doing with or without comments.
It has been easy to add more observers against the events. There are not many collisions, and existing functions stay closed.
With DataDog in place, we can monitor the usage of each handler and see how many times it invokes and estimate the costs that they generate.
For starting engineers, it’s possible to assign them to one or two handlers to start so they can gain specialization in the stack and know how they perform.
Some of the struggle we had were:
The surface area of the backend becomes more of a web, with only parts of it known to any one engineer. We have to continuously document and diagram to have a full picture of the architecture.
For starting engineers, the above point is why onboarding was challenging. Introducing a change to an event requires knowing every handler using it to correctly and safely change behavior.
Specifically with the AWS stack, we ran into constraints when designing our functions, specifically around the number of resources an CloudFormation stack can have. The initial simplicity of the setup has definitely become more complicated over time, and we have to be wary how we scale within the AWS ecosystem.
Based on our product values in delivering messages based on customer events, it made sense for us to choose EDA. We believe we and the architecture will scale as our product adds more features. Even with our early struggles with it, the benefits of our choice outweigh the cons. Part of choosing the right architecture is matching what your business needs not only in the short term but for the long term too.
If you find our architecture and what we do here in engineering interesting, reach out to us! We're always looking for great engineering talent.