So your CTO has just handed you a project to revamp or build your product’s notification system. It seemed like a simple and straightforward project, but you started doing research and realized that not only is the process pretty complicated, there’s not a lot of information online on how to do it. After all, companies like LinkedIn, Uber, and Slack have large teams of over 25 employees working just on notifications. But smaller companies don’t have that luxury - so how can you meet the same level of quality with a team of one?
This can certainly be overwhelming, which is why we’ve created a blog post series to guide you through building the best notification system for your company. This is the first post in this series, and we’re introducing you to the essential user requirements for both developers and nontechnical users of your notification system.
It’s crucial that before building a notification system, you should know the requirements for your fellow developers and non-technical teammates who will be creating the notifications for your end users. Understanding these teammates’ personas will help you to build a more effective product with a better user experience.
A notification system is a collection of services (templates, provider integrations, routing logic, preferences, logging, etc.) that make it possible to quickly and easily create clear and direct communication between an app and its users. This clarity of communication generally involves a myriad of channels, including email, SMS, push notifications, etc. that allow the app to reach each user with the best possible user experience.
A well-built notification system removes complexity from the process of creating each notification, which allows for a consistent experience across products and teams. This also provides a centralized hub for notifications across the organization, thus making monitoring and analytics more accessible.
Depending on your company’s product, a notification system can work with different use cases. You can use a central notification system to alert your end-users about an incoming request, send messages about actions taken, inform end-users about product updates and upgrades or promotions, or even deploy account management notifications.
A developer needs to understand the framework of the notification system so they can integrate it into other parts of the application or software. They are the ones who end up wiring up notifications for the myriad of applicable use cases, so it’s important to build the system with them in mind.
Reliability makes it possible to avoid dropped messages. Even if many messages are coming in simultaneously or the system is at peak load, message delivery should be guaranteed. While there could be delays at peak load, you should be confident you aren’t losing messages.
The system should also retry failed message deliveries by sending messages reliably over the network and try again if a message fails.
An organization will need to send varying volumes of notifications at various times, so the developer using the API will not need to bother with auto-scaling the infrastructure. For example, an organization needs to send plenty of notifications when it has flash sales. At other times, during low-volume periods, it will need to send fewer notifications. The system should be able to scale up and down resource-efficiently as the volume of notifications changes and as an organization grows.
In the absence of a central notification system, inconsistency among channels like SMS, email, or push notifications are likely to become an issue whether among fellow developers, customer success, marketing, etc.
You can change the notification channel provider, whether AWS SES or Twilio for emails, in the notification service without changing the application code in any other products. Thus, the notification channels and providers will be abstracted and centralized instead of having the code sprinkled all over the application codebase. So if your company stops doing business with a particular provider, it can switch to a new provider in a few hours without impacting any other part of the notification service.
The uniform interface makes development simpler for each product and team by abstracting the different notification providers (email, push, SMS, etc.) so the developer can easily switch between different providers without rewriting the code.
For other developers to use your notification system, you have to provide good documentation to understand how to use it. Internal documentation is an integral part of any system, as it educates and helps users to reference and know how to use your product. When building a platform for other developers, you have to provide great documentation so that they have the tools to figure out any support-related issues.
Good documentation for a notification system should be an easy guide to help them get started. It should also provide a comprehensive reference for all the operations in the notification system. A developer integrating the notification system should not have to guess supported operations and parameters for the system.
The documentation should also be easily discoverable. Without knowing exactly where to go in the system, the developer should search for what they want and find it easily. Documentation should be accurate, consistent, available on-demand, and up to date. It should include examples, code samples, screenshots, and tutorials for more context on the system.
Technical users need to send notifications programmatically. So, a notification system must have APIs to submit notifications for delivery. These APIs must be intuitive from various platforms so that the system does not constrain the implementation of other systems. Users should call the APIs from any programming language or platform, and the API documentation should also be available on demand.
Notifications communicate with a target audience—so, system operators want to measure the performance of the notification system (and the impact of the notifications themselves) and collect information that can help the organization design its notifications better.
The data the notification service collects is valuable but raw. So, analyzing the data further derives insights for your organization. Such analyses are often done in other systems where engagement events correlate with other data to better picture user behavior.
A notification system should support data exports in both human-readable and machine-readable forms. It can also integrate with data warehousing tools and export to them directly.
Engagement with a notification is an essential metric for businesses, so the notification system should track such engagement. Tracking engagements are usually done by tracking link clicks and push notification opens.
For links, the notification system rewrites links in the notifications to go through itself. Each visit to the links logs an event in the system, then redirects to the original link. That enables the system to track clicks. The SDKs on the clients notice when a user opens a notification and record it for push notifications.
Information about the run-time behavior of the system is important for keeping the system running. Latency metrics and throughput metrics help to understand delays in the subsystems and the rate of notifications delivered. Queue length, together with service time and wait time (both latency metrics), can estimate delays and optimize the system further. These metrics help with capacity planning.
When things go wrong, the system should let users gain insight into those issues. Technical logs provide such insight. For example, the logs show that although notifications were successfully submitted to the system, they were not delivered to the downstream provider. The users now know that it’s not us; it’s them.
Technical users should see detailed technical logs for errors that occur when a notification is not sent. Another example: the developer should see that a message sent through SendGrid bailed because of an HTTP 401 error that says the API key is bad. Technical logs also show other vital details about the system’s operations. The logs can help operators diagnose problems when they occur.
A test environment allows the developer to simulate sending notifications safely. It is useful in continuous integration or staging environments where you need to run test code without sending notifications to actual customers or end-users.
Supporting a test environment enables rapid application development and also gives confidence to the programmers. The programmers can write tests close enough to the notification system’s workings rather than mocking out the system in their tests.
A test environment also allows the developers to experiment and try out different parameters and operations to see their results without impacting customers. Without this, every interaction with the notification system is potentially dangerous, as it may send a notification to a customer. A system that does not support a test environment delays the development pace because developers have to wait until production to try things out.
If your company has spanned multiple products or brands, the notification system should be able to deploy notifications to other brands’ customers and change the branding and logos on the fly. White labeling makes changing over to new brands for sending notifications much easier. The newly acquired company can retain its brand image while switching to the existing company’s notification system. For example, Twilio, Segment, and SendGrid (all owned by the same company) want to send notifications to all three software and change the branding and colors on the spot, depending on the product receiving the notification.
Non-technical users are the ones who only need a smooth user interface and user experience to use the notification system. Designers, content editors, customer success and support, and your marketing team do not interface with code, so you have to build to suit their needs as well. Let’s look at some of the requirements for a non-technical user.
Usability is a non-negotiable requirement because your users need it to create and send notifications seamlessly. Ensure the interface is user-friendly so they can explore the system quickly. It should also not require a lot of onboarding and training to understand the system.
The users should be able to carry out their intended tasks efficiently (in the shortest sequence of steps possible). To achieve decent usability, choose task-based user interfaces over generic ones. Task-based interfaces are interfaces designed with a particular user action in mind. For example, logging: A customer support representative needs to find why a user is not receiving a notification and needs to be able to search for notifications to that user by email in the logs. The interface must:
Designing a notification is the most important capability for a nontechnical user who will not be handling any code. Creating content and efficiently designing its layout and branding plays a central role in the way a customer success manager, for example, might use your notification. The manager needs to be able to rebrand a new logo or update text within an email or SMS without engineering going through a sprint cycle.
In addition to creating a great UX for notification design, templates can help make it faster and simpler to design the notifications. These templates could provide a drag-and-drop editor to change content on the fly without redeploying code. A highly usable system should also provide ready-made templates and the ability for the user to to create new ones and customize them.
Non-technical users need to see some logs, although not as detailed as technical logs. Each log entry should contain information such as:
The notification system should also log notifications sent by other systems through API calls, not just those sent by human users. Besides logs that send notifications, the system should also log changes to access permissions. It is essential in notification systems that have role-based access control.
It is important to know when a new user has permission to access the system. The notification system should log the particular permissions granted, the user that granted the permissions, and the user that received the permissions. The system should also log an event when it revokes a user’s access. Permissions granted to machine users, such as API keys and service accounts, fall under this category. These logs provide insight for non-technical users to understand their use of the system over time.
Role-based access control is a system that grants permissions based on roles defined in the system. It makes it easier to manage access across an organization and tailor such access to the roles of employees and departments.
For example, suppose we want to enforce the following rules:
With RBAC, you can create three roles: notification-sender, designer, and role-editor, respectively, for each rule. Users who take on these roles can perform them, and users who don’t have them cannot perform them. Build the notification system in a way that is easy for small teams to use, and it should scale to larger teams and organizations, too, as they need it more.
One important part of designing an RBAC is the ability to compose larger roles from smaller roles. For example, it lets you create roles that delegate smaller functions to subordinates while granting more permissions to team leaders.
Non-technical users need to see the collected data in an easily digestible way. The system should present data in a format that’s easy to understand so even non-technical users can grasp key insights at a glance.
The system also needs to provide various views of the same information: aggregate statistics over different periods, various visualizations of the data, etc. They should answer the most common questions, such as what notification channel performs best for each type of message, by looking at a dashboard.
Understanding the needs of the different personas that will be using your notification system is foundational while building a notification system to ensure that your hard work is meeting the needs it was commissioned for. But understanding the needs of not only your fellow developers, but also nontechnical team members from customer success, marketing, etc. will make your hard work more available and scalable.
Speaking of scaling, scalability and reliability are necessary to make sure your notification system stays current and does not require more rebuilds in the near future. Scaling reliably can be hard, but the next article in this series will explain how you can do it without sacrificing throughput for maximum reliability. We will delve more into the complexities of building a notification system and continue to provide a comprehensive guide along the way. To stay in the loop about the upcoming content, subscribe below or follow us @trycourier!
In this post, we will start diving into the internals of Postgres to understand how replication works an...
January 20, 2022
In the bigger picture, observability ties your technological infrastructure to your overarching product ...
December 15, 2021