sidenote

Back-end Developers Work Hard To Make Nothing Happen


6/12/24 - Dallin Urness

Background


I recently deployed a migration to Sidenote converting the schema I used for notifications to a new version that is more flexible. Sidenote is a browser extension that allows you to converse on any page of the internet. At the end of the week-long migration, it felt strange and a bit ironic how the users saw absolutely no improvement or change after the project was complete. Back-end development tends to consist of a lot of these kinds of tasks where users don’t notice the difference, but those tasks often play a pivotal role in the future of a tech company.

So if nothing happens… then what are back-end developers doing?

A codebase is not static. It is constantly changing with improved architecture, regular maintenance, and the addition of new features. Most users will notice when new features are added, like new buttons and pages to visit. However, architectural changes and regular maintenance are done best when users don’t notice that anything has changed at all.

Common Developments That Users Don’t See

Infrastructure scaling


Web servers

As the number of customers increases, they will start to see negative side-effects, such as long load times and dropped connections. One common cause of this is that a single web-server can handle a finite number of requests per second. In order to increase that number, a load balancer can be added, which will distribute traffic between more than one web-server and multiply the number of handled requests per second by the number of servers added. These servers can also be bogged down by long-running or computationally expensive tasks. It is common to take these workloads and move them to their own dedicated servers that can be scaled independently. Breaking out work in this way is known as a microservice architecture and effectively mitigates scaling issues with web-servers. However, this scaling by itself will often only pass the problem to the next part of the stack.

Databases

Databases suffer from the same problem as the other servers, in that they can only run so many queries at once. Different kinds of databases have different limitations, but regardless of which kind is used, there will be limitations. A common way to mitigate this issue is to first add a cache, which will save the results of queries and return the same value for subsequent requests that are asking for the same data. This stops that request from querying the database at all, taking load off of the database and significantly decreasing the response time of the request. This cache could be stored in the memory of the web-server, or be stored in a separate in-memory database like Redis. Most likely though, it is better to go with an external cache so that the multiple web-server instances can share the cache, increasing the chances of a cache hit and saving memory. Another solution to the problem is to scale up the number of databases. In the case of a Postgresql database, read replicas of the database can be created that will share the load of read-only queries. Then the primary (or write) instance can focus on database writes. This division of work significantly increases the number of queries that the, previously single-instance, database cluster can handle per second.

Why not just start out with an infrastructure that can handle the load?

The simple answer is that it is expensive and time consuming. Each web server, database instance, and external cache requires its own memory, compute, and storage that will need to be paid for. Even more expensive than the basic machine cost is the developer cost to set the infrastructure up and maintain it. Engineers are an expensive resource, and having them spend time on something that is unnecessary for the stage the company is at will burn cash very quickly without any real benefit. Assessing business growth and creating a plan that defines what resources will be needed and when they will be needed is the best path for a cost-effective company.

Maintenance


But it works right now… what is there to maintain?

Software needs maintenance in order to continue functioning effectively. There are many reasons for this, but maintaining software most often takes the form of fixing security vulnerabilities or improving efficiency. While that may sound a bit alarming, it can be compared to keeping up with a house. Maybe a window lock has broken, or the sink wasn’t installed with big enough pipes, so larger ones need to be added so they don’t back-up. These issues certainly could be left in place, but doing so could result in a disaster somewhere down the road.

Application Changes


Hindsight is 20/20

When designing the application-level logic of code, it is important to spend a significant amount of time going through what exactly the product is, what features it will support, and what additions to that product are likely going to happen in the future. The better product requirements are ironed out before building the system, the less will need to be changed later. I say “less will need to be changed later”, because there WILL be changes. This could be caused by something as drastic as a product pivot or as simple as a misunderstanding when the product requirements were defined.

Migrations

In order to make application changes, it often requires some sort of migration from the old definitions or functionality to the new ones. In the case of data in a database, we may need to change what the schema that is being stored looks like. There are several methodologies to do this, the easiest of which being to just make the change. Assuming a desire to not lose data, this may look like copying all of the data from one table into a new table with the new schema, re-organizing the data to fit the new schema in between. This would cause a bit of downtime while each row is copied, mutated, and then written to a row in the new table. Soon after this the application code that interfaces with the database would also need to be updated so that it knows to grab from the new table now instead of the old one. If we want to improve the user experience during the migration, we would put more effort into having no downtime. This may look like creating the new table and then writing application code to interface with both the new and old table at the same time. Once this is in place, a migration could be run lazily to move the data from the old table to the new one. Finally, the old table and migration code would need to be removed, leaving the logic in place that is now checking the new table. Deciding what kind of an approach to take will depend on whether it is more important to reduce customer impact, or reduce the amount of time developers spend doing the migration.

My Migration

A problem born of speed


In my case, I wanted to simplify and add flexibility to the way that I was managing notifications. When a user responds or interacts with a comment you wrote, it sends you a notification letting you know so that you can respond accordingly. During the rush of trying to get to release, the original design for the notifications was a bit short sighted. However, when it comes to software, sometimes you need to get things working to figure out where you want to go before you make them pretty.

My original schema for notifications went something like:

{
id,
recipient_id,
seen,
title,
type,
details: { // this field is dynamic depending on the notification type
note_id,
truncated_text,
}
}


The idea was to store the common fields associated with the various types of notifications, and then keep a flexible json in the “details” field to store the data that is specific to that type of notification. Then, when I wanted to add a new notification, I just needed to add a new option of notification type and insert the associated data into the flexible json field. On the client side, I would need to define how to interpret and display each different notification to the user.

My foresight was not 20/20

After creating the first few notification types, I started to realize that there was a very consistent pattern for how a notification is displayed, regardless of what type it was. All I really needed to do was define what all of the options were for notifications to be able to be displayed, store it as such, and then have them all share one generic component on the client to be displayed according to the display options being used. This would allow me to add new notification types and modify the wording of specific notifications dynamically without redeploying the browser extension.

The new generalized schema for notifications looked something like:

{
id,
recipient_id,
seen,
type,
details: { // this field is dynamic depending on the notification type
note_id,
header_sender,
header_text,
body_text,
associated_link,
}
}


The need to prioritize user experience

But now… the problem. Changing the schema requires changing how the data is stored in the database, how notifications are created before being stored, and how they are displayed on the client. All while trying not to interrupt current users! The Sidenote application is a browser extension, which normally takes about a day to be approved and rolls out new versions to users over the course of around 3 hours. This means that if we make the desired change in the database, users will experience errors until their client gets the new update since it does not know how to display the new schema. However, we can’t just wait to deploy the database changes until after the clients have all been updated, because then the NEW version of the client won’t understand how to display the old schema that is still stored in the database before migration.

Solution


To have a smooth migration, I decided to break up the changes into three different deployments. The first deployment would allow for the database to store both the new and old schemas, as well as add the generic display to the client that will be able to display the new schema. However, we do not actually create any notifications that use the new schema so that we don’t have the issue mentioned above about one being used before the other.

The second deployment then contained the code changes that created new notifications with the new schema, creating our first notifications in the database that use the new schema. So now our database has two different types of notifications in it and we have components in the client for displaying both the new and old schemas. I then ran a migration script that went through the database and converted notifications from the old schema to the new schema.

The final deployment consisted of cleaning up the code to create, mutate, or display the data from the original schema. Each of these steps was performed a day apart to make sure that all of the users’ clients got the new version before moving on to the next step in the process.

Update your apps!!!

What is crazy is that, depending on where the application lives, this process becomes even more problematic. For example, for mobile apps there is a longer approval process and it is much more common for users to not want to update to new versions of the app, so those things need to be accounted for when deciding how to perform an architectural change. More often than one would think, it is chosen to not go through with the full migration at all. Instead, devs will either deal with the consequences that come with leaving it in the current state, or make the changes internally with a translator that maps the new schema to the old versions that still need to be supported.

Nothing is Something


Back-end engineers are constantly reviewing the existing system and planning for how to improve it as time goes on. This includes scaling the infrastructure, maintaining the existing codebase, and preparing stored data for feature development. Developers will need to gauge when the right time is for the company to make these changes to avoid impacts to their (hopefully) growing user-base. When that time comes, and with the right planning and execution, customers will never notice that anything has changed.