Build Your Own Uptime Monitoring Service
This challenge is to build your own uptime monitoring service. There are many such services and if you work for a company that runs an Internet facing property your company probably uses one of the well known ones such as Pingdom, pusetic, Uptime or Uptime Robot.
Essentially they will monitor your website / API endpoint and provide alerts when it is down so your on-call engineers can restore it ASAP. Some also offer alerts if the latency exceeds a threshold and can monitor it’s availability from multiple geographical locations.
Many will provide historic records of uptime, latency and alerts - useful for tracking your SLIs and SLAs.
The Challenge - Building An Uptime Monitoring Service
For this coding challenge we will be building an uptime monitoring service. It will allow one or more users to:
- Enter one or more URLs to be monitored.
- Configure the frequency of monitoring.
- Define an action to take when the monitored URL is either:
- Too slow to respond with a full page, or
- Not responding at all.
- View historical uptime and response-time data.
You could build this as a simple command line tool, just for your own use or you could build a full blown web based service hosted in the cloud. If so don’t forget AWS offer a free tier, Azure offers a similar set of services that are free (with limits) for 12 months and Google Cloud offers 20+ free products for all customers (with limits) and $300 in free credits (at the time of writing). Other cloud providers are also available.
Step Zero
In this step you decide which programming language and IDE you’re going to use and you get yourself setup with a nice new project for your uptime monitor. You might like to mix a couple of programming languages for this coding challenge, one for the backend and monitoring functionality and another for the user interface.
Step 1
In this step your goal is to allow the user to enter one or more URLs to be monitored and to set a frequency for monitoring. You will then need to store this in a database.
Depending on the solution you decide to build that could be as simple as a flat file or as complex as a cloud hosted SQL/NoSQL database. You could use any of those, but they will all have trade-offs.
Step 2
In this step your goal is to send a HTTP HEAD request to the URLs at the specified frequency, storing the result and relevant data for each request in the database. When designing your database consider how you’ll want to use this time series data.
You could use a HTTP GET request, but a HEAD is all you need for monitoring if the site is up and responding to requests. If you don’t know the difference now would be a great time to learn.
Step 3
In this step your goal is to allow the user to view the historic data for the requests. Your UI should allow the user to select one or more of the URLs they are monitoring and specify a date range. They should then be able to see a graph of the uptime, and ideally the round-trip time of the response.
Step 4
In this step your goal is to allow the user to specify the use of a GET request and to then have additional data about the request and response stored and available for display via a graph. This should include time to first byte (TTFB) and the time to get the whole response.
Step 5
In this step your goal is to allow the user to view the new data via the UI. Ideally they should also be able to access the data either via an API (if you went for a web based solution) or export it if you built it as a command line/desktop tool.
Step 6
In this step your goal is to allow the user to configure alerts. They should be able to request alerts via email or via a webhook. They should be able to specify that the alert happens when the site is unresponsive for N attempts where N is a configurable number.
Going Further
Much like the URL Shortener Coding Challenge you can extend this uptime monitoring solution with signup, authentication and a payment service and you have your own fully blow SaaS business! Beyond that many uptime services will also monitor web pages for specific text or SSL certificates for expiry.
Help Others by Sharing Your Solutions!
If you think your solution is an example other developers can learn from please share it, put it on GitHub, GitLab or elsewhere. Then let me know - ping me a message on the Discord Server, via Twitter or LinkedIn or just post about it there and tag me. Alternately please add a link to it in the Coding Challenges Shared Solutions Github repo.
Get The Challenges By Email
If you would like to recieve the coding challenges by email, you can subscribe to the weekly newsletter on SubStack here: