As part of the SRE (Site Reliability Engineering) Team, we were tasked to ease
the amount of requests that our API services receive during a surge in traffic.
Previously, we just use our proxy (Kong + Nginx) gateway's builtin rate limit
plugin or write a custom one to deal with our requirements. This time we wrote
a service that our custom plugin calls. The downside of this is that we now
introduced extra latency (network + runtime) but the upside outweigh the negatives.
The plugin is a lot simpler to write, it only needs to call the service. No more
installing of 3rd party plugin to interface with the DB or cache. While Lua (the
default scripting language of Kong/Nginx) is proven to be fast, the complexity
of the requirements might slow the server down.
Changes to the plugin involves packaging, uploading and restarting the proxy
server. There is always an inherent risk of changes in the proxy server. Uncaught
exceptions might take the whole proxy down and redeploying takes a considerable
time. When most of the logic offloaded to a separate service, the plugin becomes
a dumb plugin that only blocks or allows request to pass through based on the
response it gets from the service. And if the service is down, defaults on letting
the request through.
We wrote the new service in Rust. Why Rust? Well, it's 2023 and why not? Joking
aside, rust was chosen largely based on the rumors that it's highly efficient and
fast. We don't just take everyone's word for it so we went and try to see if all
the buzz and hype well deserved.
What is Rust? Rust is a system programming language that was developed by Mozilla Research with the goal of creating a safe, concurrent, and fast language. It's known for its speed, efficiency, and ability to provide fine-grained control over
low-level details, making it a good choice for writing high-performance,
memory-safe systems software.
We were scared due to this is our first time writing something in Rust and it's
task to take on something massive head on. Rust is known to be a good systems
programming alternative. We are writing a web service. Systems programming mostly needs way less libraries. They usually write everything they need. A web service, on the other hand, needs tons of external libraries depending on how many other services it needs to interface with.
Finding the the libraries we need and putting them together to come up with a
working application is a challenge we faced early on. We, being new to Rust, have
no idea which libraries the most stable.
First, we searched for the recommended web frameworks. It was between Actix and Axum. Actix being older and more mature and Axum is relatively new. We employed the most scientific way to decide which framework to choose; by quickly glossing over the respective frameworks landing page. We didn't do any PoC. Science! But seriously, we checked a few benchmarks from all over internet and both seem to be head to head with each other in terms of performance.
Actix's website presented a very nice and incredibly magical routing mechanism.
We're sure we could get something up and running in a very short time. It employs
Sinatra-like decorators to map urls to handlers and is flexible enough to handle
wide variety of responses. Axum, on the other hand, straight up showed us a piece
of code with all the wiring required for routing and responding to a request.
We could have gone easily with either one but we went with Axum. Axum is from the same people who did Tokio, the asynchronous runtime that most of Rust developers use. Axum uses Tokio underneath so they probably know their way to make it faster.
The Rate Limit service is going to access the database to get rules for request
URLs it's watching. We use SQLx which is not an ORM. It lets us write SQL queries
natively and execute against the DB. Rust already has a popular Diesel ORM but
Diesel is not native asynchronous. SQLx being asynchronous makes it a good fit with the web framework we have chosen.
The rest of the stack is Postgres (for source of truth) and Redis (for tracking usage). Updates to the DB is reflected in Redis in realtime through a CRUD UI to manage the rules. This is the only time the DB is access. For processing the incoming requests, everything is matched against Redis as redis offers much faster reads and writes.
There is one other piece of software that made learning and writing Rust at the same time easier than it used to be when we first heard about it - Visual Studio Code, specifically, with Rust-Analyzer plugin. The syntax highlighting, quick fixes, linting and autocompletion saved us a lot of time going to the docs.rs.
So how does it perform? It's hard to say since we don't have something in place previously that do exactly what it does but given that it is only 2 nodes of 128 vCPU units and 256MB memory in AWS and is able to comfortably check all incoming requests to our API without going down.