Practical Networked Applications in Rust, Part 2: Networked Key-Value Store
Welcome to the second installation in my series on taking the Practical Networked Applications in Rust course, kindly provided by the PingCAP company, where you develop a networked and multithreaded/asynchronous key-value store in the amazing Rust language. You may see my previous post in this series here.
In the previous, and initial, post I implemented the course module of making the fundamental key-value store functionality, based around the Bitcask algorithm, which would only allow for local usage on your own computer. In the second module of my course work, I add networking functionality, dividing the application into a client/server architecture so that clients can connect to servers across the network.
You can find all the code for this module of the course here.
Client/Server Architecture
The first fundamental change I had to make to the application as part of this course module
was to divide it into two executables, kvs-client and kvs-server. It’s done practically
by adding two source files,
src/bin/kvs-client.rs
and src/bin/kvs-server.rs,
as well as two corresponding entries to Cargo.toml:
[[bin]]
name = "kvs-server"
test = false
doctest = false
[[bin]]
name = "kvs-client"
test = false
doctest = false
This makes Cargo build the two executables kvs-client and kvs-server.
In addition to the source files corresponding to the two executables, the project must contain a library component containing both shared and client/server specific code. The main file of the library component (from which other modules get imported) is src/lib.rs.
The Client
The client executable supports the same sub-commands as the program developed in the first module, but with the difference that each has a new option called --address. This allows you to control which server to connect to. The default is to use one running on localhost, port 4000. Pretty cool that we can now control a key-value store server on a different network host!
Network Protocol
As implied by the above, the different sub-commands of the client, f.ex. set, are now sent across the network to be interpreted by the server. Since the course doesn’t stipulate any protocol for client/server communication, apart from TCP, I had to design my own. I chose to define a simple command structure to serialize with Bincode and send across the wire to the server:
/// Type of server command.
#[derive(Serialize, Deserialize, Debug)]
pub enum CommandType {
Get,
Set,
Remove,
}
/// Server command.
#[derive(Serialize, Deserialize, Debug)]
pub struct Command {
pub r#type: CommandType,
pub key: String,
pub value: String,
}
The server responds to commands with another self-designed structure, that also gets serialized with Bincode before sending over the wire:
/// Error type of server response.
#[derive(Serialize, Deserialize, Debug)]
pub enum ErrorType {
NonExistentKey,
Unknown,
}
/// Server response.
#[derive(Serialize, Deserialize, Debug)]
pub struct Response {
pub value: Option<String>,
pub error: Option<ErrorType>,
pub error_message: Option<String>,
}
The server may indicate that an error occurred, by including appropriate information in the response.
TCP Networking
The TCP networking itself, i.e. to send data to the server and receive data back, is made quite simple by the Rust standard library. More specifically, I use the std::net::TcpStream type for this.
The Server
The server is a more interesting topic than the client, for the reason that it sports two different storage engines. The course stipulates also supporting the popular embedded Rust key-value store Sled, in addition to the Bitcask inspired engine from the previous module. In order to allow choosing between the two, the server executable has a flag --engine, defaulting to our handrolled alternative (named kvs). If the user doesn’t specify any engine, we either detect the one previously used or fall back to kvs. Picking a different engine than the one previously used is considered an error.
Once launched, the server uses
TcpListener to receive incoming
connections from clients (represented as TcpStream
objects). For each incoming connection,
the server attempts to interpret the data sent as a Bincode serialized command (see the definition
in the preceding section on the client). Depending on the command type, it will perform either a
get, set or remove operation. The operation itself gets delegated to the configured storage engine.
The result from the operation then gets written back (Bincode serialized) to the client. If an error occurred during the operation, information on this gets encoded in the response for the client to interpret.
It’s worth mentioning that the server is written such that it can only handle a single request at a time, synchronously, so it’s not very performant (although easily understood as a result). Parallelization of request handling gets introduced in the next course module. Exciting!
Benchmarks
A part of this module of the course is to write and run benchmarks. Rust has built-in benchmarking
tooling, but the course stipulates using the 3rd party framework
Criterion instead. It plugs into Cargo, with
some tweaking of Cargo.toml, so you can still use the standard cargo bench
command:
[[bench]]
name = "engine_benches"
harness = false
Below, I’ve included plots from benchmarking of the get operation:
And the set operation:
Conclusion
Hope you enjoyed this part of this series! More to come!