Writing a Web API Client in Rust (Part 1)
- Published on
Writing a client for a web API is a trivial task in many languages. In fact, as a new student of Python, you might even build such a thing soon after a "hello world".
If you are new to Rust, you may immediately run into a few topics which I think are non-obvious, and perhaps difficult to navigate from a cold start.
This series of posts aims to break the ice by working through a toy example using the public Marvel Comics API. Along the way, I'll touch on some of the popular library choices and how they apply to the problem at hand.
Some of the Rust topics we'll cover in this series include:
- Working with "split" crate project (multiple binaries with a shared library).
- Interior Mutability.
- Using generics to reduce code duplication.
- Making async HTTP requests with hyper, futures, and tokio.
- JSON parsing with serde.
In this post of the series, we'll be focusing on project setup and interior mutability.
If you want to skip all the rationale and explanation, you can head over to GitHub and check out marvel-explorer which is where I prototyped a bunch of the code we'll be talking about.
The Mission ◈
Marvel's Infinity War is currently showing in theaters and has been heralded as the most ambitious cross-over event in history. For this article, our mission will be to use the Marvel Comics API to answer the question:
Which was the first cross-over event to feature two particular characters?
To accomplish this task, our problem space can be broken down into the following areas. Some tasks on the critical path:
- A way to see how characters are identified in the remote system (so we can ask for additional information about them).
- A way to get a list of events featuring a given character.
In addition to these tasks, we need to implement some sort of system for safely building the API URLs in a way which satisfies the authorization requirements set by the service.
This "URL building" aspect is the first we'll tackle as we get underway.
Project Setup ◈
Following the marvel-explorer prototype, built as a companion to this article, we'll start by creating a new "lib" cargo project.
$ cargo new --lib marvel-explorer
The problem space we've defined could call for more than one program, each sharing some common code. Cargo can accommodate this nicely since while a project can only have one library target, it can still have many additional binary targets.
In your Cargo.toml
you can define extra sections for each binary target,
mapping a name to a "main" source file (which must contain a main()
function). For example:
[[bin]]
name = "foo"
path = "src/any/old/path/foo.rs"
[[bin]]
name = "bar"
path = "extras/bin/bar/main.rs"
In this case, we've defined two programs named foo
and bar
. Note that the
name of the source file is inconsequential, and that the sources are not even
required to be under your project's src/
directory.
This is the explicit way to define binary targets, but since this is a common
practice, there is a more convenient implicit way, following a simple
convention: any rust source matchingsrc/bins/NAME.rs
produces a binary target
called NAME
.
Regardless of how you define binaries in your project, inside each one you'd be
able to list our lib using extern crate
to get access to it.
While we're ready to share code across multiple binaries, all the code we'll
write today will be in src/lib.rs
(part of our shared lib target).
Now, on to the coding!
Building API URLs ◈
In the most simple case, building URLs could be done with the basic string formatting, but to do this safely I tend to use the url crate. I mention safely here since it is important that your URLs are percent-encoded where required, and when you are accepting user input to build your URLs, there's no way you can know what will get fed in. In this case, it's nice to have a library to take care of the encoding concerns for you!
As it would happen, the Marvel Comics API is not the most simple case. The service uses a "shared secret" authorization scheme where each request must include the following query string parameters:
apikey
, your public key in clear text.ts
, a value which varies from request to request (Marvel recommends using a timestamp for this).hash
, the md5 hash ofts
+ your private key +apikey
.
By sending all this information, Marvel can verify each request by repeating the hashing process on their end to confirm that we both have the same private key string. If the private key portion differs, the resulting hashes would not match!
To satisfy all these requirements we also bring in the rust-crypto crate.
Finally, while the url crate is nice for building URLs programmatically, the
url::Url
struct it provides is actually not compatible with the HTTP client
provided by hyper. In order to make our requests, we'll have to convert our
final url::Url
into a hyper::Uri
. As such, we need to bring in hyper to
do this.
extern crate crypto;
extern crate hyper;
extern crate url;
use std::cell::RefCell;
use std::time::{SystemTime, UNIX_EPOCH};
use crypto::digest::Digest;
use crypto::md5::Md5;
use hyper::Uri;
use url::Url;
struct UriMaker {
/// Our Marvel API *public* key
key: String,
/// Our Marvel API *private* key
secret: String,
/// The prefix of every url we'll be producing.
api_base: String,
/// Our md5 hasher, used to generate our `hash` query
/// string parameter.
hasher: RefCell<Md5>,
}
The above code lays out the general types we'll be working with for this
problem. This struct has been modeled to hold all the pieces we'll need to do
the work we've outlined. Basic String
s are used to store the public and
private keys, as well as the common prefix for all the URLs we want to build.
The hasher
field is a little more exotic in comparison. First, a little
background on the hasher itself.
The Md5
hasher we get from the crypto crate has an API whereby you feed in
bytes of data, potentially via multiple calls, then at some point later you
request the hash digest, which is the string of characters the hasher has
computed for the supplied input. In order for this to work, the various inputs
fed into the hasher are buffered internally, which in Rust require mutable
access to the data structure.
In Rust, the mutability of objects is managed by something called the borrow checker. The borrow checker acts like a lock, ensuring that only one part of the program can modify an object at a time.
When I first started working with Rust, I was confused by the borrow checker
barking at me because of my unintentional double-borrow attempts, and how I had
to keep marking things as mutable with the mut
keyword even though I felt
like I shouldn't have to.
A big gap in my initial mental model for how the borrow checker works was
around how the mutable state of an object is all or nothing. If you need to
call a method on an object which modifies some private field, the entire object
needs to be marked as mutable.
This can be a big hassle for the callers of your code since it means they will
need to be aware of what operations will require the object to be mut
.
Additionally, this impacts any objects, all the way up the ownership chain,
which own fields of a type requiring mutability to function.
This is where RefCell
can help! RefCell
can be employed to limit the
scope of a mutability change such that pivot from immutable to mutable is not
seen further up the chain of ownership.
The term for this is managing interior mutability.
As we implement methods for the UriMaker
struct, we'll see how RefCell
works in practice.
impl UriMaker {
/// convenience method to initialize a new `UriMaker`.
pub fn new(
key: String,
secret: String,
api_base: String
) -> UriMaker {
UriMaker {
key,
secret,
api_base,
hasher: RefCell::new(Md5::new()),
}
}
/// Produces an md5 digest hash for ts + private key + public key
fn get_hash(&self, ts: &str) -> String {
// The `RefCell` lets us get a mutable reference to the
// object within while not having to flag the whole `UriMaker`
// as mutable.
let mut hasher = self.hasher.borrow_mut();
hasher.reset();
hasher.input_str(ts);
hasher.input_str(&self.secret);
hasher.input_str(&self.key);
hasher.result_str()
}
/// Convert from a `url::Url` to a `hyper::Uri`.
fn url_to_uri(url: &url::Url) -> Uri {
url.as_str().parse().unwrap()
}
/// Append a path to the api root, and set the authorization
/// query string params.
fn build_url(
&self,
path: &str
) -> Result<Url, url::ParseError> {
let ts = {
let since_the_epoch =
SystemTime::now().duration_since(UNIX_EPOCH).unwrap();
let ms = since_the_epoch.as_secs() * 1000
+ since_the_epoch.subsec_nanos() as u64 / 1_000_000;
format!("{}", ms)
};
let hash = &self.get_hash(&ts);
let mut url = Url::parse(&self.api_base)?.join(path)?;
url.query_pairs_mut()
.append_pair("ts", &ts)
.append_pair("hash", hash)
.append_pair("apikey", &self.key);
Ok(url)
}
// ... snip ...
}
In the above code, we're defining methods for the default implementation for
UriMaker
. We've attached a public UriMaker::new()
method which will simply
return a new instance, handling the creation of the hasher for the caller.
Notice that all the fields are private - there's currently no need for access
to these from the outside.
In addition to UriMaker::new()
, we've also added a couple private instance
methods which will be used by other methods which will act as our public
interface.
In the get_hash()
method, we are able to get access to a mutable reference to
the hasher thanks to the RefCell
wrapper. If we didn't have
RefCell::borrow_as_mut()
to manage the mutability for us, get_hash()
would
need to be defined instead with &mut self
in the parameter list, which in
turn would require whatever owned UriMaker
to be mutable as well.
The UriMaker::url_to_uri()
method is just a little helper to convert from
type to type. We'll be calling it to finalize the values of your public
methods right before they return.
The build_url()
method simply assembles all the parts.
We take the prefix + path + standard query string and package it all up in a
url::Url
which the caller can modify further as need. The common case for
further modification is to simply add additional query string parameters (as
we'll see in this next code sample).
Moving on, we'll flesh out the public interface for this type.
impl UriMaker {
// ... snip ... continued from the previous code sample
/// Lookup character data by name (exact match).
pub fn character_by_name_exact(&self, name: &str) -> Uri {
let mut url = self.build_url("characters").unwrap();
url.query_pairs_mut().append_pair("name", name);
Self::url_to_uri(&url)
}
/// Lookup character data by name (using a "starts with" match).
pub fn character_by_name(&self, name_starts_with: &str) -> Uri {
let mut url = self.build_url("characters").unwrap();
url.query_pairs_mut()
.append_pair("nameStartsWith", name_starts_with);
Self::url_to_uri(&url)
}
/// Get all the events for a given character.
pub fn character_events(&self, character_id: i32) -> Uri {
let mut url = self.build_url(
&format!("characters/{}/events", character_id)
).unwrap();
url.query_pairs_mut()
// 100 is currently the largest limit we can set.
.append_pair("limit", &format!("{}", "100"));
Self::url_to_uri(&url)
}
}
These public methods offer a simple interface to turn questions like,
"which characters match this name prefix?" or, "which events were this
character featured in?" into fully-fledged hyper::Uri
instances we can pass
directly to hyper to fetch some answers.
Wrapping Up ◈
In this post, we looked at how to arrange a cargo project for sharing code across multiple binary targets.
In the next post, we'll introduce the serde and futures crates, then
build a new struct called MarvelClient
which will house all the hyper HTTP
client tech we'll need to actually make requests.