Writing a Web API Client in Rust (Part 3)
- Published on
Continuing on from the previous post in the series (Part 2), this post will be focusing on making async HTTP requests, generics, and JSON parsing with serde.
This time we'll be assembling all the pieces we've built thus far, and calling them from some new binary targets.
Finally, we will use the API client library to answer the questions we originally set out to answer:
Which was the first cross-over event to feature two particular characters?
Once again:
If you want to skip all the rationale and explanation, you can head over to GitHub and check out marvel-explorer which is where I prototyped a bunch of the code we'll be talking about.
Exploring Marvel's Character Data ◈
The problem we're trying to solve will require a couple different programs to help us explore the Marvel API.
The first program will be for searching for characters by name so we can pick
out specific entities (or records). For this, we can leverage the helper
function, UriMaker#character_by_name()
, which we wrote in Part 1.
The function was written to build a URL we can use to fetch a list of character
entities that have a name starting with a given prefix.
This program is important for the exploration process since otherwise, we'd be stabbing in the dark looking for exact matches. At least now we can cast a wide net and see what comes back.
Much of the work for this program will happen in src/lib.rs
, but will also
include some argument and presentation handling over in src/bin/character-search.rs
.
Here's an example of the usage:
$ cargo run --bin character-search thor
Finished dev [unoptimized + debuginfo] target(s) in 0.0 secs
Running `target/debug/character-search thor`
+---------+----------------------------------+
| ID | Name |
+---------+----------------------------------+
| 1009664 | Thor |
| 1017576 | Thor (Goddess of Thunder) |
| 1017106 | Thor (MAA) |
| 1017315 | Thor (Marvel Heroes) |
| 1017328 | Thor (Marvel War of Heroes) |
| 1017302 | Thor (Marvel: Avengers Alliance) |
| 1011025 | Thor (Ultimate) |
| 1010820 | Thor Girl |
+---------+----------------------------------+
The actual program also includes a third column with a truncated description which has been cut out for readability's sake.
Modeling Character Responses ◈
The documentation for the Marvel Comics API shares a fairly detailed blueprint for what the various responses are shaped like.
A simplified version of the responses for "character" data might be:
{
"data": {
"results": [
{"id": 1, "name": "...", "description": "..."},
{"id": 2, "name": "...", "description": "..."},
{"id": 3, "name": "...", "description": "..."},
{"id": 4, "name": "...", "description": "..."}
]
}
}
Many fields have been omitted at each level of the JSON data.
We don't need them to complete our task.
Likewise, serde
does not require us to exhaustively model the data we
will receive, rather we just have to worry about the data we want to extract.
We can write some structs to mirror the anticipated types of the fields (just as before in Part 2).
#[derive(Debug, Deserialize)]
pub struct Character {
pub id: i32,
pub name: String,
pub description: String,
}
#[derive(Debug, Deserialize)]
struct CharacterDataWrapper {
pub data: CharacterDataContainer,
}
#[derive(Debug, Deserialize)]
struct CharacterDataContainer {
pub results: Vec<Character>,
}
If we look closely at
the spec
in the API docs, you may notice that the intermediate fields, data
and
results
, are actually optional.
If we were to update our model to account for this, we could have our structs
to use Option<T>
for each field. Still, this might be pedantic and overly
verbose for our use case. We have modeled the data we need, and anything less
will effectively mean we can't offer a meaningful return value.
Another way to think about it is, if at any point one of the fields turns out
to be null
or missing, the end result (for us) is the same. We cannot proceed
any further, so the outcome can only be failure. By modeling our needs we
can skip the formality of carefully interrogating each
Option
to see if it is Some
or None
. Instead, we can simply ask serde
if
the data matches our expectations.
If fields are missing, serde
will give us an Err
when we try to unpack the
data into our struct.
Fetching the Character List ◈
To build our program, we'll write a function that can accept a name prefix. The pieces we built earlier in the series, Parts 1 and 2, are finally put to work for this.
In src/lib.rs
we add on to MarvelClient
.
impl MarvelClient {
// ... snip ... continued from the work done in Part 2
pub fn search_characters(
&self,
name_prefix: &str
) -> Result<Vec<Character>, io::Error> {
let uri = self.uri_maker.character_by_name(name_prefix);
let work = self.get_json(uri).and_then(|value| {
let wrapper: CharacterDataWrapper =
serde_json::from_value(value).map_err(to_io_error)?;
Ok(wrapper.data.results)
});
self.core.borrow_mut().run(work)
}
}
Our new method puts all those pieces together. We use:
UriMaker
to build ourUri
.MarvelClient#get_json()
to prepare aFuture
that will run our request.Future#and_then()
to transform the return from thatFuture
.- our helper
to_io_error()
converts any potentialserde_json::Error
tostd::io::Error
so as to be compatible withhyper
's internal error handling. - our
core
to schedule theFuture
work, and block until it completes.
Writing the character-search binary ◈
With this method written, we can build our command line program in
src/bin/character-search.rs
extern crate marvel_explorer;
#[macro_use]
extern crate prettytable;
use marvel_explorer::MarvelClient;
use prettytable::Table;
use prettytable::format;
use std::env;
fn main() {
// read our auth info from environment vars
let key = env::var("MARVEL_KEY").unwrap();
let secret = env::var("MARVEL_SECRET_KEY").unwrap();
// create an instance of our client.
let client = MarvelClient::new(key, secret);
// read a hero name to search for from the
// arguments to our program.
let name = env::args().nth(1).expect("name");
match client.search_characters(&name) {
Err(e) => eprintln!("{:?}", e),
Ok(results) => {
// Create the table
let mut table = Table::new();
table.set_format(
*format::consts::FORMAT_NO_LINESEP_WITH_TITLE);
// Add a row
table.set_titles(row!["ID", "Name"]);
for character in &results {
table.add_row(row![character.id, character.name]);
}
table.printstd();
}
};
}
Most of the work has already been done for us in our library code.
The binary program brings in our library, pulls keys for the Marvel API from environment variables set in the shell before running the program.
See the "Get a Key" section of the Marvel Comics API site for how to get your own keys.
It also reads from the arguments list to capture the name we want to search for.
Once all the inputs for our request have been collected, we can make our request by calling our new method.
Since the method returns a Result
, and we're at the top of our call stack, we
finally use it to present the outcome to the user. We use match
to inform
the user of either the success or failure of our request.
In the case of an error, we simply print to stderr some debug information about
the error itself by using the debug format token ("{:?}"
). In a more complete
program, we might take further steps to provide better details or
recommendations to the user in this section of the program.
In the case of success, we format the data and print it out. I found a crate to help me with this. Using prettytable-rs, we can set some table headers and insert rows, one per matching entity, then print the result.
The output looks like the example up at the start of the post.
Exploring Marvel's Event Data ◈
In Part 1 we wrote a method for building URLs to fetch a list of events associated with a specific character id.
We will now use this to build a binary in src/bin/character-events.rs
Here is some example output using the top id from our earlier search for "thor":
$ cargo run --bin character-events 1009664
Compiling marvel-explorer v0.1.0 (file:///home/owen/projects/marvel-explorer)
Finished dev [unoptimized + debuginfo] target(s) in 4.39 secs
Running `target/debug/character-events 1009664`
+-----+-----------------------+---------------------+
| ID | Title | Date |
+-----+-----------------------+---------------------+
| 116 | Acts of Vengeance! | 1989-12-10 00:00:00 |
| 233 | Atlantis Attacks | 1989-01-01 00:00:00 |
... snip ...
| 273 | Siege | 2009-12-06 00:00:00 |
| 60 | World War Hulks | 2007-07-07 00:00:00 |
+-----+-----------------------+---------------------+
Again, in the actual program, there is a 4th column with a truncated description which is omitted for readability.
Modeling Event Responses ◈
Just as with the character data, we begin by building some structs to represent the data.
#[derive(Clone, Debug, Deserialize, Eq, Hash, PartialEq)]
pub struct Event {
pub id: i32,
pub title: String,
pub start: Option<String>,
pub description: String,
}
#[derive(Debug, Deserialize)]
struct EventDataWrapper {
pub data: CharacterDataContainer,
}
#[derive(Debug, Deserialize)]
struct EventDataContainer {
pub results: Vec<Event>,
}
This time, we will be just a bit more permissive. We'll define the Event.start
field as Option<String>
. This is to say, if there are any events in the response
without dates associated with them, we don't want to reject the whole data set.
Instead, we can sort the entities without a date to the bottom of our list, or filter them out completely when searching for the earliest event.
You may notice there is an alarming amount of overlap between the
EventDataContainer
, and the CharacterDataContainer
we wrote earlier in the
post. In fact, if these two types could be unified somehow, then the
EventDataWrapper
and CharacterDataWrapper
types would also be virtually
identical.
This is where the Rust generics system can really shine! Let's take a quick detour to reduce this duplication.
Refactoring the Model with Generics ◈
When we talk about generics we are talking about types that are framed in terms of other types.
We've already seen this in practice with Option
, where we have a single
Option
type that can be reused for any number of other types by filling in
the type parameter in angle brackets.
Option<String>
represents either a String
or None
. Option<i32>
represents either an i32
or None
. The generics system allows us to
parameterize a type as an input so we don't have to have separate concrete
types for each variant. This is what makes it possible to not need an
OptionalString
type, for example.
To leverage this in our program, we can define some new types to replace the ones we originally wrote.
#[derive(Debug, Deserialize)]
struct DataWrapper<T> {
pub data: DataContainer<T>,
}
#[derive(Debug, Deserialize)]
struct DataContainer<T> {
pub results: Vec<T>,
}
In this case, T
represents an unknown type to be decided later by the caller.
To trace through, imagine you have a DataWrapper<Character>
.
The compiler makes the replacement for T
all the way through these type
definitions.
By using this signature in our code, we are telling the compiler that we have a
struct with a data
field pointing to a DataContainer<Character>
, which has
a results
field pointing to a Vec<Character>
.
Likewise, by using a DataWrapper<Event>
we follow the same path, ending with
a results
field holding a Vec<Event>
. We just let the compiler stamp out
equivalent types to the ones we wrote manually earlier! To sweeten the deal, we
can continue to use these generic structs for new types later on, if we wanted
to explore other areas of the Marvel data.
The generics system is really awesome and can be a great help to reduce repetition in your code. Be on the lookout for places where generics can help.
Fetching the Event List ◈
With our new generic data model, we can now implement a new fetcher method to get us a list of events for a given character id.
impl MarvelClient {
// ... snip ...
pub fn events_by_character(
&self,
character_id: i32
) -> Result<Vec<Event>, io::Error> {
let uri = self.uri_maker.character_events(character_id);
let work = self.get_json(uri).and_then(|value| {
let wrapper: DataWrapper<Event> =
serde_json::from_value(value).map_err(to_io_error)?;
Ok(wrapper.data.results)
});
self.core.borrow_mut().run(work)
}
}
The work being done in this function is pretty much the same as the last.
We're gathering the inputs needed to make our request, building the request,
transforming the data by attaching a closure to the Future
returned by
get_json()
, and finally scheduling the work on our tokio core
, and blocking
to await the final result.
Writing the character-events binary ◈
Just like last time, we gather up authorization info from the environment, collect an argument from the args list, and present the outcome to the user.
extern crate marvel_explorer;
#[macro_use]
extern crate prettytable;
use marvel_explorer::MarvelClient;
use prettytable::Table;
use prettytable::format;
use std::env;
fn main() {
let key = env::var("MARVEL_KEY").unwrap();
let secret = env::var("MARVEL_SECRET_KEY").unwrap();
let client = MarvelClient::new(key, secret);
let id: i32 = env::args()
.nth(1)
.expect("character_id")
.parse()
.expect("parse character_id");
match client.events_by_character(id) {
Err(e) => eprintln!("{:?}", e),
Ok(results) => {
let mut table = Table::new();
table.set_format(
*format::consts::FORMAT_NO_LINESEP_WITH_TITLE);
table.set_titles(row!["ID", "Title", "Date"]);
for event in &results {
// fall back to empty string if an event
// has no start date.
let start = match event.start {
Some(ref s) => s,
None => "",
};
table.add_row(row![event.id, event.title, start]);
}
table.printstd();
}
};
}
This program is very similar to the last, except for:
- parsing the first argument as an
i32
instead of leaving it asString
. - normalizing the value of
start
so it is always astr
(even when the value isNone
).
Beyond these minor points, everything should look familiar.
Putting it all together ◈
So far, we've been able to fetch a list of character matching a given name prefix and a list of events for a given character id. Technically we have enough information to answer the question, "which event was the first to feature two particular characters?"
Still, it would be nice if we didn't have to run our two programs multiple times to spit out lists of events that we'd have to find matches in, then sort by date.
Seems like this is the sort of thing a computer should be able to do for us neatly. Let's see if we can manage to automate all that.
Calculating the Earliest Event ◈
To calculate the earliest event that two characters participated in, we can follow a process like:
- Look up the character id so we can get a list of events.
- Fetch the event list for each character.
- Pack the event lists into a
HashSet
so we canintersect
the two lists. - Sort the intersection by date to get the earliest result.
impl MarvelClient {
// ... snip ...
pub fn earliest_event_match(
&self,
name1: &str,
name2: &str,
) -> Result<Option<Event>, io::Error> {
let name_to_event_set = |name: String| {
let id_lookup = self.uri_maker
.character_by_name_exact(&name);
self.get_json(id_lookup)
.and_then(move |characters_resp| {
let wrapper: DataWrapper<Character> =
serde_json::from_value(characters_resp)
.map_err(to_io_error)?;
match wrapper.data.results.first() {
Some(character) => Ok(character.id),
None => Err(io::Error::new(
io::ErrorKind::Other,
format!("Character `{}` Not Found", name),
)),
}
})
.and_then(|id| {
let uri = self.uri_maker.character_events(id);
// Return a future from a future.
// The next `.and_then()` receives the resolved
// value of this.
self.get_json(uri)
})
.and_then(|events_resp| {
let wrapper: DataWrapper<Event> =
serde_json::from_value(events_resp)
.map_err(to_io_error)?;
let result_set: HashSet<Event> =
wrapper.data.results.into_iter().collect();
Ok(result_set)
})
};
// build up a graph of futures to compute the final value.
let work = name_to_event_set(name1.to_owned())
.join(name_to_event_set(name2.to_owned()))
.and_then(|(events1, events2)| {
let maybe_event: Option<Event> = events1
.intersection(&events2)
.min_by_key(|x| &x.start)
.map(|x| x.clone());
Ok(maybe_event)
});
self.core.borrow_mut().run(work)
}
}
If I'm completely honest, this design is greatly influenced by the Rust
borrow checker. If I were stronger with the language, I might have figured out
a way to rewrite the name_to_event_set
closure as a method on MarvelClient
,
which I would prefer, but I suppose there are some complicated ownership aspects
to this. Likely this sort of thing will be improved with the async API changes
coming to Rust later this year.
This method, as written, is a bit of a mouthful, so let's step through it.
Early in the method, we define a new closure called name_to_event_set
. This
represents the work we will be doing for each character name.
At the start of this section, we laid out some bullet points for the steps required to compute our final result. This closure effectively handles all but the final point (comparing the events for each character). The closure will:
- Translate a character name into an id.
- Use that id to fetch event list JSON data.
- Convert the data into a
HashSet<Event>
so we can easily find the intersection.
Once we finish defining this closure (which accounts for the bulk of this method), we can build a complete graph of the work to be done.
The .join()
method on Future
allows us to transform two separate futures
into a single future that will resolve once each of the individual futures
resolves. By using this, we can run our closure twice, in parallel, once for
each character name. Since the return of a .join()
is yet another future, we
can chain a call to .and_then()
to perform a final task that consumes the
final values from each call to name_to_event_set
.
This diagram shows how these tasks might be scheduled on our tokio core
.
Each box inside the blue groupings roughly corresponds to the 3 .and_then()
calls chained to the initial call to get_json()
. Each of these steps echoes
work we did previously so I won't go into detail on them.
The final step performs the intersection on the two event sets, and uses
the .min_by_key()
method to either find None
or Some(event)
with the
lowest start
field.
You may have noticed the definition for the
Event
struct had a long list of traits derived for it, most of which were not required for the other types we defined.The
Eq
,Hash
, andPartialEq
are what allowEvent
to be stored in aHashSet
.
Writing the first-event binary ◈
Here's src/bin/first-event.rs
extern crate marvel_explorer;
use marvel_explorer::MarvelClient;
use std::env;
fn main() {
let key = env::var("MARVEL_KEY").unwrap();
let secret = env::var("MARVEL_SECRET_KEY").unwrap();
let client = MarvelClient::new(key, secret);
let name1: String = env::args().nth(1).unwrap();
let name2: String = env::args().nth(2).unwrap();
match client.earliest_event_match(&name1, &name2) {
Err(e) => eprintln!("{:?}", e),
Ok(maybe_event) => {
println!("{:?}", maybe_event);
}
};
}
Here's the final output:
$ cargo run --bin first-event Deadpool Nightcrawler
Finished dev [unoptimized + debuginfo] target(s) in 0.0 secs
Running `target/debug/first-event Deadpool Nightcrawler`
Some(Event { id: 318, title: "Dark Reign", start: Some("2008-12-01 00:00:00"), description: "Norman Osborn came out the hero of Secret Invasion, and now the former Green Goblin has been handed control of the Marvel Universe. With his Cabal and the Dark Avengers at his side, can anything stop this long time villain from reshaping the world in his own image? And what has become of the heroes?" })
Not as pretty as the other table-based outputs, but it does the job.
Since we're using the debug format token, we can see if the two
characters have a match or none at all since the printed output will show one of
Some(... blob of data ...)
, None
, or some kind of error.
Wrapping Up ◈
This has been a guided tour of working with an assortment of libraries to build a small client for the Marvel Comics API.
While the code may not be production-ready, it shows how quickly a somewhat trivial task can send a developer into dealing with topics that wouldn't require attention when working in another language.
Notably, our interior mutability handling is not thread-safe, so concurrent access to
UriMaker
in our implementation forMarvelClient#earliest_event_match()
could result in apanic!
at runtime. I suggest you check out Ricardo Martins' Interior Mutability in Rust, part 2: thread safety for an in-depth guide on thread-safe alternatives toRefCell
.
At the very least, I hope this series helps to "break the ice," highlighting topics for further reading.