Wrapping a JSON API to access your personal data

Tags:

In my last tale I outlined how I used Dist::Zilla to do much of the heavy lifting for packaging and releasing a Perl module to CPAN. In this tale I want to outline how quick and easy it was to write that module; wrapping a proprietary undocumented API so that I could easily use my personal data in ways the company did/does not provide.

The module in question is WebService::SmartRow. SmartRow is a physical device and app that provides power and other performance data from a indoor rowing machine (specifically the WaterRower) machines.

It is a bluetooth device, that sends data to a web service via a mobile application. This data is then accessed via the app, or the website. The data includes things like the power in watts, stokes per minute, heart rate, etc.

The specific data I wanted to see, was/is how my average power changes over time. Specifically, across a series of 5000 meter rows, am I generating more power over time. Also I wanted to see my personal best times, over time. I.e. how much better was this 5000m over my previous best.

The API

SmartRow do not provide a developer API (at the time of writing, and as far as I was able to see). What they do have is a website that is mainly driven by JavaScript and a JSON API. As logged in user my browser makes calls to the API and then creates pretty HTML from the JSON returned.

The API is really quite tidy and easy to understand. Which is nice! I have previous experience wrapping JSON APIs and it can be really ugly; thankfully the SmartRow team have made a clean, easy to understand API.

The API relies on you being logged in, but quick exploration showed that basic auth is possible on the API endpoint. Meaning I did not need to worry about OAuth or keys and so forth.

The web requests

Knowing that I could get away with basic auth, that meant that the code I needed could work simply by including the username and password in the URL I make a HTTPS request to.

I turned to HTTP::Tiny and was quickly able to craft a get request that looked a little like this:

    my $response = HTTP::Tiny->new->request( 'GET',
              'https://'
            . $user . ':'
            . $pass . '@'
            . $api_url
            . '/public-game'
    );

Then (after some basic tests to make sure I got a 200), I could parse the $response->{content} from JSON to a Perl data structure using Cpanel::JSON::XS.

This gave me a data structure looking like this:

  [
    {
    "accessory_mac"        => E,
    "account"              => E,
    "ave_bpm"              => E,
    "ave_power"            => E,
    "calc_ave_split"       => E,
    "calc_avg_stroke_rate" => E,
    "calc_avg_stroke_work" => E,
    "calories"             => E,
    "confirmed"            => E,
    "created"              => E,
    "curve"                => E,
    "device_mac"           => E,
    "distance"             => E,
    "elapsed_seconds"      => E,
    "extra_millies"        => E,
    "id"                   => E,
    "mod"                  => E,
    "option"               => E,
    "option_distance"      => E,
    "option_time"          => E,
    "p_ave"                => E,
    "protocol_version"     => E,
    "public_id"            => E,
    "race"                 => E,
    "strava_id"            => E,
    "stroke_count"         => E,
    "time"                 => E,
    "user_age"             => E,
    "user_max_hr"          => E,
    "user_weight"          => E,
    "watt_kg"              => E,
    "watt_per_beat"        => E,
    },
    ...
  ]

(The above code is actually taken from the Test2::V0 .t file I wrote to confirm the structure.)

So you can see it's pretty easy to understand, the keys are all mainly understandable. In my case I wanted the p_ave and distance, so I could filter on 5000 meters and build an array of all the average power values.

Module creation

At first this could have been a simple script, but I wanted to make this something portable and usable by anyone wanting to work with their personal data.

So after I proved the approach would work, I started a module (using dzil new WebService::SmartRow). This is currently a single file with little refinement.

I used Moo for some simple OOPness and structure. This allowed me to specify the attributes I want:

has username => ( is => 'ro', required => 1 );
has password => ( is => 'ro', required => 1 );

There are pretty self-explanatory, to use the API you need those, so add them as required attributes.

Next I added an http attribute:

has http => (
    is      => 'ro',
    default => sub {
        return HTTP::Tiny->new();
    },
);

The default here creates a HTTP::Tiny object which I can later use in methods via $self, which meant my earlier get request changes to look like this:

    my $response = $self->http->request( 'GET',
              'https://'
            . $user . ':'
            . $pass . '@'
            . $api_url
            . '/public-game'
    );

You can set your own http attribute when creating via WebService::SmartRow->new() so if you need to do something like change the user agent, or have a preferred module, you can inject it easily (assuming the methods match HTTP::Tiny).

Testing

Currently the module is pretty simple, three attributes and 4 public methods. The module has little smarts so the t directory is pretty spartan as the object is pretty simple.

I am using the xt directory to hold tests that talk to the API and as such require an internet connection and credentials.

Not wanting to include my personal credentials in the repo, I have a private sub in the class that gets the username and password from environment variables. Which is good as it means I can commit my tests, and if someone using this module does not need to commit their credentials in code either.

Perl makes the environment variables easy to work with, so the small sub that handles it looks like this:

sub _credentials_via_env {
    my $self = shift;

    my $user = $self->username || $ENV{SMARTROW_USERNAME};

    my $pass = $self->password || $ENV{SMARTROW_PASSWORD};

    return ( $user, $pass );
}

So if you have instantiated the module with username or password it will use those. If they are not present it will use SMARTROW_USERNAME or SMARTROW_PASSWORD.

Then (and I know I can make this a bit smarter) my get_workouts() method, has a call to my ( $user, $pass ) = $self->_credentials_via_env; prior to calling the URL.

This means I can run my tests like this:

SMARTROW_USERNAME=yyyyyyy SMARTROW_PASSWORD=xxxxxxx prove -lvr xt/

And it will go and connect to the API and execute tests for me by getting real data from the API.

Earlier I mentioned I am using Test2::V0 for tests, so in xt I have a growing collection of files that primarily confirm the structure of the data returned from the API.

Mainly they use E() to prove hash element exists, some could/should/do test the content. For example in one test I just wanted to confirm that the data and distribution fields are arrays before later testing the content of those arrays. So I have a test like this:

is $leaderboard, {
    distribution => {
        ageMode => E,
        data    => array {
            etc(),
        },
        max            => E,
        mean           => E,
        min            => E,
        range          => E,
        userPercentile => E,
    },
    id      => E,
    mod     => E,
    records => array {
        etc(),
    },
    },
    'Leaderboard structure is as expected';

It is mainly E() checking that is exists, then array to confirm it contains an array. etc() tell the testing code to assume some rows, but not to check the content of them. They just have to exist. As etc() is the only thing in there, as long at data and records contain arrays with some rows, the tests pass.

Having tests like this is really helpful when wrapping someone elses API previous pain has taught me. If the structure of the return data changes, I can easily confirm what has changed.

When you are wrapping an API this way it is an inevitability that the API will change so having the tests is one of the only ways to ensure you can maintain it over time. My Webservice::Judobase has had this happen a multitude of times in the past 5 or so years.

Summary

As you can see from the brevity of this tale, wrapping an API is pretty easy to to. I know from previous experience that doing it helps others build things with Perl. So it's the sort of project that helps both the service and our language.

Perl is a famous "glue code" language and this is a great example of why. Two CPAN modules do almost all the heavy lifting and our fantastic test tooling means I was easily able to write tests that will make it easy to maintain.

Dist::Zilla (as per previous post) I use to automate most of the publishing work. I will write something up about that another day. It has lots of plugins that make sure your module is up to scratch.

The JSON API and JavaScript front-end trend has meant that lots of websites have a JSON endpoint that you could wrap. This means you could as I have create a tool that uses your data in a way that the website may never build for you.

It also gives me confidence that I could pull my personal data out of the service and store it in my backups (I am doing that next), so that if the company ever goes bust, I have my data there and have it backed up securely and independently. I can do with it what I please as it is just data on my disk(s), either as plain files or maybe pushed into another database.

If you use a website and there is no module for it on CPAN, maybe it has an API you can wrap too?