Wrapping a JSON API to access your personal data

Tags:

In my last tale I outlined how I used Dist::Zilla to do much of the heavy lifting for packaging and releasing a Perl module to CPAN. In this tale I want to outline how quick and easy it was to write that module; wrapping a proprietary undocumented API so that I could easily use my personal data in ways the company did/does not provide.

The module in question is WebService::SmartRow. SmartRow is a physical device and app that provides power and other performance data from a indoor rowing machine (specifically the WaterRower) machines.

It is a bluetooth device, that sends data to a web service via a mobile application. This data is then accessed via the app, or the website. The data includes things like the power in watts, stokes per minute, heart rate, etc.

The specific data I wanted to see, was/is how my average power changes over time. Specifically, across a series of 5000 meter rows, am I generating more power over time. Also I wanted to see my personal best times, over time. I.e. how much better was this 5000m over my previous best.

The API

SmartRow do not provide a developer API (at the time of writing, and as far as I was able to see). What they do have is a website that is mainly driven by JavaScript and a JSON API. As logged in user my browser makes calls to the API and then creates pretty HTML from the JSON returned.

The API is really quite tidy and easy to understand. Which is nice! I have previous experience wrapping JSON APIs and it can be really ugly; thankfully the SmartRow team have made a clean, easy to understand API.

The API relies on you being logged in, but quick exploration showed that basic auth is possible on the API endpoint. Meaning I did not need to worry about OAuth or keys and so forth.

The web requests

Knowing that I could get away with basic auth, that meant that the code I needed could work simply by including the username and password in the URL I make a HTTPS request to.

I turned to HTTP::Tiny and was quickly able to craft a get request that looked a little like this:

    my $response = HTTP::Tiny->new->request( 'GET',
              'https://'
            . $user . ':'
            . $pass . '@'
            . $api_url
            . '/public-game'
    );

Then (after some basic tests to make sure I got a 200), I could parse the $response->{content} from JSON to a Perl data structure using Cpanel::JSON::XS.

This gave me a data structure looking like this:

  [
    {
    "accessory_mac"        => E,
    "account"              => E,
    "ave_bpm"              => E,
    "ave_power"            => E,
    "calc_ave_split"       => E,
    "calc_avg_stroke_rate" => E,
    "calc_avg_stroke_work" => E,
    "calories"             => E,
    "confirmed"            => E,
    "created"              => E,
    "curve"                => E,
    "device_mac"           => E,
    "distance"             => E,
    "elapsed_seconds"      => E,
    "extra_millies"        => E,
    "id"                   => E,
    "mod"                  => E,
    "option"               => E,
    "option_distance"      => E,
    "option_time"          => E,
    "p_ave"                => E,
    "protocol_version"     => E,
    "public_id"            => E,
    "race"                 => E,
    "strava_id"            => E,
    "stroke_count"         => E,
    "time"                 => E,
    "user_age"             => E,
    "user_max_hr"          => E,
    "user_weight"          => E,
    "watt_kg"              => E,
    "watt_per_beat"        => E,
    },
    ...
  ]

(The above code is actually taken from the Test2::V0 .t file I wrote to confirm the structure.)

So you can see it's pretty easy to understand, the keys are all mainly understandable. In my case I wanted the p_ave and distance, so I could filter on 5000 meters and build an array of all the average power values.

Module creation

At first this could have been a simple script, but I wanted to make this something portable and usable by anyone wanting to work with their personal data.

So after I proved the approach would work, I started a module (using dzil new WebService::SmartRow). This is currently a single file with little refinement.

I used Moo for some simple OOPness and structure. This allowed me to specify the attributes I want:

has username => ( is => 'ro', required => 1 );
has password => ( is => 'ro', required => 1 );

There are pretty self-explanatory, to use the API you need those, so add them as required attributes.

Next I added an http attribute:

has http => (
    is      => 'ro',
    default => sub {
        return HTTP::Tiny->new();
    },
);

The default here creates a HTTP::Tiny object which I can later use in methods via $self, which meant my earlier get request changes to look like this:

    my $response = $self->http->request( 'GET',
              'https://'
            . $user . ':'
            . $pass . '@'
            . $api_url
            . '/public-game'
    );

You can set your own http attribute when creating via WebService::SmartRow->new() so if you need to do something like change the user agent, or have a preferred module, you can inject it easily (assuming the methods match HTTP::Tiny).

Testing

Currently the module is pretty simple, three attributes and 4 public methods. The module has little smarts so the t directory is pretty spartan as the object is pretty simple.

I am using the xt directory to hold tests that talk to the API and as such require an internet connection and credentials.

Not wanting to include my personal credentials in the repo, I have a private sub in the class that gets the username and password from environment variables. Which is good as it means I can commit my tests, and if someone using this module does not need to commit their credentials in code either.

Perl makes the environment variables easy to work with, so the small sub that handles it looks like this:

sub _credentials_via_env {
    my $self = shift;

    my $user = $self->username || $ENV{SMARTROW_USERNAME};

    my $pass = $self->password || $ENV{SMARTROW_PASSWORD};

    return ( $user, $pass );
}

So if you have instantiated the module with username or password it will use those. If they are not present it will use SMARTROW_USERNAME or SMARTROW_PASSWORD.

Then (and I know I can make this a bit smarter) my get_workouts() method, has a call to my ( $user, $pass ) = $self->_credentials_via_env; prior to calling the URL.

This means I can run my tests like this:

SMARTROW_USERNAME=yyyyyyy SMARTROW_PASSWORD=xxxxxxx prove -lvr xt/

And it will go and connect to the API and execute tests for me by getting real data from the API.

Earlier I mentioned I am using Test2::V0 for tests, so in xt I have a growing collection of files that primarily confirm the structure of the data returned from the API.

Mainly they use E() to prove hash element exists, some could/should/do test the content. For example in one test I just wanted to confirm that the data and distribution fields are arrays before later testing the content of those arrays. So I have a test like this:

is $leaderboard, {
    distribution => {
        ageMode => E,
        data    => array {
            etc(),
        },
        max            => E,
        mean           => E,
        min            => E,
        range          => E,
        userPercentile => E,
    },
    id      => E,
    mod     => E,
    records => array {
        etc(),
    },
    },
    'Leaderboard structure is as expected';

It is mainly E() checking that is exists, then array to confirm it contains an array. etc() tell the testing code to assume some rows, but not to check the content of them. They just have to exist. As etc() is the only thing in there, as long at data and records contain arrays with some rows, the tests pass.

Having tests like this is really helpful when wrapping someone elses API previous pain has taught me. If the structure of the return data changes, I can easily confirm what has changed.

When you are wrapping an API this way it is an inevitability that the API will change so having the tests is one of the only ways to ensure you can maintain it over time. My Webservice::Judobase has had this happen a multitude of times in the past 5 or so years.

Summary

As you can see from the brevity of this tale, wrapping an API is pretty easy to to. I know from previous experience that doing it helps others build things with Perl. So it's the sort of project that helps both the service and our language.

Perl is a famous "glue code" language and this is a great example of why. Two CPAN modules do almost all the heavy lifting and our fantastic test tooling means I was easily able to write tests that will make it easy to maintain.

Dist::Zilla (as per previous post) I use to automate most of the publishing work. I will write something up about that another day. It has lots of plugins that make sure your module is up to scratch.

The JSON API and JavaScript front-end trend has meant that lots of websites have a JSON endpoint that you could wrap. This means you could as I have create a tool that uses your data in a way that the website may never build for you.

It also gives me confidence that I could pull my personal data out of the service and store it in my backups (I am doing that next), so that if the company ever goes bust, I have my data there and have it backed up securely and independently. I can do with it what I please as it is just data on my disk(s), either as plain files or maybe pushed into another database.

If you use a website and there is no module for it on CPAN, maybe it has an API you can wrap too?

Using Dist::Zilla to create a new CPAN module

Tags:

Recently, I posted about some tools that are perhaps not as well known as they should be. I asked on Twitter (from @perlkiwi) for suggetsions and one was distzilla.

I didn't include it at the time in part because I knew I wanted to mint a new CPAN module and planned on using distzilla to do it, so this tale is going to cover how it works and how it helped me put a module together.

What is Dist::Zilla

Dist::Zilla (aka distzilla aka dzil) is a command line tool to help you create and perhaps more importantly maintain packages you intend to share via CPAN.

It really does help ensure you create a good package and make it easy to maintain that module over time. Perl people know the importance of maintaining legacy code; so dzil is really valuable.

Getting started

If you have never tried distzilla, then do pop over to the main website dzil.org which has a fabulous "Choose your own adventure" way of introducing you to the tool. I really like the site and LOVE a different approach to doing documentation.

In short... you start by typing something like dzil new WebService::SmartRow, which will create a new directory for you with a dist.ini a base lib/WebService/SmartRow.pm file and a few other bits and pieces.

From here, I started a new repo on github and added the "remote" to the created directory (git init then git remote add origin git@github.com:lancew/WebService-SmartRow.git) after which I could happily git add . then commit and push the changes.

Writing the code

The module I was writing is a small wrapper around a JSON API endpoint from https://smartrow.fit/. I just wanted to be able to access the workout data so I could munge the data in a few different ways and create some charts for myself.

The module itself is pretty simple. I used Moo out of habit and HTTP::Tiny for the network part and Cpanel::JSON::XS for the JSON handling.

These are specified in the dist.ini file as prerequisites:

[Prereqs]
perl = v5.06.0 ;
Cpanel::JSON::XS = 4.27 ;
HTTP::Tiny       = 0.080 ;
Moo              = 2.005004 ;
namespace::clean = 0.27 ;

[Prereqs / TestRequires]
Test2::V0 = 0.000145 ;

You can see I have two sets of dependencies, one for the app itself and one for testing dependencies for installation. I use Test2::V0 so I pop it in there.

Uploading to CPAN

After having written the code and tests I wanted to upload to CPAN, this I can do with distzilla via the helpful dzil release --trial command; which as I have already used dzil to upload modules before worked forst time. :-)

All you need to know is you pause credentials.

Dist::Zilla plugins for the win!

Having uploaded the package as trial, it was a good time to use distzilla to fine tune my module. This is super easy via the large array of plugins (and plugin bundles). The biggest problem is choosing them and dealing with the odd conflict.

A good one to start with is @TestingMania bundle includes a lot of really helpful tetsing tools. PerlTidy, PerlCritic, PodCoverage and more intricate ones to test your META info. WHich is important if the package is going to CPAN.

Add the bundle is easy, just add [@TestingMania] into your dist.ini and run dzil test. If the dependencies you need are not there distzilla will prompt you with the command you need which is dzil authordeps --missing | cpanm.

Git and GitHub integration

I am using GitHub to host the code for this module, so I added a few plugins:

GitHub::Meta

This plugin includes things like the repository url and issue tracker in the META for the package.

By inlcuding the plugin (and META infor), metacpan will know that the repo is on GitHub and present it in the UI. It also tells metacpan that I am using the issue tracker on GitHub and this prevents it from defaulting to RT.

@Git

The @Git bundle is helpful for ensuring a few Git related things are done correctly. I.e. that the repo is "clean" before you release. It also does a git tag automatically on release and pushes it to GitHub for me automatically.

This bundle did push me to update my .gitignore file to ignore the WebService-SmartRow-* files and directory that Dist::Zilla builds.

Git::NextVersion

The last of the Git plugins is my favourite, thos one automatically bumps the version when I do a release which meant I could remove the version = 0.001 from the dist.ini as the plugin increments based on the tags in the repo. It's cool, I like it.

Changes

I am terrible at updating teh Changes file in repositories, so I added these two plugins:

Test::ChangesHasContent

This plugin creates a .t file that confirms you have content in the Changes file matching the next release.

CheckChangesHasContent

This plugin is similar, in that it actually prevents you from releasing if you have not got content for the new release in the changes file.

I use both, as I like the warning early via dzil test that the first plugin gives me and the second is there mainly in case I ever removethe first I suppose. Maybe I could should delete it?

The Bundles gotcha

The plugin bundles are awesome, with one exception for new dzil users. They can cause conflicts where the same plugin is referenced twice. This is not too scary once you have seen it a couple of times; but can be a stress when it hits you the first time. Just read the message and try and understand which two bundles are conflicting. You can (at least I could in @TestingMania) disable a plugin and that can solve the issue for you (I just added disable = Test::Version ; for example immediately after the [@TestingMania] in dist.ini; YMMV).

dzil as tool.

dzil is really helpful. dzil test and dzil test --release ensures my module is in a pretty good state.

dzil build (and dzil clean) are handy for building the module so I can look at what will end up on CPAN.

dzil install installs the module like a normal cpan module; so I could use the module in a script elsewhere on my machine as if I had downloaded it from CPAN. Given this module is mainly about accessing a JSON API; this is a helpful feature to save uploading to cpan and then installing from cpan more than I need to.

Summary

Dist::Zilla can be a little overwhelming, but I find it pretty intuitive if you take small steps. I added each plugin one at a time as that helps keep the confusion to a minimum. Bundles are great; but can have conflicts. It is by it's own definition "maximum overkill"; but it's easy overkill so give it a try.

Let me know how you get on.

  • Lance (perl.kiwi).