Wrapping a JSON API to access your personal data

Tags: perl projects cpan

In my last tale I outlined how I used Dist::Zilla to do much of the heavy lifting for packaging and releasing a Perl module to CPAN. In this tale I want to outline how quick and easy it was to write that module; wrapping a proprietary undocumented API so that I could easily use my personal data in ways the company did/does not provide.

The module in question is WebService::SmartRow. SmartRow is a physical device and app that provides power and other performance data from a indoor rowing machine (specifically the WaterRower) machines.

It is a bluetooth device, that sends data to a web service via a mobile application. This data is then accessed via the app, or the website. The data includes things like the power in watts, stokes per minute, heart rate, etc.

The specific data I wanted to see, was/is how my average power changes over time. Specifically, across a series of 5000 meter rows, am I generating more power over time. Also I wanted to see my personal best times, over time. I.e. how much better was this 5000m over my previous best.

The API

SmartRow do not provide a developer API (at the time of writing, and as far as I was able to see). What they do have is a website that is mainly driven by JavaScript and a JSON API. As logged in user my browser makes calls to the API and then creates pretty HTML from the JSON returned.

The API is really quite tidy and easy to understand. Which is nice! I have previous experience wrapping JSON APIs and it can be really ugly; thankfully the SmartRow team have made a clean, easy to understand API.

The API relies on you being logged in, but quick exploration showed that basic auth is possible on the API endpoint. Meaning I did not need to worry about OAuth or keys and so forth.

The web requests

Knowing that I could get away with basic auth, that meant that the code I needed could work simply by including the username and password in the URL I make a HTTPS request to.

I turned to HTTP::Tiny and was quickly able to craft a get request that looked a little like this:

    my $response = HTTP::Tiny->new->request( 'GET',
              'https://'
            . $user . ':'
            . $pass . '@'
            . $api_url
            . '/public-game'
    );

Then (after some basic tests to make sure I got a 200), I could parse the $response->{content} from JSON to a Perl data structure using Cpanel::JSON::XS.

This gave me a data structure looking like this:

  [
    {
    "accessory_mac"        => E,
    "account"              => E,
    "ave_bpm"              => E,
    "ave_power"            => E,
    "calc_ave_split"       => E,
    "calc_avg_stroke_rate" => E,
    "calc_avg_stroke_work" => E,
    "calories"             => E,
    "confirmed"            => E,
    "created"              => E,
    "curve"                => E,
    "device_mac"           => E,
    "distance"             => E,
    "elapsed_seconds"      => E,
    "extra_millies"        => E,
    "id"                   => E,
    "mod"                  => E,
    "option"               => E,
    "option_distance"      => E,
    "option_time"          => E,
    "p_ave"                => E,
    "protocol_version"     => E,
    "public_id"            => E,
    "race"                 => E,
    "strava_id"            => E,
    "stroke_count"         => E,
    "time"                 => E,
    "user_age"             => E,
    "user_max_hr"          => E,
    "user_weight"          => E,
    "watt_kg"              => E,
    "watt_per_beat"        => E,
    },
    ...
  ]

(The above code is actually taken from the Test2::V0 .t file I wrote to confirm the structure.)

So you can see it's pretty easy to understand, the keys are all mainly understandable. In my case I wanted the p_ave and distance, so I could filter on 5000 meters and build an array of all the average power values.

Module creation

At first this could have been a simple script, but I wanted to make this something portable and usable by anyone wanting to work with their personal data.

So after I proved the approach would work, I started a module (using dzil new WebService::SmartRow). This is currently a single file with little refinement.

I used Moo for some simple OOPness and structure. This allowed me to specify the attributes I want:

has username => ( is => 'ro', required => 1 );
has password => ( is => 'ro', required => 1 );

There are pretty self-explanatory, to use the API you need those, so add them as required attributes.

Next I added an http attribute:

has http => (
    is      => 'ro',
    default => sub {
        return HTTP::Tiny->new();
    },
);

The default here creates a HTTP::Tiny object which I can later use in methods via $self, which meant my earlier get request changes to look like this:

    my $response = $self->http->request( 'GET',
              'https://'
            . $user . ':'
            . $pass . '@'
            . $api_url
            . '/public-game'
    );

You can set your own http attribute when creating via WebService::SmartRow->new() so if you need to do something like change the user agent, or have a preferred module, you can inject it easily (assuming the methods match HTTP::Tiny).

Testing

Currently the module is pretty simple, three attributes and 4 public methods. The module has little smarts so the t directory is pretty spartan as the object is pretty simple.

I am using the xt directory to hold tests that talk to the API and as such require an internet connection and credentials.

Not wanting to include my personal credentials in the repo, I have a private sub in the class that gets the username and password from environment variables. Which is good as it means I can commit my tests, and if someone using this module does not need to commit their credentials in code either.

Perl makes the environment variables easy to work with, so the small sub that handles it looks like this:

sub _credentials_via_env {
    my $self = shift;

    my $user = $self->username || $ENV{SMARTROW_USERNAME};

    my $pass = $self->password || $ENV{SMARTROW_PASSWORD};

    return ( $user, $pass );
}

So if you have instantiated the module with username or password it will use those. If they are not present it will use SMARTROW_USERNAME or SMARTROW_PASSWORD.

Then (and I know I can make this a bit smarter) my get_workouts() method, has a call to my ( $user, $pass ) = $self->_credentials_via_env; prior to calling the URL.

This means I can run my tests like this:

SMARTROW_USERNAME=yyyyyyy SMARTROW_PASSWORD=xxxxxxx prove -lvr xt/

And it will go and connect to the API and execute tests for me by getting real data from the API.

Earlier I mentioned I am using Test2::V0 for tests, so in xt I have a growing collection of files that primarily confirm the structure of the data returned from the API.

Mainly they use E() to prove hash element exists, some could/should/do test the content. For example in one test I just wanted to confirm that the data and distribution fields are arrays before later testing the content of those arrays. So I have a test like this:

is $leaderboard, {
    distribution => {
        ageMode => E,
        data    => array {
            etc(),
        },
        max            => E,
        mean           => E,
        min            => E,
        range          => E,
        userPercentile => E,
    },
    id      => E,
    mod     => E,
    records => array {
        etc(),
    },
    },
    'Leaderboard structure is as expected';

It is mainly E() checking that is exists, then array to confirm it contains an array. etc() tell the testing code to assume some rows, but not to check the content of them. They just have to exist. As etc() is the only thing in there, as long at data and records contain arrays with some rows, the tests pass.

Having tests like this is really helpful when wrapping someone elses API previous pain has taught me. If the structure of the return data changes, I can easily confirm what has changed.

When you are wrapping an API this way it is an inevitability that the API will change so having the tests is one of the only ways to ensure you can maintain it over time. My Webservice::Judobase has had this happen a multitude of times in the past 5 or so years.

Summary

As you can see from the brevity of this tale, wrapping an API is pretty easy to to. I know from previous experience that doing it helps others build things with Perl. So it's the sort of project that helps both the service and our language.

Perl is a famous "glue code" language and this is a great example of why. Two CPAN modules do almost all the heavy lifting and our fantastic test tooling means I was easily able to write tests that will make it easy to maintain.

Dist::Zilla (as per previous post) I use to automate most of the publishing work. I will write something up about that another day. It has lots of plugins that make sure your module is up to scratch.

The JSON API and JavaScript front-end trend has meant that lots of websites have a JSON endpoint that you could wrap. This means you could as I have create a tool that uses your data in a way that the website may never build for you.

It also gives me confidence that I could pull my personal data out of the service and store it in my backups (I am doing that next), so that if the company ever goes bust, I have my data there and have it backed up securely and independently. I can do with it what I please as it is just data on my disk(s), either as plain files or maybe pushed into another database.

If you use a website and there is no module for it on CPAN, maybe it has an API you can wrap too?

Perl Zettelkasten tooling hackery

Tags: projects

This week I have worked on some Go-lang for Fantasy-Judo.com adding some internationalisation (Italian support and French translation). I also did a Perl live coding session where I worked on a command line tool I wrote last year for my Zettelkasten note taking system.

I forgot to zoom the terminal so it's a little hard to read, apologies for that. And some of the audio seems muted as I think the music in the background has triggered some copyright bot in Twitch.

The script originally added back links to files that related to one another automatically; I've decided not to do that as many people suggest this is an anti-pattern in the Zettelkasten world. But I did/do still want to know when I have notes that:

Don't link to other notes.
Don't have links from other notes.

So I picked up the repo and started working on it, the stream is just me pottering around in Perl, so although I include it above and am leaving it up, it's pretty dry/dull... you have been warned.

What sticks in my mind from the session were a couple of things:

Deleted code is debugged code (Jeff Sickel)
Test2::V0 and Test::MockFile don't play nicely together
TDD is a skill that comes with practice

"Deleted code is debugged code." – Jeff Sickel
— Programming Wisdom (@CodeWisdom) April 30, 2021

This tweet resonated with me after the live coding session. The reason being that I knew I had a small bug in my code, where an additional blank line was added each time I ran the script and it added back links. I had done some debugging back in August 2020 when I worked on this script last, but not solved the problem.

Today the code is fully "debugged" as I have now deleted the code in question. As a developer the quote an experience is important to remember. When we find bugs, should we dig into them and write complicated fixes/tests? Or as per this example maybe the best answer is to delete the code. "Less is More" and all that good stuff.

Test2::V0 vs Test::MockFile

As I'd not worked on the code for a while I started by running the tests (Always run the tests first is a mantra we could all do with chanting... especially developers in practical coding interviews, run the darn tests).

In my situation, the there were odd errors being thrown, I am presuming this has something to do with the version of Perl and modules on my current machine and the machine I was writing the code on originally as I don't recall seeing the errors back then.

On the video you'll see me scratching my head, then starting a new test file, with no tests, just the including the initial use statements. This showed that including the two libraries together caused the error. I realise now as I write this perhaps I could have done some more diagnosis (and probably will later) but at the time I solved my problem the easiest way I could... deleting the code.

Well, sort of. What I actually did was stop using Test2::V0 and switched to Test::More and seeing as the errors went away I made minor changes to my tests to use that instead.

Test Driven Development (TDD) is a skill that comes with practice.

If you watch my live coding sessions, you'll notice I try and work in a TDD style. Which when you are doing an abstract problem is a form of "deliberate practice". This week when working on an actual tool I was also operating in a TDD style. And it felt I think pretty smooth; and smooth as a result of the using it in the Perl Weekly Challenge solutions.

This is not a surprise, all the TDD people I trust do suggest that you should learn TDD outside of your job, as it's hard and you get it wrong at first. That you should not slow down your salary earning work learning the skill. This weeks coding felt like the TDD was flowing nicely as a result of the practice done in other weeks.

I found myself describing is comments what I wanted in pseudo-code comments in the script, then writing tests and code that delivered that, then replacing the pseudo-code with calls to the tested methods I'd written. The code feels pretty comfortable. I also find that I am writing a module that is larger than the script. So the script is pretty expressive and understandable:

use strict;
use warnings;

use lib './lib';
use Zettlr::Backlinker;

my $ZB = Zettlr::Backlinker->new;

my $files = $ZB->get_file_list('/home/lancew/zettel');

for my $file (@$files) {
    if ( $ZB->number_of_links_out($file) == 0 ) {
        print "NO LINKS OUT: $file\n";
    }

    if ( $ZB->number_of_links_in( $file, $files ) == 0 ) {
        print "NO LINKS IN: $file\n";
    }

}

As you can see, I am not doing much, just getting a list of files, checking the number of outbound links and checking the number of inbound links. The code has three corresponding methods. Tidy.

The tests I wrote in parts, but the overall tests I wrote that evening look like this:

use strict;
use warnings;

use Test::MockFile;
use Test::More;

use Zettlr::Backlinker;

my $CLASS = Zettlr::Backlinker->new;

my $file_1_content = <<HERE;
# This is the first file
#tag1 ~tag2

Paragraph one has no links.

Paragraph 2 links to [[20200716164925]] Coding standards.
Which is only the ID and not the full file name


Paragraph 3 links to [[20200802022902]] Made up at the time I wrote the test.

Paragraph 4 links to [[20200802022902]] and [[20200716164911]] to test a bug.

Paragraph 4 links to the second file [[22222222222222]] So should be a backlink for that file

HERE

my $file_2_content = <<HERE;
# This is the second file
#tag1 ~tag2

Paragraph one has no links.


HERE

my $file_3_content = <<HERE;
# This is the second file
#tag1 ~tag2

Links to [[11111111111111]] and [[22222222222222]]

HERE
my $mock_file_1
    = Test::MockFile->file( '11111111111111 some file.md', $file_1_content );
my $mock_file_2 = Test::MockFile->file( '22222222222222 another file.md',
    $file_2_content );
my $mock_file_3
    = Test::MockFile->file( '33333333333333 third file.md', $file_3_content );
my $mock_file_4 = Test::MockFile->file( 'README.md', $file_1_content );

my @file_list = (
    '11111111111111 some file.md',
    '22222222222222 another file.md',
    '33333333333333 third file.md',
    'README.md',
);

my $mock_dir = Test::MockFile->dir( '/foo', \@file_list, { mode => 0700 } );

subtest 'number_of_links_out' => sub {
    is $CLASS->number_of_links_out('11111111111111 some file.md'),
        4, 'Has 4 unique links in it, 5 in total as one is repeate2d';
    is $CLASS->number_of_links_out('22222222222222 another file.md'),
        0, 'Has no links in it';
    is $CLASS->number_of_links_out('33333333333333 third file.md'),
        2, 'Has 2 links in it';
};

subtest 'number_of_links_in' => sub {
    is $CLASS->number_of_links_in(
        '11111111111111 some file.md', @file_list
        ),
        1, 'Only the third file links to file 1';
    is $CLASS->number_of_links_in(
        '22222222222222 another file.md', @file_list
        ),
        2, 'Both the other files link to this file';
    is $CLASS->number_of_links_in(
        '33333333333333 third file.md', @file_list
        ),
        0, 'Neither of the other files link to this file';
};
done_testing;

As I hope you can see, I have two things I am testing links in links out, broken into two subtest blocks. I like using subtest to give clarity in TAP output and visibly in the test file itself. I could/should move the here-docs lower in the file as that's noise. Though arguably, it give the reader the content of the three test files before they see the tests.

I used Test::MockFile for the first time on this project. I don't think it's particularly popular and ideally I'd not use it and structure things in different ways. But the tool is all about reading (and originally writing) to files on disk so this test library was great for allowing me to create files and directories without actually creating temp files.

The methods I wrote look like this:

sub number_of_links_out {
    my ( $self, $filename ) = @_;

    my $links = $self->get_links_from_files($filename);

    return scalar( @{ $links->{$filename} } );
}

sub number_of_links_in {
    my ( $self, $filename, @file_list ) = @_;

    my $links = $self->get_links_from_files(@file_list);
    delete $links->{$filename};

    $filename =~ m/^(\d+)/;
    my $occurances = 0;
    for my $key ( keys %$links ) {
        for my $link ( @{ $links->{$key} } ) {
            $occurances++ if $link eq $1;
        }

    }
    return $occurances;
}

The first one is pretty concise now, it was longer, but I did refactor a little. The second came later and I was tired, and it shows. It is still the verbose version I tend to start with and whittle down once I have a working solution. I actually finished this code off stream as I was not getting things right. Just after stopping the stream I stopped making small mental mistakes and got it working properly.

This makes sense too, coding is a skill. And your skills just as your skills improve with practice, they deteriorate with tiredness.

This is one of those things that a professional coder knows. You are creating something that requires skill, working tired or sick is something you learn not to do. I have told many people about "that time" when I wrote some code at work when I had a cold/flu. The (awesome) tester who worked with me took one look and straight out told me it was rubbish and looked like someone else had written it. They saw the result of using a skill sick/fatigued, it's just not as good as when you are fresh and healthy.

Given writing software is done in our heads, making sure that our heads are clear should be a priority you'd think. But how often do we ask that question of ourselves, let alone peers or reports (if you are a manager)? How much effort are we as an industry wasting because we are allowing people creating software to do it tired or sick?

So along with practice, I am adding "Rest and wellness matter!" to my list of things I learnt this week.

This week I have more l10n work to do on/in Go-lang and want to look at how to run the above script every time I push to my Zettelkasten and have the results emailed to me. I will also try and do the Perl Weekly Challenge (I seem to be settling into a fortnightly rhythm of doing them". I really should also put some time into the Judo simulation project I have been "renovating" from the early noughties.

Lance.