Building a tool to integrate Readwise.io highlights into my Zettelkasten via Perl

Tags:

Recently I started using the Readwise.io as a replacement both for my RSS reader and my bookmarking app. I also use a Zettelkasten system of thinks I want to record; so decided to integrate via the API so that highlights I make in Readwise end up as markdown files on my local machine (and git repo).

The editor I use mostly for Zelltelkasten is the lovely Zettlr or vim; so I wanted to craft simple markdown files. Being me; I reached for Perl and have written a small CPAN module along the way called WebService::Readwise which includes a example script of the basics of what I cover in this tale.

Continue reading Building a tool to integrate Readwise.io highlights into my Zettelkasten via Perl...

TDD for Good... strings

Tags:

One of the things about Test Driven Development (TDD) is that it's something that takes practice. A great way to do that is via the amazing Weekly Challenge which is led by the amazing Mohammad Anwar. Mohammad is a force for good in the Perl community and this week I wanted to take the time to thank him and express how much I appreciate him and wish him well always... but especially at the moment when he faces some challenges! Kia Kaha my friend!

The first challenge this week (https://theweeklychallenge.org/blog/perl-weekly-challenge-221/) was to develop some software to calculate a sum of the lengths of "Good Strings" where a good string is described as:

A string is good if it can be formed by characters from $chars, each character can be used only once.

For me this starts with creating a .t test file. Yes, a test file first. I use Test2::V0 and target a module to test, so something like this:


use Test2::V0 -target => 'Good::Strings';
ok 1;
done_testing;

This fails as I have not created the module file yet.

So my next step is to create a package/module in a lib directory, with this as content:


package Good::Strings;

1;

Then I run my tests again. I make lots of typo mistakes; so running my tests regularly/constantly and making small changes helps prevent those small mistakes causing frustration. To help me, I like to run the tests automatically. My current tooling (when working purely from terminal session) is to run my tests in a separate window like this:

git ls-files -mo | entr -s 'yath'

What the above does is use entr to watch a selection of files, which it gets from git. So each time I save a file, yath gets fired and it runs my tests.

Once I have confirmed that the test file is able to open the package, no syntax errors etc. It's time to write first test, against the first example:


use Test2::V0 -target => 'Good::Strings';

subtest "Example 1" => sub {
    my @words    = ( "cat", "bt", "hat", "tree" );
    my $chars    = "atach";
    my $expected = 6;

    is $CLASS->sum_lengths(
        words => \@words,
        chars => $chars,
    ), $expected;

};

done_testing;

This fails as expected as there is no sum_lengths method/sub in the package. So I add that sub sum_lengths{}; this fails in a different way. This time the test calls the method successfully; but it the method itself is not returning the right answer... so I change it to sub sum_legths { return 6; }. At this point I am "green" and the code does exactly what is required for it.

It does not do any real calculations, and that is the right thing to do in a TDD exercise. Each steps you take wants to be tiny. You want to make the smallest step you can each time. Do the smallest thing that makes the tests pass; including hard-coding the return value as we don't need to do anything else to make the test pass. It seems silly; but it does a couple of good things. First teaches you not to try and solve the whole problem at once. As a professional software developer... we do that all too often. Rather than solving the small easy problem immediately in front of us; we will try and solve the bigger problems whilst solving small problems. The cognitive load is hard to manage. We constantly run the risk of over complicating things and is a recipe for disaster. So intentionally practising doing the smallest change that meets the immediate requirement is good for discouraging that trap. The other thing it helps with is catching the human errors; typos, spelling mistakes, syntax errors and the ever threatening missing semi-colon. As a developer you need to plan for human error; not pretend we can be perfect.

Having reached this point, I'd normally commit the change; grab your favourite beverage, take a moment and look at what the next requirement (in this case, example 2) and whilst hydrating it's a chance to think about how the code needs to change. Again, this is a good habit that working in a TDD manner encourages; stopping, thinking, moving forward.

So as you'd expect, I start by adding a new test:


subtest "Example 2" => sub {
    my @words    = ( "hello", "world", "challenge" );
    my $chars    = "welldonehopper";
    my $expected = 10;

    is $CLASS->sum_lengths(
        words => \@words,
        chars => $chars,
    ), $expected;
};

This test fails because the code always returns 6 so, now is the right time to change the function to make this test pass.

I'm going to skip over the repeated cycles I took to get to a working solution... why, because it was probably 50 small steps... and that would be dull. It included things like alternating between passing named parameters to using sub signatures (still a Perl feature I don't use as often so decided to use it for educational purposes). I decided not to explore using a CPAN module like List::MoreUtils and wrote a really ugly solution... but one that worked. Here it is:


sub sum_lengths ( $self, $words, $chars ) {
    my @chars_array = sort split '', $chars;
    my %chars_hash  = map { $_ => $_ } @chars_array;

    my $char_count = 0;
    for my $word (@$words) {
        my @word_array = split '', $word;

        my $built_word;
        for my $char (@word_array) {
            $built_word .= $chars_hash{$char} if $chars_hash{$char};
        }
        $char_count += @word_array if $word eq $built_word;

    }

    return $char_count;
}

At this point I stopped and had another coffee, coming back to the screen I realised I had stopped doing good TDD. I'd written a lot of code without actually having tests supporting that beyond the larger view of if it worked. I'd used Data::Dumper and warn to check what the code was doing rather than a test. This happens a lot in "the real word", as a developer you cut corners. Mainly when you think you know what you want to do. Your intuition leads you to skipping some steps. Coming back to this code after a break, I was able to see I had "cheated".

This is part of the reason to practice TDD on non-work code. I get that chance to see the behaviour on a exercise; not on my employers production code. It's good to remind myself that I am fallible; even when intention is to do TDD... I skip to intuitive coding. No harm done and it works... but the steps went from small to large. If someone was looking at this as a merge request... they have more code to understand, there is no story of what I approached, why, etc.

This is a micro demonstration; it's good for me to see it here; much better writing this poorly here than doing it at work and making my colleagues suffer a big change that is ugly. What was/is interesting also; is that I felt tired mentally after this change (hence the coffee break). That was in part because of duration; but also because without the TDD it was one big block of concentration without the micro-breaks that come from TDD test, change, test, change, test, change cycles.

Having "solved" the problem, both the examples are passing at this stage. I have met the "business requirements". Now is the time to refactor and improve on what I know is a "sub-optimal" solution. Time to TDD this, TDD in fact makes this easier I think.

So how might we refactor this?

Well the code that build the word looks like a likely candidate, in part because a loop inside a loop is never a good look. So we can try refactoring that out into a sub, something like this:


subtest "_build_word" => sub {
    is $CLASS->_build_word(
        word  => 'cat',
        chars => {
            c => 'c',
            a => 'a',
            t => 't',
        },
        ),
        'cat';
};

Then we can make the sum_lengths sub look like this:


sub sum_lengths ( $self, $words, $chars ) {
    my @chars_array = sort split '', $chars;
    my %chars_hash  = map { $_ => $_ } @chars_array;

    my $char_count = 0;

    for my $word (@$words) {
        my @word_array = split '', $word;

        my $built_word = $self->_build_word(
            word  => $word,
            chars => \%chars_hash,
        );

        $char_count += @word_array if $word eq $built_word;
    }

    return $char_count;
}

So that is a little cleaner; it's perhaps easier to follow that we loop around the words. Looking at this, we can see that the only reason why we split the word into an array is to tell us how many characters to add to the count. This is a idiomatic Perl trick where an array in scalar context returns the number of elements of the array. It's a bit magical... and not necessary. We can use Perl length function to achieve the same thing. Which is both shorter and probably easier for others to understand (especially non Perl natives).


sub _build_word {
    my ( $self, %params ) = @_;

    my @word_array = split '', $params{word};

    my $built_word;
    for my $char (@word_array) {
        $built_word .= $params{chars}{$char} if $params{chars}{$char};
    }

    return $built_word;
}

What is important here is that because I have tests, I know immediately that I've not broken the functionality. So I can make my refactorings with confidence. Because I have confidence in not breaking the functionality, I can experiment with the code and see if I can hone it.

This freedom also lets me test a bit more thoroughly, doing a little exploration of edge cases. Such as "what happens if we can't build any words", so I can add another test:


subtest "Edge case: unable to build any words" => sub {
    my @words    = ( "hello", "world", "challenge" );
    my $chars    = "xxx";
    my $expected = 0;

    is $CLASS->sum_lengths( \@words, $chars, ), $expected;
};

This test passes... BUT; we get a warning as we Perl does not like us using eq on $build_word when it is undef. Knowing this we can change our test as follows:


subtest "Edge case: unable to build any words" => sub {
    my @words    = ( "hello", "world", "challenge" );
    my $chars    = "xxx";
    my $expected = 0;

    my $got;
    ok no_warnings { $got = $CLASS->sum_lengths( \@words, $chars, ); };
    is $got, $expected;
};

This test fails as we would hope, so I can fix it with a simple skip in the loop if it's falsey; something like next unless $built_word; just above where we check if the built word and the target word are equal. You could of course do it other ways. :-)

So there you have it, my contribution to the Weekly Challenge. I wanted to make the effort to participate this week as I wanted to support Mohammad as he put the effort in to administer the challenge this week despite all that is going on with him at the moment. Hang in there Mohammad!!

Writing a Dist::Zilla test plugin

Tags:

Recently I wrote a small Dist::Zilla plugin to help with the problem of dependencies being old in modules I author.

I use update-cpanfile to check dependencies in projects I write using a cpanfile (and carton); I wanted a tool to do the same thing with libraries I write.

It was pretty simple to do over a couple of evenings; and worked really well; a trial version is on Cpan now at Dist::Zilla::Plugin::Test::Prereqs::Latest.

Using it is really simple you just add [Test::Prereqs::Latest] in your dist.ini and then when you run dzil test it will add and run a test file (xt/author/prereqs-latest.t) which checks the version of each module in the [Prereqs] secion of the dist.ini and fail if the version you have specified is lower than the latest CPAN.

The code is pretty simple, using existing work by HITODE whose update-cpanfile I use regularly to help keep dependencies up to date in projects I have that use cpanfile and carton.


use strict;
use warnings;

use App::UpdateCPANfile::PackageDetails;
use Dist::Zilla::Util::ParsePrereqsFromDistIni qw(parse_prereqs_from_dist_ini);
use Test::More;

my $prereqs = parse_prereqs_from_dist_ini(path => 'dist.ini');
my $checker = App::UpdateCPANfile::PackageDetails->new;

for my $key (sort keys %$prereqs) {
    for my $req (sort keys %{$prereqs->{$key}->{requires}}) {
        my $current_version = $prereqs->{$key}->{requires}->{$req};
        $current_version =~ s/v//g;
        my $latest_version = $checker->latest_version_for_package($req) || '0';
        my $out_of_date = ($latest_version <= $current_version);

        ok( $out_of_date,"$req: Current:$current_version, Latest:$latest_version");
    }
}

It's simplistic, but works well. If the version in the dist.ini is lower than the version on CPAN it fails the test and tells you as much.

Once written, I was easily able to test it against a couple of modules I maintain by installing the package with dzil install then in the other module I add [Test::Prereqs::Latest] to the dist.ini and ran dzil test and it worked.

Once I had done basic testing locally; was able to create a trial release and upload to CPAN with dizil release --trial which built and uploaded the distribution.

Of course CPAN is amazing, so shortly afterwards the cpan testers started discovering the module and testing that it built on a variety of versions of Perl and a variety of platforms. People love GitHub actions but cpan testeers was first and I did literally nothing to get all this amazing testing without any configuration work, nothing... it just happens. It's amazing, seriously amazing.

The module is not ready for a proper release, but it's been nice to "scratch my own itch" so to speak.

Wrapping a JSON API to access your personal data

Tags:

In my last tale I outlined how I used Dist::Zilla to do much of the heavy lifting for packaging and releasing a Perl module to CPAN. In this tale I want to outline how quick and easy it was to write that module; wrapping a proprietary undocumented API so that I could easily use my personal data in ways the company did/does not provide.

The module in question is WebService::SmartRow. SmartRow is a physical device and app that provides power and other performance data from a indoor rowing machine (specifically the WaterRower) machines.

It is a bluetooth device, that sends data to a web service via a mobile application. This data is then accessed via the app, or the website. The data includes things like the power in watts, stokes per minute, heart rate, etc.

The specific data I wanted to see, was/is how my average power changes over time. Specifically, across a series of 5000 meter rows, am I generating more power over time. Also I wanted to see my personal best times, over time. I.e. how much better was this 5000m over my previous best.

The API

SmartRow do not provide a developer API (at the time of writing, and as far as I was able to see). What they do have is a website that is mainly driven by JavaScript and a JSON API. As logged in user my browser makes calls to the API and then creates pretty HTML from the JSON returned.

The API is really quite tidy and easy to understand. Which is nice! I have previous experience wrapping JSON APIs and it can be really ugly; thankfully the SmartRow team have made a clean, easy to understand API.

The API relies on you being logged in, but quick exploration showed that basic auth is possible on the API endpoint. Meaning I did not need to worry about OAuth or keys and so forth.

The web requests

Knowing that I could get away with basic auth, that meant that the code I needed could work simply by including the username and password in the URL I make a HTTPS request to.

I turned to HTTP::Tiny and was quickly able to craft a get request that looked a little like this:

    my $response = HTTP::Tiny->new->request( 'GET',
              'https://'
            . $user . ':'
            . $pass . '@'
            . $api_url
            . '/public-game'
    );

Then (after some basic tests to make sure I got a 200), I could parse the $response->{content} from JSON to a Perl data structure using Cpanel::JSON::XS.

This gave me a data structure looking like this:

  [
    {
    "accessory_mac"        => E,
    "account"              => E,
    "ave_bpm"              => E,
    "ave_power"            => E,
    "calc_ave_split"       => E,
    "calc_avg_stroke_rate" => E,
    "calc_avg_stroke_work" => E,
    "calories"             => E,
    "confirmed"            => E,
    "created"              => E,
    "curve"                => E,
    "device_mac"           => E,
    "distance"             => E,
    "elapsed_seconds"      => E,
    "extra_millies"        => E,
    "id"                   => E,
    "mod"                  => E,
    "option"               => E,
    "option_distance"      => E,
    "option_time"          => E,
    "p_ave"                => E,
    "protocol_version"     => E,
    "public_id"            => E,
    "race"                 => E,
    "strava_id"            => E,
    "stroke_count"         => E,
    "time"                 => E,
    "user_age"             => E,
    "user_max_hr"          => E,
    "user_weight"          => E,
    "watt_kg"              => E,
    "watt_per_beat"        => E,
    },
    ...
  ]

(The above code is actually taken from the Test2::V0 .t file I wrote to confirm the structure.)

So you can see it's pretty easy to understand, the keys are all mainly understandable. In my case I wanted the p_ave and distance, so I could filter on 5000 meters and build an array of all the average power values.

Module creation

At first this could have been a simple script, but I wanted to make this something portable and usable by anyone wanting to work with their personal data.

So after I proved the approach would work, I started a module (using dzil new WebService::SmartRow). This is currently a single file with little refinement.

I used Moo for some simple OOPness and structure. This allowed me to specify the attributes I want:

has username => ( is => 'ro', required => 1 );
has password => ( is => 'ro', required => 1 );

There are pretty self-explanatory, to use the API you need those, so add them as required attributes.

Next I added an http attribute:

has http => (
    is      => 'ro',
    default => sub {
        return HTTP::Tiny->new();
    },
);

The default here creates a HTTP::Tiny object which I can later use in methods via $self, which meant my earlier get request changes to look like this:

    my $response = $self->http->request( 'GET',
              'https://'
            . $user . ':'
            . $pass . '@'
            . $api_url
            . '/public-game'
    );

You can set your own http attribute when creating via WebService::SmartRow->new() so if you need to do something like change the user agent, or have a preferred module, you can inject it easily (assuming the methods match HTTP::Tiny).

Testing

Currently the module is pretty simple, three attributes and 4 public methods. The module has little smarts so the t directory is pretty spartan as the object is pretty simple.

I am using the xt directory to hold tests that talk to the API and as such require an internet connection and credentials.

Not wanting to include my personal credentials in the repo, I have a private sub in the class that gets the username and password from environment variables. Which is good as it means I can commit my tests, and if someone using this module does not need to commit their credentials in code either.

Perl makes the environment variables easy to work with, so the small sub that handles it looks like this:

sub _credentials_via_env {
    my $self = shift;

    my $user = $self->username || $ENV{SMARTROW_USERNAME};

    my $pass = $self->password || $ENV{SMARTROW_PASSWORD};

    return ( $user, $pass );
}

So if you have instantiated the module with username or password it will use those. If they are not present it will use SMARTROW_USERNAME or SMARTROW_PASSWORD.

Then (and I know I can make this a bit smarter) my get_workouts() method, has a call to my ( $user, $pass ) = $self->_credentials_via_env; prior to calling the URL.

This means I can run my tests like this:

SMARTROW_USERNAME=yyyyyyy SMARTROW_PASSWORD=xxxxxxx prove -lvr xt/

And it will go and connect to the API and execute tests for me by getting real data from the API.

Earlier I mentioned I am using Test2::V0 for tests, so in xt I have a growing collection of files that primarily confirm the structure of the data returned from the API.

Mainly they use E() to prove hash element exists, some could/should/do test the content. For example in one test I just wanted to confirm that the data and distribution fields are arrays before later testing the content of those arrays. So I have a test like this:

is $leaderboard, {
    distribution => {
        ageMode => E,
        data    => array {
            etc(),
        },
        max            => E,
        mean           => E,
        min            => E,
        range          => E,
        userPercentile => E,
    },
    id      => E,
    mod     => E,
    records => array {
        etc(),
    },
    },
    'Leaderboard structure is as expected';

It is mainly E() checking that is exists, then array to confirm it contains an array. etc() tell the testing code to assume some rows, but not to check the content of them. They just have to exist. As etc() is the only thing in there, as long at data and records contain arrays with some rows, the tests pass.

Having tests like this is really helpful when wrapping someone elses API previous pain has taught me. If the structure of the return data changes, I can easily confirm what has changed.

When you are wrapping an API this way it is an inevitability that the API will change so having the tests is one of the only ways to ensure you can maintain it over time. My Webservice::Judobase has had this happen a multitude of times in the past 5 or so years.

Summary

As you can see from the brevity of this tale, wrapping an API is pretty easy to to. I know from previous experience that doing it helps others build things with Perl. So it's the sort of project that helps both the service and our language.

Perl is a famous "glue code" language and this is a great example of why. Two CPAN modules do almost all the heavy lifting and our fantastic test tooling means I was easily able to write tests that will make it easy to maintain.

Dist::Zilla (as per previous post) I use to automate most of the publishing work. I will write something up about that another day. It has lots of plugins that make sure your module is up to scratch.

The JSON API and JavaScript front-end trend has meant that lots of websites have a JSON endpoint that you could wrap. This means you could as I have create a tool that uses your data in a way that the website may never build for you.

It also gives me confidence that I could pull my personal data out of the service and store it in my backups (I am doing that next), so that if the company ever goes bust, I have my data there and have it backed up securely and independently. I can do with it what I please as it is just data on my disk(s), either as plain files or maybe pushed into another database.

If you use a website and there is no module for it on CPAN, maybe it has an API you can wrap too?

Using Dist::Zilla to create a new CPAN module

Tags:

Recently, I posted about some tools that are perhaps not as well known as they should be. I asked on Twitter (from @perlkiwi) for suggetsions and one was distzilla.

I didn't include it at the time in part because I knew I wanted to mint a new CPAN module and planned on using distzilla to do it, so this tale is going to cover how it works and how it helped me put a module together.

What is Dist::Zilla

Dist::Zilla (aka distzilla aka dzil) is a command line tool to help you create and perhaps more importantly maintain packages you intend to share via CPAN.

It really does help ensure you create a good package and make it easy to maintain that module over time. Perl people know the importance of maintaining legacy code; so dzil is really valuable.

Getting started

If you have never tried distzilla, then do pop over to the main website dzil.org which has a fabulous "Choose your own adventure" way of introducing you to the tool. I really like the site and LOVE a different approach to doing documentation.

In short... you start by typing something like dzil new WebService::SmartRow, which will create a new directory for you with a dist.ini a base lib/WebService/SmartRow.pm file and a few other bits and pieces.

From here, I started a new repo on github and added the "remote" to the created directory (git init then git remote add origin git@github.com:lancew/WebService-SmartRow.git) after which I could happily git add . then commit and push the changes.

Writing the code

The module I was writing is a small wrapper around a JSON API endpoint from https://smartrow.fit/. I just wanted to be able to access the workout data so I could munge the data in a few different ways and create some charts for myself.

The module itself is pretty simple. I used Moo out of habit and HTTP::Tiny for the network part and Cpanel::JSON::XS for the JSON handling.

These are specified in the dist.ini file as prerequisites:

[Prereqs]
perl = v5.06.0 ;
Cpanel::JSON::XS = 4.27 ;
HTTP::Tiny       = 0.080 ;
Moo              = 2.005004 ;
namespace::clean = 0.27 ;

[Prereqs / TestRequires]
Test2::V0 = 0.000145 ;

You can see I have two sets of dependencies, one for the app itself and one for testing dependencies for installation. I use Test2::V0 so I pop it in there.

Uploading to CPAN

After having written the code and tests I wanted to upload to CPAN, this I can do with distzilla via the helpful dzil release --trial command; which as I have already used dzil to upload modules before worked forst time. :-)

All you need to know is you pause credentials.

Dist::Zilla plugins for the win!

Having uploaded the package as trial, it was a good time to use distzilla to fine tune my module. This is super easy via the large array of plugins (and plugin bundles). The biggest problem is choosing them and dealing with the odd conflict.

A good one to start with is @TestingMania bundle includes a lot of really helpful tetsing tools. PerlTidy, PerlCritic, PodCoverage and more intricate ones to test your META info. WHich is important if the package is going to CPAN.

Add the bundle is easy, just add [@TestingMania] into your dist.ini and run dzil test. If the dependencies you need are not there distzilla will prompt you with the command you need which is dzil authordeps --missing | cpanm.

Git and GitHub integration

I am using GitHub to host the code for this module, so I added a few plugins:

GitHub::Meta

This plugin includes things like the repository url and issue tracker in the META for the package.

By inlcuding the plugin (and META infor), metacpan will know that the repo is on GitHub and present it in the UI. It also tells metacpan that I am using the issue tracker on GitHub and this prevents it from defaulting to RT.

@Git

The @Git bundle is helpful for ensuring a few Git related things are done correctly. I.e. that the repo is "clean" before you release. It also does a git tag automatically on release and pushes it to GitHub for me automatically.

This bundle did push me to update my .gitignore file to ignore the WebService-SmartRow-* files and directory that Dist::Zilla builds.

Git::NextVersion

The last of the Git plugins is my favourite, thos one automatically bumps the version when I do a release which meant I could remove the version = 0.001 from the dist.ini as the plugin increments based on the tags in the repo. It's cool, I like it.

Changes

I am terrible at updating teh Changes file in repositories, so I added these two plugins:

Test::ChangesHasContent

This plugin creates a .t file that confirms you have content in the Changes file matching the next release.

CheckChangesHasContent

This plugin is similar, in that it actually prevents you from releasing if you have not got content for the new release in the changes file.

I use both, as I like the warning early via dzil test that the first plugin gives me and the second is there mainly in case I ever removethe first I suppose. Maybe I could should delete it?

The Bundles gotcha

The plugin bundles are awesome, with one exception for new dzil users. They can cause conflicts where the same plugin is referenced twice. This is not too scary once you have seen it a couple of times; but can be a stress when it hits you the first time. Just read the message and try and understand which two bundles are conflicting. You can (at least I could in @TestingMania) disable a plugin and that can solve the issue for you (I just added disable = Test::Version ; for example immediately after the [@TestingMania] in dist.ini; YMMV).

dzil as tool.

dzil is really helpful. dzil test and dzil test --release ensures my module is in a pretty good state.

dzil build (and dzil clean) are handy for building the module so I can look at what will end up on CPAN.

dzil install installs the module like a normal cpan module; so I could use the module in a script elsewhere on my machine as if I had downloaded it from CPAN. Given this module is mainly about accessing a JSON API; this is a helpful feature to save uploading to cpan and then installing from cpan more than I need to.

Summary

Dist::Zilla can be a little overwhelming, but I find it pretty intuitive if you take small steps. I added each plugin one at a time as that helps keep the confusion to a minimum. Bundles are great; but can have conflicts. It is by it's own definition "maximum overkill"; but it's easy overkill so give it a try.

Let me know how you get on.

  • Lance (perl.kiwi).