Perl tooling - less well known but powerful

Tags:

Perl is a mature language, businesses have been relying on it for decades, a common refrain is that Perl is battle tested. One of the strengths of a mature language is that the community builds tooling around the language. Perl has some well known tools and some less well known ones. In this tale I want to talk about some of the tooling I like and perhaps are less well known.

Having a good set of tools helps developers build software better and more easily.

Well known tools

Here are some of the netter known tools that many Perl developers use everyday.

Perl::Tidy

Consistent formatting is surprisingly important and time consuming. Especially in a team environment running Perl::Tidy is helpful for visually formatting the code so that no matter who writes the changes they are formatted consistently. Be that tabs vs. spaces or aligning the "=" signs in variable assignments.

Perltidy is similar to gofmt (GoLang), prettifier (Node, etc) or elm-format (Elm).

Perl::Critic

perlcritic is a static analysis tool, it is a very flexible and configurable tool that allows you to define and detect rules on things you don't want in your code base.

For example, a common rule is that all sub routines need an explicit return; perlctitic can enforce this across your code base. There are over 100 "policies" on CPAN for perlcritic. Another one Perl::Critic::Policy::logicLAB::RequireParamsValidate insists that the Params::Validate module is used to validate all parameters to your subroutines.

Perl::Critic::Policy::Variables::ProhibitUnusedVarsStricter prevents you defining a variable that is never used. This is a pretty common thing to find in a large code base, either because the variable is no longer used or there is a typo somewhere.

prove

Perl has a strong and well developed testing culture and tooling. prove is the test runner most developers are used to using.

There are a wide selection of Test:: modules that you can use for mocking, unit testing, BDD, even enforcing Perl::Critic policies.

perlbrew

Perlbrew is relatively well known, though perhaps less often used than it might deserve to be.

Perlbrew allows an easy way to install various versions of Perl and switch between them. It is often used on your development machine especially if you need to work on specific versions of Perl so you can support legacy applications. Increasingly we are seeing it used in the building of Docker container.

Less well known tools

carton

If you have written any node you will understand carton. With carton you describe your module dependencies in a file called cpanfile and then install them with carton install This installs all the modules in a ./local directory and you can then run the perl application with carton exec and it runs the code with all the dependencies in the correct path.

This is particularly helpful for when you have multiple projects and they use differing versions of the same dependencies.

Because I use carton with almost all my projects now, I have the following two aliases setup:

perl='carton exec perl'

prove='carton exec prove'

These mean that if I forget to type carton exec perl some_script.pl and type perl some_script.pl it works as expected using the local dependencies. The prove alias is handy as I almost never remember to type carti exec prove.

update-cpanfile

This command-line tool is really helpful for maintenance of your dependencies. When you run it with update-cpanfile pin it will pin your dependencies in the cpanfile. carton install will then install the specific versions in the file. This keeps you dependencies consistent; but you could do that by hand in the cpanfile.

Where cpanfile-update is really helpful is when you run it with update-cpanfile update, this will make the tool check cpan.org for the latest version of modules and will update the cpanfile for you with the new version.

If you maintain a variety of projects or have lots of dependencies update-cpanfile is a real time saver.

yath

The simplest way to describe yath is to say that yath is the new prove... but better.

The venerable prove has been used for a long time, yath is relatively new (2016) and can be used pretty much to do all the things prove does.

However, it's able to do some really interesting things such as running in a daemon mode saving startup times... give it a try.

perlvars and perlimports

These two tools are super handy for analyzing your code base to ensure that some undesirable code smells.

perlvars identifies unused variables in your code.

perlimports identifies unused modules you may have included in your code but are not using.

Removing unused variables and modules helps keep your code "tidy" and improve memory consumption and protect against such things as methods from an unused module causing issues when another module has a method with same name for example.

What am I missing?

This is just a short list of tools I wanted to mention, in part to invite people to let me know what they use.

So if you have some tools you use all the time, drop me an email or a tweet.

Building a simple RSS aggregator with Perl and Mojolicious to replace a PHP app

Tags:

In the last tale I shared a little of my experience of using the Digital Ocean app services for build, deploy and hosting of a simple web application.

In this tale, I'd like to summarise the small application I built.

Introduction

The application I am replacing is an old (early 2000s) PHP application. It aggregates RSS feeds from a variety of Judo blogs and produces both a web presentation and a new RSS feed of the combined RSS items. The old site has limped along with little modification since early 2010s. It did well untill the hosting platform it was on (cPanel) upgraded PHP and broke the site.

There were two specific hacks that both helped and hindered. The web pages were being writen to static HTML files by a cron job that ran the PHP. To address multi-ligual need, the site had three installations of the same code running against different OPML files of sites. This actually menat the site stayed "up" when the PHP upgrade broke the app.

New Version

I decided to build the site using Mojolicious. I've done a lot of work with Dancer2 in the past, so this was a good opportunity to build something real with Mojo.

Making the site properly multi-lingual was high on my agenda. Fast also an important factor for me.

Starting

I started by using the Mojolicious command line tool to scaffold a working application:

mojo generate app

This gave me the shape of the app; Mojo/Mojolicious gives a pretty standard MVC web application structure to work with. As someone familiar with Dancer2 it's not difficult to adjust. All your regular tools for handling GET and POST requests are there already. Mojo also comes with Morbo.

Morbo

Morbo is a development server that handles HTTP and has an almost mandatory feature for me... hot reloading. When you change some code, Morbo identifies that and reloads. This is pretty common in other languages and it's nice to have it baked into Mojo.

Test::Mojo

Mojo comes with Test::Mojo which is a really nice library to help you test the WebUI elements of your application. This is really nice and works well. I like to try and build things in a Test Driven Development style. Both in terms of unit tests AND integration style tests. Test::Mojo makes it easy to describe the web page I want to build and then confirm I have done it as I build it.

It does not replace unit testing for business logic (I still use Test2::V0) for that. Test::Mojo provides a ovely out of the box, pre-wired web page testing tool. The ease of use as a developer is important to me and frankly makes doing a more TDD style of development more of a default option than if you have to wire things together yourself.

XML

This was actually the hardest part of the puzzle. There are many options and in the end I have been using XML::OPML to parse the lists of sites in each language. XML::Feed handles the reading and and writing of the RSS feeds. Beyond this, I use HTML::Strip to clean the content from the external feeds along with Text::Truncate to limit the text shown for each item shown on screen. Lastly, I am using Moo for forming the object oriented "business logic" part of the code.

Since starting (and in part from writing this tale) I see that Mojolicious has "JSON and HTML/XML parser with CSS selector support." so maybe I don't need all this? I shall need to explore.

Code structure

The site is pretty simple at the moment, I have broken it into two parts. Feed fetching, parsing, aggregating is one part. Starting with reading an OPML file and culminating in writing an RSS file.

The second part is the website that reads the RSS file and presents it on screen. Currently, the website does not fetch the feeds from the external sites. This is being done locally on my machine as a stand alone script; this will probably change; though it does mean that the site is very simple currently.

Routes

The code is simple easy:


 $r->get('/')->to('Home#welcome');
 $r->get('/:lang')->to('Main#index');

The whole site is basically one route that is the language we are serving i.e. /english, /french or /spanish. The :lang is used within the code to decide which RSS file to read and display. So /english reads rss_english.xml and displays it. Currently the controller looks like this:


sub index ($self) {
    my $xml
        = XML::Feed->parse( 'public/rss_' . $self->param('lang') . '.xml' );
    my $opml_parser = XML::OPML->new;
    my $opml
        = $opml_parser->parse( 'public/' . $self->param('lang') . '.opml' );

    # Rendel template "example/welcome.html.ep" with message
    $self->render(
        lang    => ucfirst $self->param('lang'),
        rss_xml => $xml,
        opml    => $opml,
    );
}

As you can see, we read the OPML file (to show a list of sites down one side of the page) and the aggregated RSS feed for each item from the sites on the list.

Hidden complexity in the template

The controller is simple, which is in part becuase I have hidden a large about of code in the template itself. The template engine is complex enough to allow me to generate loops, strip HTML, truncate etc. This is convenient for me, but feels very wrong.

Bootstrap

There is a little Bootstrap used to apply some design.

Summary

So for me Mojolicious has been enjoyable; well worth a try if you are looking for a developer focussed web framework; no matter what language you are coming from. The tooling (Morbo and Test::Mojo) make for an easy and reliable development process. The app structure is familiar to someone coming from most any MVC framework (be that Dancer2, Ruby on Rails, Django or Express).

Give it a try.

Using Digital Ocean App service to run a Perl Mojolicious docker app

Tags:

Recently I needed to renovate an old web project; RSS aggregator site written in PHP in early 2000s which after not being maintained for about a decade had finally died.

It is now a new Perl Mojolicious website, moved from the cPanel hosting onto Digital Ocean's docker "App" service. In this post I'd like to share some of the learnings and experiences of moving the app to Digital Ocean's service.

I am not going to cover in this tale/post the code side of the Mojo app, just the infrastructure side. Mainly because I have been really impressed with the ease and usefulness of the Digital Ocean service.

Please note, I am NOT sponsored, or paid by Digital Ocean. This is just a service I have used and enjoyed.

Overview

I write changes in git | -> Push to GitHub | -> Digital Ocean picks up the git change | -> Digital Ocean builds the docker image | -> Digital Ocean runs the image (including HTTP).

More detailed explanation:

I used the basic Docker file that Mojo generates, with a couple of small modifications:

FROM perl
WORKDIR /opt/mojo-planetjudo
COPY . .
# XML::LibXML needs to be there before we
# try to install SimpleObject, Parser.
# TODO: See if parser needed here
# TODO: See why cpanm does not sort this out
RUN cpanm install -n XML::LibXML
RUN cpanm install -n XML::Parser
RUN cpanm install -n XML::SimpleObject
RUN cpanm --installdeps -n .
EXPOSE 3000
CMD ./script/planetjudo prefork

You can see that it's pretty basic. The only oddity is that I install a couple of modules before installing the rest via a cpanfile. That's just because they were not installing nicely for some reason. ;-)

The other thing I do is have a .dockerignore file, it only has two entries:


.git
local

This prevents Docker copying in the .git and local directories. .git is pretty self explanatory you don't need it. local is where cpanm has installed my Perl modules and that can cause an issue I struck when it tried to build modules and failed.

And that's about it!

Developer Workflow

The really nice thing with this setup is that my entire workflow consists of git add, commit, push.

After that Digital Ocean takes care of everything, the build/deploy is all done for me. It is currently a bit slower than I'd like (mainly cpanm, so cpm might be faster). But I love that I don't need to think about anything.

Niceties

Digital Ocean's app service takes care of not only the build, deploy and hosting of the app. It also provides console logs, graphs of things like memory and CPU usage and even a terminal I can use to interact with the app.

Cost

The smallest instance you can run is $5usd per month, which is equivalent to their smallest "droplet" (virtual machine). If, like me it's a low traffic, resources light, site; then it's a great price to pay for a simple, reliable solution.

Summary

Digital Ocean's app service is great for a Perl web developer. With virtually no effort we can get a Perl Mojolicious dockerised application up and running. No need to worry about pipelines or even SSL, it's all just taken care of for you. It does put you firmly in the "vendor lock-in" situation; but it's a pleasant trap to be in. We as Perl developers get to host our applications on a new service.

As a developer who works in Perl (as well as other languages), it's great to have tried this and felt it work really smoothly. It's sometimes not the case; sometimes you try a new approach or tool and Perl is not practical on it.

The only build and deploy step I need to make is git push which is phenomenal. I didn't have to worry about LetsEncrypt or anything like that. Everything "just worked".

Give it a try fellow Perl devs, it's nice.

Renovating a CGI app talk

Tags:

Recently I was invited to give a tech talk at the Southampton Perl Mongers group online event. It was great to spend some time with local and not so local (Hello to our new friends in Texas!) Perl users.

The talk was an abbreviated version of the talk I had planned to give at the German Perl Workshop 2021 but was unable to do. In this post I want to share the content and some bonus content not included that I should have included probably.

Here is the video of the talk I made after the event:

Overview

The talk was some experiences I've had mainly renovating an old Perl CGI app, I wrote in the late 1990s and early 2000's which was parked gathering dust till Christmas time 2020.

The 3 steps I describes were: * Get it working * Tidy up * Modernise

Followed by some learnings and some advice.

Get it working

This seems obvious, but made a big difference for me. Mixing making it work and improvements is a mistake I think. It's tempting but just getting it to work will more than likely force changes in the code. Couple those changes with tidying up and you are more likely to break things.

One of the big challenges I found was the (re)learning of how the code works. What ideas and approaches were in use at the time. Just "getting it working" provides opportunity to learn the general shape of the code and what it depends on to work. Be that old CPAN modules, environment variable, files on disk or databases (and specific database versions).

You will have to change code even if you are just trying to get a working local version.

Getting it working TWICE is super valuable too. So both a local dev setup and a fresh server somewhere. This doing it twice helps find things that are not obvious. For me the differences from a local ArchLinux machine and a Ubuntu Server helped me identify differences.

In the talk and afterwards in the discussion, the idea of writing down what you learn as you go along was covered. This is skill and habit well worth developing if you are working on legacy code... or new code in fact. I kept notes on blog articles I read and followed. I scribbled notes on how to start/stop things. I absolutely had moments where I did something on Friday and come Monday had forgotten what I had done, why or how I came to the information I used to influence what I tried. WRITE IT DOWN... it's worth it!

Plack::App::CGIBin

Specifically for me, I was working with an old set of .cgi files. There were previously running on a server under Apache. Locally, I did not have; nor want that overhead. So I needed a tool called Plack::App::CGIBin which allows you to simply point it at a directory of .cgi files and serve it as a plack app via plackup. This was essential for my local development setup so I could see the working app once again.

At this stage the home page loaded nicely but not a lot else as CPAN modules were not on my system.

Carton and cpanfile

Managing the dependencies the app had was important (especially with an older application, where breaking changes can often appear). My choice here is to use the tool carton which reads a cpanfile and installs the CPAN modules locally (into a directory called local in the working directory). Then you can run with these specific CPAN modules using carton exec.

When I was trying to get the application up and running, I had some issues and it was really helpful to have two copies of the source code in different directories and be able to use carton to run different versions of the same CPAN modules. This helped me identify breaking changes. Not having to rely (or mess with) system wide CPAN modules was/is really valuable.

Hard coding and deleting

As I got to understand the code better when getting it working; it became clear that some things were not worth retaining. So the delete key was a really effective method to get the application working. The other trick that I used was to simplify the problem by hard coding some variable that were originally designed to more flexible but generated complexity.

Deleting code and hard coding things helped get the app to the state that it "worked" again. It was not 100% functionality restored; that in itself was a great learning experience. It's easy to think that all the features the legacy code had in place matter and/or worked.

Tidy up

Once the application was working, the next phase was to tidy the code in advance of planned modernisation.

I feel it's valuable to separate the tidy up from the modernisation. I find and found that the act of tidying up involves some modernisation anyway. What I mean here is that knew at this stage that I wanted to replace the data handling part of the code; but decided that I wanted to tidy up first as I knew that it would involve some changes to the code AND some collateral modernisation. If I tried to modernise during or while cleaning up I think it makes the task more difficult.

Perltidy

I suspect most Perl developers have used Perltidy and are familiar with the way it formats the source code in a consistent manner. When picking up a legacy code base it's really helpful to find a moment to perltidy everything. This will make the existing code look more familiar (especially for those of use using Perltidy a lot and are used to the style choices). I tend not to customise the perltidy setting too much leaving it pretty much on defaults.

Perlcritic

Another well used tool, perlcritic allows you to automate some stylistic decisions that are regarded and "Best Practices". Specifically, "Perl Best Practices" the book standards. I am not saying that all the PBP standards are ones I follow but I appreciate the standardisation it offers. It is a wonderful tool to help shape legacy codebase. It did help me identify some common things to improve (double to triple arg file opens for example).

WebService::SQLFormat

SQL is a language in itself, so like the Perl I think it's valuable to have some consistent formatting of the SQL in the app. I didn't use a ORM when the app was first written and broadly find the benefits of writing SQL outweigh the benefits of an ORM... your mileage may vary.

I started out using a website and copy and pasting SQL back and forward. Later I identified the WebService::SQLFormat module and was able to write a small script that would format my SQL in a more automated fashion.

As with Perltidy I don't necessarily agree/like all the formatting choices it makes. But I value the consistency and ease to automate more than my aesthetic preferences.

Human Eye

A trick I applied to this code is the simple trick of zooming out my editor so that the code is tiny and I can see the "shape" of the code.

It's a remarkably effective way of "seeing" code smells. The easiest to describe is complexity added by layers of loops or if statements. You can easily see multiple levels of indentation and see that something there is complex. This is helped by having previously having run Perltidy; so do that first.

You also see large/long subs and are able to see dense code.

Zooming in on the areas that from "30,000 feet" look wrong, you can make quick gains by tackling these areas, then zooming back out to find what else looks problematic.

I did use some other tools to help with this, like trying to see cyclomatic complexity or file size. Frankly though helpful, the human eye and mind is exceptionally good at seeing patterns and I got more benefit from this trick than the tools.

Modernise

Having tidied up the code I was in a position to modernise. Starting with introducing Dancer2 as my web framework (Mojo/Catalyst might have been your choice... I went with Dancer2 as it's a tool I know well).

Having created the basics, the next stage was a cut and paste exercise of moving code out of CGIs and into routes. I was fortunate that I had used HTML templates originally so was saved the pain of breaking the HTML out of the code... many of use have been there and it's not fun.

Database change

The original code used a module called DBD::Anydata which was (it's deprecated now) handy at the time. I used it to read and write from CSV files using SQL statements. Yes really. It was a mad decision to use CSV files as the data store for the app; but in terms of modernisation it was fortunate as it meant that I'd not had written code to read/write data to files. I had written SQL inserts and selects, which meant that it was comparatively easy to migrate the app to Postgres.

I did need to write the schema creation etc.

Database Migrations

The schema was "OK" but not perfect and as I tidied more and had to make more changes I became annoyed with destroying and recreating the database each time. I explored using migration tools like sqitch XXX, but ended up quickly writing a migration tool in the admin area of the app that applied SQL statements in numerical order from a directory of .sql files. (With a migration level being stored in the database to prevent re-applying the same changes. Not sophisticated... but it works and was simplest solution at the time.

Docker

Initially I had a local installation of Postgres, but I work from multiple machines and quickly moved to using a dockerised installation of Postgres to simplify my development cycles.

Following this I added a Perl container to run the app itself.

Learnings

In preparing and giving the talk I was struck by how much of what mattered to me was not "technical" but "human" parts.

  • Write it down This ended up being more important pretty much anything else. Having notes on what I was doing and why proved important when I took breaks from the code and came back to it. When I did poorly I would come back and not recall what specific things I was trying to do and why. This was really prevalent in the Postgres changes. I am not a docker guru and followed several guides; at least once I came back the next day and got lost as I did not take notes on what article I had read and what it taught me.

  • Automate it Perltidy, Perlcritic, etc. The more I was able to automate the easier it became to use the tools and to remain consistent. This is true also of deploying the code. A "pipeline" be it Github actions or bash script makes life so much easier and the easier the process the better. My mind could stay of the code and the problems not on how did I get this deployed etc. So take those moments to automate things.

  • Do the simple things first It's tempting to get stuck in and tackle hard problems at the start. I think this is a mistake. By starting with small simple things you gain familiarity with the code and the "domain". You discover the hidden complexity before you get deep into hard problems. This way when you get into the hard problems, your familiarity is high and you've discovered many of the complexities already. So do those simple things; it's why we often give the new member to a development team the simple update the template typo ticket right. Simple tasks mean you run through all the steps sooner and resolve the things that are not obvious.

  • Legacy code is a great place There is enjoyment to be had in an older code base. Unlike a writing from new, a legacy code base is nuanced. You often have multiple ideas spanning the code base. As a developer touching legacy code you have the opportunity to discover how it was built, how it was changed, and why. This is satisfying work; learning how to code in the "old" style can be fun. Also moving a code from old to new can be satisfying. Finding the multiple styles of a code base and bringing them into alignment is a skill in itself and one that deserves more highlighting.

Legacy code, has made many decisions already. So often that "paralysis by analysis" problem is avoided as you are stepping into a situation where decisions have already been made. Another skill is understanding the constraints and mindsets that ended up in the code looking as it does. The old adage that the people who wrote were doing the best they could given the situation is valuable to keep in mind.

Legacy code is important, it exists and did something. New code is a gamble in a way. It's not proven, it's not been used. So Legacy code and maintaining it is a vital part of our industry and we need more people to value it.

  • Perl's Legacy Working in multiple languages, it's interesting to see the influence it has had on newer languages.

For example, Go has great testing tools out of the box and formatting. JavaScript's NPM is CPAN. PHP is getting better and has better tooling than it once did. Raku oddly given it's lineage does not have great tooling. There is no Rakucritic or Rakutidy. Go in many ways shows more influence of legacy Perl. Go is arguably built with Legacy in mind. It's intentionally constrained and comes with tools to help the legacy code age well.

Summary

I'll try and record myself giving the talk and share it on the site till then, thanks for reading along.

Perl flexibility for the win

Tags:

This week I spent some time on the Weekly Challenge 119 which is always a good opportunity to get some deliberate practice in.

This week I completed task one, "Swap Nibbles" in both Perl and GoLang.

Now I know I am biased and my main language is Perl, but this was definitely easier/faster than doing it in GoLang.

The point of the exercise is to take a decimal integer, convert it to the binary representation, flip the two "nibbles" and return the resulting decimal integer.

Even with my verbose style of coding these things my solution was this:


sub swap {
    my ( $self, $n ) = @_;

    my $oct = sprintf( "%08b", $n );

    $oct =~ /^(.{4})(.{4})$/;

    return oct "0b$2$1";
}

I streamed myself coding this up on the Perl.Kiwi Twitch Channel which is also on YouTube:

Perl's ability to allow me to take a variable $n and treat it like an integer, then the awesome power of Perl's regex engine makes it simple to swap the nibbles. Then in the return statement auto-magically takes the strings taken in the regex and combines them into another string and oct treats it like a number and converts it to decimal.

It's helpful in a way that non-Perl people can find disconcerting. It's hugely flexible and useful. It's also one of the reasons people hate Perl; it is super easy to write code that is buggy as things are "fluid".

My second attempt was in GoLang:


func NSwap(n int) int {
    bin := fmt.Sprintf("%08b", n)

    a := bin[0:4]
    b := bin[4:8]

    c := b + a

    d, _ := strconv.ParseInt(c, 2, 64)
    return int(d)
}

It's not too dissimilar to be honest, probably influenced by my Perl implementation (and certainly not idiomatic Go). It's not much more complex and follows a similar style of approach. However the conversion steps are more distinct as GoLang will not allow different types to interact in the way Perl does.

The second task was another example of how Perl's flexibility can make it very easy to solve problems quickly and easily. In this task, you need to create a sequence of numbers using just 1,2,3 and that don't have the number 1 next to itself (so 11 excluded, 211 excluded).

My implementation looks like this:


sub no_one_on_one {
    my ( $self, $n ) = @_;

    my @seq;

    my $x = 0;
    while (1) {
        $x++;
        next unless $x =~ /^[123]/;
        next if $x =~ /[4567890]/g;
        next if $x =~ /11/g;
        push @seq, $x;
        last if @seq > $n - 1;
    }
    return $seq[-1];
}

Again, Perl's regex engine for the win. And unless is super helpful here, it helps me read easily.

Rather than trying to create a sequence that obeys the rules; I just loop around incrementing $x by one each time. Then I skip to the next iteration if the $x if the number does not start with one of our numbers (1,2,3). We skip if any number other than 1,2,3 appears. Finally we skip if "11" appears.

If we get past the "next" tests, we push the number onto an array. If that array is long enough to match the limit the user inputs $n we exit the loop with last and return the final element in the array $seq[-1];

I should probably point out here that I am not criticising Go, it's a great language. But Perl is a great language too. And much of the time I prefer Perl over Go, Elm, Raku, PHP, etc. Why? Mainly familiarity if I am being honest with you. It's a really mature, battle tested language. Perl is a general purpose language designed for flexibility and expressiveness. It is not the fastest (or slowest); it has some weaknesses and strengths... as do all programming languages.

Perl is designed to be the "swiss-army chainsaw" a developer reach for when they have a problem to solve. Go on the other hand is design to create "simple, reliable and efficient software"

The next post might be on last weeks challenge, I did take a crack at task #2 and it was really interesting to resolve and to do in a specific way.