Building a simple RSS aggregator with Perl and Mojolicious to replace a PHP app

Tags:

In the last tale I shared a little of my experience of using the Digital Ocean app services for build, deploy and hosting of a simple web application.

In this tale, I'd like to summarise the small application I built.

Introduction

The application I am replacing is an old (early 2000s) PHP application. It aggregates RSS feeds from a variety of Judo blogs and produces both a web presentation and a new RSS feed of the combined RSS items. The old site has limped along with little modification since early 2010s. It did well untill the hosting platform it was on (cPanel) upgraded PHP and broke the site.

There were two specific hacks that both helped and hindered. The web pages were being writen to static HTML files by a cron job that ran the PHP. To address multi-ligual need, the site had three installations of the same code running against different OPML files of sites. This actually menat the site stayed "up" when the PHP upgrade broke the app.

New Version

I decided to build the site using Mojolicious. I've done a lot of work with Dancer2 in the past, so this was a good opportunity to build something real with Mojo.

Making the site properly multi-lingual was high on my agenda. Fast also an important factor for me.

Starting

I started by using the Mojolicious command line tool to scaffold a working application:

mojo generate app

This gave me the shape of the app; Mojo/Mojolicious gives a pretty standard MVC web application structure to work with. As someone familiar with Dancer2 it's not difficult to adjust. All your regular tools for handling GET and POST requests are there already. Mojo also comes with Morbo.

Morbo

Morbo is a development server that handles HTTP and has an almost mandatory feature for me... hot reloading. When you change some code, Morbo identifies that and reloads. This is pretty common in other languages and it's nice to have it baked into Mojo.

Test::Mojo

Mojo comes with Test::Mojo which is a really nice library to help you test the WebUI elements of your application. This is really nice and works well. I like to try and build things in a Test Driven Development style. Both in terms of unit tests AND integration style tests. Test::Mojo makes it easy to describe the web page I want to build and then confirm I have done it as I build it.

It does not replace unit testing for business logic (I still use Test2::V0) for that. Test::Mojo provides a ovely out of the box, pre-wired web page testing tool. The ease of use as a developer is important to me and frankly makes doing a more TDD style of development more of a default option than if you have to wire things together yourself.

XML

This was actually the hardest part of the puzzle. There are many options and in the end I have been using XML::OPML to parse the lists of sites in each language. XML::Feed handles the reading and and writing of the RSS feeds. Beyond this, I use HTML::Strip to clean the content from the external feeds along with Text::Truncate to limit the text shown for each item shown on screen. Lastly, I am using Moo for forming the object oriented "business logic" part of the code.

Since starting (and in part from writing this tale) I see that Mojolicious has "JSON and HTML/XML parser with CSS selector support." so maybe I don't need all this? I shall need to explore.

Code structure

The site is pretty simple at the moment, I have broken it into two parts. Feed fetching, parsing, aggregating is one part. Starting with reading an OPML file and culminating in writing an RSS file.

The second part is the website that reads the RSS file and presents it on screen. Currently, the website does not fetch the feeds from the external sites. This is being done locally on my machine as a stand alone script; this will probably change; though it does mean that the site is very simple currently.

Routes

The code is simple easy:


 $r->get('/')->to('Home#welcome');
 $r->get('/:lang')->to('Main#index');

The whole site is basically one route that is the language we are serving i.e. /english, /french or /spanish. The :lang is used within the code to decide which RSS file to read and display. So /english reads rss_english.xml and displays it. Currently the controller looks like this:


sub index ($self) {
    my $xml
        = XML::Feed->parse( 'public/rss_' . $self->param('lang') . '.xml' );
    my $opml_parser = XML::OPML->new;
    my $opml
        = $opml_parser->parse( 'public/' . $self->param('lang') . '.opml' );

    # Rendel template "example/welcome.html.ep" with message
    $self->render(
        lang    => ucfirst $self->param('lang'),
        rss_xml => $xml,
        opml    => $opml,
    );
}

As you can see, we read the OPML file (to show a list of sites down one side of the page) and the aggregated RSS feed for each item from the sites on the list.

Hidden complexity in the template

The controller is simple, which is in part becuase I have hidden a large about of code in the template itself. The template engine is complex enough to allow me to generate loops, strip HTML, truncate etc. This is convenient for me, but feels very wrong.

Bootstrap

There is a little Bootstrap used to apply some design.

Summary

So for me Mojolicious has been enjoyable; well worth a try if you are looking for a developer focussed web framework; no matter what language you are coming from. The tooling (Morbo and Test::Mojo) make for an easy and reliable development process. The app structure is familiar to someone coming from most any MVC framework (be that Dancer2, Ruby on Rails, Django or Express).

Give it a try.