Tatsuhiko Miyagawa's Blog

Perl Toolchain Summit 2018

April 25, 2018

I attended the Perl Toolchain Summit 2018 in Oslo.

It’s a unique event with lots of perl hackers who work on perl toolchain, testing tools, core support and ecosystem such as PAUSE and MetaCPAN. They annually get together and hack on these things in the same room.

Day 0: Monday to Wednesday

I spent a few days in Stockholm as a stop over for this trip. It’s my first time visiting both Stockholm and Oslo, and they’re both beautiful, a bit cold, expensive but really nice. I also enjoyed that the contactless payment via NFC is literally *everywhere *and it was really hard to use cash that I withdrew just as a backup. Maybe more on that in another post.

Viking ship museum in Oslo
Viking ship museum in Oslo

Day 1: Thursday

I stated the day off by merging a few PRs for HTTP::Tinyish from Shoichi Kaji (SKAJI). HTTP::Tinyish is a wrapper module to use curl, wget, LWP and HTTP::Tiny transparently with the same API. You might wonder why you would need the wrapper when you can fatpack and fall back to HTTP::Tiny anyway. This is to support TLS HTTPS requests with a stock perl which doesn’t ship with its HTTP clients that has TLS capabilities. This will be more important in the coming years as more websites will enforce TLS, although I believe PAUSE/CPAN websites will allow non-TLS requests for a while.

The similar code has been there in cpanm for a long time, and this is the first extract of this kind of utility modules.

Now, I moved onto work on outstanding pull requests and bug fixes to cpanm. At this time I was maintaining two different code base: cpanminus-1.7 (devel) branch, and Menlo 2.0 (menlo) branch.

They have very similar codes because one is a copy of another, and provides the same functionalities. Because they’re in different branches under the same repository, we need to apply the same changes to two different branches, or to merge from one to another, with a lot of conflicts to resolve for every merge.

This is obviously painful, and I decided to split the repository into two repos: miyagawa/cpanminus and miyagawa/Menlo. At the same time, I removed App::cpanminus package, and started to use Menlo::CLI::Compat from cpanm so that I don’t need to maintain two different packages for future bug fixes. (Spoiler alert: I thought this was a good idea, but turned out not to be true, more on that later)

I also updated cpanm’s fatpacking tool to use Carton to get the dependencies we want. Previously it was using App::FatPacker’s trace functionality, which works most of the time but is painful to make it work when your perl has unclean site_perl directory because of the side effects of loading modules from there.

This is so much fun of bootstrapping, that we’re building the next release of cpanm using Carton, which relies on the current release of cpanm, when you think about it.

Day 2: Friday

On Friday I continued most of the release-engineering related work on Day 1.

First thing, the split of repositories for cpanminus and Menlo done on Day 1 now means that I have to commit, merge and release from two different repositories for every update. This is slightly annoying, and introduces a bit of confusion for contributors when opening an issue or pull request on GitHub because it’s unclear which repository it has to be fixed.

I quickly decided to move back Menlo to cpanminus repo, but as a subdirectory. Basically this makes it a monorepo with subdirectories for each distribution. It turns out that this has the best of both worlds, so that commits can be made across multiple dists at the same time, while we can release each distribution from its own directory.

In the afternoon I managed to implement a long-awaited feature in cpanfile: ‘dist’, ‘mirror’ and ‘url’ support. It was once added in Carton in its 1.1 branch, but the complexity of the implementation made me abandon it.

This time, the patch against cpanm for this feature is really clean and simple, and I learned from the past mistakes and decided to not DWIM on the handling of these values. dist will only take CPAN dist names such as MIYAGAWA/Plack-1.000.tar.gz, while you can specify your DarkPAN URL with mirror keyword, as well as using url for just arbitrary full URL.

requires 'Path::Class', 0.26,  
 dist => "KWILLIAMS/Path-Class-0.26.tar.gz";  
  
# omit version specifier  
requires 'Hash::MultiValue',  
 dist => "MIYAGAWA/Hash-MultiValue-0.15.tar.gz";  
  
# use dist + mirror  
requires 'Cookie::Baker',  
 dist => "KAZEBURO/Cookie-Baker-0.08.tar.gz",  
 mirror => "http://cpan.cpantesters.org/";  
  
# use the full URL  
requires 'Try::Tiny', 0.28,  
 url => "http://backpan.perl.org/authors/id/E/ET/ETHER/Try-Tiny-0.28.tar.gz";

In principle I hesitate to add this kind of new features to cpanm itself, but the argument here is that a) the patch is relatively straightforward and optional and b) users are already abusing this by specifying the URL in requires argument:

requires “http://host/path/Foo-Bar-1.00.tar.gz”;which, I hate to admit, accidentally works, so the feature is not really new, but is a cleaner upgrade. Also, implementing it in cpanm (or Menlo in this case) means the downstream clients such as Carton, Carmel and App::cpm will all get this feature for free, with their ability to override it if needed.

PAUSE hackers
PAUSE hackers

Day 3: Saturday

On Saturday I took a bit of break and went on a hike to the lake, which is pretty easy to do in Oslo with just a 30 minute metro ride.

In the evening I came back to the hackathon and added a support for x_use_unsafe_inc support in META.json after discussing it with ETHER, HAARG and LEONT.

cpanm by default adds PERL_USE_UNSAFE_INC=1when configuring, building and testing modules so that it can install distributions that have not been updated since perl 5.26 removed “.” (current directory) from the library include path.

Some authors want to disable this, who doesn’t want perl to load modules from an arbitrary directory after chdiring to them, to make sure their tests can reveal bugs if the module is relying on that behavior. Now, you can do that by declaring:

"x_use_unsafe_inc": "0"

This is of course opt-in and configured per distribution, and cpanm will continue to set PERL_USE_UNSAFE_INC=1 unless otherwise set in user’s shell for different values.

Day 4: Sunday

Sunday is the last day and I should wrap up, but basically continued working on the remaining stuff, and finally fixed Win32 cmd quoting issue that has been in cpanm from day 1.

Essentially on Win32 you can’t rely on system() and pass arguments in a list, and you have to use modules like Win32::ShellQuote to quote them by hand. cpanm was using its own which function to get the command path name, and added quotes in this command output. This is wrong, and could cause a double escaping issue if you pass an already quoted command to Win32::ShellQuote.

SKAJI has setup an AppVeyor CI repository for me to test the new Menlo-based cpanm on Windows, and I was able to refactor the way we execute the shell command so that it works correctly both on UNIX and Win32, even when the command path or file names include spaces.

I also discussed some quick updates about CPAN static installer with LEONT and ETHER. The basic version of the implementation, which is essentially a port of Module::Build::Tiny, is now in cpanm. This is also an opt-in from CPAN authors and by default cpanm will continue to configure, build and test modules using the standard tools, unless it’s explicitly stated in META.json with x_static_install: “1”

(Right now there’s no way to turn this feature off, but in case it blows up in some buggy distributions, we might need an option to turn this off globally)

RJBS and xdg
RJBS and xdg

After the Summit

After the summit, at the hotel and the airport lounge, I continued investigating the effects of merging cpanm and Menlo. The big striking fact for me was that App::cpm uses and depends on a lot of Menlo::CLI::Compat internals. This essentially makes it difficult or nearly impossible for me to make changes or refactor the internals of Menlo, without breaking downstream clients every time.

Even though SKAJI agreed to patch his code whenever Menlo::CLI::Compat is updated, we decided this is not a great idea for both the maintainer (me) and downstream consumers.

I also know that there are gross hacks out there in the wild, who messes with cpanm internals, or parses its log files or stdout/stderr outputs because frankly there has been no other ways to extend cpanm. I would love to refactor them and make them better, but then if it breaks these users it might not be worth the hassle.

As of this writing, I created a third distribution, Menlo-Legacy, that contains Menlo::CLI::Compat that’s compatible to cpanm 1.7. This is something that is fatpacked into cpanm itself, and can be used by tools like Carton, Carmel or cpm. Menlo will add more features, as well as a lot of refactoring and cleanups, but then the downstream clients can continue using this legacy module, and only need to upgrade to the newer version, whenever they’re ready.

Wrap Up

Working on toolchain is a hard job, since you have to support all the old versions of the software, and will be blamed and criticized for “fixing broken features” because people are relying on these broken features.

This is why Perl Toolchain Summit is so valuable so that I can get together with these people, exchange ideas and get a great moral support.

xdg, sjn and stigo
xdg, sjn and stigo

Thanks to NUUG Foundation, Teknologihuset, Booking.com, cPanel, FastMail, Elastic, ZipRecruiter, MaxMind, MongoDB, SureVoIP,Campus Explorer, Bytemark, Infinity Interactive, OpusVL, Eligo, Perl Services, Oetiker+Partner for sponsoring, and sjn & stigo for organizing the great event.