Oh, The Huge Manatee

A blog about technology, open source, and the web... from someone who works with all three.

An Open Letter to My MEPs About Article 17 (Formerly Article 13)

The proposal for a directive on copyright in the digital single market is disastrous for the EU economy, culture, and democracy in the digital world. It is particularly bad for my country of Germany, as a leading light in Europe in all three areas. I am writing all of my MEPs listed in support of this impossibly bad proposal.

The German and European economies would be terribly damaged by this article, which effectively rules out small and medium sized competition in favor of the largest incumbents. I work for Microsoft on precisely the kind of machine-understanding tasks involved in the copyright filter requirement. I can tell you with authority: it is an impossible task which only the deepest pockets can approach. Artic…

BTRFS and Free Space - Emergency Response

I run BTRFS on my root filesystem (on Linux), mostly for the quick snapshot and restore functionality. Yesterday I ran into a common problem: my drive was suddenly full. I went from 4GB of free space on my system drive to 0 in an instant, causing all sorts of chaos on my system.

This problem happens to lots of people because BTRFS doesn’t have a linear relationship to “free space available”. There are a few concepts that get in the way:

  • Compression: BTRFS supports compressing data as it writes. This obviously changes the amount of data that can be stored. – 50MB of text may take only 5MB “room” on the drive.
  • Metadata: BTRFS stores your data separately from metadata. Both data and metadata occupy “space”.
  • Chunk allocation: BTRFS allocates space for your data in chunks.
  • Multiple devices: BTRFS supports multiple devices working together, RAID-style. That means there’s extra information to store for every file. For example, RAID-1 stores two copies of every file, so a 50MB file takes 100MB of space….

Kubernetes for Stateful Applications: Scaling Macroservices

I recently got to proctor an Openhack event on modern containerization. It ended up an excuse to dig deep on one of the corner cases that we all encounter, but no one likes to talk about.

Kubernetes is one of the greatest orchestration and scaling tools ever built, designed for modern decoupled, stateless architectures. Kubernetes tutorials abound to show you these strong use cases. But in the real world where you don’t get to build “green field” every time, there are a lot of applications that don’t fit that model.

Lots of people out there are still writing tightly-coupled monoliths, in many cases for good reason. In some use cases microservices style scalability isn’t even useful – you actually prefer stateful applications with tight coupling. For example a game server, where you don’t wa…

Optimizing Data Transfer Speeds

One of my holiday projects was to set up my home “data warehouse.” Ever since Dropbox killed modern Linux filesystem support I’ve been using (and loving) Nextcloud from my home. It backs up to an encrypted Duplicati store on Azure blob store, so that’s offsite backups taken care of. But it was time to knit all my various drives together into a single RAID data warehouse. The only problem: how to transfer my 2 terabytes (rounded to make the math in the post easier) of data, without nasty downtime during the holidays?

A local network transfer is the fastes…

Drupal Does Face Recognition: Introducing Image Auto Tag Module

Last week I wrote a Drupal module that uses face recognition to automatically tag images with the people in them. You can find it on Github, of course. With this module, you can add an image to a node, and automatically populate an entity_reference field with the names of the people in the image. This isn’t such a big deal for individual nodes of course; it’s really interesting for bulk use cases, like Digital Asset Management systems.

I had a great time at Drupalcon Nashville, reconnecting with friends, mentors, and colleagues as always. But this time I had some fresh perspective. After 3 months working with Microsoft’s (badass) CSE unit – building cutting edge proofs-of-concept for some of their biggest customers – the contrast was powerful. The Drupal core development team are famously obsessive about code quality and about optimizing the experience for de…

The #1 Question I Get Asked Working at MS: Why Do You Run Linux?

I’ve had all of a week working at Microsoft, and so far it’s been great. I’ve managed to get my Linux daily driver machine working with almost all the MS internal systems (more on that in a separate post), and all the daily use applications have web or Linux versions available. So far, so good!

But every time I’ve asked for support, or visited IT, or encountered something that didn’t work perfectly, I hear the same question: “Why do you want to run open source software, anyway?” The question isn’t asked with malice or condescention. It’s asked with genuine curiosity. This is tremendous progress for Microsoft.

It is an impo…

My War on Systemd-resolved

I run ubuntu as the base for my daily driver machine – heavily customized though it is – because Canonical’s choices are, by definition, mainstream. That makes them easy to support, easy to understand, and generally easy to work with. So what I’m about to describe is exceptional in how frustrating it is for me. Seriously, this one issue keeps is enough to drive me into the arms of another distro.

Ubuntu has a built in DNS cache, which it checks first when trying to resolve anything. This Makes Sense For The User in that DNS queries are resolved faster if they come from local cache. It Makes Sense for the network admin, in that repetitive DNS queries don’t take up bandwidth. But it really doesn’t Make Sense for the web developer.

The local DNS cache takes up port 53, which is a problem if you’re trying to run any different kind of DNS service locally. For example, the DNS service that practically an…

I’m Joining Microsoft, Because They’re Doing Open Source Right

I’m excited to announce that I’ve signed with Microsoft as a Principal Software Engineering Manager. I’m joining Microsoft because they are doing enterprise Open Source the Right Way, and I want to be a part of it. This is a sentence that I never believed I would write or say, so I want to explain.

First I have to acknowledge the history. I co-founded my first tech company just as the Halloween documents were leaked. That’s where the world learned that Microsoft considered Open Source (and Linux in particular) a threat, and was intentionally spreading FUD as a strategic counter. It was also the origin of their famous Embrace, Extend, and Extinguish strategy. The Microsoft approach to Open Source only got more aggressive from there, funneling money to SCO’s lawsuits against Linux and its users, calling OSS licensing a “cancer”, and accusing Linux of violating MS …

The 3 Skills You Need to Become a Rock Star Developer

A lot is made of so-called “rockstar developers” in any given language or framework. They have a seemingly magical knowledge of the language and API, finding obscure methods and writing best-practice code as if by instinct. I’d like to lift the curtain on this: it’s not that hard to be this kind of rock star. You can do it, too… even without a fancy computer science degree. If you know how to code in a given language, you just need 3 skills and some patience. It will take about a year of working this way to get there, but you’ll find people throwing around “the R word” sooner than you think.

1) Know how to explore the source

For most of us this c…

Why No Mainstream PHP Speakers Come to Drupalcon - and How We’re Changing That

I’ve learned something incredible as the PHP Track Chair for Drupalcon Vienna. The Drupal Association has no way to invite PHP speakers to Drupalcon.

This blew me away when I first learned about it. After all the work to bring mainstream PHP to Drupal core, after all the outreach to PHP-FIG, after all the talks Drupalists have given at major PHP conferences, how is this possible?

You see, basically every other PHP conference covers their speakers’ travel and accommodation costs. Drupalcon doesn’t, and never has. Historically it has to do with Drupalcon’s identity as a community conference, rather than a professional one. But it means the best PHP speakers never get to Drupalcon.

On one hand that’s great for our project: our speakers are all passionate volunteers! They’re specialists who care deeply about the project. On the other hand, it contributes to isolated, “stay on the island” thinking. If the only speakers we hear are Drupalists, where do we get new insights? If the only people at the BoF or code sprint table…