Oh, The Huge Manatee

A blog about technology, open source, and the web... from someone who works with all three.

What Is Good Code

This is a question I have to define a lot, every time I step in to lead a new team, and it's an important one. Your team has to know the standards against which they are measured, and code reviewers have to understand what they're looking for. A well defined idea of "good code" can make a huge difference in the on-ramp and acceleration speed for new team members, too. I hope this post helps clarify different ways of thinking of this for the other leads out there. I recognize that it's a bit of hubris for a _PHP specialist_ to make comments about good code. After all, PHP has a (sadly deserved) reputation as the home of amateurs, taking their first stabs at spaghetti procedural lcode in a simple imperative language. But bear in mind that PHP is also the home of some serious "grown-up" development: Wikipedia, Facebook, Slack, and others. Modern PHP development uses modern development practices common to every language. So it's important for any professional modern PHP developer to hold high-falutin' opinions like these. Writing good code is subject to many of the same rules as writing good prose. Bear with me while I stretch this analogy as far as I can. Keep your target audience in mind --- Good prose is targeted to the reader. You write differently for a 7 year old than you would for a popular magazine, and differently again for an essay competition. So, who is your code's target audience? The answer will vary a bit, but I can tell you who is not your target audience: the computer. We're all writing in compiled languages now. Comments are free. Extra lines are free. There's a good reason that every modern coding standard includes rules like "no assignments in conditionals". It's because in a compiled language, _the compiler does all the work to translate your code for the computer._ Your target reader is guaranteed to be some kind of human. So no matter what, we can say that: ``` php if ($value = \Entity::load(123)) { ... } ``` is worse than: ``` php // Load the entity. $value = \Entity::load(123); if (!empty($value)) { ... } ``` And not just because some arbitrary coding standard said so! Immediately my _reducto ad absurdum_ sensor starts to go off - you can write insane code this way, if you go too far. ``` php // Initialize the $value variable. We'll use this later. $value = NULL; // Try loading the entity. $value = \Entity::load(123); // React only if the entity loaded OK, and if it's a sane entity. if (!empty($value) && $value instanceof \EntityInterface) { ... } ``` Is this better than the examples above? I certainly wouldn't enjoy writing this way, or reading it. But depending on your target reader, this might be just the way to go. If you're writing an examples package, or a demonstration for a blog post, then this is not so crazy after all. So consider your audience. In most of my work I try to imagine a new developer on the team. When I know I'm writing to open source something, I imagine the pedantic community reviewers. Think about their skill level, their approach to the codebase, how much they understand the framework, and write for them. Name semantically --- Since your target audience is human, they will appreciate having things named for what they are or how they function. This is actually quite hard - there's a reason that "naming things" is one of the famous two hardest problems in programming! Whenever possible, remember the nouns/verbs metaphor you learned from object oriented 101, and try to name according to what the thing IS or what it DOES. (if it falls into both categories, refactor and thank me later). Remember that your names are all metaphors, and be careful of ambiguities. My favorite pet peeve is the word "Controller." Easily the most abused word in the PHP class-naming lexicon, and it's a useless buzzword. Is the ThingieController like a nintendo controller, a control panel for interacting with an individual Thingie? Is it the person who regulates Thingie inputs and outputs? Is it a class that collects Thingies, or something that changes them? In Drupal, it's likely to be the class responsible for Thingie routes. And it's likely to make me swear at the screen under my breath. If you want to pass my code review, never name _anything_ "Controller". If you call something "Manager" you'll get the same dirty looks -I don't care if you caught that bad habit from Java, it's a bad habit and I won't have it in my codebase! Try to describe _completely_ what the thing IS or DOES. For functions, describe the return (ie "getUserAccount"). If the name gets ridiculously long, that's a code smell and it's time to refactor. Order matters --- Humans learn things by connecting them to each other. That is to say, that in order for a concept to make sense, it cannot exist in a vacuum. This means that when explaining something complex - like a board game or a class - you should always start from the big points, or the points that connect to the rest of the framework, and go from there. In a well structured class, I expect to see the constructor method(s) at the top, and then the methods starting from biggest overview to smallest detail. Properties should go right before their assigning method, which 99.9% of the time means at the top of the class - enough to say that "all properties go at the top of the class" is a perfectly fine rule of thumb. This concept of explaining from largest to smallest, only adding nouns as they're needed, is easy to illustrate. Here's a simplified explanation of the board game _Settlers of Catan_: * You build settlements by spending resources. * You collect resources every time someone throws a number touching one of your existing settlements. * You can also buy a Development card, which gives you certain bonuses. * You can trade with anyone on your turn. * Each settlement is worth 1 point. * The first to 10 points wins. This is actually quite a difficult way to explain the game! The listener has to keep all the previous 5 points in their head before getting the one piece of information that integrates it all. If we put the big goal first, we can attach every point to the ones before it. * The first to 10 points wins. * Each settlement is worth 1 point. * You build settlements by spending resources. * You can also buy a Development card, which gives you certain bonuses. * You collect resources every time someone throws a number touching one of your existing settlements. * You can trade with anyone on your turn. Similarly in your code, start with the big concept first. What does this class do? It is not crazy to have this "main" method just call a bunch of component methods. If you name them clearly, this can be a great experience for the reader. ```php public function getFieldValuesFromData(\SimpleXML $data) : array { $this->addXpathNamespaces($data); $documentId = $this->getDocumentId($data); $fieldMapping = $this->getFieldMapping($documentId); $rawFieldValues = $this->applyFieldMapping($fieldMapping, $data); return $this->filterFieldValues($rawFieldValues); } ``` This barely even needs comments, it's so clear what's going on. In a well-structured class, this would be right at the top (after `__construct()`), and the other methods would appear in call order below. (Note that this is probably pretty hard to unit test. The one public function is just glue, and it's definitely implied that all the others are protected methods, hiding the actual unit testable logic of the class. Actually you would want to move the key functionality into the public methods of other classes, so they're testable there.) Write descriptive tests --- One of the purposes of your test, is to explain your thought process and the intended functionality to the reader. It's a form of documentation. I should be able to skim down to the `Assert`s, and get a good idea of what your code is supposed to do. This can be tricky, particularly if you're bad at separation of concerns (or in a framework that's bad at it - I'm looking at you, Drupal). It often (for me) requires refactoring the code in the SUT to be equally clear. But it's worthwhile. In my code reviews, we often look at the tests first for this reason. The first thing I want is a description of the objectives, and the tests should give me that. Be consistent --- Perhaps above all else: be consistent in your names. That means all your id functions should be named identically. Don't intermingle "get" and "retrieve" words. Try and maintain the opposites principle (create methods in opposite pairs when possible, ie getters/setters. Pick a naming style (camelCase, StudlyCaps, undderscores, or something else, I ont care), and stick with it. Configure your IDE to be as anal-retentive as possible about code style, and clear up _all_ the coding style notices before you commit.