I consult, write, and speak on running better technology businesses (tech firms and IT captives) and the things that make it possible: good governance behaviors (activist investing in IT), what matters most (results, not effort), how we organize (restructure from the technologically abstract to the business concrete), how we execute and manage (replacing industrial with professional), how we plan (debunking the myth of control), and how we pay the bills (capital-intensive financing and budgeting in an agile world). I am increasingly interested in robustness over optimization.

Friday, February 28, 2025

American manufacturers have forgotten Deming's principles. Their software teams never learned them.

American manufacturing struggled with quality problems in the 1970s and 1980s. Manufacturers got their house in order with the help of W. Edwards Deming, applying statistical quality control and total quality management throughout the manufacturing process, from raw materials, to work in process, to finished goods. The quality of American made products improved significantly as a result.

Once again, American manufacturing is struggling with quality problems, from Seattle’s planes to Detroit’s cars. But the products are a little different this time. Not only are manufactured products physically more complex than they were 50 years ago, they consist of both physical and digital components. Making matters worse, quality processes have not kept pace with the increase in complexity. Product testing practices haven’t changed much in thirty years; software testing practices haven’t changed much in twenty.

And then, of course, there are “practices” and “what is practiced.”

The benefit of having specifications, whether for the threading of fasteners or scalability of a method, is that they enable components to be individually tested to see whether they meet expectations. The benefit of automating inspection is that it gives a continuous picture of the current state of quality of the things coming in, the things in-flight, and the things going out. Automated tests provide both of these things in software.

If tests are coded before their corresponding methods are coded, there is a good chance that the tests validate the expectation of the code and the code is constructed in such a way that the fulfillment of expectation is visible and quantifiable. Provided the outcome - the desired technical and functional behavior - is achieved, the code is within the expected tolerance, and the properties the code needs to satisfy can be confirmed.

All too often, tests are written after the fact, which leads to “best endeavors” testing of the software as constructed. Yes, those tests will catch some technical errors, particularly as the code changes over time, but (a) tests composed after the fact can only test for specific characteristics to the extent to which the code itself is testable; and (b) it relegates testing to an exactness of implementation (a standard of acceptance that physical products grew out of in the 19th century).

Another way to look at it is, code can satisfy tests written ex post facto, but all that indicates is the code still works in the way it was originally composed to the extent to which the code exposes the relevant properties. This is not the same as indicating how well the code does what it is expected to do. That’s a pretty big gap in test fidelity.

It’s also a pretty big indicator that measurement of quality is valued over executing with quality.

Quality in practice goes downhill quickly from here. Tests that do nothing except to increase the number of automated tests. Tests with inherent dependencies that produce Type 1 and Type 2 errors. Tests skipped or ignored. It’s just testing theater when it is the tests rather than the code being exercised, as in, “the tests passed (or failed)” rather than “the code passed (or failed)”.

Automated testing is used as a proxy for the presence of quality. But a large number of automated tests is not an indicator of a culture of quality if what was coded and not why it was coded is what gets exercised in tests. When it is the prior, there will always be a large, late-stage, labor intensive QA process to see whether or not the software does what it is supposed to do, in a vain attempt to inspect in just enough quality.

Automated test assets are treated as a measurable indicator of quality when they should be treated as evidence that quality is built in. Software quality will never level up until it figures this out.

Friday, January 31, 2025

We knew what didn’t work in software development a generation ago. They’re still common practices today.

In the not too distant past, software development was notorious for taking too long, costing too much, and having a high rate of cancellation. Companies spent months doing up-front analysis and design before a line of code was written. The entire solution was coded before being deemed ready to test. Users only got a glimpse of the software shortly before the go-live date. No surprise that there were a lot of surprises, all of them unpleasant: a shadow “component integration testing” phase, an extraordinarily lengthy testing phase, and extensive changes to overcome user rejection. By making the activities of software development continuous rather than discrete, collaborative rather than linear, and automated rather than repeated meant more transparency, less time wasted, and less time taken to create useful solutions through code.

It was pretty obvious back then what didn’t work, but it wasn’t so obvious what would. Gradually people throughout the industry figured out what worked less bad and, eventually, what worked better. Just like it says in the first line of the Agile manifesto. But I do sometimes wonder how much of it really stuck when I see software delivery that looks a lot like it did in the bad old days. Such as when:

  • "Big up front design" is replaced with … "big up front design": pre-development phases that rival in duration and cost those of yesteryear.
  • It takes years before we get useful software. The promise during the pitch presentation was that there would be frequent releases of valuable software… but flip down a few slides and you’ll see that only happens once the foundation is built. It’s duplicitous to talk about “frequent releases” when it takes a long time and 8 figures of spend to get the foundation - sorry, the platform - built.
  • We have just as little transparency as we did in the waterfall days. Back then, we knew how much we’d spent but not what we actually had to show for it: requirements were “complete” but inadequate for coding; software was coded but it didn’t integrate; code was complete but was laden with defects; QA was complete but the software wasn’t usable. The project tasks might have been done, but the work was not. Today, when work is defined as tasks assigned to multiple developers we have the same problem, because the tasks might very well be complete but "done" - satisfying a requirement - takes more than the sum of the effort to complete each task. Just as back then, we have work granularity mismatches, shadow phases, deferred testing, and rework cycles. Once again, we know how much we’re spending, but have no visibility into the efficacy of that spend.
  • The team is continuously learning … what they were reasonably expected to have known before they started down this path in the first place. Specifically, that what the team is incrementally "learning" is only new to people in the development team. Significant code refactoring that results from the team "discovering" something we already knew is a non-value-generative rework cycle.
  • We have labor-intensive late stage testing despite there being hundreds and hundreds of automated tests, because tests are constructed poorly (e.g., functional tests being passed off as unit tests), bloated, flaky, inert, disabled, and / or ignored. Rather than fixing the problems with the tests and the code, we push QA to the right and staff more people.
  • Deadlines are important but not taken seriously. During the waterfall days, projects slipped deadlines all the time, and management kept shoveling more money at it. Product thinking has turned software development into a perpetual investment, and development teams into a utility service. Utility service priorities are self-referential to the service they provide, not the people who consume the service. The standard of timeliness is therefore "you’ll get it when you get it."
  • Responsibility isn’t shared but orphaned. Vague role definitions, a dearth of domain knowledge, unclear decision rights, a preference for sourcing capacity (people hours) and specialist labor mean process doesn’t empower people as much as process gives people plausible deniability when something bad happens.

Companies and teams will traffic in process as a way to signal something, or several somethings: because of our process we deliver value, our quality is very high, our team is cost effective, we get product to market quickly. If in practice they do the things we learned specifically not to do a generation ago, they are none of the things they claim to be.