I consult, write, and speak on running better technology businesses (tech firms and IT captives) and the things that make it possible: good governance behaviors (activist investing in IT), what matters most (results, not effort), how we organize (restructure from the technologically abstract to the business concrete), how we execute and manage (replacing industrial with professional), how we plan (debunking the myth of control), and how we pay the bills (capital-intensive financing and budgeting in an agile world). I am increasingly interested in robustness over optimization.

Friday, February 28, 2025

American manufacturers have forgotten Deming's principles. Their software teams never learned them.

American manufacturing struggled with quality problems in the 1970s and 1980s. Manufacturers got their house in order with the help of W. Edwards Deming, applying statistical quality control and total quality management throughout the manufacturing process, from raw materials, to work in process, to finished goods. The quality of American made products improved significantly as a result.

Once again, American manufacturing is struggling with quality problems, from Seattle’s planes to Detroit’s cars. But the products are a little different this time. Not only are manufactured products physically more complex than they were 50 years ago, they consist of both physical and digital components. Making matters worse, quality processes have not kept pace with the increase in complexity. Product testing practices haven’t changed much in thirty years; software testing practices haven’t changed much in twenty.

And then, of course, there are “practices” and “what is practiced.”

The benefit of having specifications, whether for the threading of fasteners or scalability of a method, is that they enable components to be individually tested to see whether they meet expectations. The benefit of automating inspection is that it gives a continuous picture of the current state of quality of the things coming in, the things in-flight, and the things going out. Automated tests provide both of these things in software.

If tests are coded before their corresponding methods are coded, there is a good chance that the tests validate the expectation of the code and the code is constructed in such a way that the fulfillment of expectation is visible and quantifiable. Provided the outcome - the desired technical and functional behavior - is achieved, the code is within the expected tolerance, and the properties the code needs to satisfy can be confirmed.

All too often, tests are written after the fact, which leads to “best endeavors” testing of the software as constructed. Yes, those tests will catch some technical errors, particularly as the code changes over time, but (a) tests composed after the fact can only test for specific characteristics to the extent to which the code itself is testable; and (b) it relegates testing to an exactness of implementation (a standard of acceptance that physical products grew out of in the 19th century).

Another way to look at it is, code can satisfy tests written ex post facto, but all that indicates is the code still works in the way it was originally composed to the extent to which the code exposes the relevant properties. This is not the same as indicating how well the code does what it is expected to do. That’s a pretty big gap in test fidelity.

It’s also a pretty big indicator that measurement of quality is valued over executing with quality.

Quality in practice goes downhill quickly from here. Tests that do nothing except to increase the number of automated tests. Tests with inherent dependencies that produce Type 1 and Type 2 errors. Tests skipped or ignored. It’s just testing theater when it is the tests rather than the code being exercised, as in, “the tests passed (or failed)” rather than “the code passed (or failed)”.

Automated testing is used as a proxy for the presence of quality. But a large number of automated tests is not an indicator of a culture of quality if what was coded and not why it was coded is what gets exercised in tests. When it is the prior, there will always be a large, late-stage, labor intensive QA process to see whether or not the software does what it is supposed to do, in a vain attempt to inspect in just enough quality.

Automated test assets are treated as a measurable indicator of quality when they should be treated as evidence that quality is built in. Software quality will never level up until it figures this out.