Agility in the construction industry
Drawing parallels in agility between two seemingly disparate industries: software engineering and construction.
I-405 Sepulveda Pass Widening. Image: kiewit.com
I can’t quite define to everyone’s agreement what the loaded term ‘business agility’ means but I can spot it if I see it. And of all the places to spot it, the construction industry would’ve been my last bet but here I am writing about it.
Agility means different things in different contexts but many of the principles are shared across industries. In this post I’ll discuss an agile approach adopted in the construction industry. Construction is considered to be on the other end of the spectrum of software development and parallels between the two are considered incongruent. This is because unlike software development, construction has high fixed costs, high inventory of physical materials, and high cost of iterations (e.g., building a bridge’s foundation more than once is very costly). But as I’ll illustrate, innovative approaches in construction are rooted in the same agile thinking that is applied in software development. Specifically, the on continuous learning, incremental approaches, trial and error, product teams, lowering work in progress, and breaking down silos through collaboration.
I will refer to a comprehensive article by Brian Balkus titled Why America Can’t Build. This is an in-depth piece detailing the inefficiencies of the American construction industry and answers the question of why America cannot complete infrastructure projects as well as it used to. The final third of the article focused on new approaches to construction which served as the motivation for this post. In each of the below sections, I will talk about three things:
How things were done in the failed approach
How things were done in the more agile, and ultimately successful, approach
The common principle behind the approach connecting software engineering and construction
Iterate Intentionally
The construction firm Kiewit had won the contract for California’s Sepulveda Freeway Expansion project because their price came in much lower due to a novel retaining wall design. After completion, this vertical retaining wall design began buckling because the metal straps connecting the panels broke and the whole wall had to be demolished. This resulted in several lawsuits and ultimately project shutdowns.
In Madrid, instead of one novel design which everything depended on, the team used simple and similar modules which could be repeatedly built to make the final product. With each build the team was able to make iterative improvements which contributed to the overall quality of the build. They chose to forego the “savings” of unproven technologies and instead focused on creating a process which allowed them to continuously learn with each build, which increased quality through pure practice. The higher the quality, the lower the long-term cost.
Choosing a process which allows for iterations is a prerequisite for success in software development. If you are doing something once, you have one chance to get it right. If you do it 20 times, your 20th try will be better than your first and the intermediate 18 will reflect the learnings of the one before. In the Madrid instance, a conscious decision was made to choose raw materials which allowed for iteration. Reducing complexity and repeatedly building the same simplified design made iterative improvements possible. The sum of the costs of each module may have been higher than Kiewit’s novel design, but the overall cost and risk was significantly lower because the team inspected and adapted with each iteration of the modular builds.
Intentionally designing your way of working to enable iterations is the common theme between the Madrid success story and software development practices.
Elevate the Performance of the Constraint
When building the High Speed Rail project in California, a single tunnel boring machine (TBM) was used at a time. These machines are effective but extremely slow - a snail moves 14 times faster than the best drill. A TBM can also only be used at certain times in the day, which means the the constraint creating the tunnels is the amount of time a TBM is able to operate per day. The project is currently 14 years and counting behind schedule, and $44 billion USD over budget. It is more than the speed of the TBM that has held this back, with regulation and political opposition also contributing.
A TBM is largely automatic but California employed 25 people instead of the required 10. Why? Unlike in other countries where many TBM tasks are automated, in California more people are thrown at the problem (largely due to union laws) instead of improving the process of boring.
By contrast in Madrid:
When tunneling segments, instead of using one TBM as is typical, it deployed up to six at a time—a number previously unheard of. Most importantly, Madrid ran its construction crews 24 hours a day, seven days a week, and achieved consistent worker productivity gains.
The groundbreaking book, The Goal, discusses how the Theory of Constraints is central to process improvement. The idea being that any system’s throughput is governed by a small (often one) number of constraints, and identifying and elevating that constraint will cause the overall system throughput to increase. A new constraint will arise and we must repeat the process.
It is a book that has informed software development and lean thinking to a considerable degree because bottlenecks are present in every team. For example, the capacity of the continuous integration server governs how fast software builds take, which affect how often developers get feedback about their code, which in turn affects deployment frequency. Fix the server and you increase deployment frequency.
These constraints are easy to address when they’re visible but most aren’t. Tools like Value Stream Mapping help to identify and analyze them, but if you analyze and address the wrong constraint, the throughput of the system decreases (like the 25 people employed per TBM). Identify and elevate the right constraint and you build 35 miles of rail in 4 years at 65 million per mile, like in Madrid (compared to 3.5 billion per mile in New York).
Experimentation at a Financial Loss
The cost of innovation are projects that don’t produce a financial return but increase organizational knowledge. TBM technologies had not changed in decades which prompted Madrid, unlike California, to use six of them at the same time. It was a process innovation that yielded immediate returns. However, not all investments should be expected to do that.
Case in point is the energy company, Oxy:
Oxy is currently in preconstruction on a $1 billion direct air capture (DAC) plant in Texas. DAC plants remove carbon dioxide from the air and then convert it into concentrated carbon that can be sequestered or re-used.
This approach requires Oxy to know it will likely lose money on the project. It is a $1 billion experiment that has no real parallel to public-sector infrastructure projects. That is because none of the public agencies can afford a costly project it knows will likely fail. Real physical innovation requires a trial-and-error approach and a tolerance for risk and loss.
This is perhaps the strongest example of an agile mindset. The investment in learning through experimentation is key to process and product improvements. There are many parallels to this in software development including A/B testing and skunk works, where “throwaway” work is acknowledged as being necessary to achieve optimal solutions. Good agile teams value experimentation with different tools, libraries, architectures, and processes, knowing that of the many candidates, some will work and most will fail. Triangulating to the right one is a matter of trial-and-error, both in software development and construction. Though the cost of a trial in the construction space is likely to be much higher, proportionally the long-term benefits still outweigh near-term costs.
Fast Reliable Feedback
Fast feedback is critical to software delivery. Every effort is made to get feedback as quickly as possible in every part of the software development process.
Tests give feedback on whether code works as intended
Code reviews solicit another person’s feedback on software design
Static code analysis provides feedback on code quality
Security scans provide feedback on whether code has vulnerabilities
Performance tests provide feedback on the stability of the application
There are countless examples.
Consider the key feedback loop in the South Mountain Freeway Project:
…this project was completed in 2019 in fewer than 1,000 days, an estimated three years earlier than what would normally be expected. Early coordination between the contractor and engineer ensured that the design issues that appeared on Sepulveda or HSR were avoided, saving the project an estimated $100 million.
Feedback provided by the contractor on the engineering designs early in the process resulted in designs being tweaked to avoid re-work and deliver the project early and well below budget.
In California, it has been 14 years since the High Speed Railway was started and not laid a single mile of track has been laid. The project till lacks the land parcels it needs to do so and only 50% of the environmental clearance has been received. This points to a gap in communication and expectations between all parties: the construction company, the environmental agencies and the government’s land development department.
Starting a project with such an exorbitantly high risk profile without getting the right feedback from critical parties has resulted in catastrophic financial failure. As a general model, feedback can be thought of in two aspects: feedback on the product we are building, and feedback on the process we are using to build the product. Both are equally important, though feedback on process is rarely a first-order concern.
In the South Mountain Freeway Project mentioned above, the construction companies had learned the value of improving their feedback process by talking to their counterparts in Texas. The key process improvement was getting feedback on engineering designs earlier when changing them carried a lower cost. This is no different of a principle than, for example, getting feedback on your code before it’s deployed - the cost to change is lower when the code review is done earlier.
Sticky Teams
Key unionized roles like electricians and crane operators are a big factor in construction speed. Instead of scaling these roles by hiring and training more people, unions promote overtime work which drives up costs and limits scaling. The same people are spread across multiple projects resulting in dependencies and slower construction. The cost of labour is intentionally driven up at the expense of the system, while also causing contradictory incentives: if a workers pay depends on overtime, then inefficiency is desired.
In Arizona and Texas where the influence of labour unions is limited, renewable energy developers are free to choose suppliers. They end up completing similar projects by hiring the same engineering, procurement and construction companies for similar projects. This builds mutually beneficial relationships, increases collaboration, and the right people are selected for the right problem. The same is true in Texas where the same project participants are being used to achieve economies of scale, which is in sharp contrast to labour unions in California where the same participants (union workers) are being overworked and split up across projects to drive up inefficiencies for individual financial gain.
Move the work, not the people is a popular idiom in software development. The principle being that once a team learns how to collaborate, keep them together as collaboration is often the constraint that governs speed. Intentionally keeping people who have proven to work well together as part of a team is in sharp contrast to spreading inefficient and expensive people to different projects when they haven’t proven to work well anywhere.
This is a nuanced topic because working with the same people using the same process carries the risk of getting complacent, so this principle has to be considered in tandem with experimentation and reliable feedback loops.
An agile mindset is different than an agile method. Mechanically following the methods (e.g., Scrum) without the mindset has the benefit of building good practices in the short-term. However, a method that worked yesterday may not work tomorrow, and an agile mindset can help create suitable practices in perpetuity. Methods are only manifestation of principles applied to a very particular context. We should avoid copying methods and aim to stay true to the principles of agility.