The sad state of evidence based development management patterns
I have been in the development game for many decades. I did my first programs using APL/360 and Fortran (WatFiv) at the University of Waterloo, and have seen and coded a lot of languages over the years (FORTH, COBOL, Asm, Pascal, B,C, C++, SAS, etc).
My academic training was in Operations Research – that is mathematical optimization of business processes. Today, I look at the development processes that I see and it is dominantly “fly by the seats of the pants”, “everybody is doing it” or “academic correctness”. I am not talking about waterfall or agile or scrum. I am not talking about architecture etc. Yet is some ways I am. Some processes assert Evidence Based Management, yet fails to deliver the evidence of better results. Some bloggers detail the problems with EBM. A few books attempt to summarize the little research that has occurred, such as "Making Software: What Really Works and Why we Believe It"
As an Operation Research person, I would define the optimization problem facing a development manager or director or lead as follows:
- Performance (which often comes at increased man hours to develop and operational costs)
- Scalability (which often comes at increased man hours to develop and operational costs)
- Cost to deliver
- Accuracy of deliverable (Customer satisfaction)
- Completeness of deliverable
- Elapsed time to delivery (shorter time often exponentially increase cost to deliver and defect rates)
- Ongoing operational costs (a bad design may result in huge cloud computing costs)
- Time for a new developer to become efficient across the entire product
- Defect rate
- Number of defects
- ETA from reporting to fix
- Developer resources
- For development
- For maintenance
All of these factors interact. For evidence, there are no studies and I do not expect them to be. Technology is changing too fast, there is huge differences between projects, and any study will be outdated before it is usable. There is some evidence that we can work from.
Lines of Code across a system
Lines of code directly impacts several of the above.
- Defect rate is a function of the number of lines of code ranging from 200/100K to 1000/100K lines [source] which is scaled by developer skill level. Junior or new developers will have a higher defect rate.
- Some classic measures defined in the literature, for example, cyclomatic complexity. Studies find a positive correlation between cyclomatic complexity and defects: functions and methods that have the highest complexity tend to also contain the most defects.
- Time to deliver is often a function of the lines of code written.
There is a mistaken belief that lines of code is an immutable for a project. In the early 2000’s I lead a rewrite of a middle tier and backend tier (with the web front end being left as is), the original C++/SQL server code base was 474,000 lines of code and was the result of 25 man years of coding. With a team of 6 new (to the application) developers sent over from India and 2 intense local developer, we recreated these tiers with 100% api compliance in just 25,000 lines of code in about 8 weeks. 25 man years –> 1 man year. a 20 fold decrease in code base. And the last factor was an increase in concurrent load by 20 fold.
On other projects I have seen massive copy and paste (with some minor change) that result in code bloat. When a bug is discovered it was often only fixed in some of the pastes. Martin Fowler describes Lines of Code as a measure of developer productivity as useless; the same applies to lines of code in a project. A change of programming language can result in a 10 fold drop (or increase) in lines of code. A change of a developer can also result in a similar change – depending on skill sets.
The use of Object-Relational Mapping (ORM) can often result in increased lines of code, defects, steeper learning curves and greater challenges addressing performance issues. A simple illustration is to move all addresses in Washington State from a master table to a child table. In SQL Server, TSQL – it is a one line statement, calling this from SQL it amounts to 4 lines of C# code. Using an ORM, this can quickly grow to 100-200 lines. ORMs came along because of a shortage of SQL developer skills. As with most things, it carry hidden costs that are omitted in the sales literature!
“Correct academic design” does not mean effective (i.e. low cost) development. One of the worst systems (for performance and maintenance) that I have seen was absolutely beautifully designed with a massive array of well defined classes – which unfortunately ignored the database reality. Many calls of a single method cascaded through these classes and resulted in 12 – 60 individual sql queries being executed against the database. Most of the methods could be converted to a wrapper on a single stored procedure with a major improvement of performance. The object hierarchy was flattened (or downsized!).
I extend the concept of cyclomatic complexity to the maximum stack depth in developer written code. The greater the depth, the longer it takes to debug (because the developer has to walk through the stack) and likely to write. The learning curve goes up. I suggest a maximum depth of 7 (less than cyclomatic complexity), ideally 5. This number comes out of research for short term memory (wikipedia). Going beyond seven significantly increases the effort that a developer needs to make to understand the stack. On the one hand, having a deep hierarchy of objects looks nice academically – but it is counterproductive for efficient coding. Seven is a magic number to keep asking “Why do we have more than seven ….”
Developer Skill Sets
Many architects suffer from the delusion that all developers are as skilled as they are, i.e. IQs over 145. During my high school teaching years, I was assigned both gifted classes and challenged classes – and learn to present appropriately to both. In some cities (for example Stockholm, Sweden) – 20% of the work force is in IT. This means that the IQ of the developers likely range from 100 upwards. When an application is released, the support developers likely will end up with an average IQ around 100. The question must be asked, how simple is the code to understand for future enhancements and maintenance?
If a firm has a policy of significant use of off-shore or contractor resources, there are further challenges:
- A high percentage of the paid time is in ramp-up mode
- There is a high level of non- conformity to existing standards and practices.
- Higher defect rate, greater time for existing staff to come up to speed on the code
- Size of team and ratio of application-experienced versus new developer can greatly alter delivery scheduled (see Brook’s law)
Pseudo coding different architecture rarely happens. It has some advantages – if you code up the most complex logic and then ask the question – “ A bug happens and nothing comes back, what are the steps to isolated the issue with certainty?” The architecture with the least diagnostic steps may be the more efficient one.
Last, the availability now and in the future of developers with the appropriate skills. The industry is full of technology that was hot and promised the moon and then were disrupted by a new technology (think of Borland Delphi and Pascal!). I often do a weighted value composed of years since launch, popularity at the moment and trend to refine choices (and in some cases to say No to a developer or architect that want to play with the latest and greatest!). Some sites are DB-Engine Ranking and PYPL. After short listing, then it’s a matter of coding up some complex examples in each and counting lines of code needed.
Specification Completeness And Stability
On one side, I have worked with a few PMs that deliver wonderful specifications (200-500 pages) that had no change-orders between the first line of code being written and final delivery a year later. What was originally handed to developers was not changed. Work was done in sprints. The behavior and content of every web page was detailed. There was a clean and well-reviewed dictionary of terms and meanings. Needless to say, delivery was prompt, on schedule, etc.
On the other side, I have had minor change-requests which mutated constantly. The number of lines of code written over all of these changes were 20x the number of lines of code finally delivered.
Concurrent development means that two or more set of changes were happening to the same code base. At one firm we had several git-hub forks: Master,Develop, Sprint, Epic and Saga. The title indicate when the changes were expected to be propagated to master. It worked reasonably, but often I ended up spending two days resolving conflicts and debugging bugs that were introduced whenever I attempted to get forks in sync. Concurrent development increases overhead exponentially according to the number of independent forks are active. Almost everything in development has exponential cost with size, there is no economy of scale in development.
On the flip side, at Amazon using the microservices model, there were no interaction between feature requests. Each API was self contained and would evolve independently. If an API needed another API changed, then the independent API would be changed, tested and released. The dependent API then was developed against the released independent API. There was no code-juggling act. Each code base API was single development and self-contained. Dependencies were by API not libraries and code bases.
Controlling costs and improving delivery depends greatly on the preparation work IMHO -- namely:
- Specification stability and completeness
- Architectural / Design being well crafted for the developer population
- Minimum noise (i.e. no concurrent development, change orders, change of priorities)
- Methodology (Scrum, Agile, Waterfall, Plan Driven) is of low significance IMHO – except for those selling it and ‘true believers’.
On the flip side, often the business will demand delivery schedules that add technical debt and significantly increase ongoing costs.
A common problem that I have seen is solving this multiple dimension problem by looking at just one (and rarely two) dimensions and discovering the consequences of that decision down stream. I will continue to add additional dimensions as I recall them from past experience.