Saturday, May 28, 2016

Theory about Test Environments

Often my career has faced dealing with an arbitrary environment to test in. This environment preceded my arrival, and often was still there at my departure with many developers became fatalistic towards this arbitrary environment.  This is not good.


The Rhetorical Goal Recomposed

“We use our test environment to verify that our code changes will work as expected”

While this assures upper management, it lacks specifics to evaluate if the test environment is appropriate or complete. A more objective measurement would be:

  • The code changes perform as specified at the six-sigma level of certainty.

This then logically cascades into sub-measurements:

  • A1: The code changes perform as specified at the highest projected peak load for the next N year (typically 1-2) at the six-sigma level of certainty.
  • A2: The code changes perform as specified on a fresh created (perfect) environment  at the six-sigma level of certainty.
  • A3: The code changes perform as specified on a copy of production environment with random data at the six-sigma level of certainty.

The last one is actually the most critical because too often there is bad data from bad prior released code (which may have be rolled back – but the corrupted data remained!) . There is a corollary:

  • C1: The code changes do not need to perform as specified when the environment have had its data corrupted by arbitrary code and data changes that have not made it to production. In other words, ignore a corrupted test environment


Once thru is not enough!

Today’s systems are often multi-layers with timeouts, blockage under load and other things making the outcome not a certainty but a random event. Above, I cited six sigma – this is a classic level sought in quality assurance of mechanical processes.


“A six sigma process is one in which 99.99966% of all opportunities to produce some feature of a part are statistically expected to be free of defects (3.4 defective features per million opportunities).”


To translate this into a single test context – the test must run 1,000,000 times and fail less than4 times. Alternatively, 250,000 times with no failures.


Load testing to reach six-sigma

Load testing will often result in 250,000 calls being made. In some cases, it may mean that the load test may need to run for 24 hours instead of 1 hour. There are some common problem with many load tests:

  • The load test does not run on a full copy of the production environment – violates A3:
  • The same data is used time and again for the tests – thus A3: the use of random data fails.
    • If you have a system that has been running for 5 years, then the data should be selected based on user created data with 1/5 from each year
    • If the system has had N releases, then the data should be selected on user created data with 1/n from each release period

Proposal for a Conforming Pattern

Preliminary development (PD) is done on a virgin system each day. By virgin I mean that databases and other data stores are created from scripts and populated with perfect data. There may be super user data but no common user data.  This should be done by an automated process. I have seen this done in some firms and it has some real benefits:

  • Integration tests must create (instead of borrow) users
    • Integration tests are done immediately after build – the environment is confirmed before any developers arrive at work.
    • Images of this environment could be saved to allow faster restores.
  • Performance is good because the data store is small
  • A test environment is much smaller and can be easily (and cheaply) created on one or more cloud services or even VMs
  • Residue from bad code do not persist (often reducing triage time greatly) – when a developer realized they have accidentally jacked the data then they just blow away the environment and recreate it

After the virgin system is built, the developer’s “release folder scripts” are executed – for example, adding new tables, altering stored procedures, adding new data to system tables. Then the integration tests are executed again. Some tests may fail. A simple solution that I have seen is for these tests to call into the data store to get the version number and add an extension to NUnit that indicate that this test applies to before of after this version number. Tests can then be excluded that are expected to fail (and also identified for a new version to be written).


Integration development(ID) applies to the situation where there may be multiple teams working on stuff that will go out in a single release. Often it is more efficient to keep the teams in complete isolation for preliminary development – if there are complexities and side-effects than only one team suffers. A new environment is created then each teams’ “release folder scripts” are executed and tests are executed.

i.e. PD+PD+….+PD = ID

This keeps the number of moving code fragments controlled.


Scope of Testing in PD and ID

A2 level is as far as we can do in this environment. We cannot do A1 or A3.


SmokeTest development (STD) means that an image of the production data base is made available to the integration team and they can test the code changes using real data. Ideally, they should regress with users  created during each release period so artifact issues can be identified. This may be significant testing, but is not load testing because we do not push up to peak volumes.

Tests either creates a new user (in the case of PD and ID) or searches for a random user that was created in release cycle 456 in the case of STD. Of course, code like SELECT TOP 1 *… should not be used, rather all users retrieved and one randomly selected.


This gets us close to A3: if we do enough iterations.


Designing Unit Tests for multiple Test Environment

Designing a UserFactory with a signature such as

UserFactory.GetUser(UserAttributes[] requiredAttributes)

can simplify the development of unit tests that can be used across multiple environments. This UserFactory reads a configuration file which may have  properties such as

  • CreateNewUser=”true”
  • PickExistingUser=”ByCreateDate”
  • PickExistingUser=”ByReleaseDate”
  • PickExistingUser=”ByCreateDateMostInactive”

In the first case, a user is created with the desired attributes.  In other cases, the attributes are used to filter the production data to get a list of candidates to randomly pick from.


In stressing scenarios when we want to test for side-effects due to concurrent operation by the same user, then we could use the current second to select the same user for all tests starting in the current second.


Developers Hiding Significant Errors – Unintentional

At one firm, we successfully established the following guidance:

  • Fatal: When the unexpected happen – for example, the error that was thrown was not mapped to a known error response (i.e. Unexpected Server Error should not be returned)
  • Error: When an error happens that should not happen, i.e. try catch worked to recover the situation…. but…
  • Warning: When the error was caused by customer input. The input must be recorded into the log (less passwords). This typically indicates a defect in UI, training or child applications
  • Info: everything else, i.e. counts
  • Debug: what ever

We also implemented the ability to change the log4net settings on the fly – so we could, in production, get every message for a short period of time (massive logs)

Load Stress with Concurrency

Correct load testing is very challenging and requires significant design and statistics to do and validate the results.


One of the simplest implementation is to have a week old copy of the database, capture all of the web request traffic in the last week and do a play back in a reduced time period. With new functionality extending existing APIs then we are reasonably good – except we need to make sure that we reach six-sigma level – i.e.  was there at least 250,000 calls???  This can be further complicated if the existing system has a 0.1% error rate. A 0.1% error rate means 250 errors are expected on average, unfortunately this means that detecting a 1 error in 250,000 calls difference is impossible from a single run (or even a dozen runs). Often the first stage is to drive error rates down to near zero on the existing code base. I have personally (over several months) a 50K/day exception logging rate to less than 10. It can be done – just a lot of systematic slow work (and fighting to get these not business significant bug fixes into production). IMHO, they are business significant: they reduce triage time, false leads, bug reports, and thus customer experience with the application.


One of the issues is whether the 250,000 calls applies to the system as a whole – or just the method being added or modified? For true six-sigma, it needs to be the method modified – sorry! And if there are 250,000 different users (or other objects) to be tested, then random selection of test data is required.


I advocate the use of PNUnit (Parallel Nunit) on multiple machines with a slight twist. In the above UserFactory.Get() described above, we randomly select the user, but  for stress testing, we could use the seconds (long) and modular it with the number of candidate users and then execute the tests. This approach intentionally creates a situation where concurrent activity will generated, potentially creating blocks, deadlocks and inconsistencies.


There is a nasty problem with using integration tests mirroring the production distribution of calls. Marking tests appropriately may help, the test runner can them select the tests to simulate the actual production call distribution and rates. Of course, this means that there is data on the call rates and error rates from the production system.


Make sure that you are giving statistically correct reports!


The easy question to answer is “Does the new code make the error rate statistically worst?” Taking our example above of 0.1% error we had 250 errors being expected. If we want to have 95% confidence then we would need to see 325 errors to deem it to be worst. You must stop and think about this, because of the our stated goal was less than 1 error in 250,000 – and we ignore 75 more errors as not being significant!!! This is a very weak criteria. It also makes clear that driving down the back ground error rate is essential. You cannot get strong results with a high background error rate, you may only be able to demonstrate 1 sigma defect rate.


In short, you can rarely have a better sigma rate than your current rate unless you fix the current code base to have a lower sigma rate.

Thursday, May 12, 2016

The sad state of evidence based development management patterns

I have been in the development game for many decades. I did my first programs using APL/360 and Fortran (WatFiv) at the University of Waterloo, and have seen and coded a lot of languages over the years (FORTH, COBOL, Asm, Pascal, B,C, C++, SAS, etc).


My academic training was in Operations Research – that is mathematical optimization of business processes. Today, I look at the development processes that I see and it is dominantly “fly by the seats of the pants”, “everybody is doing it” or “academic correctness”. I am not talking about waterfall or agile or scrum. I am not talking about architecture etc. Yet is some ways I am. Some processes assert Evidence Based Management, yet fails to deliver the evidence of better results. Some bloggers detail the problems with EBM.  A few books attempt to summarize the little research that has occurred, such as "Making Software: What Really Works and Why we Believe It"


As an Operation Research person, I would define the optimization problem facing a development manager or director or lead as follows:

  • Performance (which often comes at increased man hours to develop and operational costs)
  • Scalability (which often comes at increased man hours to develop and operational costs)
  • Cost to deliver
  • Accuracy of deliverable (Customer satisfaction)
  • Completeness of deliverable
  • Elapsed time to delivery (shorter time often exponentially increase cost to deliver and defect rates)
  • Ongoing operational costs (a bad design may result in huge cloud computing costs)
  • Time for a new developer to become efficient across the entire product
  • Defect rate
    • Number of defects
    • ETA from reporting to fix
  • Developer resources
    • For development
    • For maintenance

All of these factors interact. For evidence, there are no studies and I do not expect them to be. Technology is changing too fast, there is huge differences between projects, and any study will be outdated before it is usable. There is some evidence that we can work from.

Lines of Code across a system

Lines of code directly impacts several of the above.

  • Defect rate is a function of the number of lines of code ranging from 200/100K to 1000/100K lines [source] which is scaled by developer skill level. Junior or new developers will have a higher defect rate.
  • Some classic measures defined in the literature, for example, cyclomatic complexity. Studies find a positive correlation between cyclomatic complexity and defects: functions and methods that have the highest complexity tend to also contain the most defects.
  • Time to deliver is often a function of the lines of code written.

There is a mistaken belief that lines of code is an immutable for a project. In the early 2000’s I lead a rewrite of a middle tier and backend tier (with the web front end being left as is), the original C++/SQL server code base was 474,000 lines of code and was the result of 25 man years of coding. With a team of 6 new (to the application) developers sent over from India and 2 intense local developer, we recreated these tiers with 100% api compliance in just 25,000 lines of code in about 8 weeks. 25 man years –> 1 man year. a 20 fold decrease in code base. And the last factor was an increase in concurrent load by 20 fold. 


On other projects I have seen massive copy and paste (with some minor change) that result in code bloat. When a bug is discovered it was often only fixed in some of the pastes. Martin Fowler describes Lines of Code as a measure of developer productivity as useless; the same applies to lines of code in a project.  A change of programming language can result in a 10 fold drop (or increase) in lines of code. A change of a developer can also result in a similar change – depending on skill sets.


Implementation Design

The use of Object-Relational Mapping (ORM) can often result in increased lines of code, defects, steeper learning curves and greater challenges addressing performance issues. A simple illustration is to move all addresses in Washington State from a master table to a child table. In SQL Server, TSQL – it is a one line statement, calling this from SQL it amounts to 4 lines of C# code. Using an ORM, this can quickly grow to 100-200 lines. ORMs came along because of a shortage of SQL developer skills. As with most things, it carry hidden costs that are omitted in the sales literature!


“Correct academic design” does not mean effective (i.e. low cost) development. One of the worst systems (for performance and maintenance) that I have seen was absolutely beautifully designed with a massive array of well defined classes – which unfortunately ignored the database reality.  Many calls of a single method cascaded through these classes and resulted in 12 – 60 individual sql queries being executed against the database.  Most of the methods could be converted to a wrapper on a single stored procedure with a major improvement of performance. The object hierarchy was flattened (or downsized!).


I extend the concept of cyclomatic complexity to the maximum stack depth in developer written code.  The greater the depth, the longer it takes to debug (because the developer has to walk through the stack) and likely to write. The learning curve goes up. I suggest a maximum depth of 7 (less than cyclomatic complexity), ideally 5. This number comes out of research for short term memory (wikipedia). Going beyond seven significantly increases the effort that a developer needs to make to understand the stack. On the one hand, having a deep hierarchy of objects looks nice academically – but it is counterproductive for efficient coding. Seven is a magic number to keep asking “Why do we have more than seven ….”

Developer Skill Sets

Many architects suffer from the delusion that all developers are as skilled as they are, i.e. IQs over 145.  During my high school teaching years, I was assigned both gifted classes and challenged classes – and learn to present appropriately to both. In some cities (for example Stockholm, Sweden) – 20% of the work force is in IT. This means that the IQ of the developers likely range from 100 upwards. When an application is released, the support developers likely will end up with an average IQ around 100. The question must be asked, how simple is the code to understand for future enhancements and maintenance?


If a firm has a policy of significant use of off-shore or contractor resources, there are  further challenges:

  • A high percentage of the paid time is in ramp-up mode
  • There is a high level of non- conformity to existing standards and practices.
    • Higher defect rate, greater time for existing staff to come up to speed on the code
  • Size of team and ratio of application-experienced versus new developer can greatly alter delivery scheduled (see Brook’s law

Pseudo coding different architecture rarely happens. It has some advantages – if you code up the most complex logic and then ask the question – “ A bug happens and nothing comes back, what are the steps to isolated the issue with certainty?” The architecture with the least diagnostic steps may be the more efficient one.


Last, the availability now and in the future of developers with the appropriate skills.  The industry is full of technology that was hot and promised the moon and then were disrupted by a new technology (think of Borland Delphi and Pascal!). I often do a weighted value composed of years since launch, popularity at the moment and trend to refine choices (and in some cases to say No to a developer or architect that want to play with the latest and greatest!). Some sites are DB-Engine Ranking and PYPL.  After short listing, then it’s a matter of coding up some complex examples in each and counting lines of code needed.

Specification Completeness And Stability

On one side, I have worked with a few PMs that deliver wonderful specifications (200-500 pages) that had no change-orders between the first line of code being written and final delivery a year later. What was originally handed to developers was not changed. Work was done in sprints. The behavior and content of every web page was detailed. There was a clean and well-reviewed dictionary of terms and meanings. Needless to say, delivery was prompt, on schedule, etc.


On the other side, I have had minor change-requests which mutated constantly. The number of lines of code written over all of these changes were 20x the number of lines of code finally delivered.

Concurrent Development

Concurrent development means that two or more set of changes were happening to the same code base. At one firm we had several git-hub forks: Master,Develop, Sprint, Epic and Saga. The title indicate when the changes were expected to be propagated to master. It worked reasonably, but often I ended up spending two days resolving conflicts and debugging bugs that were introduced whenever I attempted to get forks in sync. Concurrent development increases overhead exponentially according to the number of independent forks are active. Almost everything in development has exponential cost with size, there is no economy of scale in development.


On the flip side, at Amazon using the microservices model, there were no interaction between feature requests. Each API was self contained and would evolve independently. If an API needed another API changed, then the independent API would be changed, tested and released. The dependent API then was developed against the released independent API. There was no code-juggling act. Each code base API was single development and self-contained. Dependencies were by API not libraries and code bases.


Bottom Line

Controlling costs and improving delivery depends greatly on the preparation work IMHO -- namely:

  • Specification stability and completeness
  • Architectural / Design being well crafted for the developer population
  • Minimum noise (i.e. no concurrent development, change orders, change of priorities)
  • Methodology (Scrum, Agile, Waterfall, Plan Driven) is of low significance IMHO – except for those selling it and ‘true believers’.

On the flip side, often the business will demand delivery schedules that add technical debt and significantly increase ongoing costs.


A common problem that I have seen is solving this multiple dimension problem by looking at just one (and rarely two) dimensions and discovering the consequences of that decision down stream.  I will continue to add additional dimensions as I recall them from past experience.

Tuesday, May 10, 2016

Mining PubMed via Neo4J Graph Database–Getting the data

I have a blog dealing with various complex autoimmune diseases and spend a lot of time walking links at Often readers send me an article that I missed. 


I thought that a series of post on how to do it will help other people (including MDs, grad students and citizen scientists) better research medical issues.


Getting the data from Pub Med

I implemented a simple logic to obtain a collection of relevant articles:

  • Query for 10,000 articles on a subject or key word

  • Retrieve each of these articles and any articles they referenced (i.e. the knowledge graph).
  • Keep repeating until you have enough articles or you run out of them!!

Getting the bootstrapping list of articles

A console application that reads the command line arguments and retrieves the list. For example,

downloader.exe Crohn’s Disease

which produces this URI's+disease

This results in an XML file being sent


<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE eSearchResult PUBLIC "-//NLM//DTD esearch 20060628//EN" "">

So let us look at the code

class Program
        static Downloader downloader = new Downloader();
        static void Main(string[] args)
            if (args.Length > 0)
                var search = new StringBuilder();
                foreach (var arg in args)
                    search.AppendFormat("{0} ", arg);

The Downloader class tracks articles already downloaded and those to do next. It simply starts downloading and saving each article summary to an Xml file using the unique article Id as the file name. I wanted to keep the summaries on my disk to speed reprocessing if my Neo4J model changes.

using System;
using System.Collections.Generic;      
using System.Collections.Concurrent;
using System.Net;                 

using System.Linq;
using System.Threading.Tasks; 
using System.Xml;                    

using System.Text;    
using System.Configuration;
using System.IO;
namespace PubMed
    public class Downloader
        // Entrez E-utilities at the US National Center for Biotechnology Information:
        static readonly String server = "";
        string dataFolder = "C:\\PubMed";
        string logFile;
        public System.Collections.Concurrent.ConcurrentBag<string> index = new ConcurrentBag<string>();
        public System.Collections.Concurrent.ConcurrentQueue<string> todo = new ConcurrentQueue<string>();
        public Downloader()
            logFile = Path.Combine(dataFolder, "article.log");
            if (File.Exists(logFile))
                var lines = File.ReadAllLines(logFile);
                foreach (var line in lines)
                    if (!string.IsNullOrWhiteSpace(line))
        public void Save()
            File.WriteAllLines(logFile, index.ToArray());

         public void ProcessAll()

            var nextId = string.Empty;
            while (todo.Count > 0)
                if (todo.Count > 12)
                    var tasks = new List<Task>();
                    int t = 0;
                    for (t = 0; t < 10; t++)
                        if (todo.TryDequeue(out nextId))

                            tasks.Add(Task.Factory.StartNew(() => NcbiPubmedArticle(nextId)));
                    if (todo.TryDequeue(out nextId))


        public void TermSearch(String term)
            var search = string.Format("{0}", term.Replace(" ", "+"));
            new WebClient().DownloadFile(new Uri(search), "temp.log");
            var xml = new XmlDocument();
            foreach (XmlNode node in xml.DocumentElement.SelectNodes("//Id"))
                var id = node.InnerText;
                if (!index.Contains(id) && !todo.Contains(id))

        public void NcbiPubmedArticle(String term)

            if (!index.Contains(term))
                    var fileLocation = Path.Combine(dataFolder, string.Format("{0}.xml", term));
                    if (File.Exists(fileLocation)) return;
                    var search = string.Format("{0}&retmode=xml", term);
                    new WebClient().DownloadFile(new Uri(search), fileLocation);

        private void GetChildren(string fileName)
                var dom = new XmlDocument();
                foreach (XmlNode node in dom.DocumentElement.SelectNodes("//PMID"))
                    var id = node.InnerText;
                    if (!index.Contains(id) && !todo.Contains(id))
            catch (Exception exc)

Next Importing into Neo4J

An example of the structured data to load is shown below. Try defining your own model while you wait for the next post. 


<?xml version="1.0"?>
<!DOCTYPE PubmedArticleSet PUBLIC "-//NLM//DTD PubMedArticle, 1st January 2016//EN" "">
    <MedlineCitation Owner="NLM" Status="MEDLINE">
        <PMID Version="1">10022306</PMID>
        <Article PubModel="Print">
                <ISSN IssnType="Print">0378-4274</ISSN>
                <JournalIssue CitedMedium="Print">
                <Title>Toxicology letters</Title>
                <ISOAbbreviation>Toxicol. Lett.</ISOAbbreviation>
            <ArticleTitle>Epidemiological association in US veterans between Gulf War illness and exposures to anticholinesterases.</ArticleTitle>
                <AbstractText>To investigate complaints of Gulf War veterans, epidemiologic, case-control and animal modeling studies were performed. Looking for OPIDP variants, our epidemiologic project studied 249 Naval Reserve construction battalion (CB24) men. Extensive surveys were drawn for symptoms and exposures. An existing test (PAI) was used for neuropsychologic. Using FACTOR, LOGISTIC and FREQ in 6.07 SAS, symptom clusters were sought with high eigenvalues from orthogonally rotated two-stage factor analysis. After factor loadings and Kaiser measure for sampling adequacy (0.82), three major and three minor symptom clusters were identified. Internally consistent by Cronbach's coefficient, these were labeled syndromes: (1) impaired cognition; (2) confusion-ataxia; (3) arthro-myo-neuropathy; (4) phobia-apraxia; (5) fever-adenopathy; and (6) weakness-incontinence. Syndrome variants identified 63 patients (63/249, 25%) with 91 syndromes. With pyridostigmine bromide as the drug in these drug-chemical exposures, syndrome chemicals were: (1) pesticide-containing flea and tick collars (P &lt; 0.001); (2) alarms from chemical weapons attacks (P &lt; 0.001), being in a sector later found to have nerve agent exposure (P &lt; 0.04); and (3) insect repellent (DEET) (P &lt; 0.001). From CB24, 23 cases, 10 deployed and 10 non-deployed controls were studied. Auditory evoked potentials showed dysfunction (P &lt; 0.02), nystagmic velocity on rotation testing, asymmetry on saccadic velocity (P &lt; 0.04), somatosensory evoked potentials both sides (right P &lt; 0.03, left P &lt; 0.005) and synstagmic velocity after caloric stimulation bilaterally (P-range, 0.02-0.04). Brain dysfunction was shown on the Halstead Impairment Index (P &lt; 0.01), General Neuropsychological Deficit Scale (P &lt; 0.03) and Trail Making part B (P &lt; 0.03). Butylcholinesterase phenotypes did not trend for inherent abnormalities. Parallel hen studies at Duke University established similar drug-chemical delayed neurotoxicity. These investigations lend credibility that sublethal exposures to drug-chemical combinations caused delayed-onset neurotoxic variants.</AbstractText>
            <AuthorList CompleteYN="Y">
                <Author ValidYN="Y">
                    <ForeName>T L</ForeName>
                        <Affiliation>Department of Internal Medicine, University of Texas Southwestern Medical School, Dallas 75235, USA.</Affiliation>
                <PublicationType UI="D016428">Journal Article</PublicationType>
                <PublicationType UI="D013485">Research Support, Non-U.S. Gov't</PublicationType>
            <MedlineTA>Toxicol Lett</MedlineTA>
                <NameOfSubstance UI="D002800">Cholinesterase Inhibitors</NameOfSubstance>
                <DescriptorName MajorTopicYN="N" UI="D016022">Case-Control Studies</DescriptorName>
                <DescriptorName MajorTopicYN="N" UI="D002800">Cholinesterase Inhibitors</DescriptorName>
                <QualifierName MajorTopicYN="Y" UI="Q000633">toxicity</QualifierName>
                <DescriptorName MajorTopicYN="N" UI="D006801">Humans</DescriptorName>
                <DescriptorName MajorTopicYN="N" UI="D008297">Male</DescriptorName>
                <DescriptorName MajorTopicYN="N" UI="D018923">Persian Gulf Syndrome</DescriptorName>
                <QualifierName MajorTopicYN="Y" UI="Q000209">etiology</QualifierName>
                <DescriptorName MajorTopicYN="Y" UI="D014728">Veterans</DescriptorName>
            <PubMedPubDate PubStatus="pubmed">
            <PubMedPubDate PubStatus="medline">
            <PubMedPubDate PubStatus="entrez">
            <ArticleId IdType="pubmed">10022306</ArticleId>


Saturday, May 7, 2016

Microservices–Do it right!

In my earlier post, A Financially Frugal Architectural Pattern for the Cloud,  I advocated the use of microservices. Microservices are similar to REST, a concept or pattern or architectural standard, unlike  SOAP which is standards based. The modern IT industry trend towards “good enough”,  “lip-service” and “we’ll fix it in the next release”.  A contemporary application may use relational database software (SQL Server, Oracle, MySql) and thus the developers (and their management) would assert that their is a relational database system. If I move a magnetic tape based system into tables (one table for each type of tape) using relational database software – would that make it a relational database system? My opinion is no – never!!!


Then what makes it one? The data has been fully normalized in the logical model. Often the database has never been reviewed for normalization  despite such information being ancient (see William Kent, A Simple Guide to Five Normal Forms in Relational Database Theory, 1982), older than many developers. The implementation may be de-normalized in the physical model (if you have just a ‘database model’ and not separate physical and logical, then you are likely heading to trouble – in time (usually after the original developers have left!). For NoSql database, there is a lot of ancient literature out there dealing with both hierarchical databases and network databases which should also be used with MongoDB and Neo4j – but likely not.


My academic training is in mathematics and thus axioms and deriving theorems from them though rigorous logic.  The normalization of databases is immediately attractive to me. Knowing the literature (especially Christopher J.Date’s early writings from the 1970’s) is essential since “"Those who do not learn history are doomed to repeat it.”


Microservices Normalization Rules

Below are my attempt to define equivalent rules for microservices. They will likely be revised over time. They are very mathematical in definition by intent. Martin Fowler’s article is also a good read. Much of the discussion on the web is at a high level (the hand waving level), such as Microservices Architecture and Design PrinciplesMicroservices Design Principles, with some echoing some of the issues cited below Adopting Microservices at Netflix: Lessons for Architectural Design


A public REST API consumed over the internet is probably not a microservice. It may use many microservices and other composite APIs.


  • Composite API: A service that consumes other composite APIs and/or microservices but do not qualify below
  • Independent Microservice: A service that does not call any other microservices
  • Dependent Microservice: A service that calls Independent Microservices in parallel

An Independent Microservice

An independent microservice is the exclusive owner of a data store.

  • No other service or system may access the data store.
  • A microservice may change the software used to create the datastore with no consequences on any other system.
  • A microservice does not make calls to other services
    • An corollary of this is that microservices rarely use any libraries that are not generic across the industry
      • Exception: libraries of static functions that are explicit to a firm, for example, encryption of  keys (i.e. Identity Integers –> strings)
  • A microservice may contain publishers
    • The datastore that it controls may need to be pushed to reporting and other systems
  • A microservice may create log records that are directly consumed by other systems.
    • Non-blocking outputs from a microservice are fine
  • A microservice may make periodic calls.
    • A microservice may pull things off a queue, or push things to a queue
      • The nature of this data is transient. The queue services interaction must match the model of an API call and response. The call comes from one queue and written to another queue.
        • Ideally there will be no references to queues inside the microservice.
        • A call to the microservice would start reading a queue at a regular interval (the queue is in the call parameters)
        • The data on the queue would specify where the results should be sent
    • A microservice should not pull data from (non-microservice) data store.
      • Exception: a configurationless implementation such as described for queues above is fine.
  • A microservice configuration should never reference anything outside of it’s own world.
    • Configuration Injection is allowed. Microservice is deployed and loaded, then a call is done to supply it’s configuration.

A Dependent Microservice

  • Has exclusive ownership of it’s data store just like an independent microservice.
  • Calls dependent microservice to obtain data only (no update, delete or create)
    • Create Update Delete calls must always go to the independent microservice that owns it, no relaying should occur.
    • Calls are in parallel, never sequential
  • Note: Some types of queries may be inefficient, those should be directed at a reporting microservice (which independent microservices may publish to) or a composite service.


Control of Microservices

The best model that I have seen is one that some groups at Amazon used.

  • All calls to the microservice must have an identifier (cookie?) that identifies the caller.
    • The microservice determines if it is an authorized caller based on the identifier and possibly the IP address
  • The consumer of the microservice must be authorized by the owner of the microservice based on: a contract containing at least
    • Daily load estimates
    • Peak load estimates
    • Availability
  • The microservice may disable any consumer that exceeds the contracted load.
  • The consumer should be given a number from 1-100 indicating business importance.
    • If the microservice is stressed, then those services with lower values will be disabled.
  • There should always be SLA in place

Microservices should be executed on isolated VMs behind  a load distributor. The datastore should be also on a dedicated set of VMs, for example a set of VMs supporting a Casandra implementation.


More suggestions?

To quote Martin Fowler, “If the components do not compose cleanly, then all you are doing is shifting complexity from inside a component to the connections between components. Not just does this just move complexity around, it moves it to a place that's less explicit and harder to control.” I have seen this happen – with the appearance of microservices (because there are tons of REST APis on different servers) but behind this layer, there are shared DLL’s accessing multiple databases causing endless pains with keeping DLL current as features are added or bugs fixed. A bug in a single library may require the fix to be propagated to a dozen REST APIs’. If it must be propagated it is not a microservice.


My goal with this post is to define a set of objective check items that can be clearly determined by inspection. The ideal implementation would have all of them passing.


One of the side-effects is that the rules can often be inconvenient for quick designs. A rethinking of what you are trying to do often results – similar to what I have seen happen when you push for full normalization in a logical model. 


Do you have further suggestions?