Monday, November 25, 2013

Counterexample #1 - Performance Engineering of

For the past few months we have all had ringside seats to the spectacular failure of planning and communication that is - the personalized health insurance marketplace run by the United States Federal Government.

We now know that the project team and its managers were aware of problems with the application as early as March. That insufficient testing, evolving requirements, and performance were all contributors to the limitations that were seen at launch where the system could only handle 1,100 users per day. Considering that initial estimates were anticipating 50 to 60 thousand simultaneous users and in reality has been seeing upwards of 250,000 simultaneous users, this is a remarkable example of the impact of failing to engineer for performance.

On September 27th, four days before go-live, the Acting Directory of the CMS Office of Enterprise Management David Nelson wrote the following, illuminating, quote: "We cannot proactively find or replicate actual production capacity problems without an appropriately sized operational performance testing environment." By September 30th, the day before go-live another email : "performance degradation started when there were around 1,100 to 1,200 users".

The Catalogue of Catastrophe, a list of failed or troubled projects around the world has this to say about the project: " joins the list of projects that underestimated the volume of transactions they would be facing (see "Failure to address performance requirements" for further examples)."

If we example the list of Classic Mistakes as to why projects fail we can see that committed no less than 7 out of the top 10. Clay Shirky has written
 a fabulous article titled and the Gulf Between Planning and Reality, that explains the scope and magnitude of failure of communication that occurred on this project, as well as the inherent flaw in the statement "Failure is not an option."