Software development effort estimation

In software development, effort estimation is the process of predicting the most realistic amount of effort (expressed in terms of person-hours or money) required to develop or maintain software based on incomplete, uncertain and noisy input. Effort estimates may be used as input to project plans, iteration plans, budgets, investment analyses, pricing processes and bidding rounds.^[1]^[2]

State-of-practice

Published surveys on estimation practice suggest that expert estimation is the dominant strategy when estimating software development effort.^[3]

Typically, effort estimates are over-optimistic and there is a strong over-confidence in their accuracy. The mean effort overrun seems to be about 30% and not decreasing over time. For a review of effort estimation error surveys, see.^[4] However, the measurement of estimation error is problematic, see Assessing the accuracy of estimates. The strong overconfidence in the accuracy of the effort estimates is illustrated by the finding that, on average, if a software professional is 90% confident or “almost sure” to include the actual effort in a minimum-maximum interval, the observed frequency of including the actual effort is only 60-70%.^[5]

Currently the term “effort estimate” is used to denote as different concepts such as most likely use of effort (modal value), the effort that corresponds to a probability of 50% of not exceeding (median), the planned effort, the budgeted effort or the effort used to propose a bid or price to the client. This is believed to be unfortunate, because communication problems may occur and because the concepts serve different goals.^[6]^[7]

History

Software researchers and practitioners have been addressing the problems of effort estimation for software development projects since at least the 1960s; see, e.g., work by Farr^[8]^[9] and Nelson.^[10]

Most of the research has focused on the construction of formal software effort estimation models. The early models were typically based on regression analysis or mathematically derived from theories from other domains. Since then a high number of model building approaches have been evaluated, such as approaches founded on case-based reasoning, classification and regression trees, simulation, neural networks, Bayesian statistics, lexical analysis of requirement specifications, genetic programming, linear programming, economic production models, soft computing, fuzzy logic modeling, statistical bootstrapping, and combinations of two or more of these models. The perhaps most common estimation methods today are the parametric estimation models COCOMO, SEER-SEM and SLIM. They have their basis in estimation research conducted in the 1970s and 1980s and are since then updated with new calibration data, with the last major release being COCOMO II in the year 2000. The estimation approaches based on functionality-based size measures, e.g., function points, is also based on research conducted in the 1970s and 1980s, but are re-calibrated with modified size measures and different counting approaches, such as the use case points^[11] or object points in the 1990s.

Estimation approaches

There are many ways of categorizing estimation approaches, see for example.^[12]^[13] The top level categories are the following:

Expert estimation: The quantification step, i.e., the step where the estimate is produced based on judgmental processes.^[14]
Formal estimation model: The quantification step is based on mechanical processes, e.g., the use of a formula derived from historical data.
Combination-based estimation: The quantification step is based on a judgmental and mechanical combination of estimates from different sources.

Below are examples of estimation approaches within each category.

Estimation approach	Category	Examples of support of implementation of estimation approach
Analogy-based estimation	Formal estimation model	ANGEL, Weighted Micro Function Points
WBS-based (bottom up) estimation	Expert estimation	Project management software, company specific activity templates
Parametric models	Formal estimation model	COCOMO, SLIM, SEER-SEM, TruePlanning for Software
Size-based estimation models^[15]	Formal estimation model	Function Point Analysis,^[16]Use Case Analysis, Use Case Points, SSU (Software Size Unit), Story points-based estimation in Agile software development, Object Points
Group estimation	Expert estimation	Planning poker, Wideband delphi
Mechanical combination	Combination-based estimation	Average of an analogy-based and a Work breakdown structure-based effort estimate^[17]
Judgmental combination	Combination-based estimation	Expert judgment based on estimates from a parametric model and group estimation

Selection of estimation approaches

The evidence on differences in estimation accuracy of different estimation approaches and models suggest that there is no “best approach” and that the relative accuracy of one approach or model in comparison to another depends strongly on the context .^[18] This implies that different organizations benefit from different estimation approaches. Findings^[19] that may support the selection of estimation approach based on the expected accuracy of an approach include:

Expert estimation is on average at least as accurate as model-based effort estimation. In particular, situations with unstable relationships and information of high importance not included in the model may suggest use of expert estimation. This assumes, of course, that experts with relevant experience are available.
Formal estimation models not tailored to a particular organization’s own context, may be very inaccurate. Use of own historical data is consequently crucial if one cannot be sure that the estimation model’s core relationships (e.g., formula parameters) are based on similar project contexts.
Formal estimation models may be particularly useful in situations where the model is tailored to the organization’s context (either through use of own historical data or that the model is derived from similar projects and contexts), and it is likely that the experts’ estimates will be subject to a strong degree of wishful thinking.

The most robust finding, in many forecasting domains, is that combination of estimates from independent sources, preferable applying different approaches, will on average improve the estimation accuracy.^[19]^[20]^[21]

It is important to be aware of the limitations of each traditional approach to measuring software development productivity.^[22]

In addition, other factors such as ease of understanding and communicating the results of an approach, ease of use of an approach, and cost of introduction of an approach should be considered in a selection process.

Assessing the accuracy of estimates

The most common measure of the average estimation accuracy is the MMRE (Mean Magnitude of Relative Error), where the MRE of each estimate is defined as:

MRE = ${\frac {|{\text{actual effort}}-{\text{estimated effort}}|}{\text{actual effort}}}$

This measure has been criticized ^[23]^[24]^[25] and there are several alternative measures, such as more symmetric measures,^[26] Weighted Mean of Quartiles of relative errors (WMQ) ^[27] and Mean Variation from Estimate (MVFE).^[28]

MRE is not reliable if the individual items are skewed. PRED(25) is preferred as a measure of estimation accuracy. PRED(25) measures the percentage of predicted values that are within 25 percent of the actual value.

A high estimation error cannot automatically be interpreted as an indicator of low estimation ability. Alternative, competing or complementing, reasons include low cost control of project, high complexity of development work, and more delivered functionality than originally estimated. A framework for improved use and interpretation of estimation error measurement is included in.^[29]

Psychological issues

There are many psychological factors potentially explaining the strong tendency towards over-optimistic effort estimates that need to be dealt with to increase accuracy of effort estimates. These factors are essential even when using formal estimation models, because much of the input to these models is judgment-based. Factors that have been demonstrated to be important are: Wishful thinking, anchoring, planning fallacy and cognitive dissonance. A discussion on these and other factors can be found in work by Jørgensen and Grimstad.^[30]

It's easy to estimate what you know.
It's hard to estimate what you know you don't know. (known unknowns)
It's very hard to estimate things that you don't know you don't know. (unknown unknowns)

Humor

The chronic underestimation of development effort has led to the coinage and popularity of numerous humorous adages, such as ironically referring to a task as a "small matter of programming" (when much effort is likely required), and citing laws about underestimation:

Ninety-ninety rule:

The first 90 percent of the code accounts for the first 90 percent of the development time. The remaining 10 percent of the code accounts for the other 90 percent of the development time.^[31]
— Tom Cargill, Bell Labs

Hofstadter's law:

Hofstadter's Law: It always takes longer than you expect, even when you take into account Hofstadter's Law.
— Douglas Hofstadter, Gödel, Escher, Bach: An Eternal Golden Braid^[32]

Fred Brooks' law:

What one programmer can do in one month, two programmers can do in two months.
— Fred Brooks

Adding to the fact that estimating development efforts is hard, it's worth stating that assigning more resources doesn't always help.

Comparison of development estimation software

Software	Schedule estimate	Cost estimate	Cost Models	Input	Report Output Format	Supported Programming Languages	Platforms	Cost	License
AFCAA REVIC^[33]	Yes	Yes	REVIC	KLOC, Scale Factors, Cost Drivers	proprietary, Text	any	DOS	Free	Proprietary Free for public distribution
Seer for Software	Yes	Yes	SEER-SEM	SLOC, Function points, use cases, bottoms-up, object, features	proprietary, Excel, Microsoft Project, IBM Rational, Oracle Crystal Ball	any	Windows, Any (Web-based)	Commercial	Proprietary
SLIM^[34]	Yes	Yes	SLIM	Size (SLOC, Function points, Use Cases, etc.), constraints (size, duration, effort, staff), scale factors, historical projects, historical trends	proprietary, Excel, Microsoft Project, Microsoft PowerPoint, IBM Rational, text, HTML	any	Windows, Any (Web-based)^[35]	Commercial	Proprietary
TruePlanning^[36]	Yes	Yes	PRICE	Components, Structures, Activities, Cost drivers, Processes, Functional Software Size (Source Lines of Code (SLOC), Function Points, Use Case Conversion Points (UCCP), Predictive Object Points (POPs) etc.)	Excel, CAD	any	Windows	Commercial	Proprietary

References

^ "What We do and Don't Know about Software Development Effort Estimation".
^ "Cost Estimating And Assessment Guide GAO-09-3SP Best Practices for developing and managing Capital Program Costs" (PDF). US Government Accountability Office. 2009.
^ Jørgensen, M. (2004). "A Review of Studies on Expert Estimation of Software Development Effort". Journal of Systems and Software. 70 (1–2): 37–60. doi:10.1016/S0164-1212(02)00156-5.
^ Molokken, K. Jorgensen, M. (2003). "A review of software surveys on software effort estimation". 2003 International Symposium on Empirical Software Engineering, 2003. ISESE 2003. Proceedings. pp. 223–230. doi:10.1109/ISESE.2003.1237981. ISBN 978-0-7695-2002-5. S2CID 15471986.CS1 maint: multiple names: authors list (link)
^ Jørgensen, M. Teigen, K.H. Ribu, K. (2004). "Better sure than safe? Over-confidence in judgement based software development effort prediction intervals". Journal of Systems and Software. 70 (1–2): 79–93. doi:10.1016/S0164-1212(02)00160-7.CS1 maint: multiple names: authors list (link)
^ Edwards, J.S. Moores (1994). "A conflict between the use of estimating and planning tools in the management of information systems". European Journal of Information Systems. 3 (2): 139–147. doi:10.1057/ejis.1994.14. S2CID 62582672.
^ Goodwin, P. (1998). Enhancing judgmental sales forecasting: The role of laboratory research. Forecasting with judgment. G. Wright and P. Goodwin. New York, John Wiley & Sons: 91-112. Hi
^ Farr, L. Nanus, B. "Factors that affect the cost of computer programming, volume I" (PDF).CS1 maint: multiple names: authors list (link)
^ Farr, L. Nanus, B. "Factors that affect the cost of computer programming, volume II" (PDF).CS1 maint: multiple names: authors list (link)
^ Nelson, E. A. (1966). Management Handbook for the Estimation of Computer Programming Costs. AD-A648750, Systems Development Corp.
^ Anda, B. Angelvik, E. Ribu, K. (2002). "Improving Estimation Practices by Applying Use Case Models". Lecture Notes in Computer Science. 2559: 383–397. CiteSeerX . doi:10.1007/3-540-36209-6_32. ISBN 978-3-540-00234-5.CS1 maint: multiple names: authors list (link) ISBN 9783540002345, 9783540362098.
^ Briand, L. C. and Wieczorek, I. (2002). Resource estimation in software engineering. Encyclopedia of software engineering. J. J. Marcinak. New York, John Wiley & Sons: 1160-1196.
^ Jørgensen, M. Shepperd, M. "A Systematic Review of Software Development Cost Estimation Studies".CS1 maint: multiple names: authors list (link)
^ "Custom Software Development Services - Custom App Development - Oxagile".
^ Hill Peter (ISBSG) - Estimation Workbook 2 - published by International Software Benchmarking Standards Group ISBSG - Estimation and Benchmarking Resource Centre Archived 2008-08-29 at the Wayback Machine
^ Morris Pam — Overview of Function Point Analysis Total Metrics - Function Point Resource Centre
^ Srinivasa Gopal and Meenakshi D'Souza. 2012. Improving estimation accuracy by using case based reasoning and a combined estimation approach. In Proceedings of the 5th India Software Engineering Conference (ISEC '12). ACM, New York, NY, USA, 75-78. DOI=https://dx.doi.org/10.1145/2134254.2134267
^ Shepperd, M. Kadoda, G. (2001). "Comparing software prediction techniques using simulation". IEEE Transactions on Software Engineering. 27 (11): 1014–1022. doi:10.1109/32.965341.CS1 maint: multiple names: authors list (link)
^ ^a ^b Jørgensen, M. "Estimation of Software Development Work Effort:Evidence on Expert Judgment and Formal Models".
^ Winkler, R.L. (1989). "Combining forecasts: A philosophical basis and some current issues Manager". International Journal of Forecasting. 5 (4): 605–609. doi:10.1016/0169-2070(89)90018-6.
^ Blattberg, R.C. Hoch, S.J. (1990). "Database Models and Managerial Intuition: 50% Model + 50% Manager". Management Science. 36 (8): 887–899. doi:10.1287/mnsc.36.8.887. JSTOR 2632364.CS1 maint: multiple names: authors list (link)
^ BlueOptima (2019-10-29). "Identifying Reliable, Objective Software Development Metrics".
^ Shepperd, M. Cartwright, M. Kadoda, G. (2000). "On Building Prediction Systems for Software Engineers". Empirical Software Engineering. 5 (3): 175–182. doi:10.1023/A:1026582314146. S2CID 1293988.CS1 maint: multiple names: authors list (link)
^ Kitchenham, B. Pickard, L.M. MacDonell, S.G. Shepperd. "What accuracy statistics really measure".CS1 maint: multiple names: authors list (link)
^ Foss, T. Stensrud, E. Kitchenham, B. Myrtveit, I. (2003). "A Simulation Study of the Model Evaluation Criterion MMRE". IEEE Transactions on Software Engineering. 29 (11): 985–995. CiteSeerX . doi:10.1109/TSE.2003.1245300.CS1 maint: multiple names: authors list (link)
^ Miyazaki, Y. Terakado, M. Ozaki, K. Nozaki, H. (1994). "Robust regression for developing software estimation models". Journal of Systems and Software. 27: 3–16. doi:10.1016/0164-1212(94)90110-4.CS1 maint: multiple names: authors list (link)
^ Lo, B. Gao, X. "Assessing Software Cost Estimation Models: criteria for accuracy, consistency and regression".CS1 maint: multiple names: authors list (link)
^ Hughes, R.T. Cunliffe, A. Young-Martos, F. (1998). "Evaluating software development effort model-building techniquesfor application in a real-time telecommunications environment". IEE Proceedings - Software. 145: 29. doi:10.1049/ip-sen:19983370.CS1 maint: multiple names: authors list (link)
^ Grimstad, S. Jørgensen, M. (2006). "A Framework for the Analysis of Software Cost Estimation Accuracy".CS1 maint: multiple names: authors list (link)
^ Jørgensen, M. Grimstad, S. "How to Avoid Impact from Irrelevant and Misleading Information When Estimating Software Development Effort".CS1 maint: multiple names: authors list (link)
^ Bentley, Jon (1985). "Programming pearls". Communications of the ACM (fee required) |format= requires |url= (help). 28 (9): 896–901. doi:10.1145/4284.315122. ISSN 0001-0782. S2CID 5832776.
^ Gödel, Escher, Bach: An Eternal Golden Braid. 20th anniversary ed., 1999, p. 152. ISBN 0-465-02656-7.
^ AFCAA Revic 9.2 manual Revic memorial site
^ "SLIM Suite Overview". Qsm.com. Retrieved 2019-08-27.
^ "SLIM-WebServices". Qsm.com. Retrieved 2019-08-27.
^ TruePlanning Integrated Cost Models PRICE Systems site Archived 2015-11-05 at the Wayback Machine

[1] "What We do and Don't Know about Software Development Effort Estimation".

[2] "Cost Estimating And Assessment Guide GAO-09-3SP Best Practices for developing and managing Capital Program Costs" (PDF). US Government Accountability Office. 2009.

[3] Jørgensen, M. (2004). "A Review of Studies on Expert Estimation of Software Development Effort". Journal of Systems and Software. 70 (1–2): 37–60. doi:10.1016/S0164-1212(02)00156-5.

[4] Molokken, K. Jorgensen, M. (2003). "A review of software surveys on software effort estimation". 2003 International Symposium on Empirical Software Engineering, 2003. ISESE 2003. Proceedings. pp. 223–230. doi:10.1109/ISESE.2003.1237981. ISBN 978-0-7695-2002-5. S2CID 15471986.CS1 maint: multiple names: authors list (link)

[5] Jørgensen, M. Teigen, K.H. Ribu, K. (2004). "Better sure than safe? Over-confidence in judgement based software development effort prediction intervals". Journal of Systems and Software. 70 (1–2): 79–93. doi:10.1016/S0164-1212(02)00160-7.CS1 maint: multiple names: authors list (link)

[6] Edwards, J.S. Moores (1994). "A conflict between the use of estimating and planning tools in the management of information systems". European Journal of Information Systems. 3 (2): 139–147. doi:10.1057/ejis.1994.14. S2CID 62582672.

[7] Goodwin, P. (1998). Enhancing judgmental sales forecasting: The role of laboratory research. Forecasting with judgment. G. Wright and P. Goodwin. New York, John Wiley & Sons: 91-112. Hi

[8] Farr, L. Nanus, B. "Factors that affect the cost of computer programming, volume I" (PDF).CS1 maint: multiple names: authors list (link)

[9] Farr, L. Nanus, B. "Factors that affect the cost of computer programming, volume II" (PDF).CS1 maint: multiple names: authors list (link)

[10] Nelson, E. A. (1966). Management Handbook for the Estimation of Computer Programming Costs. AD-A648750, Systems Development Corp.

[11] Anda, B. Angelvik, E. Ribu, K. (2002). "Improving Estimation Practices by Applying Use Case Models". Lecture Notes in Computer Science. 2559: 383–397. CiteSeerX . doi:10.1007/3-540-36209-6_32. ISBN 978-3-540-00234-5.CS1 maint: multiple names: authors list (link) ISBN 9783540002345, 9783540362098.

[12] Briand, L. C. and Wieczorek, I. (2002). Resource estimation in software engineering. Encyclopedia of software engineering. J. J. Marcinak. New York, John Wiley & Sons: 1160-1196.

[13] Jørgensen, M. Shepperd, M. "A Systematic Review of Software Development Cost Estimation Studies".CS1 maint: multiple names: authors list (link)

[14] "Custom Software Development Services - Custom App Development - Oxagile".

[15] Hill Peter (ISBSG) - Estimation Workbook 2 - published by International Software Benchmarking Standards Group ISBSG - Estimation and Benchmarking Resource Centre Archived 2008-08-29 at the Wayback Machine

[16] Morris Pam — Overview of Function Point Analysis Total Metrics - Function Point Resource Centre

[17] Srinivasa Gopal and Meenakshi D'Souza. 2012. Improving estimation accuracy by using case based reasoning and a combined estimation approach. In Proceedings of the 5th India Software Engineering Conference (ISEC '12). ACM, New York, NY, USA, 75-78. DOI=https://dx.doi.org/10.1145/2134254.2134267

[18] Shepperd, M. Kadoda, G. (2001). "Comparing software prediction techniques using simulation". IEEE Transactions on Software Engineering. 27 (11): 1014–1022. doi:10.1109/32.965341.CS1 maint: multiple names: authors list (link)

[Jørgensen,_M-19] Jørgensen, M. "Estimation of Software Development Work Effort:Evidence on Expert Judgment and Formal Models".

[20] Winkler, R.L. (1989). "Combining forecasts: A philosophical basis and some current issues Manager". International Journal of Forecasting. 5 (4): 605–609. doi:10.1016/0169-2070(89)90018-6.

[21] Blattberg, R.C. Hoch, S.J. (1990). "Database Models and Managerial Intuition: 50% Model + 50% Manager". Management Science. 36 (8): 887–899. doi:10.1287/mnsc.36.8.887. JSTOR 2632364.CS1 maint: multiple names: authors list (link)

[22] BlueOptima (2019-10-29). "Identifying Reliable, Objective Software Development Metrics".

[23] Shepperd, M. Cartwright, M. Kadoda, G. (2000). "On Building Prediction Systems for Software Engineers". Empirical Software Engineering. 5 (3): 175–182. doi:10.1023/A:1026582314146. S2CID 1293988.CS1 maint: multiple names: authors list (link)

[24] Kitchenham, B. Pickard, L.M. MacDonell, S.G. Shepperd. "What accuracy statistics really measure".CS1 maint: multiple names: authors list (link)

[25] Foss, T. Stensrud, E. Kitchenham, B. Myrtveit, I. (2003). "A Simulation Study of the Model Evaluation Criterion MMRE". IEEE Transactions on Software Engineering. 29 (11): 985–995. CiteSeerX . doi:10.1109/TSE.2003.1245300.CS1 maint: multiple names: authors list (link)

[26] Miyazaki, Y. Terakado, M. Ozaki, K. Nozaki, H. (1994). "Robust regression for developing software estimation models". Journal of Systems and Software. 27: 3–16. doi:10.1016/0164-1212(94)90110-4.CS1 maint: multiple names: authors list (link)

[27] Lo, B. Gao, X. "Assessing Software Cost Estimation Models: criteria for accuracy, consistency and regression".CS1 maint: multiple names: authors list (link)

[28] Hughes, R.T. Cunliffe, A. Young-Martos, F. (1998). "Evaluating software development effort model-building techniquesfor application in a real-time telecommunications environment". IEE Proceedings - Software. 145: 29. doi:10.1049/ip-sen:19983370.CS1 maint: multiple names: authors list (link)

[29] Grimstad, S. Jørgensen, M. (2006). "A Framework for the Analysis of Software Cost Estimation Accuracy".CS1 maint: multiple names: authors list (link)

[30] Jørgensen, M. Grimstad, S. "How to Avoid Impact from Irrelevant and Misleading Information When Estimating Software Development Effort".CS1 maint: multiple names: authors list (link)

[Bentley1985-31] Bentley, Jon (1985). "Programming pearls". Communications of the ACM (fee required) |format= requires |url= (help). 28 (9): 896–901. doi:10.1145/4284.315122. ISSN 0001-0782. S2CID 5832776.

[32] Gödel, Escher, Bach: An Eternal Golden Braid. 20th anniversary ed., 1999, p. 152. ISBN 0-465-02656-7.

[33] AFCAA Revic 9.2 manual Revic memorial site

[34] "SLIM Suite Overview". Qsm.com. Retrieved 2019-08-27.

[35] "SLIM-WebServices". Qsm.com. Retrieved 2019-08-27.

[36] TruePlanning Integrated Cost Models PRICE Systems site Archived 2015-11-05 at the Wayback Machine

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]

[35]

[36]

Contents