David Voorhees Ph.D work @ Nova Southeastern University

Predicting Software Size and Development Effort: Models based on Stepwise Refinement

Abstract

This study designed a Software Size Model and an Effort Prediction Model, then performed an empirical analysis of these two models. Each model design began with identifying its objectives, which led to describing the concept to be measured and the meta-model. The numerical assignment rules were then developed, providing a basis for size measurement and effort prediction across software engineering projects. The Software Size Model was designed to test the hypothesis that a software size measure represents the amount of knowledge acquired and stored in software artifacts, and the amount of time it took to acquire and store this knowledge. The Effort Prediction Model is based on the estimation by analogy approach and was designed to test the hypothesis that this model will produce reasonably close predictions when it uses historical data that conforms to the Software Size Model.

The empirical study implemented each model, collected and recorded software size data from software engineering project deliverables, simulated effort prediction using the jack knife approach, and computed the absolute relative error and magnitude of relative error (MRE) statistics. This study resulted in 35.3% of the predictions having an MRE value at or below twenty-five percent. This result satisfies the criteria established for the study of having at least 31% of the predictions with a MRE of 25% or less.

This study is significant for three reasons. First, no subjective factors were used to estimate effort. The elimination of subjective factors removes a source of error in the predictions and makes the study easier to replicate. Second, both models were described using metrology and measurement theory principles. This allows others to consistently implement the models and to modify these models while maintaining the integrity of the models' objectives. Third, the study's hypotheses were validated even though the software artifacts used to collect the software size data varied significantly in both content and quality. Recommendations for further study include applying the Software Size Model to other data-driven estimation models, collecting and using software size data from industry projects, looking at alternatives for how text-based software knowledge is identified and counted, and studying the impact of project cycles and project roles on predicting effort.

Reference List

Abran, A., Sellami, A., & Suryn, W. (2003). Metrology, Measurement and Metrics in Software Engineering. Proceedings of the Ninth International Software Metrics Symposium, Sydney, Australia, 2-11.
Acuña, S. T., & Sosa, M. D. V. (2000). An Integral Software Process Formal Model based on the SOCCA Approach. Proceedings of the Twentieth International Conference of the Chilean Computer Science Society, 162-171.
Albrecht, A. J. (1979). Measuring Application Development Productivity. Of the Joint SHARE, GUIDE, and IBM Application Developments Symposium, 83-92.
Angelis, L., & Stamelos, I. (2000). A Simulation Tool for Efficient Analogy Based Cost Estimation. Empirical Software Engineering, 5(1), 35-68.
Armour, P. G. (2004). The Laws of Software Process: A New Model for the Production and Management of Software. Boca Raton, Florida: Auerbach Publications.
Briand, L. C., El Emam, K., Surmann, D., Wieczorek, I., & Maxwell, K. D. (1999). An Assessment and Comparison of Common Software Cost Estimation Modeling Techniques. Proceedings of the 21st International Conference on Software Engineering, Los Angeles, California, USA, 313-322.
Briand, L. C., & Wieczorek, I. (2001). Resource Modeling in Software Engineering. In J. J. Marciniak (Ed.), The Encyclopedia of Software Engineering. New York: John Wiley & Sons.
Chidamber, S. R., Darcy, D. P., & Kemerer, C. F. (1998). Managerial Use of Metrics for Object-Oriented Software: An Exploratory Analysis. IEEE Transactions on Software Engineering, 24(8), 629-639.
Construction Specifications Institute (1991). Manual of Practice. Alexandria, Virginia: Construction Specifications Institute.
Construction Specifications Institute, & Construction Specifications Canada (1995). MasterFormatTM: Master List of Numbers and Titles for the Construction Industry. Alexandria, Virginia: Construction Specifications Institute and Construction Specifications Canada.
Fenton, N. (1994). Software Measurement: A Necessary Scientific Basis. IEEE Transactions on Software Engineering, 20(3), 199-206.
Hacker, D. (2000). Rules for Writers, Fourth Edition. Boston: Bedford/St. Martin's.
Hakuta, M., Tone, F., & Ohminami, M. (1997). A Software Size Estimation Model and its Evaluation. The Journal of Systems and Software, 37(3), 253-263.
Hastings, T. E., & Sajeev, A. S. M. (1997). A Vector Based Software Size Measure. Australian Software Engineering Conference (ASWEC '97), Sydney, Australia, 7-16.
Hellenic Institute of Metrology (2004). The History of Metrology. Retrieved May 20, 2004, from http://www.eim.org.gr/html/english/metrology/history.html.
Jacquet, J. P., & Abran, A. (1997). From Software Metrics to Software Measurement Methods: A Process Model. Proceedings of the 3rd International Software Engineering Standards Symposium, 128-135.
Jeffery, R., Ruhe, M., & Wieczorek, I. (2000). A Comparative Study of Two Software Development Cost Modeling Techniques using Multi-Organizational and Company-Specific Data. Information and Software Technology, 42(14), 1009-1016.
Jones, T. C. (1986). Programming Productivity. New York: McGraw-Hill.
Jones, T. C. (1995). Patterns of Large Software Systems: Failure and Success. IEEE Computer, 28(3), 86-87.
Lederer, A. L., & Prasad, J. (1992). Nine Management Guidelines for Better Cost Estimating. Communications of the ACM, 35(2), 50-59.
Mair, C., Kadoda, G., Lefley, M., Phalp, K., Schofield, C., Shepperd, M., & Webster, S. (2000). An Investigation of Machine Learning Based Prediction Systems. The Journal of Systems and Software, 53(1), 23-29.
Matson, J. E., Barrett, B. E., & Mellichamp, J. M. (1994). Software Development Cost Estimation Using Function Points. IEEE Transactions on Software Engineering, 20(4), 275-287.
Mendes, E., Mosley, N., & Counsell, S. (2003). A Replicated Assessment of the Use of Adaptation Rules to Improve Web Cost Estimation. Proceedings of the 2003 International Symposium on Empirical Software Engineering, Rome, Italy, 100-109.
Musílek, P., Pedrycz, W., Succi, G., & Reformat, M. (2000). Software Cost Estimation with Fuzzy Models. ACM SIGAPP Applied Computing Review, 8(2), 24-29.
Myers, G. J., Badgett, T., Thomas, T. M., & Sandler, C. (2004). The Art of Software Testing. Hoboken, New Jersey: John Wiley & Sons.
Nesi, P., & Querci, T. (1998). Effort Estimation and Prediction of Object-Oriented Systems. The Journal of Systems and Software, 42(1), 89-102.
Pfleeger, S. L. (2001). Software Engineering: Theory and Practice. Upper Saddle River, New Jersey: Prentice-Hall.
Pressman, R. S. (1992). Software Engineering: A Practitioner's Approach, Third Edition. New York: McGraw-Hall.
R. S. Means Company (1998). RSMeans Building Construction Cost Data: 57th Annual Edition. Kingston, MA: R. S. Means Company.
Ramil, J. F., & Lehman, M. M. (2000). Metrics of Software Evolution as Effort Predictors - A Case Study. Proceedings of the International Conference on Software Maintenance, San Jose, CA, 163-172.
Reifer, D. J. (2000). Web Development: Estimating Quick-to-Market Software. IEEE Software, 17(6), 57-64.
Ruhe, M., Jeffery, R., & Wieczorek, I. (2003). Cost Estimation for Web Applications. J Proceedings of the 25th International Conference on Software Engineering. Portland, Oregon, 285-294.
Rumbaugh, J., Jacobson, I., & Booch, G. (1999). The Unified Modeling Language Reference Manual. New York: Addison-Wesley.
Rus, I., & Lindvall, M. (2002). Knowledge Management in Software Engineering. IEEE Software, 19(3), 26-38.
Seybold, C. (2003). Estimation Tools. Seminar in Software Cost Estimation WS 2002/03, Retrieved June 7, 2003, from Institut für Informatik der Universität Zürich Web site: http://www.ifi.unizh.ch/req/courses/seminar_ws02/.
Shepperd M., & Schofield C. (1997). Estimating Software Project Effort using Analogies. IEEE Transactions on Software Engineering, 23(12), 736-743.
Shepperd M., Schofield C., & Kitchenham B. (1996). Effort Estimation using Analogy. Proceedings of the 18th International Conference on Software Engineering, Berlin, Germany, 170-178.
The Standish Group International (2001). Extreme Chaos. West Yarmouth, MA, USA: The Standish Group International, Inc.
Verner, J., & Tate, G. (1992). A Software Size Model. IEEE Transactions on Software Engineering, 18(4), 265-278.
Walkerden, F., & Jeffery, R. (1999). An Empirical Study of Analogy-based Software Effort Estimation. Empirical Software Engineering, 4(2), 135-158.
Yau, C., & Tsoi, H. (1998). Modelling the Probabilistic Behaviour of Function Point Analysis. Information and Software Technology, 40(2), 59-68.

Last updated on April 22, 2005.