Project 020111

UNIVERSITY OF ALASKA AT ANCHORAGE
Software Project Cost

Estimating
Russell Frith
4/12/2011
0
Abstract
Software engineering cost estimation is the process of predicting the effort required to
develop a software system. Cost estimation techniques involve distinctive steps, tools, algorithms
and assumptions. Many estimation models have been developed since the 1980s due to the
dyanmic nature of software engineering practices. Despite the evolution of new cost estimation
techniques, fundamental economic principles underlie the overall structure of the software
engineering life cycle, and its primary refinements of prototyping, incremental development, and
advancement. This paper provides a general overview of software cost estimation methods,
including recent advances in the field. Many of the models rely on a software project size
estimate as input and this paper provides details for common size metrics. The primary economic
driver of the software life-cycle structure is the significantly increasing cost of making software
changes or fixing software problems as a function of a development phase in which the change
or fix is made. Software engineering models are classified into two major categories: algorithmic
and non-algorithmic. Each has its own strengths and weaknesses with regard to implementing
modifications to software projects. A key factor in selecting a cost estimation model is the
accuracy of its estimates, which can be very problematical.
1. Introduction
In recent years, software has become the most expensive component of computer system
projects. The cost of software development is mostly sourced in human efforts, and most
estimation efforts focus on this aspect and give estimates in terms of person-months. If one
considers economics as the study of how people make decisions in resource-limited situations,
then economics category of macroeconomics is the study of how people make decisions in
resource-limited situations on a national or global scale. Macroeconomic decisions are

1
influenced by tax rates, interest rates, foreign policy, and trade policy. Conversely,
microeconomics is the study of how people make decisions in resource-limited situations on a
more personal scale and it treats decisions that individuals and organizations make on such issues
as how much insurance to buy, which word software development systems to procure, or what
prices to charge for products and services.
Software engineering is an exercise in microeconomics in that it deals with limited
resources. There is never enough time or money to encompass all the essential features software
vendors would like to put into their products. Even with cheap hardware, storage, memory, and
networks, software projects must always operate within a world of limited computing and
network resources. Subsequently, accurate software cost estimates are critical to both developers
and customers. Those estimates can be used for generating requests for proposals, contract
negotiations, scheduling, monitoring, and control. Understimating software engineering costs
could result in management approving proposed systems that potentially exceed budget
allocations, or underdeveloped functions with poor quality, or a failure to complete a project on
time. Conversely, overestimating costs may result in too many resources committed to a project,
or, during contract bidding, result in losing a contract and loss of jobs.
Accurate cost estimation is important because:
It can help to classify and prioritize development projects with respect to an overall
business plan,
It can be used to assess the impact of changes and support replanning,
2
It can be used to determine what resources to commit to the project and how well those
resources will be used,
Projects can be easier to manage and control when resources are better matched to real
needs, and
Customers expect actual development costs to be in line with estimated costs.
Three fundamental estimates typically comprise a software cost estimate and these are effort in
person-months, project duration, and cost. Most cost estimation models attempt to generate an
effort estimate which is then converted into a project duration time-line and cost. The relation
between effort and cost may be non-linear, although effort is measured in person-months of
programmers, analysts, and project managers and effort estimates can be converted to a dollar
cost figure by calculating an average salary per unit time of the staff involved and multiplying
that number by the estimated effort required.
In constructing a software cost engineering estimate, three basic questions arise:
Which software cost estimation model should be used?
Which software size measurement should be used lines of code (LOC), function points
(FP), or feature points?
What is a good estimate?
A widely used practice of cost estimation method is that of using expert judgment. Using this
technique, project managers rely on experience and prevailing industry norms as a basis to
3
develop cost estimates. Basing estimates on expert judgement can be somewhat error prone
however:
The approach is not repeatable and the means of deriving an estimate is subjective.
The pool of experienced estimators of new software projects is very small.
In general, the relationship between cost and system size is not linear. Costs tend to
increase exponentially with size, which subsequently confines expert judgment estimates
to those new projects with anticipated sizes of past projects.
Budget alterations by management aimed at avoiding cost overruns make experience and
data from previous projects questionable.
There exist alternatives to expert judgements, some theoretical and not very useful, others having
more pragmatic value and they are presented in the software engineering cost estimation section.
In the last four decades, many quantitative software cost estimation models have been developed
and they range from empirical models such as Boehms COCOMO models [4] to analytical
models such as those in [7, 22, 23]. An empirical model uses data from previous projects to
evaluate the current project and derives the basic formulae from analysis of the particular
database available. Alternative analytical models use formulae based on global assumptions,
such as the rate at which developers solve problems and the number of problems available.
A well-constructed software cost estimate should have the following properties [24]:
It is conceived and supported by the project manager and the development team.
It is accepted by all stakeholders as realizable.

4
It is based on a well-defined software cost model with a credible basis.
It is based on a database of relevant project experience (similar processes, similar
technologies, similar environments, similar people and similar requirements).
It is defined in enough detail so that its key risk areas are understood and the probability
of success is objectively assessed.
Hindrances to developing a reliable software engineering cost estimate include the following:
Lack of an historical database of cost measurement,
Software development involving many interrelated factors, which affect development
effort and productivity, and which relationships are not well understood,
Lack of trained estimators with the necessary expertise, and
Little penalty is often associated with a poor estimate.
2. Process of Software Engineering Estimation
Throughout the software life cycle, there are many decision situations involving limited
resources in which software engineering techniques provide useful assistance. See Figure II in
the appendix for elements of a computer programming project cycle. To provide a feel for the
nature of these economic decision issues, an example is given below for each of the major phases
in the software life cycle. In addition, refer to Figure III in the appendix for the loopback nature
of computer programming process steps.
Feasibility Phase: How much should one invest in information system analyses (user
questionnaires and interviews, current-system analysis, workload characterizations,
5
simulations, scenarios, prototypes) in order to obtain convergence on an appropriate
definition and concept of operation for the system to be implemented?
Plans and Requirements Phase: How rigorously should requirements be specified? How
much should be invested in requirements validation activities (automated completeness,
consistency, traceability checks, analytic models, simulations, prototypes) before
proceeding to design and develop a software system?
Product Design Phase: Should developers organize software to make it possible to use a
complex piece of existing software which generally but not completely meets
requirements?
Programming Phase: Given a choice between three data storage and retrieval schemes
which are primarily execution time-efficient, storage-efficient, and easy-to-modify,
respectively; which of these should be implemented?
Integration and Test Phase: How much testing and formal verification should be
performed on a product before releasing it to users?
Maintenance Phase: Given an extensive list of suggested product improvements, which
ones should be implemented first?
Phaseout: Given an aging, hard-to-modify software product, should it be replaced with a
new product, should it be restructured, or should it be left alone?
Software cost engineering estimation typically involves a top-down planning approach in
which the cost estimate is used to derive a project plan. Typical steps in a planning process
include:
6
1. The project manager develops a characterization of the overall functionality, size,
process, environment, people, and quality required for the project.
2. A macro-level estimate of the total effort and schedule is developed using a software cost
estimation model.
3. The project manager partitions the effort estimate into a top-level work breakdown
structure. In addition, the schedule is partitioned into major milestone dates and a staffing
profile is configured.
The actual cost estimation process involves seven steps [4]:
1. establish cost-estimating objectives;
2. generate a project plan for required data and resources;
3. pin down software requirements;
4. work out as much detail about the software system as feasible;
5. use several independent cost estimation techniques to capitalize on their combined
strengths;
6. compare different estimates and iterate the estimation process; and
7. once the project has started, monitor its actual cost and progress, and feedback results to
project management.
Regardless of which estimation model is selected, consumers of the model must pay attention to
the following to get the best results:
7
Since some models generate effort estimates for the full software life-cycle and others do
not include effort for the requirements stage, coverage of the estimate is essential.
Model calibration and assumptions should be decided beforehand.
Sensitivity analysis of the estimates to different model parameters should be calculated.
The microeconomics field provides a number of techniques for dealing with software
life-cycle decision issues such as the ones mentioned early in this section. Standard optimization
techniques can be used when one can find a single quantity such as rupees or dollars to serve as a
universal solvent into which all decision variables can be converted. Or, if nonmonetary
objectives can be expressed as constraints (system availability must be 98%, throughput must be
150 transactions per second), then standard constrained optimization techniques can be used. If
cash flows occur at different times, then present-value techniques can be used to normalize them
to a common point in time.
Inherent in the process of software engineering estimation is the utilization of software
engineering economics analysis techniques. One such technique compares cost and benefits. An
example involves the provisioning of a cell phone service in which there are two options.
Option A: Accept an available operating system that requires $80K in software costs, but
will achieve a peak performance of 120 transactions per second using five $10K
minicomputer processors, because of high multiprocessor overhead factors.
Option B: Build a new operating system that would be more efficient and would support a
higher peak throughput, but would require $180 in software costs.
8
In general, software engineering decision problems are even more complex as Options A and B
and will have several important criteria on which they differ such as robustness, ease of tuning,
ease of change, functional capability, and so on. If these criteria are quantifiable, then some type
of figure of merit can be defined to support a comparative analysis of the preference of one
option over another. If some of the criteria are unquantifiable (user goodwill, programmer
morale, etc.), then some techniques for comparing unquantifiable criteria need to be used.
In software engineering, decision issues are generally complex and involve analyzing
risk, uncertainty, and the value of information. The main economic analysis techniques available
to resolve complex decisions include the following:
1. Techniques for decision making under complete uncertainty, such as the maximax rule,
the maximin rule and the Laplace rule [19]. These techniques are generally inadequate for
practical software engineering decisions.
2. Expected-value techniques, in which one estimates the probabilities of occurrence of each
outcome; i.e., successful development of a new operating system, and complete the
expected payoff of each option: EV = Prob(success)*Payoff(successful OS) +
Prob(failure) *Payoff(unsuccessful OS). These techniques are better than decision
making under complete uncertainty, but they still involve a great deal of risk if the
Prob(failure) is considerably higher than the estimate of it.
3. Techniques in which one reduces uncertainty by buying information. For example,
prototyping is a way of buying information to reduce uncertainty about the likely success
or failure of a multiprocessor operating system; by developing a rapid prototype of its
9
high-risk elements, one can get a clearer picture of the likelihood of successfully
developing the full operating system.
Information-buying often tends to be the most valuable aid for software engineering decisions.
The question of how much information-buying is enough can be answered via statistical decision
theoretic techniques using Bayes Law, which provides calculations for the expected payoff from
a software project as a function of the level of investment in a prototype. In practice, the use of
Bayes Law involves the estimation of a number of conditional probabilities which are not easy
to estimate accurately. However, the Bayes Law approach can be translated into a number of
value-of-information guidelines, or conditions under which it makes good sense to decide on
investing in more information before committing to a particular course of action.
Condition 1: There exist attractive alternatives which payoff varies greatly, depending on
some critical states of nature. If not, engineers can commit themselves to one of the attractive
alternatives with no risk of significant loss.
Condition 2: The critical states of nature have an appreciable probability of occurring. If
not, engineers can again commit without major risk. For situations with extremely high
variations in payoff, the appreciable probability level is lower than in situations with smaller
variations in payoff.
Condition 3: The investigations have a high probability of accurately identifying the
occurrence of the critical states of nature. If not, the investigations will not do much to reduce
the risk of loss incurred by making the wrong decision.
10
Condition 4: The required cost and schedule of the investigations do not overly curtail
their net value. It does one little good to obtain results which cost more than those results can
save for us, or which arrive too late to help make a decision.
Condition 5: There exist significant side benefits derived from performing the
investigations. Again, one may be able to justify an investigation solely on the basis of its value
in training, team-building, customer relations, or design validation.
3. Software Engineering Project Sizing
During the 1950s and the 1960s, relatively little progress was made in software cost
estimation, while the frequency and magnitude of software cost overruns was becoming critical
to many large systems employing computers. In 1964, the U.S. Air Force contracted with System
Development Corporation for a landmark project in software cost estimation. The project
collected 104 attributes of 169 software projects and treated them to extensive statistical analysis.
One result was the 1965 SDC cost model which was the best possible statistical 13-parameter
linear estimation model for the sample data:
MM (man-months) = -33.63 + 9.15(Lack of Requirements) + 10.73(Stability of Design) +
0.51(Percent Math Instructions) + 0.46*(Percent Storage/Retrieval Instructions) + 0.4(Number of
Subprograms) + 7.28(Programming Language) 21.45(Business Application) + 13.53(Stand-
Alone Program) + 12.35(First Program on Computer) + 58.82(Concurrent Hardware
Development) + 30.61(Random Access Device Used) + 29.55(Difference Host, Target
Hardware) + 0.54(Number of Personnel Trips) -25.2(Developed by Military Organization) [5].
When applied to its database of 169 projects, this model produced a mean estimate of 40 MM
and a standard deviation of 62MM; not a very accurate predictor. The model is also
11
counterintuitive; a project will all zero values for variables is estimated at -33 MM; changing
language from a higher order language to assembly adds 7 MM, independent of project size. One
can conclude that there were too many nonlinear aspects of software development for a linear
cost-estimation model to work.
Today, software size is the most important factor that affects software cost. There exist
five fundamental software size metrics used in practice. Two of the most commonly used metrics
include the Lines of Code and Function Point metrics. The Lines of Code metric is the
number of lines of delivered source code for the software and it is known as LOC [9], and is
programming language dependent. Most models relate this measurement to the software cost, but
the exact LOC can only be obtained after the project has completed. Thus, estimating project
costs becomes substantially more difficult.
One method for estimating code size is to use experts judgment together with a
technique called PERT (Program Evaluation and Review Technique)[2]. The model is based
upon three possible code-sizes: S1, the lowest possible size; Sh the highest possible size; and
S l S h 4S m
Sm, the most likely size. An esitmate of the code-size S may be computed as S
6
.This formula is valid for modular code components and can be summed with other code
components size values.
An alternative PERT proposed by Halstead [11] uses code length and volume metrics.
Code length is used to measure sorce code program length and is defined as N N1 N 2 where
N1 is the total number of operator occurances and N2 is the total number of operand occurances.
Volume corresponds to the amount of storage space and is defined as V N log(n1 n 2) where
12
n1 is the number of distinct operators and n2 is the number of distinct operands that appear in the
program. Counter-alternatives to Halstead may be found in [12, 25].
As mentioned previously, software size measurement may also be based on function
points. This is a measurement based on the functionality of the program and was introduced by
Albrecht [1]. The total number of function points depends on the counts of distinct logic types in
the following five classes:
1. User-input types: data or control user-input types
2. User-output types: output data types to the user that leaves the system
3. Inquiry types: interactive inputs requiring a response
4. Internal file types: files that are used and shared inside the system
5. External file types: files that are passed or shared between the system and other systems.
Each of these types is individually assigned one of three complexity levels of {1 = simple, 2 =
medium, 3 = complex} and given a weighting value that varies from three for simple input to 15
for complex internal files. The unadjusted function-point counts (UFC) is given as
5 3
UFC N ijWij where Nij and Wij are respectively the number and weight types of class I
i 1 j 1
and complexity j. For instance, if the raw function-point counts of a project are two simple inputs
(Wij = 3), two complex outputs (Wij = 7) and one complex internal file (Wij = 15) then the UFC
value is computes as 2*3 + 2*7 + 15 = 35. This initial function- point count is either directly
used for cost estimation or is further modified by factors which values depend on the overall
13
complexity of the project. The accounting consists of the degrees of distributed processing, the
amount of resuse, performance requirements, and so on. The advantage of the function-point
measurement is that it can be obtained based on the system requirement specification in the early
stage of software development.
UFC may also be used for code size estimation using a linear formula:
LOC = a*UFC + b. The parameters a and b can be obtained using linear regression and
previously completed project data. The latest Function Point Counting Practices Manual is
maintained by the IFPUG (International Function Point Users Group) in http://www.ifpug.org/.
An extension of the function point software measurement technique is the feature point
measurement technique. Feature point extends function points to include algorithms as a new
class [16]. An algorithm is defined as the set of rules which must be completely expressed to
solve a significant computational problem. For example, a square root routine can be considered
as an algorithm. Each algorithm used is given a weight ranging from one (elementary) to ten
(advanced) and the feature point is the weighted sum of the algorithms plus the function points.
This measurement is especially useful for systems with few inputs/outputs and high algorithmic
complexity, such as mathematical software, discrete simulations, and military applications.
Real-time applications development cost estimation is based on full function point (FFP)
analysis. It takes into account special hardware control of such applications. FFP introduces two
new control data functions types and four new control transactional function types which are
described in [28].
A final consideration for software project size estimation is the use of object points.
While feature point and FFP extend function point estimates, object points measure sizes from a
14
different dimension. These measurements are based on the number and complexity of the
following objects: screens, reports, and 3GL components. Each of these objects is counted and
given a weight ranging from one (simple screen) to ten (3GL component) and object points are
the weighted sum of all these objects.
4. Software Engineering Cost Estimation
There are two major types of cost estimation methods: algorithmic and non-algorithmic.
The former vary widely in mathematical sophistication and are based on simple arithmetic
formulae using summary statistics [8]. Others are based on regression models [30] and
differential equations [23]. To improve the accuracy of algorithmic models, there is a need to
adjust or calibrate the model to specific circumstances, but even this added work can still lead to
mixed accuracy. Table I in the appendix lists strengths and weaknesses of software cost-
estimation methods. The first part of this comparative discussion will treat non-algorithmic
costing.
4.1 Non-algorithmic Methods
Analogy Costing: This method involves reasoning by analogy with one or more completed
projects to relate their actual costs to an estimate of the cost of a similar new project. This
protocol may be used at either the total project level or at the subsystem level. The total project
level has the advantage that all cost components of the system will be considered while the
subsystem level has the advantage of providing a more detailed assessment of the similarities and
differences between the new project and the completed project. Success factors using this
technique are outlined in [26].
15
Expert Judgment: This method involves consulting one or more experts, perhaps with the aid
of an expert-consensus mechanism. Experts provide estimates using their own methods and
experience. The PERT technique can be used to resolve inconsistencies in estimates. The Delphi
technique works as follows:
1. The coordinator presents each expert with a specification and a form to record estimates.
2. Each expert fills in the form individually and is allowed to ask the coordinator questions.
3. The coordinator prepares a summary of all estimates from the experts on a form
requesting another iteration of the experts estimates and the rationale for the estimates.
4. Steps 2-3 may be repeated several times.
A modification of the Delphi technique proposed by Boehm and Fahquhar [4] has proven more
effective. Before the estimation, a group meeting involving the coordinator and the experts is
arranged to discuss the estimation issues. In step three the experts do not offer any rationale for
their estimates. Instead, after each round of estimation, the coordinator calls a meeting to have
experts reconcile their estimates.
Parkinson: In the Parkinson principle, work expands to fill the available volume. This principle
is used to equate the cost estimate to available resources [21]. For instance, if the software has to
be delivered in 12 months and five people are available, then the effort is estimated to be 60
person-months. This method is hazardous to use in that it has the potential to provide unrealistic
estimates and it does not promote good software engineering practices.
Price-to-win: Using this method, the cost estimate is equated to the price believed necessary to
win the job or to the schedule believed necessary to be first in the market with a new product.
16
The estimate is based on the customers budget instead of the software functionality. For
example, if a reasonable estimation for a project costs 100 person-months but the customer can
only afford 60 person-months, it is common that the estimator is asked to modify the estimation
to fit 60 person-months effort in order to win the project. A very poor practice indeed, but all
too-often used.
Bottom-up: Each component of the software job is separately estimated, and the results
aggregated to produce an estimate for the overall job. An initial design must be in place that
indicates how the system is decomposed into different components.
Top-down: An overall cost estimate for the project is derived from global properties of the
software product. The total cost is then split among the various components.
The main conclusions one can draw from Table I are the following:
None of the alternatives is better than the others from all aspects.
The Parkinson and price-to-win methods are unacceptable and do not produce
satisfactory cost estimates.
The strengths and weaknesses of the other techniques are complimentary.
In practice, a combination of the viable techniques should be employed, their results
compared, and iterations on them performed when they differ.
4.2 Algorithmic Methods
Algorithmic methods are based on mathematical models that produce cost estimates as a
function of a number of variables, which are considered to be the major cost factors. Any
17
algorithmic model has the form: Effort f ( x1 , x2 ,..., xn ) where {x1 , x2, ..., x n } denotes the set
of cost factors. The existing algorithmic methods differ in two aspects: the selection of cost
factors and the form of the funciton f.
A clearer picture of the fundamental limitations of software cost estimation techniques is
emerging. Despite the seven approaches to software cost estimation, there is no definitive way
one can expect a particular technique to compensate for a lack of definition or understanding of
the software job to be done. Until a software specification is fully defined, a software cost
estimation technique represents a range of software development costs. Figure I in the appendix
shows a limitation in cost estimation technology. In the figure, the accuracy of the cost estimates
is shown as a function of the software life-cycle phase. The horizontal line labeled x is a
convergent estimate for the cost of a human-machine interface for a hypothetical software
project. The level of cost uncertainty is the y-axis and its range is between and four times the
convergent cost. This range is somewhat subjective and is intended to represent 80% confidence
limits, that is, within a factor of four on either side, 80% of the time. At the feasibility phase of
the human-machine interface component, the software engineering estimator does not know what
classes of people (clerks, computer specialists, middle managers, etc.) or what classes of data
(raw or pre-edited, numerical or text, digital or analog) the system will have to support. Until
those uncertainties are clarified, a factor of four in either direction serves as a best-guess for a
range of estimates.
The uncertainty envelope contracts once the feasibility phase is completed and the
operational concept is settled. At this stage, the range of estimates constricts to a factor of two on
either side of the convergent estimate. Outstanding issues include the specific types of query to
be supported or the specific functions to be performed within a potential client-server enterprise

18
application. Those issues will be resolved by the time a software requirements specification has
been developed, at which point the estimate of software costs are ranged within a factor of 1.5 in
either direction of the convergent estimate.
Once the product design specification is completed and validated, design issues such as
the internal data structure of the software product and the specific techniques for handling
network input/output between the client computer and the web server will have been resolved. At
this point the software estimate should be accurate to within a factor of 1.2 of the convergent
estimate. The remaining discrepencies are caused by sources of uncertainty in specific algorithms
to be used for database queries, internet error handling, network failure recovery, and so on.
Those issues will be resolved at the detailed desing phase, but there will still be some residual
uncertainty around ten percent and that is based on how well programmers understand the
specifications to which they are to code, or possibly, personnel turnover during development and
test phases.
4.3 Algorithmic Model Cost Estimation Formulation
4.3.1 Cost Factors
So far a substantial part of this discussion has treated software cost estimation in terms of
software size. There exist, however, many additional cost factors that are worth mentioning.
Table II in the appendix summarizes a set of cost factors proposed by Boehm et al in the
COCOMO II model for software engineering cost estimation [5]. There are four types of cost
factors. The first set of factors includes product factors and these are placeholders for required
reliability, product complexity, database size, required reusability, and documentation matched to
life-cycle needs. The second set includes computer factors which include execution time
19
constraints, storage constraints, computer turnaround constraints, and platform volatility.
Personnel factors consist of the capabilities of analysts, application experience, programming
capabilities, platform experience, language and tool experience, and personnel continuity. The
final set of factors includes project factors, the set of which is made up of multisite development,
software tools used, and development schedule. Many of these factors are hard to quantify and in
many models, some are combined, others are omitted. Furthermore, some factors take on discrete
forms which result in an estimation function taking on piece-wise form.
4.3.2 Linear Models
n
Linear models have the form: Effort a0 ai xi where the coefficients are chosen to
i 1
best fit the completed project data, as in Nelsons work [19]. Needless to say, software
development is mostly comprised of nonlinear interactions so this model is less than optimal.
4.3.3 Multiplicative Models
n
Multiplicative models have the form: Effort a0 ai . This model was used by
x i
i 1
Walston-Felix [30] with each xi taking on three possible values of -1, 0, and +1. These models
have proven to be too restrictive to incorporate many cost factor values.
4.3.4 Power Function Models
Power function models have the general form: Effort a S b where S is the code size
and a and b are functions of other cost factors. This class of modeling contains two popular
algorithmic models.
20
COCOMO (Constructive Cost Model)
This family of models was proposed by Boehm [3, 4] and they have been widely
accepted in practice. In these models, code-size S is given in thousand LOC (KLOC) and effort is
in person-months. The primary motivation for the COCOMO model has been to help people
understand the cost consequences of the decisions they will make in commissioning, developing,
and supporting a software product. COCOMO is a hierarchy of three increasingly detailed
models which range from a single macro-estimation scaling model as a function of product size
to micro-estimation model with a three-level breakdown structure and a set of phase-sensitive
multipliers for each cost driver attribute. COCOMO applies to three classes of software projects:
1. Organic projects: small teams with good experience working with less than rigid
requirements,
2. Semi-detached projects: medium teams with mixed experience working with a mix of rigid
and less than rigid requirements, and
3. Embedded projects: developed within a set of tight constraints such as hardware, software,
operational demands, and so on.
A) Basic COCOMO: This model uses three sets of {a, b} depending on the complexity of the
software. Typical parameter values include the following:
(1) for simple, well-understood applications, a = 2.4, b = 1.05;
(2) for more complex systems, a = 3.0, b = 1.15;
(3) for embedded systems, a = 3.6, b= 1.20.
21
The basic COCOMO model is simple and it excludes using many cost factors.
B) Intermediate COCOMO and Detailed COCOMO: In intermediate COCOMO, a nominal
effort estimation is obtained using the power function with three sets of {a, b}, with coefficient a
being slightly different from that of basic COCOMO:
(1) for simple, well-understood applications, a = 3.2, b = 1.05;
(2) for more complex systems, a = 3.0, b = 1.15;
(3) for embedded systems, a = 2.8, b = 1.2.
Next, fifteen cost factors with values ranging from 0.7 to 1.66 drawn from Table II are
determined [4]. The overall impact factor M is obtained as the product of all individual factors,
and the estimate is obtained by multiplying M to the nominal estimate. While both basic and
intermediate COCOMOs estimate software costs at the system level, the detailed COCOMO
works on each sub-system separately and has an obvious advantage for large systems that
contain non-homogeneous subsystems.
Intermediate COCOMO estimates the cost of a proposed software product in the
following way.
1) A nominal development effort is estimated as a function of the products size in delivered
source instructions in thousands (KDSI) and the products development mode, which is
described in Table III.
2) A set of effort multipliers are determined from the products ratings on a set of 15 cost driver
attributes.
22
3) The estimated development effort is obtained by multiplying the nominal effort estimate by all
of the products effort multipliers.
4) Additional factors can be used to determine dollar costs, development schedules, phase and
activity distributions, computer costs, annual maintenance costs, and other elements from the
development effort estimate.
C) COCOMO II: In this contemporary modification, the exponent b in the earlier COCOMO
models changes according to the following cost factors: precedent, development flexibility, risk,
team cohesion, and process maturity. Newly added cost factors are thrown in for additional
measure.
Putnams Model and SLIM
The Putnam model is based on Norden/Raleigh manpower distribution and his findings in
analyzing many completed projects [23]. The software equation forms the main part of the model
and is: S E ( Effort )1 / 3 t d4 / 3 where t d is the software delivery time; E is the environment factor
that reflects the development capability, which can be derived from historical data using the
software equation. The size S is in LOC and the Effort is in person-years. Another important
relation found by Putnam is Effort Do t d3 where Do is a parameter called manpower build-
up which ranges from eight; i.e., entirely new software with many interfaces, to 27 for rebuilt
software. Combining the above equation with the software equation, we obtain the power
1 3 3
9 / 7
function form: Effort ( Do E
4/7
) S 9 / 7 and t ( D 7 E 7 ) S 7 . SLIM is a software
d 0
tool based on this model for cost estimation and manpower scheduling (http://www.qsm.com/).
23
4.3.5 Model Calibration Using Linear Regression
A direct application of the above models does not take local circumstances into
consideration. One can adjust the cost factors using local data and a linear regression method.
Let the cost estimation power formula be: Effort a S b . Taking the logarithm of both sides
and transforming the result into a linear equation, one gets: Y A bX where Y = log(Effort), A
= log(a) and X = log(S). By applying the least square method to a set of previous project data
{Yi , X i : i 1,..., k} one obtains parameters b and A for the power function.
4.3.6 Discrete Models
Discrete models have a tabular form which usually relates effort, duration, difficulty, and
other cost factors. This class of models contains models from [2, 3, 31]. The models have gained
recent popularity due to their simplicity.
4.3.7 Price-S Model
The Price-S is a proprietary software cost estimation model developed and maintained by
RCA [20]. It is a macro cost-estimation model developed for embedded system applications
which formulation has evolved from subjective complexity factors to equations based on the
number of computers/servers, personnel, and project attributes that modulate complexity. The
program provides a wide range of useful outputs such as activity distribution analysis and cost-
schedule-expected progress forecasts.
In the 1980s, PRICE-S added a software life-cycle support cost estimation capability
called PRICE SL and it involved the definition of three categories of support activities.
24
Growth: The estimator specifies the amount of code to be added to the product. PRICE
SL then uses its standard techniques to estimate the resulting life-cycle-effort distribution.
Enhancement: PRICE SL estimates the fraction of the existing product which will be
modified.
Maintenance: The estimator provides a parameter indicating the quality level of the
developed code. PRICE SL uses this to estimate the effort required to eliminate
remaining errors.
4.3.8 The Doty Model
This model is the result of extensive data analysis collected by the Air Force in the 1960s
and 1970s. A number of models of similar form were developed for different application areas.
For a general application,
MM 5.288( KDSI )1.047 , forKDSI 10,

14
MM 2.060( KDSI )1.047 ( f j ), forKDSI 10.
j 1
The effort multipliers are shown in Table V. This model has numerical stability issues because it
exhibits a discontinuity at KDSI = 10, and produces widely varying estimates via the f factors.
For instance, answering yes to first software developed on CPU adds 92% to the estimated
cost.
4.4 Measurement of Model Performance
25
One common error measure for software engineering cost estimation is the Mean
1 n
Absolute Relative Error (MARE). The formula is: MARE (| estimatei actuali |) / actuali
n i1
where the ith estimate is the estimated effort from the model, the ith actual is the actual effort,
and n is the number of projects. To establish whether models are biased, the Mean Relative Error
1 n
(MRE) can be used. Its formulation is: MRE (estimatei actuali ) / actuali . A large
n i1
positive MRE suggests that the model overestimates the effort, while a large negative value
indicates the reverse.
The following criteria can be used for evaluating cost estimation models [4]:
1. Definition Has the model clearly defined the costs it is estimating and the costs it is
excluding?
2. Fidelity Are the estimates close to the actual costs expended on the projects?
3. Objectivity Does the model avoid allocating most of the software cost variance to poorly
calibrated subjective factors such as complexity? Is it hard to adjust the model to obtain any
result the user wants?
4. Constructiveness Can a user tell why the model gives the estimates it does? Does it help the
user understand the software job to be done?
5. Detail Does the model easily accommodate the estimation of a software system consisting of
a number of subsystems and units? Does it give accurate phase and activity breakdowns?
26
6. Stability Do small differences in inputs produce small differences in output cost estimates?
7. Scope Does the model cover the class of software projects whose costs the user needs to
estimate?
8. Ease of Use Are the model inputs and options easy to understand and specify?
9. Prospectiveness Does the model avoid the use of information that will not be well known
until the project is complete?
10. Parsimony Does the model avoid the use of highly redundant factors, or factors which
make no appreciable contribution to the results?
5. Performance of Estimation Models
Many studies have attempted to evaluate cost estimation models and the results are
discouraging in that many cost estimation techniques were found to be inaccurate. Some studies
found in a literature search include:
1. Kemerer performed an empirical validation of four algorithmic models (SLIM, COCOMO,
Estimacs, and FPA)[17] where no recalibration of models was performed on the project data,
which was different from that used for model development. Most models showed a strong over-
estimation bias and large estimation errors, ranging from a MARE of 57% to 800%.
2. Vicinanza, Mukhopadhyay, and Prietula used experts to estimate project effort using
Kemerers data set without formal algorithmic techniques and found the results outperformed the
models in the original study [29]. The MARE, however, ranged from 32% to 1107%.
27
3. Ferens and Gurner evaluated three models (SPANS, Checkpoint, and COSTAR) using 22
projects from Albrechts database and 14 projects from Kemerers data set. The estimation errors
were found to be large, with MAREs ranging from 46% for the Checkpoint model to 105% for
the COSTAR model.
4. Jeffery and Low investigated the need for model calibration at both the industry and
organization levels [15]. Without model calibration, their estimation error findings were large,
with MAREs ranging from 43% to 105%, They later compared the SPQR/20 model to FPA using
data from 64 projects from a single organization [15]. Their models were recalibrated to the local
environment to remove estimation biases. Some improvements in their estimates show MARE
observations of 12%, thus reflecting the benefits of model calibration.
5. Sheppard and Schofield found that estimating by analogy outperformed estimation based on
statistically derived algorithms [26].
6. Heemstra surveyed 364 organizations and found that only 51 used models to estimate effort
and that model users made no better estimates than non-model users [12.5]. Also, use of
estimation models was no better than expert judgment [12.5].
7. A survey of software development within JPL found that only 7% if estimators use algorithmic
models as a primary approach of estimation [13].
6. New Approaches
Software engineering cost estimation remains a complex problem and it continues to
attract considerable research and attention. Recently, Finnie and Wittig applied artificial neural
networks (ANN) and case-based reasoning (CBR) to estimation of effort [10] on a data set from
28
the Australian Software Metrics Association. ANN was able to estimate development effort
within 25% of the actual effort in more than 75% of the projects, and with a MARE of less than
25%. The results from CBR were less encouraging. In 73% of the cases, the estimates were
within 50% of the actual effort, and for 53% of the cases, the estimates were within 25% of the
actual.
Srinivasan and Fisher used machine learning approaches based on regression trees and
neural networks to estimate costs [27]. The learning approaches were found to be competitive
with SLIM, COCOMO, and function points, compared to the previous study by Kemerer [17]. A
primary advantage of learning systems is that they are adaptable and nonparametric.
7. Conclusion
As of today, almost no software engineering cost estimation model can predict the cost of
software development with a high degree of accuracy. This state of the practice is created
because of the following reasons:
(1) there are a large number of interrelated factors that influence the software development
process of a given development team and a large number of project attributes, such as number of
web pages, volatility of system requirements, and the use of reusable software components,
(2) software engineering development environments are evolving continuously, and
(3) there is a lack of measurement that truly reflects the complexity of software systems.
To produce better estimates, estimators must improve their understandings of those project
attributes and their causal relationships, model the impact of the evolving environment, and
develop effective ways of measuring software complexity.

29
At the initial stage of a project, there is high uncertainty about many project attributes.
Estimates produce at early stages of development are inevitably inaccurate, as the accuracy
depends highly on the amount of reliable information available to the estimator. As more project
details emerge during analysis and later design stages, uncertainties are reduced and more
accurate estimates can be made. Most models produce exact results without regard to this
uncertainty. They need to be enhanced to produce a range of estimates and their probabilities.
To improve algorithmic models, there is a great need for the industry to collect project
data on a wider scale. The recent effort of ISBSG is a step in the right direction [14]. This
standards group has established a repository of over 790 projects, which can serve as a potential
source for builders of cost estimation models.
With new types of applications, new development paradigms, and new development
tools, cost estimators are facing great challenges in applying known estimation models in the 21st
century. Historical data may prove to be irrelevant for future projects. The search for reliable,
accurate, and low cost estimation methods must continue. Several areas are in need of immediate
attention and these include the need for models based on development using formal methods or
those based on iterative software processes. Also, more studies are needed to improve the
accuracy of cost estimates for maintenance projects. Although a good deal of progress has been
made in software cost estimation, a great deal remains to be done. Outstanding issues needing
further research include:
1. Software size estimation;
2. Software size and complexity metrics;
3. Software cost driver attributes and their effects;

30
4. Software cost model analysis and their refinements;
5. Quantitative models of software project dynamics;
6. Quantitative models of software life-cycle evolution;
7. Software data collection.
References
1. A. J. Albrecht, and J.E. Gaffney, Software function, source lines of codes, and development
effort prediction: a software science validation, IEEE Trans Software Eng. SE-9, 1983, pp. 639-
648.
2. J.D. Aron, Estimating Resource for Large Programming Systems, NATO Science Committee,
Rome, Italy, October 1969
3. R.K.D. Black, R.P. Curnow, R. Katz and M.D. Gray, BCS Software Production Data, Final
Technical Report, RADC-TR-77-116, Boeing Computer Services, Inc., March 1977.
4. B.W. Boehm, Software engineering economics, Englewood Cliffs, NJ: Prentice-Hall, 1981
5. B.W. Boehm et al The COCOMO 2.0 Software Cost Estimation Model, American
Programmer, July 1996, pp.2-17.
6. L.C. Briand, K. El Eman, F. Bomarius, COBRA: A hybrid method for software cost
estimation, benchmarking, and risk assessment, International conference on software
engineering, 1998, pp. 390-399.
31
7. G. Cantone, A. Cimitile and U. De Carlini, A comparison of models for software cost
estimation and management of software projects, in Computer Systems: Performance and
Simulation, Elisevier Science Publishers B.V., 1986
8. W.S. Donelson, Project planning and control, Datamation, June 1976, pp. 73-80.
9. N. E. Fenton and S. L. Pfleeger, Software Metrics: A Rigorous and Practical Approach, PWS
Publishing Company, 1997
10. G.R. Finnie, G.E. Wittig, AI tools for software development estimation, Software
Engineering and Education and Practice Conference, IEEE Computer Society Press, pp. 346-
353, 1996.
11. M. H. Halstead, Elements of software science, Elsevier, New York, 1977
12. P.G. Hamer, G.D. Frewin, M.H. Halsteads Software Science a critical examination,
Proceedings of the 6th International Conference on Software Engineering,Sept. 13-16, 1982, pp.
197-206.
12.5. F.J. Heemstra, Software cost estimation, Information and Software Technology vol. 34,
no. 10, 1992, pp. 627-639.
13. J. Hihn and H. Habib-Agahi, Cost Estimation of software intensive projects: a survey of
current practices, International Conference on Software Engineering, 1991, pp. 276-287.
14. ISBSG, International software benchmarking standards group, http://www.isbsg.org/au.
15. D.R. Jeffery, G. C. Low, A comparison of function point counting techniques, IEEE Trans
on Soft. Eng., vol. 19, no. 5, 1993, pp. 529-532.
32
16. C. Jones, Applied Software Measurement, Assuring Productivity and Quality, McGraw-Hill,
1997.
17. C.F. Kemerer, An empirical validation of software cost estimation models,
Communications of the ACM, vol. 30, no. 5, May 1987, pp. 416-429
18. R.D. Luce and H. Raiffa, Games and Decisions. New York: Wiley, 1957.
19. R. Nelson, Management Handbook for the Estimation of Computer Programming Costs,AD-
A648750, Systems Development Corp., 1966
20. R.E. Park, PRICE S: The calculation of within and why, Proceedings of ISPA Tenth Annual
Conference, Brighton, England, July 1988.
21. G.N. Parkinson, Parkinsons Law and Other Studies in Administration, Houghton-Miffin,
Boston, 1957.
22. N. A. Parr, An alternative to the Raleigh Curve Model for Software development effort,
IEEE on Software Eng. May 1980.
23. L.H. Putnam, A general empirical solution to the macro software sizing and estimating
problem, IEEE Trans. Soft. Eng. May 1980
24. W. Royce, Software project management: a unified framework, Addison Wesley, 1998
25. V. Y. Shen, S. D. Conte, H. E. Dunsmore, Software Science revisited: a critical analysis of
the theory and its empirical support, IEEE Transactions on Software Engineering, 9,2,1983, pp.
155-165.
33
26. M. Shepperd and C. Schofield, Estimating software project effort using analogy, IEEE
Trans. Soft. Eng. SE-23:12, 1997, pp. 736-743.
27. K. Srinivasan and D. Fisher, Machine learning approaches to estimating software
development effort, IEEE Trans. Soft. Eng., vol. 21, no. 2, Feb. 1995, pp. 126-137.
28. D. St-Pierre, et al, Full Function Points: Counting Practice Manual, Technical Report 1997-
04, University of Quebec at Montreal, 1997
29. S. S. Vivinanza, T. Mukhopadhyay, and M. J. Prietula, Software-effort estimation: an
exploratory study of expert performance, Information Systems Research, vol. 2, no. 4, Dec.
1991, pp. 243-262.
30. C.E. Walston and C.P. Felix, A method of programming measurement and estimation,
IBM Systems Journal, vol. 16, no. 1, 1977, pp. 54-73.
31. R.W. Wolverton, The cost of developing large-scale software, IEEE Trans. Computer, June
1974, pp. 615-636.
Appendix
Figure I: Software cost estimation accuracy versus phase [4]
34
Figure II: Computer Programming Project Cycle [19]
35
Figure III: Computer Programming Processing Steps [19]
36
Figure IV: Early Sample Cost Justification Form [19]
37
Figure V: Early Sample Project Description Form [19]
38
Figure VI: Early Software Budget Form [19]
39
Table I: Strengths and Weaknesses of Software Cost-Estimation Methods [4]
Method Strengths Weaknesses
Algorithmic Model Objective, repeatable, Subjective inputs
analyzable formula
Assessment of exceptional
Efficient, good for sensitivity circumstances
analysis
Calibrated to past, not future
Objectively calibrated to
experience
Expert Judgment Assessment of No better than participants
representativeness,
Biases, incomplete recall
interactions, exceptional
circumstances
Analogy Based on representative Representativeness of
experience experience
Parkinson Correlates with some Reinforces poor practice
experience
Generally produces large
overruns
Top-down System level focus Less detailed basis
Efficient Less stable
Bottom-up More detailed basis May overlook system level
costs
More stable
Requires more effort

Fosters individual
committment
40
Table II: Cost factors and their weights in COCOMO II [4]
Cost Factors Description Rating
Very Low Nominal Hight Very

Low High
Product
RELY Reuquired software reliability 0.75 0.88 1 1.15 1.4
DATA Database size - 0.94 1.00 1.08 1.16
CPLX Product Complexity 0.7 0.85 1.00 1.15 1.30
Computer
TIME Execution time constraint - - 1.00 1.11 1.3
STOR Main storage constraint - - 1.00 1.06 1.21
VIRT Virtual machine volatility - 0.87 1.00 1.15 1.30
TURN Computer turnaround time - 0.87 1.00 1.07 1.15
Personnel
ACAP Analyst capability 1.46 1.19 1.00 0.86 0.71
AEXP Applicaton experience 1.29 1.13 1.00 0.91 0.82
PCAP Programmer capability 1.42 1.17 1.00 0.86 0.70
VEXP Virtual machine experience 1.21 1.10 1.00 0.9 -
LEXP Language experience 1.14 1.07 1.00 0.95 -
Project
MODP Modern programing practice 1.24 1.1 1 0.91 0.82
TOOL Software tools 1.24 1.1 1 0.91 0.82
SCED Development schedule 1.23 1.08 1 1.04 1.1
Table III: COCOMO Software Development Modes [4]
41
Feature Mode
Organic Semi- Embed
detached
Organization understanding of product objectives Thorough Considerable General
Experience in working with related software systems Extensive Considerable Moderate
Need for software conformance with pre-established requirements Basic Considerable Full
Need for software conformance with external interface specifications Basic Considerable Full
Concurrent development of associated new hardware and operational procedures Some Moderate Extensive
Need for innovative data processing architectures, algorithms Minimal Some Considerable
Premium on early completion Low Medium High
Product size range <50 KDSI <300 KDSI All sizes
Examples
Table IV [4]
42
Table V [4]
43

Project 020111

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Project 020111

Caricato da

Copyright:

Formati disponibili

UNIVERSITY OF ALASKA AT ANCHORAGE

Software Project Cost

accuracy of its estimates, which can be very problematical.

resource-limited situations on a national or global scale. Macroeconomic decisions are

microeconomics is the study of how people make decisions in resource-limited situations on a

prices to charge for products and services.

Software engineering is an exercise in microeconomics in that it deals with limited

negotiations, scheduling, monitoring, and control. Understimating software engineering costs

allocations, or underdeveloped functions with poor quality, or a failure to complete a project on

Accurate cost estimation is important because:

It can be used to assess the impact of changes and support replanning,

resources will be used,

Customers expect actual development costs to be in line with estimated costs.

that number by the estimated effort required.

In constructing a software cost engineering estimate, three basic questions arise:

Which software cost estimation model should be used?

(FP), or feature points?

What is a good estimate?

The pool of experienced estimators of new software projects is very small.

to those new projects with anticipated sizes of past projects.

data from previous projects questionable.

It is accepted by all stakeholders as realizable.

It is based on a database of relevant project experience (similar processes, similar

technologies, similar environments, similar people and similar requirements).

of success is objectively assessed.

Lack of an historical database of cost measurement,

Software development involving many interrelated factors, which affect development

Lack of trained estimators with the necessary expertise, and

Little penalty is often associated with a poor estimate.

2. Process of Software Engineering Estimation

of computer programming process steps.

questionnaires and interviews, current-system analysis, workload characterizations,

definition and concept of operation for the system to be implemented?

much should be invested in requirements validation activities (automated completeness,

consistency, traceability checks, analytic models, simulations, prototypes) before

proceeding to design and develop a software system?

which are primarily execution time-efficient, storage-efficient, and easy-to-modify,

respectively; which of these should be implemented?

performed on a product before releasing it to users?

Maintenance Phase: Given an extensive list of suggested product improvements, which

ones should be implemented first?

Phaseout: Given an aging, hard-to-modify software product, should it be replaced with a

new product, should it be restructured, or should it be left alone?

Software cost engineering estimation typically involves a top-down planning approach in

process, environment, people, and quality required for the project.

The actual cost estimation process involves seven steps [4]:

1. establish cost-estimating objectives;

2. generate a project plan for required data and resources;

3. pin down software requirements;

4. work out as much detail about the software system as feasible;

5. use several independent cost estimation techniques to capitalize on their combined

6. compare different estimates and iterate the estimation process; and

the following to get the best results:

Model calibration and assumptions should be decided beforehand.

Sensitivity analysis of the estimates to different model parameters should be calculated.

to a common point in time.

Inherent in the process of software engineering estimation is the utilization of software

minicomputer processors, because of high multiprocessor overhead factors.

higher peak throughput, but would require $180 in software costs.

to resolve complex decisions include the following:

practical software engineering decisions.

2. Expected-value techniques, in which one estimates the probabilities of occurrence of each

expected payoff of each option: EV = Prob(success)*Payoff(successful OS) +