Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
UNIT 63 - BENCHMARKING
A. INTRODUCTION
B. QUALITATIVE BENCHMARKS
C. QUANTITATIVE BENCHMARKS
Subtasks
Products and data input
Frequency required
Execution of tasks
Prediction
Forecast
Summary of phases of analysis
E. APPLICATION OF MODEL
F. LIMITATIONS
G. AGT BENCHMARK EXAMPLE
Project Background
REFERENCES
EXAM AND DISCUSSION QUESTIONS
NOTES
This unit is contains far more information than can possibly be covered in a single lecture. The
middle sections, D and E, contain a detailed technical review of a benchmark model. Depending
on the abilities and interests of your students you may wish to omit these sections and move on
to the description of the AGT benchmark in section G, or focus the lecture on the technical
aspects and omit the descriptive example.
UNIT 63 - BENCHMARKING
A. INTRODUCTION
Benchmark script
handout - Benchmark script example (2 pages)
benchmark uses a script which
o details tests for all of the functions required
o permits both:
subjective evaluation by an observer (qualitative)
objective evaluation of performance (quantitative)
must allow all of the required functionality to be examined
failure of one test must not prevent remainder of test from being carried out
must be modular
o customer must be able to separate the results of each test
conditions must be realistic
o real data sets, realistic data volumes
B. QUALITATIVE BENCHMARKS
in the qualitative part of the benchmark it is necessary to evaluate the way the program
handles operations
functions cannot be evaluated simply as present or absent
C. QUANTITATIVE BENCHMARKS
developed in the early days of computing because of need to allocate scarce computing
resources carefully
a subfield of computer science
requires that tasks be broken down into subtasks for which performance is predictable
early PE concentrated on the machine instruction as the subtask
o specific mixes of machine instructions were defined for benchmarking general-
purpose mainframes
e.g. the "Gibson mix" - a standard mixture of instructions for a general
computing environment, e.g. a university mainframe
multi-tasking systems are much more difficult to predict because of interaction between
jobs
o time taken to do my job depends on how many other users are on the system
o it may be easier to predict a level of subtask higher than the individual machine
instruction
o modern operating systems must be "tuned" to perform optimally for different
environments
e.g. use of memory caching, drivers for input and output systems
specifying data structures would bias the benchmark toward certain vendors
o e.g. cannot specify whether raster or vector is to be used, must leave the choice to
the vendor
similarly, cannot specify programming language, algorithms or data structures
a GIS benchmark must use a higher level of subtask
o an appropriate level of subtask for a GIS benchmark is:
understandable without technical knowledge
makes no technical specifications
o e.g. "overlay" is acceptable as long as the vendor is free to choose a raster or
vector approach
o e.g. "data input" is acceptable, specifying digitizing or scanning is not
therefore, a GIS PE can be based on an FRS and its product descriptions, which may have
been generated by resource managers with no technical knowledge of GIS
need a mathematical model which will predict resource utilization (CPU time, staff time,
plotter time, storage volume) from quantities which can be forecast with reasonable
accuracy
o numbers of objects - lines, polygons - are relatively easy to forecast
o technical quantities - numbers of bytes, curviness of lines - are less easy to
forecast
the mathematical form of the model will be chosen based on expectations about how the
system operates
o e.g. staff time in digitizing a map is expected to depend strongly on the number of
objects to be digitized, only weakly on the size of the map (unless large maps
always have more objects)
o requires a proper balance between quantitative statistical analysis and knowledge
about how the procedures operate
Subtasks
Frequency required
Execution of tasks
Prediction
in order to predict the amount of resources needed to create a product, need to find a
mathematical relationship between the amount of resource that will be needed and
measurable indicators of task size
o e.g. number of polygons, queries, raster cells, lines
Pakn is predictor n for measure k, subtask a
Mak = f(Pak1,Pak2,...,Pakn,...)
o e.g. the amount of staff time (Mk) used in digitizing (a) is a function of the
number of polygons to be digitized (Pak1) and the number of points to be
digitized (Pak2)
the general form of the prediction function f will be chosen based on expert insight into
the nature of the process or statistical procedures such as regression analysis
o e.g. use the results of the benchmark to provide "points on the curve" with which
to determine the precise form of f
Forecast
given a prediction function, we can then forecast resource use during production with
useful, though not perfect, accuracy
Wkit is the use of resource k by the tth subtask required for a single generation of product
i
Wki = sum of Wkit for all t is the amount of the resource k used by all subtasks in
making product i once
Vkj = sum of (Wki Yij) for all i is the amount of resource k used to make the required
numbers of all products in year j
3. Analyze the system's ability to make the products from the qualitative evaluations in
(2) above
4. Obtain performance measures for known workloads from the results of the quantitative
benchmark
5. Build suitable models of performance from the data in (4 ) above
7. Predict future resource utilization from future workloads and performance models, and
compare to resources available, e.g. how does CPU utilization compare to time available?
E. APPLICATION OF MODEL
this section describes the application of this model of resource use in a benchmark
conducted for a government forest management agency with responsibilities for
managing many millions of acres/hectares of forest land
FRS was produced using the "fully internalized" methodology described in Unit 61
FRS identified 33 products
o 50 different GIS functions required to make them out of a total library of 75
GIS acquisition anticipated to exceed $2 million
these three phases provided at least one test of every required function
for functions which are heavy users of resources, many tests were conducted under
different workloads
o e.g. 12 different tests of digitizing ranging from less than 10 to over 700 polygons
Qualitative benchmark
each function was scored subjectively on a 10-point scale ranging from 0 = "very fast,
elegant, user-friendly, best in the industry" to 9 = "impossible to implement without
major system modification"
o score provides a subjective measure of the degree to which the function inhibits
generation of a product
o maximum score obtained in the set of all subtasks of a product is a measure of the
difficulty of making the product
Quantitative benchmark
since this was an extensive study, consider for example the quantitative analysis for a
single function - digitizing
digitizing is a heavy user of staff time in many systems
delays in digitizing will prevent system reaching operational status
o digitizing of complete database must be phased carefully over 5 year planning
horizon to allow limited production as early as possible
as stated above, benchmark included 12 different digitizing tasks
resource measure of digitizing is staff time in minutes
predictors are number of polygons and number of line arcs
o line arcs are topological arcs (edges, 1-cells) not connected into polygons, e.g.
streams, roads
o other predictors might be more successful - e.g. number of polygons does not
distinguish between straight and wiggly lines though the latter are more time-
consuming to digitize - however predictors must be readily accessible and easy to
forecast
sample of results of quantitative benchmark: polygons line arcs staff time (mins) 766 0
930 129 0 136 0 95 120
benchmark digitizing was done by vendor's staff - well- trained in use of software, so
speeds are likely optimistic
Model
overhead - Models of time resources required
Results
the equation which fits the data best (least squares) is:
m = 1.21 p1 + 0.97 p2
o i.e. it took 1.21 minutes to digitize the average polygon, 0.97 minutes to digitize
the average line arc
to predict CPU use in seconds for the digitizing operation:
m = 2.36 p1 + 2.63 p2
required, despite the bias in using the vendor's own staff in the digitizing
benchmark
F. LIMITATIONS
Project Background
REFERENCES
Goodchild, M.F., 1987. "Application of a GIS benchmarking and workload estimation model,"
Papers and Proceedings of Applied Geography Conferences 10:1-6.
Goodchild, M.F. and B.R. Rizzo, 1987. "Performance evaluation and workload estimation for
geographic information systems," International Journal of Geographical Information Systems
1:67-76. Also appears in D.F. Marble, Editor, Proceedings of the Second International
Symposium on Spatial Data Handling, Seattle, 497-509 (1986).
Marble, D.F. and L. Sen, 1986. "The development of standardized benchmarks for spatial
database systems," in D.F. Marble, Editor, Proceedings of the Second International Symposium
on Spatial Data Handling, Seattle, 488-496.
1. Discuss the Marble and Sen paper listed in the references, and the differences between its
approach and that presented in this unit.
2. How would you try to predict CPU utilization in the polygon overlay operation? What
predictors would be suitable? How well would you expect them to perform based on your
knowledge of algorithms for polygon overlay?
4. Compare the approach to GIS applications benchmarking described in this unit with a standard
description of computer performance evaluation, for example D. Ferrari, 1978, Computer
Systems Performance Evaluation. Prentice Hall, Englewood Cliffs, NJ.
5. In some parts of the computing industry, the need for benchmarks has been avoided through
the development of standardized tests. For example such tests are used to compare the speed and
throughput rates of numerically intensive supercomputers, and of general-purpose mainframes.
Are such tests possible or appropriate in the GIS industry?
This article is about the business term. For the geo locating activity, see Benchmarking (geo locating). For
other uses, see Benchmark (disambiguation).
Benchmarking is comparing one's business processes and performance metrics to industry bests and best
practices from other companies. Dimensions typically measured are quality, time and cost. In the process of
best practice benchmarking, management identifies the best firms in their industry, or in another industry where
similar processes exist, and compares the results and processes of those studied (the "targets") to one's own
results and processes. In this way, they learn how well the targets perform and, more importantly, the business
processes that explain why these firms are successful.
Benchmarking is used to measure performance using a specific indicator (cost per unit of measure, productivity
per unit of measure, cycle time of x per unit of measure or defects per unit of measure) resulting in a metric of
performance that is then compared to others.[1]
Also referred to as "best practice benchmarking" or "process benchmarking", this process is used in
management which particularly shows VEMR strategic management, in which organizations evaluate various
aspects of their processes in relation to best practice companies' processes, usually within a peer group
defined for the purposes of comparison. This then allows organizations to develop plans on how to make
improvements or adapt specific best practices, usually with the aim of increasing some aspect of performance.
Benchmarking may be a one-off event, but is often treated as a continuous process in which organizations
continually seek to improve their practices.
Contents
[hide]
1History
2Procedure
3Costs
4Technical/product benchmarking
5Types
6Tools
7Metric benchmarking
8See also
9References
History[edit]
The term bench mark, or benchmark, originates from the chiseled horizontal marks that surveyors made in
stone structures, into which an angle-iron could be placed to form a "bench" for a leveling rod, thus ensuring
that a leveling rod could be accurately repositioned in the same place in the future. These marks were usually
indicated with a chiseledarrow below the horizontal line. Benchmarking is most used to measure performance
using a specific indicator (cost per unit of measure, productivity per unit of measure, cycle time of x per unit of
measure or defects per unit of measure) resulting in a metric of performance that is then compared to others. In
1994, one of the first technical journal named"Benchmarking: An International Journal" was published.
In 2008, a comprehensive survey[2] on benchmarking was commissioned by The Global Benchmarking
Network, a network of benchmarking centres representing 22 countries.
1. Mission and Vision Statements and Customer (Client) Surveys are the most used (by 77% of
organizations) of 20 improvement tools, followed by SWOT analysis (strengths, weaknesses,
opportunities, and threats) (72%), and Informal Benchmarking (68%). Performance
Benchmarking was used by 49% and Best Practice Benchmarking by 39%.
2. The tools that are likely to increase in popularity the most over the next three years are Performance
Benchmarking, Informal Benchmarking, SWOT, and Best Practice Benchmarking. Over 60% of
organizations that are not currently using these tools indicated they are likely to use them in the next
three years.
3. Mission and Vision Statements and Customer (Client) Surveys are the most used (by 77% of
organizations) of 20 improvement tools, followed by SWOT analysis (strengths, weaknesses,
opportunities, and threats) (72%), and Informal Benchmarking (68%). Performance
Benchmarking was used by 49% and Best Practice Benchmarking by 39%.
4. The tools that are likely to increase in popularity the most over the next three years are Performance
Benchmarking, Informal Benchmarking, SWOT, and Best Practice Benchmarking. Over 60% of
organizations that are not currently using these tools indicated they are likely to use them in the next
three years.
process
There is no single benchmarking process that has been universally adopted. The wide appeal and
acceptance of benchmarking has led to the emergence of benchmarking methodologies. One seminal
book is Boxwell's Benchmarking for Competitive Advantage (1994).[3] The first book on benchmarking,
written and published by Kaiser Associates,[4] is a practical guide and offers a seven-step approach.
Robert Camp (who wrote one of the earliest books on benchmarking in 1989)[5] developed a 12-stage
approach to benchmarking.
The 12 stage methodology consists of:
1. Select subject
2. Define the process
3. Identify potential partners
4. Identify data sources
5. Collect data and select partners
6. Determine the gap
7. Establish process differences
8. Target future performance
9. Communicate
10. Adjust goal
11. Implement
12. Review and recalibrate
The following is an example of a typical benchmarking methodology:
Identify problem areas: Because benchmarking can be applied to any business process or function, a
range of research techniques may be required. They include informal conversations with customers,
employees, or suppliers; exploratory research techniques such as focus groups; or in-depth marketing
research, quantitative research, surveys,questionnaires, re-engineering analysis, process mapping, quality
control variance reports, financial ratio analysis, or simply reviewing cycle times or other performance
indicators. Before embarking on comparison with other organizations it is essential to know the
organization's function and processes; base lining performance provides a point against which
improvement effort can be measured.
Identify other industries that have similar processes: For instance, if one were interested in improving
hand-offs in addiction treatment one would identify other fields that also have hand-off challenges. These
could include air traffic control, cell phone switching between towers, transfer of patients from surgery to
recovery rooms.
Identify organizations that are leaders in these areas: Look for the very best in any industry and in any
country. Consult customers, suppliers, financial analysts, trade associations, and magazines to determine
which companies are worthy of study.
Survey companies for measures and practices: Companies target specific business processes using
detailed surveys of measures and practices used to identify business process alternatives and leading
companies. Surveys are typically masked to protect confidential data by neutral associations and
consultants.
Visit the "best practice" companies to identify leading edge practices: Companies typically agree to
mutually exchange information beneficial to all parties in a benchmarking group and share the results
within the group.
Implement new and improved business practices: Take the leading edge practices and develop
implementation plans which include identification of specific opportunities, funding the project and selling
the ideas to the organization for the purpose of gaining demonstrated value from the process.
Costs[edit]
The three main types of costs in benchmarking are:
Visit Costs - This includes hotel rooms, travel costs, meals, a token gift, and lost labor time.
Time Costs - Members of the benchmarking team will be investing time in researching problems, finding
exceptional companies to study, visits, and implementation. This will take them away from their regular
tasks for part of each day so additional staff might be required.
Benchmarking Database Costs - Organizations that institutionalize benchmarking into their daily
procedures find it is useful to create and maintain a database of best practices and the companies
associated with each best practice now.
The cost of benchmarking can substantially be reduced through utilizing the many internet resources that have
sprung up over the last few years. These aim to capture benchmarks and best practices from organizations,
business sectors and countries to make the benchmarking process much quicker and cheaper.[6]
Technical/product benchmarking[edit]
The technique initially used to compare existing corporate strategies with a view to achieving the best possible
performance in new situations (see above), has recently been extended to the comparison of technical
products. This process is usually referred to as "technical benchmarking" or "product benchmarking". Its use is
well-developed within the automotive industry ("automotive benchmarking"), where it is vital to design products
that match precise user expectations, at minimal cost, by applying the best technologies available worldwide.
Data is obtained by fully disassembling existing cars and their systems. Such analyses were initially carried out
in-house by car makers and their suppliers. However, as these analyses are expensive, they are increasingly
being outsourced to companies who specialize in this area. Outsourcing has enabled a drastic decrease in
costs for each company (by cost sharing) and the development of efficient tools (standards, software).-
Types[edit]
Benchmarking can be internal (comparing performance between different groups or teams within an
organization) or external (comparing performance with companies in a specific industry or across industries).
Within these broader categories, there are three specific types of benchmarking: 1) Process benchmarking, 2)
Performance benchmarking and 3) Strategic benchmarking. These can be further detailed as follows:
Process benchmarking - the initiating firm focuses its observation and investigation of business processes
with a goal of identifying and observing the best practices from one or more benchmark firms. Activity
analysis will be required where the objective is to benchmark cost and efficiency; increasingly applied to
back-office processes where outsourcing may be a consideration. Benchmarking is appropriate in nearly
every case where process redesign or improvement is to be undertaking so long as the cost of the study
does not exceed the expected benefit.
Financial benchmarking - performing a financial analysis and comparing the results in an effort to assess
your overall competitiveness and productivity.
Benchmarking from an investor perspective- extending the benchmarking universe to also compare to
peer companies that can be considered alternative investment opportunities from the perspective of an
investor.
Benchmarking in the public sector - functions as a tool for improvement and innovation in public
administration, where state organizations invest efforts and resources to achieve quality, efficiency and
effectiveness of the services they provide.[7]
Performance benchmarking - allows the initiator firm to assess their competitive position by comparing
products and services with those of target firms.
Product benchmarking - the process of designing new products or upgrades to current ones. This process
can sometimes involve reverse engineering which is taking apart competitors products to find strengths
and weaknesses.
Strategic benchmarking - involves observing how others compete. This type is usually not industry
specific, meaning it is best to look at other industries.
Functional benchmarking - a company will focus its benchmarking on a single function to improve the
operation of that particular function. Complex functions such as Human Resources, Finance and
Accounting and Information and Communication Technology are unlikely to be directly comparable in cost
and efficiency terms and may need to be disaggregated into processes to make valid comparison.
Best-in-class benchmarking - involves studying the leading competitor or the company that best carries out
a specific function.
Operational benchmarking embraces everything from staffing and productivity to office flow and analysis of
procedures performed.[8]
Energy benchmarking - process of collecting, analysing and relating energy performance data of
comparable activities with the purpose of evaluating and comparing performance between or within
entities.[9] Entities can include processes, buildings or companies. Benchmarking may be internal between
entities within a single organization, or - subject to confidentiality restrictions - external between competing
entities.
Tools[edit]
Further information: List of benchmarking methods and software tools
Benchmarking software can be used to organize large and complex amounts of information. Software
packages can extend the concept of benchmarking and competitive analysis by allowing individuals to handle
such large and complex amounts or strategies. Such tools support different types of benchmarking (see above)
and can reduce the above costs significantly.
Metric benchmarking[edit]
Another approach to making comparisons involves using more aggregative cost or production information to
identify strong and weak performing units. The two most common forms of quantitative analysis used in metric
benchmarking are data envelope analysis (DEA) and regression analysis. DEA estimates the cost level an
efficient firm should be able to achieve in a particular market. In infrastructure regulation, DEA can be used to
reward companies/operators whose costs are near the efficient frontier with additional profits. Regression
analysis estimates what the average firm should be able to achieve. With regression analysis, firms that
performed better than average can be rewarded while firms that performed worse than average can be
penalized. Such benchmarking studies are used to create yardstick comparisons, allowing outsiders to
evaluate the performance of operators in an industry. Advanced statistical techniques, including stochastic
frontier analysis, have been used to identify high and weak performers in industries, including applications to
schools, hospitals, water utilities, and electric utilities.[10]
One of the biggest challenges for metric benchmarking is the variety of metric definitions used among
companies or divisions. Definitions may change over time within the same organization due to changes in
leadership and priorities. The most useful comparisons can be made when metrics definitions are common
between compared units and do not change so improvements can be verified.