trb05 Simcalval

Development and Evaluation of a Procedure for the Calibration of Simulation Models
Byungkyu (Brian) Park and Hongtu (Maggie) Qi

Microscopic traffic simulation models have been playing an important role in the evaluation of transportation engineering and planning practices for the past few decades, particularly in cases in which eld implementation is difficult or expensive to conduct. To achieve high delity and credibility for a traffic simulation model, model calibration and validation are of utmost importance. Most calibration efforts reported in the literature have focused on the informal practice, and they have seldom proposed a systematic procedure or guideline for the calibration and validation of simulation models. This paper proposes a procedure for microscopic simulation model calibration. The validity of the proposed procedure was demonstrated by use of a case study of an actuated signalized intersection by using a widely used microscopic traffic simulation model, Verkehr in Staedten Simulation (VISSIM). The simulation results were compared with multiple days of eld data to determine the performance of the calibrated model. It was found that the calibrated parameters obtained by the proposed procedure generated performance measures that were representative of the eld conditions, while the simulation results obtained with the default and best-guess parameters were signicantly different from the eld data.
Microscopic traffic simulation models have been widely and increasingly applied for the evaluation of transportation system design, trafc operations, and management alternatives for the past few decades because simulation is inexpensive, risk-free, and fast. As recognized by the Highway Capacity Manual (HCM), simulation becomes a valuable aid in assessing the performance of transportation systems (1). Furthermore, traffic simulation models sometimes provide signicant advantages over traditional planning or analytical models, such as Highway Capacity Software (HCS). For example, the estimation of delay from HCS is often inappropriate when the impacts of a large-scale system are considered or oversaturated conditions are prevalent (2). Other common reasons for the popularity of simulation models include their attractive animations; their stochastic variability for the capture of real-world traffic conditions; and their capabilities to model complex roadway geometries, such as combined systems of urban streets and freeways. When more and more people begin to rely on simulation as an important evaluation tool, an essential concern is its credibility. Potential issues are whether these models accurately replicate realistic driver
B. Park, Department of Civil Engineering, University of Virginia, and Virginia Transportation Research Council, P.O. Box 400742, Charlottesville, VA 22904-4742. H. Qi, BMI-SG, 8330 Boone Boulevard, Suite 400, Vienna, VA 22182-2624. Transportation Research Record: Journal of the Transportation Research Board, No. 1934, Transportation Research Board of the National Academies, Washington, D.C., 2005, pp. 208217.
behavior and whether one can trust the decisions made on the basis of the simulations. Microscopic simulation models contain numerous independent parameters that can be used to describe traffic ow characteristics, driver behavior, and traffic control operations. These models provide a default value for each parameter, but they also allow users to change the values to represent local traffic conditions (3). The process of adjusting and ne-tuning model parameters by using realworld data to reect local traffic conditions is model calibration. Simulation model-based analyses are often performed with default parameter values or manually adjusted values. Rigorous calibration procedures are often omitted because doing so requires a great deal of time as well as a huge amount of eld data. The ndings and conclusions based on the uncalibrated or inappropriately calibrated models could be misleading and even erroneous. Thus, proper calibration is a crucial step in simulation applications. To achieve adequate reliability of the simulation models, it is important that a rigorous calibration procedure be applied before any further study and analysis are conducted. Changes to the parameters during calibration should be justied and defensible by the users. Most of the calibration efforts were performed to achieve a reasonable correspondence between the eld data and the output of the simulation model. Recently, more and more transportation researchers and practitioners have realized the importance of model calibration and have spent signicant time and effort to demonstrate the validities of their models. However, it was found that simulation model calibration was often informally practiced among researchers and had seldom been formally proposed as a procedure or guideline (3). Therefore, proposing a general and practical procedure for simulation model calibration is an urgent task. Several researchers have discussed the general requirements of a simulation model calibration procedure (39). Park and Schneeberger proposed a general calibration procedure based on a linear regression model (3). However, they did not consider the correlations among the parameters. Sacks et al. demonstrated an informal validation process and emphasized the importance of data quality and visualization (4). Hellinga proposed general requirements for model calibration (5). However, a formal procedure for conducting calibration was not provided. The objective of this paper is to propose and evaluate a general procedure for calibration of microscopic simulation models. The validity of the proposed procedure was evaluated and demonstrated by use of a case study with an actuated signalized intersection by using a widely used microscopic traffic simulation model, Verkehr in Staedten Simulation (VISSIM). The remainder of the paper is organized as follows: a brief background of the test site and simulation model, VISSIM, is provided; the methodology of the proposed procedure is described and is followed by the implantation and evaluation of the procedure
208
Park and Qi
209
by use of a case study; lastly, conclusions and recommendations are provided.
TEST SITE AND SIMULATION MODEL Test Site An actuated signalized intersection was modeled in VISSIM to test the proposed procedure for simulation model calibration. The intersection is located at the junction of U.S. Route 15 and U.S. Route 250 in Virginia. The test site is referred to as Site 15 throughout the remainder of this paper. As shown in Figure 1, it has four legs with a single-lane approach for all directions, and there are exclusive rightturn bays 110 ft long on all approaches. The southbound approach carries a relatively heavy traffic volume and has a large proportion of left-turning vehicles during peak hours. This isolated actuated intersection has two phases and functions, such as minimum and maximum green times, added initial, passage time, and minimum gap.
logic, including actuated signal control. VISSIM uses the psychophysical driver behavior model developed by Wiedemann (11), which uses dynamic speeds and stochastic car-following models and, hence, models driving behavior close to that detected in the eld. In this research, all simulation work was carried out with Version 3.70 of VISSIM on a personal computer with a Pentium 2.53-GHz central processing unit and 1.00 GB of random access memory.
PROPOSED PROCEDURE The proposed procedure for simulation model calibration consists of simulation model setup, initial calibration, feasibility test, parameter calibration by use of a genetic algorithm (GA), and model evaluation, as shown in Figure 2.
Simulation Model Setup Simulation model setup comprises those tasks that are conducted before the commencement of model calibration. These tasks include the denition of the study scope and purpose, site selection, determination of measures of effectiveness (MOEs), eld data collection, and network coding.
Simulation Model VISSIM, a microscopic and stochastic simulation model, was chosen in this research for evaluation of the proposed procedure. VISSIM is capable of simulating traffic operations on urban streets and freeways, with a special emphasis on public transportation and multimodal transportation (10). VISSIM consists of two different programs: the traffic simulator and the signal state generator. The traffic simulator is a microscopic simulation model that comprises car-following and lane-changing logics, and it provides a simulation environment. The simulator is capable of simulation at a resolution of up to 0.1 s. The signal state generator allows users to dene their own signal control
Initial Evaluation Once a simulation model is correctly set up, the simulation performance data based on the default parameter are obtained and compared with the eld data. If a close match between them is found, the default model could be deemed appropriate for use for further analysis. Otherwise, the subsequent steps must be performed.
Route 15
Route 250
FIGURE 1
Site 15 in VISSIM.
210
Transportation Research Record 1934
Simulation Model Setup - Study scope and purpose - Site selection - Determination of measures of effectiveness - Field data collection - Network coding
Initial Evaluation on Default Parameters Unsatisfied
Satisfied
Initial Calibration - Identification of calibration parameters - Experimental design for calibration (Latin Hypercube Design) - Multiple runs
number of combinations to a practical amount while still reasonably covering the entire parameter surface. LHD provides an orthogonal array that randomly samples the entire design space broken into regions of equal probability. This type of sampling can be regarded as a stratied Monte Carlo sampling method in which the pairwise correlations can be minimized to a small value, which is essential for uncorrelated parameter estimates (12). LHD is especially useful for exploring the interior of the parameter space and for limiting the experiment to a xed or a user-dened number of combinations. This technique ensures that the entire range of each parameter is sampled. In this study, a Latin Hypercube Sampling toolbox in MatLab (13) was used to generate a predetermined number of scenarios by using the calibration parameters of each simulation model.
Multiple Runs
Feasibility Test - Statistical plots - Analysis of variance - Identification of key parameters
Adjust key parameter ranges
Feasibility Test Passed? Yes Parameter Calibration Using Genetic Algorithm
No
To reduce stochastic variability, multiple runs are conducted for each scenario from the experimental design. The number of replications should be a function of the variability of the system, the acceptable error level, and computation time constraints. The average performance measure and standard deviation are recorded for each of the runs. The simulation results are used in the feasibility test.
Feasibility Test
The purpose of the feasibility test is to check whether the eld data are reasonably covered by the distribution of simulation results and, hence, to determine whether the current parameter ranges are feasible. If eld data fall within the middle 90% of the distribution of the simulation results, dened as the acceptable range, the current parameter ranges are deemed sufficient to generate the field condition. Otherwise, the ranges of key parameters are adjusted to achieve the desired results. However, the adjusted value should be reasonable and realistic. With the adjusted parameters, another set of a predetermined number of scenarios is generated and evaluated. This process is repeated until the eld data fall within the acceptable range.
Evaluation of the Parameter Sets Unsatisfied - Distribution of simulation output - Statistical test - Visualization check
Satisfied End
FIGURE 2
Methodology flowchart.
Statistical Plots
With the performance measure recorded in each of the runs, statistical plots such as scatter plots and histograms are drawn to help interpret the results. The plots include the distribution of performance measures from multiple runs and the performance measure versus each calibration parameter. For the key parameters, the trend is usually easy to observe.
Initial Calibration
Identication of Calibration Parameters

All calibration parameters within the simulation model except those that do not have any relevant impact on the simulation results are identied. For example, for a simple network without a route diversion possibility, the parameters related to the route changes could be eliminated from the calibration parameter list. The acceptable ranges of selected calibration parameters are determined.
Analysis of Variance
Analysis of variance (ANOVA), a more rigorous statistical method, is applied to identify key parameters. ANOVA tests the null hypothesis that the means for several groups in the population are equal by comparing the sample variance estimated from the group means with that estimated within the groups (14). In this study each pair of calibration parameter and simulation results was examined by one-way ANOVA by use of a statistical package, SPSS (15). It can identify statistically signicant parameters on the basis of the signicance value of the F-test and can quantify the
Experimental Design for Calibration

The number of combinations among feasible controllable parameters is extremely large, such that all possible scenarios cannot be evaluated in a reasonable amount of time. A Latin hypercube design (LHD) algorithm (12), an experimental design method, is used to reduce the
Park and Qi
211
degree to which each parameter explains the performance measure on the basis of the sum of squares (SSRs) between the groups. When the P-value is smaller than the user-dened condence level, the null hypothesis is rejected, thereby indicating that group means are statistically different. Therefore, the parameters with small P values and large SSRs are identied as key parameters.
evaluations of travel time distributions, statistical tests, and animations are satisfactory. Otherwise, either the previous step, which is parameter calibration by use of a GA, or the initial calibration stage must be reperformed.
IMPLEMENTATION OF PROPOSED PROCEDURE Parameter Calibration by Use of GA After the identication of appropriate parameters and their acceptable ranges, a GA is applied to nd the optimal parameter values. Analysis by use of a GA is a heuristic optimization technique based on the mechanics of natural selection and evolution (16 ). It works with a population of individuals, each of which represents a possible solution to a given problem. A tness value is assigned to each individual according to how good its represented solution is. Selection of the best individuals from the current generation and mating of the best individuals to produce a new set of individuals produce a new population of possible solutions. Three basic operators used in GA analysis are reproduction, crossover, and mutation. The reproduction operator selects individuals with higher tness. The crossover operator creates the next population from the intermediate population, whereas the mutation operator is used to explore some areas that have not been searched. Schema theorem and building blocks hypothesis are rigorous explanations of how the highly t individuals survive to the next generations (17 ). In this study the relative error of the average travel time between the simulation output and the eld data was used as the tness value for the GA. The tness function takes the form FV = where FV = tness value, TTeld = average travel time from the eld, and TTsim = average travel time from the simulation. TTfield TTsim TTfield (1) The proposed procedure was implemented and evaluated in a case study by using VISSIM. This section describes the main steps in detail and highlights the strategies of ne-tuning of the key parameters.
Simulation Model Setup The tasks under the rst step, simulation model setup, consist of the denition of the study scope and purpose, site selection, determination of MOEs, eld data collection, and network coding. The data needed in this case study included the simulation input data and MOEs for calibration purposes. To code Site 15 in the simulation environment, input data such as traffic counts, the length of each approach lane, intersection geometric characteristics, detector locations, and the posted speed limit were required. To account for dayto-day variability, data were collected during the evening peak hour between 5:00 and 6:00 p.m. on multiple days, April 15 and 22 and May 13, 2003. The MOE used in this network was the average travel time on the southbound through and left-turn movements because it directly reflected the level of service of the signalized intersection. Furthermore, the collection of data from both the eld and the simulation was relatively easy.
Initial Evaluation Once the simulation model was correctly set up, multiple simulations based on the default parameters were conducted and the average travel time was recorded. It was found that the average travel time of 23.4 s from the simulation was far less than the eld travel time of 56.75 s. Thus, it was deemed that calibration was necessary and that the following steps needed to be practiced.
Evaluation of Parameter Sets Once GA nishes parameter calibration, multiple runs are separately conducted for the default parameters and the GA-based parameters for comparison. The best-guess parameter set is another parameter set that could be included in the comparison. Here, the best guess is derived by assignment of the most adequate value to each parameter on the basis of engineering judgment and knowledge of local traffic conditions. For each parameter set, a distribution of performance measures is developed and compared with the eld measures. Statistical testing, for example, by the t-test and visualization checking, is conducted. The t-test is used to verify whether the calibrated parameter set from GA can generate statistically signicant results and perform better than the default parameter set. In this study, the performance measure used in the t-test was the average travel time within the study period. Visualization is used to evaluate the calibrated models. A model cannot be considered calibrated if the animations are not realistic. Animations from multiple runs of each model are viewed to ensure that the animations are acceptable. The model is regarded as properly calibrated only if the results of the Initial Calibration
Identication of Calibration Parameters

In the initial phase, users may not be able to identify correctly all the critical parameters or the reasonable range for each parameter. Two criteria could potentially help users select the parameter candidates to be calibrated: 1. Conduct a trial-and-error test for each parameter. If the parameter does not affect the simulation result at all, it can be excluded from the parameter candidate list. 2. Judge from a traffic engineers point of view. For instance, because the study network has only one lane for each approach and because the origindestination was simple without any diversion possibility, parameters related to lane change or route choice behaviors should not have an impact on the simulation result. Thus, the minimum headway (front and rear) and the waiting time before diffusion were eliminated from the parameter list.
212
Users can determine the acceptable range for each parameter selected on the basis of a review of the literature, including the manuals for simulation models and other existing research ndings. So that the calibration direction is not misled, the use of unrealistic values in the range should be avoided. The feasibility of selected parameters and their ranges was later veried through a feasibility test. The following is the list of parameters and acceptable ranges determined initially. A detailed description of each parameter can be found in the VISSIM manual (10). 1. Simulation resolution: 1 to 9 time steps per simulation second 2. Number of observed preceding vehicles: 1 to 4 3. Average standstill distance: 1 to 5 m 4. Saturation ow rate: 1,756, 1,800, 1,846, 1,895, and 1,946 vehicles per hour Additive part of desired safety distance: 2.0 to 2.5 Multiple part of desired safety distance: 3.0 to 3.7 5. Priority rules for minimum headway: 5 to 20 m 6. Priority rules for minimum gap time: 3 to 6 s 7. Desired speed distribution: 40 to 50, 30 to 40, and 20 to 30 mph Both the additive and the multiple parts of the desired safety distances in the Wiedemann 74 car-following model determine the saturation ow rate for VISSIM. The values were based on Table 5.2.6 (ow rate, 25 s of through green time) in the VISSIM manual. The priority rules and the two associated parameters determine the priorities of conicting traffic movements in the intersection and designate the right-of-way.
Multiple Runs
Five random seeded runs were conducted in VISSIM for each of the 200 cases, for a total of 1,000 runs. The average travel time was recorded for each of the 1,000 runs. The results from the ve multiple runs of each scenario were then averaged to represent each of the 200 parameter sets.
Feasibility Test With the simulation results based on the initial parameter set, it is important to answer the following key questions by applying statistical methods and analyzing plots. Are the simulation results based on the current parameters, and do their ranges include the eld conditions? If not, some parameters and their ranges need to be adjusted or new parameters should be explored. Histogram analysis is appropriate for this purpose. Figure 3 shows that the average eld travel time, 56.75 s, falls outside of the simulated distribution generated from the initial ranges of parameters. It indicates that the current parameters and their ranges were not satisfactory and needed adjustment. Is the simulation result sensitive to the different values of each parameter? Also, which parameters are critical to the simulation results? The use of plots and the ANOVA test is appropriate. Scatter plots of each parameter versus the travel time from the simulations were investigated. It was found that the minimum gap and the desired speed distribution were two parameters important to the results. The travel time became higher when the minimum gap increased or the mean desired speed decreased, or both. Figure 4 provides two examples of the scatter plots. The mean desired speed demonstrates an apparent trend, while the simulation resolution has no effect on the simulation result. However, the identication of the critical parameters through statistical analysis is more rigorous. The values of each parameter were made discrete, separated into several groups, and indexed. A one-way ANOVA procedure was then used to test the null hypothesis that the means for two or more groups were equal. If the parameter was sensitive to the result, the means for different groups should be statistically different. Table 1 shows the ANOVA results and lists the signicance value of the F-test, the mean difference between groups, and the SSRs between groups. A small P-value and a large SSR indicate that the null hypothesis is rejected and that this parameter is important to the result. The group pairs were listed if the mean difference was signicant at either the .5 or the .05 level.
Experimental Design for Calibration

The Latin Hypercube Sampling toolbox in MATLAB was used to generate 200 scenarios by using the rst set of parameters and their ranges, as dened above. In this case, for eight controllable input parameters and ve different values for each parameter, it would require 58 (i.e., 390,625) possible combinations. Any number of scenarios within this value could have been used. However, a signicant amount of time would be required to conduct such a large number of experiments. Furthermore, the large number of simulation runs would be increased if multiple simulation runs for each scenario were required to reduce stochastic variability. Initial tests indicated that the use of 200 scenarios was adequate to cover the entire parameter surface and for computational simulation and calculation.
70 60 Frequency 50 40 30 56.75 sec 20 10 0 25 30 35 40 45 Travel Time (sec) 50 55 60
FIGURE 3
Feasibility test results for Site 15 with VISSIM.
Park and Qi
213
60 50 Travel Time (sec) 40 30 20 10 0 20 25 30 35 40 Mean Desired Speed (mph) (a) 45 50
60 50 Travel Time (sec) 40 30 20 10 0
4 5 6 Simulation Resolution (b)
10
FIGURE 4 Calibration parameters versus VISSIM travel time: ( a ) mean desired speed and ( b ) simulation resolution.
On the basis of the ANOVA results in Table 1, the desired speed distribution and the minimum gap time were identified as critical parameters. The desired speed distribution is an important parameter because it has a signicant inuence on roadway capacity and achievable speeds. If a driver is not hindered by other vehicles, the driver will travel at a desired speed with small stochastic variation. The min-
imum gap is an important parameter for the modeling of conicting traffic movements and determination of when the movement with a lower priority could proceed. To achieve a longer travel time from the simulation, the mean desired speed should be decreased or the minimum gap should be increased. However, because the upper bound of the minimum gap (i.e., 6 s) has already been reached, the use of higher
TABLE 1
ANOVA Results for Site 15 with VISSIM Signicance Value (P-value) 0.989 0.336 0.369 0.910 0.220 0.000 0.000 Signicant Mean Differences (sig. < 0.5) NA 13, 14 13 NA 1 4, 2 4 13, 1 4*, 2 4*, 3 4 All SSR Between Groups 12.8 137.0 80.8 40.9 178.2 834.8 6655.9
Site 15VISSIM Simulation resolution No. of preceding vehicles Average standstill distance Saturation ow Min. headway (m) Min. gap time (s) Desired speed distribution
* = The signicance value is less than .05; NA = not applicable.
214
values for the minimum gap is not realistic. It was believed that some other factors that might have affected the result were not yet identied. Therefore, additional tools, like the HCM procedure and eld data, were used to help clarify the ranges of some parameters, such as the saturation ow on the southbound approach and the eld speed within the intersection.
and the new ranges of the additive part of desired safety distance and the multiple part of desired safety distance were 1.0 to 5.0 and 1.0 to 6.0, respectively. The updated parameter set and its ranges are as follows: Simulation resolution: 1 to 9 time steps per simulation second Number of observed preceding vehicles: 1 to 4 Maximum look-ahead distance: 200 to 300 m Average standstill distance: 1 to 5 m Saturation ow rate Additive part of desired safety distance: 1.0 to 5.0 Multiple part of desired safety distance: 1.0 to 6.0 6. Priority rules for minimum headway: 5 to 20 m 7. Priority rules for minimum gap time: 3 to 6 s 8. Desired speed distribution: 30 to 40, 32.5 to 37.5, and 27.5 to 42.5 mph With the new parameters, another 200 cases were generated by the LHD method. The result shows that the new simulated distribution covers the eld data and indicates that the new parameter ranges were able to capture the eld condition. Parameter Calibration by Use of GA A GA was integrated with the VISSIM model to calibrate the parameters. First, the GA is started by generating a number of individuals in the population, each of which represents a feasible solution. Second, each individual is decoded into actual parameter values and inserted into the simulation model. Multiple simulations were then carried out for each set of decoded values, with the simulation results channeled into a tness function. In this case, ve simulation runs were conducted for each parameter set to reduce simulation variability. Third, a tness value is calculated by comparison of the simulation results and eld data by using Equation 1. If the comparison result does not meet the stopping criterion, then a new population is generated after the implementation of selection, crossover, mutation, and elitism operations. This process is continued until the stopping criterion is met or the maximum number of generations is reached. Figure 5a shows the convergence of the GA. The results were based on 10 generations and a population size of 20. Other parameter settings included a crossover rate of 0.8 and a mutation rate of 0.05. Both the best tness and the average tness values decrease as the number of generations increases, which indicates that the simulation result becomes closer to the eld value. The small oscillation of tness values in the last generations is due to the stochastic variability of the VISSIM simulation because only ve runs were conducted for each set of parameters. For the parameter set with the best tness value, VISSIM was run 100 times and the average travel time was recorded for each run. The resulting distribution of travel times is shown in Figure 5b. The eld travel times from 3 days also are shown, and they all fall within the simulated distribution. Evaluation of Parameter Sets In addition to the default and GA-optimized parameter sets, a bestguess parameter set was prepared to evaluate the performances of the calibrated simulation models. The best-guess parameter set was prepared because default simulation parameters were often adjusted on the basis of the engineers knowledge of local traffic and other conditions. As both the default and the GA-optimized parameter sets 1. 2. 3. 4. 5.
Check Speed Within the Intersection

Although a posted speed limit of 45 mph was observed in the eld, the actual speed within the intersection could be less than 45 mph because of inadequate sight distance, permitted left turns, a narrow intersection, and other factors, such as the presence of gas stations about 500 ft upstream of the stop line on the southbound approach. It is more reasonable to use the eld average speed to set the link speed and turning speed. By viewing the videotapes, the speed data were retrieved as the tail of the vehicle crossed the stop bar on the southbound approach. The rst few vehicles in the queue were not considered because they were still accelerating. The result shows that the mean eld speed within the intersection is 35.3 mph, with a standard deviation of 14.9 mph.
Check Saturation Flow

The saturation ow rate is another important factor for determination of the intersection capacity, and this in turn affects the vehicle travel time. It can be obtained from eld data and by the HCM procedure (1). The following shows how the saturation ow rate was calculated and estimated by using these approaches: Calculation of queue discharge headway from eld data. The eld saturation ow rate on the southbound approach was obtained from the videotapes. First, the time stamps of the third and the last vehicles in the queue were recorded as they passed the stop line in each cycle. The saturation ow rate was then calculated by using Equation 2. The average headway was 3.13 s, which resulted in a saturation ow of 1,149 vehicles per hour per lane. SF = where SF = saturation ow rate (vehicles per hour per lane), t3 = discharge headway of the third vehicle, and tn = discharge headway of the nth vehicle. Calculation by HCM procedure. A saturation ow rate was calculated by the HCM procedure (1). The ideal saturation ow rate of 1,900 vehicles per hour per lane was adjusted for the prevailing conditions. The adjustment is made by introducing factors such as the number of lanes, the lane width, the heavy vehicle percentage, and right and left turns. The adjusted saturation ow rate on the southbound approach was 1,313 vehicles per hour per lane. 3,600 (t3 tn ) (n 3) (2)
Parameter Range Modication

The parameter ranges were modified on the basis of the field speed data and the estimated saturation flow rates. The modified desired speed ranges were 30 to 40, 32.5 to 37.5, and 27.5 to 42.5 mph;
Park and Qi
215
0.45 0.4 0.35 0.3

Best Avg.
Fitness
0.25 0.2 0.15 0.1 0.05 0 0 1 2 3 4 5 6 7 Number of Generations (a) 8 9 10
60 50 40
Field 3: 46.51 secs Field 2: 53.32 secs
Frequency
30 20 10 0 30 40 50 60 Travel Time (sec) (b) 70 80

Field 1: 70.43 secs
FIGURE 5 Initial calibration result obtained by use of GA: ( a ) convergence of fitness value with generation and ( b ) travel time distribution of GA-based parameter set.
were already identied, this section describes how the best-guess parameter set was obtained. Best-guess parameters were obtained on the basis of the users experience with the study network and with the help of the analytical tool in HCM. For instance, in this case study, additional information such as saturation ow was obtained from the HCM procedure. According to the tables provided in the VISSIM manual, a linear regression model with an R2 value of .92 was derived to estimate the saturation ow rate: SF = 2, 469.105 (172.93 bx_add ) (65.51 bx_mult ) where SF = saturation ow rate (vehicles per hour per lane), bx_add = additive part of desired safety distance, and bx_mult = multiple part of desired safety distance. By use of the Solver function in Microsoft Excel software, an optimal combination of these two parameters was obtained to minimize (3)
the difference between the HCM and VISSIM saturation ow. The values for bx_add and bx_mult were 5 and 2.85 m, respectively. For comparison purposes, the parameter values for each set are listed in Table 2. The evaluation of three parameter sets (i.e., default, GA optimized, and best guess) were based on three criteria: (a) comparison of the distributions of 100 VISSIM simulation outputs, (b) paired t-test, and (c) animations. Comparison of these three parameter sets shows the importance of calibration for microscopic simulation models. The travel times along the southbound approach are compared in Figure 6. Figure 6 shows that all three eld travel times fall within the distribution of simulation results generated with the GA-optimized parameter set. The two other models (i.e., the default and best-guess parameter sets) generated travel times much less than those observed in the eld. A t-test was conducted to compare the simulation results obtained with the GA-optimized parameter set with those obtained with the two other parameter sets. The result shows that the GA-optimized parameter set generated statistically signicant results, unlike the two other parameter sets. Animations of each parameter set were viewed to determine whether the animations
216
TABLE 2
Three Parameter Sets for Site 15 with VISSIM Default 5 2 250.00 2.00 3.00 3.00 5.0 3.0 Car: 4050 HV: 3040 23.40 Best-Guess 10 4 300.00 2.00 5.00 2.85 15.0 3.5 Car: 4050 HV: 3040 29.53 GA-Based 6 4 215.15 3.85 5.0 5.3 20.0 4.0 Car: 27.542.5 HV: 15.518.6 50.33
Site 15VISSIM Simulation resolution Number of observed preceding vehicles Maximum look-ahead distance (m) Average standstill distance (m) Additive part of desired safety distance Multiple part of desired safety distance Priority rulesminimum headway (m) Priority rulesminimum gap time (s) Desired speed distribution (mph) Average travel time (s)
HV = heavy vehicle.
were realistic or unrealistic. For the GA-optimized calibrated parameter set, the animations at several travel time percentiles were found to be acceptable. For the default and the best-guess parameter sets, the majority of vehicles passed the intersection without waiting, which was not realistic.
CONCLUSIONS AND RECOMMENDATIONS This paper has proposed and evaluated a general calibration procedure for microscopic simulation models. The validity of the proposed procedure was demonstrated by use of a case study of an actuated signalized intersection with VISSIM. The performance of the GAoptimized calibrated model was evaluated by comparing the distribution of the simulation output with multiple days of field data. The result shows that use of the GA-optimized calibrated parameters yielded a distribution of simulation outputs that matched the eld travel times for multiple days, while use of the default parameters and best-guess parameters resulted in signicant discrepancies between the simulated and the eld data. Therefore, the importance of model
calibration is manifested. It is recommended that a simulation model be calibrated before any further simulation-based traffic analyses are conducted. Feasibility testing was found to be useful in identifying the reasonable and appropriate ranges of calibration parameters. It was performed as an initial check of whether or not the selected parameters and their ranges were able to achieve the eld condition. In addition, eld data and analytical tools could be used to help verify the initial parameter ranges. The impact of key parameters cannot be overstated, and appropriate values need to be assigned. Scatter plots and statistical analysis, such as analysis by the ANOVA test, were effective in identifying key parameters. Because this study used only one MOE, travel time, for model calibration, the performance of other MOEs is uncertain. Further research is recommended to include additional MOEs in the calibration process. However, for some MOEs, such as delay and queue length, it should be noted that different simulation models provide different calculation methods. Furthermore, to validate the model, it is desirable to collect another set of data under conditions different from those used for the calibration. Different conditions could be those that have been
45 40 35
53.32 sec
Frequency
30 25 20 15 10 5 0 20 30 40
46.51 sec
70.43 sec
50 Travel Time (sec)

Best-Guess
60
70
80
Default Parameters
GA-based
FIGURE 6
Comparison of travel times by different parameter sets.
Park and Qi
217
untried, such as after the implementation of a new signal timing plan, or the use of different days on similar transportation facilities. To account for day-to-day variability, multiple days of eld data should be used. The values of the data from only a single day could be higher or lower than the true mean of the eld value and may not represent the general eld conditions. A comprehensive range of data collected by using advanced equipment as well as intelligent transportation system technologies should be obtained for the full calibration and validation of a simulation model. Although a network with simple geometric alignments was tested in this study, a more complex network should be considered to verify the performance of the proposed procedure. Doing so will provide more insight into the feasibility of the proposed procedure. Finally, with the increased trend of the use of simulation applications for large and complex network-based problems, it is necessary to calibrate not only driver behavior parameters but also dynamic origindestination demand and route choice parameters. The proposed procedure for simulation model calibration could be enhanced to include more functions and evaluated for use with these scenarios.
5.
6.
7.
8.
9.
10. 11.
12.
REFERENCES
1. Highway Capacity Manual. TRB, National Research Council, Washington, D.C., 2000. 2. White, T. General Overview of Simulation Models. 49th Annual Meeting, Southern District Institute of Transportation Engineers, Williamsburg, Va., Jan. 2002. 3. Park, B., and J. D. Schneeberger. Microscopic Simulation Model Calibration and Validation: A Case Study of VISSIM for a Coordinated Actuated Signal System. In Transportation Research Record: Journal of the Transportation Research Board, No. 1856, Transportation Research Board of the National Academies, Washington, D.C., 2003, pp. 185192. 4. Sacks, J., N. Rouphail, B. Park, and P. Thakuriah. Statistically Based Validation of Computer Simulation Models in Traffic Operations and Man13. 14. 15. 16. 17.
agement. Journal of Transportation and Statistics, Vol. 5, No. 1, 2002, pp. 124. Hellinga, B. R. Requirement for the Calibration of Traffic Simulation Models. Department of Civil Engineering, University of Waterloo. http://www.civil.uwaterloo.ca/b.PDF. Accessed March 2003. Milam, R. T. Recommended Guidelines for the Calibration and Validation of Traffic Simulation Models. Fehr & Peers Associates, Inc. http://www.fehrandpeers.com/publications/traff_simulation_m.html. Accessed March 2003. Cohen, S. L. An Approach to Calibration and Validation of Traffic Simulation Models. Presented at 83rd Annual Meeting of the Transportation Research Board, Washington, D.C., 2004. Dowling, R., A. Skabardonis, J. Halkias, G. M. Hale, and G. Zammit. Guidelines for Calibration of Microsimulation Models: Framework and Applications. Presented at 83rd Annual Meeting of the Transportation Research Board, Washington, D.C., 2004. Zhang, Y., and L. E. Owen. Systematic Validation of a Microscopic Traffic Simulation Program. Presented at 83rd Annual Meeting of the Transportation Research Board, Washington, D.C., 2004. VISSIM Version 3.7 Manual. PTV Planug Transport Verkehr AG, Innovative Transportation Concepts, Inc., Jan. 2003. Wiedemann, R. Modeling of RTI-Elements on Multi-Lane Roads. In Advanced Telematics in Road Transport, DG XIII, Commission of the European Community, Brussels, Belgium, 1991. McKay, M. D., and R. J. Beckman. A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output from a Computer Code. Technometrics, Vol. 21, No. 2, 1979, pp. 239245. MATLAB Users Manual. The Mathworks, Inc., Natick, Mass., 1999. Milton, J. S., and J. C. Arnold. Introduction to Probability and Statistics. McGraw-Hill, New York, 2003. SPSS for Windows Release 10.1.0. SPSS Inc., Chicago, Ill., 2001. Goldberg, D. E. Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley Publishing Co., Inc., Reading, Mass., 1989. Beasley, D., D. R. Bull, and R. R. Martin. An Overview of Genetic Algorithms. Part I. Fundamentals. University Computing, Vol. 15, No. 2, 1993, pp. 5869.
The Traffic Flow Theory and Characteristics Committee sponsored publication of this paper.

trb05 Simcalval

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

trb05 Simcalval

Caricato da

Copyright:

Formati disponibili

Development and Evaluation of a Procedure for the Calibration of Simulation Models

Byungkyu (Brian) Park and Hongtu (Maggie) Qi

by use of a case study; lastly, conclusions and recommendations are provided.

Transportation Research Record 1934

Initial Evaluation on Default Parameters Unsatisfied

Feasibility Test Passed? Yes Parameter Calibration Using Genetic Algorithm

Identication of Calibration Parameters

Experimental Design for Calibration

Identication of Calibration Parameters

Transportation Research Record 1934

Experimental Design for Calibration

70 60 Frequency 50 40 30 56.75 sec 20 10 0 25 30 35 40 45 Travel Time (sec) 50 55 60

Feasibility test results for Site 15 with VISSIM.

60 50 Travel Time (sec) 40 30 20 10 0 20 25 30 35 40 Mean Desired Speed (mph) (a) 45 50

60 50 Travel Time (sec) 40 30 20 10 0

4 5 6 Simulation Resolution (b)

* = The signicance value is less than .05; NA = not applicable.

Transportation Research Record 1934

Check Speed Within the Intersection

Check Saturation Flow

Parameter Range Modication

0.45 0.4 0.35 0.3

0.25 0.2 0.15 0.1 0.05 0 0 1 2 3 4 5 6 7 Number of Generations (a) 8 9 10

30 20 10 0 30 40 50 60 Travel Time (sec) (b) 70 80

Transportation Research Record 1934

50 Travel Time (sec)

Comparison of travel times by different parameter sets.

Potrebbero piacerti anche