Sei sulla pagina 1di 8

ORIGINAL ARTICLE

A Review of Surgical Simulation With Attention


to Validation Methodology
John A. Aucar, MD, FACS,* Nicholas R. Groch, DO,
Scott A. Troxel, MD,* and Steve W. Eubanks, MD*

Abstract: The use of simulation technology for teaching and evaluating surgical skills has gained considerable attention in recent
years. This is driven by interest in quality of care, concerns over
increasing operative complexity, constraints on the use of animal
models, limited available patient material, medicolegal pressures, and
fiscal mandates for cost-effective performance. Traditional mechanical models are yielding to techniques dependent on electronic technology, including virtual reality. Data to support the validity of
simulation techniques for surgical training, assessment, and certification represent only a fraction of the literature available on the
subject. Literature searches were conducted in MEDLINE and ERIC,
covering the period from 1966 to the present. The electronic and
bioengineering literature was not surveyed due to the extensive literature on technology development, distinct from assessment of
context specific validity. The search results and the bibliographies of
key review articles were examined to identify articles that contained
original data, measured performance between cohorts, defined performance measures, and described a standard against which performance was compared. Most of the literature pertaining to simulation
techniques for surgical training has been published within the past 5
years and consist of review, opinion, and feasibility articles. There is
an emerging body of evidence to establish the validity of simulation
techniques for assessing surgical skills. Further refinement of simulation techniques, identification of specific performance measures,
longitudinal evaluations, and comparison to practice outcomes are
still needed to establish the validity and the value of surgical simu-

Received for publication May 7, 2004; accepted January 28, 2005.


From the *Department of Surgery University of Missouri at Columbia,
Columbia, MO; and Department of Otolaryngology/Facial Plastic
Surgery, Bi-County Community Hospital, Warren, MI.
Dr. Aucar made substantial contributions to the intellectual content of the
paper in conception and design, acquisition of a substantial portion of
data, analysis and interpretation of data, drafting of the manuscript, critical
revision of the manuscript for important intellectual content, and
supervision.
Dr. Groch made substantial contributions to the intellectual content of the
paper in acquisition of a substantial portion of data, drafting of the
manuscript, and analysis and interpretation of data.
Dr. Troxel made substantial contributions to the intellectual content of the
paper in analysis and interpretation of data and critical revision of the
manuscript for important intellectual content.
Dr. Eubanks made substantial contributions to the intellectual content of the
paper in critical revision of the manuscript for important intellectual
content.
Reprints: John A. Aucar, MD, FACS, Department of Surgery, MC 418,
McHaney Hall, One Hospital Drive, Columbia, MO 65212 (e-mail:
AucarJ@health.missouri.edu).
Copyright 2005 by Lippincott Williams & Wilkins

82

lation for teaching and assessing surgical skills prior to considering


implementation for certification purposes.
Key Words: computer simulation, surgery, education, clinical
competence, virtual reality
(Surg Laparosc Endosc Percutan Tech 2005;15:8289)

he craft of surgery is often distinguished from other medical professions by a mystique attributable to its technical
challenges. Although there is no universal system by which
surgical skills are reliably assessed, 3 general categories have
been identified as a framework for assessing surgical quality:
(1) cognitive/clinical skills, (2) technical skills, and (3) social/interactive skills. It has been estimated that even within the
context of operative performance, a skillful operation is 75%
decision making and 25% dexterity.1 Surgical skills have traditionally been taught through an apprenticeship model, and
then subsequently through the rotating residency model transferred from Europe by William Halstead.2 The assessment of
surgical technique has been predominantly subjective, without
reliable correlation between dexterity and surgical outcomes.3
Interest in the assessment of surgical technique has been
recently amplified by the attention afforded to medical errors
by the Institute of Medicine report To Err Is Human.4 The
inadequacies of our current system of training are becoming
increasingly scrutinized and the learning by doing approach, based on the random opportunity of patient flow, is
recognized to produce significant variability in educational
experience.5 A number of factors make it desirable to develop
objective, reliable tools for training and assessment of surgical
skills (Table 1).6
Simulation techniques have been identified as potential
methods to reduce risks to both students and patients by
allowing training, practice and testing in a protected environment prior to real-world exposure. Traditional methods of
technical skill assessment vary in reliability and validity.
Observational methods, with or without criteria, are highly
subjective and demand the presence of a qualified instructor.
There is evidence that criteria-based performance scoring
carries a higher degree of interobserver reliability and is considered more valid than the alternative.7,8 Perceptions about
these methods are noted in Table 2.
Modern technology has provided the opportunity to shift
from mechanically based simulation models to electronically
assisted devices and even virtual reality environments as a

Surg Laparosc Endosc Percutan Tech  Volume 15, Number 2, April 2005

Surg Laparosc Endosc Percutan Tech  Volume 15, Number 2, April 2005

TABLE 1. Factors Motivating the Use of Surgical Simulators6


Increasing complexity of operations
Constraints on the use of animal models
Limitations of available patient material
Medicolegal pressures; must have optimal skills
Fiscal mandates for cost-effective performance
Tabulated from text in Barnes.6

means for supplementing surgical experience. The need to


develop and refine simulation technology for surgical training
has been compared with the aviation industry, where simulator
use has become a routine tool for training and testing.9 The
purpose of this review is to identify the state of the art in regard
to the development and use of simulation technology for the
training and assessment of surgical technical skills. Attention
is directed at identifying the methodologies used for assessing
the validity of surgical simulators as tools for measuring the
technical skills of a surgeon.

METHODS
This review was conducted by performing literature
searches in the MEDLINE (Medicine) and ERIC (Education)
databases, covering the period from 1966 to the present.
Search terms that were exploded and then combined include
computer simulation, surgery, education, medical, and
competency based education. In ERIC, additional terms
exploded included labor force development, technical
skills, adult education, competency based education,
education work relationship, job training, job skills,
and skill development. The bibliographies of selected
articles identified by the literature search were then reviewed
for additional sources of information. The search results were
manually filtered to identify articles that focused on the use
of simulation models to assess or teach technical skills in a
surgical context. Some articles were identified by manual
review of current surgical journals and surgical society bulletins. The literature thus identified was then separated into
review articles and those that describe original comparative
data describing the use of electronic surgical simulators.
Articles were considered historical if they described general
training considerations or the use of mechanical simulation
techniques. Review articles include those limited to opinions
and editorial comments. Original data articles were categorized according to their methodological approach to assessing
the validity10 of simulation techniques for surgical skill train-

TABLE 2. Traditional Skill Assessment Methods


Method

Reliability

Validity

Procedure list with logs


Direct observation
Direct observation with criteria
Animal models with criteria
Videotapes

Limited
Poor
High
High
High

Poor
Modest
High
Proportional to realism
Proportional to realism

q 2005 Lippincott Williams & Wilkins

Surgical Simulation Assessment

ing and assessment. The first set of comparative articles


describe the performance of experienced surgeons or a single
cohort of trainees on simulated tasks. This approach can be
viewed as supportive of internal validity since the controlled
difference between groups is the use of a simulator rather than
level of experience. The second group of articles compare the
performance of experienced surgeons with surgical trainees,
medical students, or nonsurgical personnel. These articles test
the assumption that performance on a simulator will correlate
with level of experience. This approach can be viewed as
supportive of external validity to the extent that performance
on the simulator can be generalized as a measure of proficiency, substituting for the measure of time in residency. The
third group of comparative, original data articles compare
simulator performance to performance of actual operative
tasks. This approach provides the closest approximation to
construct validity, supporting the idea that simulators can actually measure the technical skills that a surgeon requires to
operate successfully. The electronic and bioengineering literature is not included in this review since that focuses predominantly on development and feasibility of electronic
simulation rather than context-specific validation of simulation
techniques. Additionally, articles were excluded if their methodologies were vague or they inadequately described the
simulation model used. Articles were included whether the
models used were mechanical, electronic, or combined and
whether they pertained to open or laparoscopic surgery.

RESULTS
The MEDLINE search identified 134 potentially useful
articles; however, many were manually filtered that pertained
to patient simulation for cognitive/clinical training, telepresence,
robotics, tele-education, surgical subspecialty, endoscopy, and
nonsurgical skills training. The ERIC search identified 32 potentially useful articles; however, many of these were manually
filtered that pertained to cognitive skills training, nursing and
ancillary service education, and communication skills. Combined with the additional techniques described above, 16
review/historical articles were identified621 for inclusion here.
One article focused on a purely qualitative research approach
to assess the value of surgical simulation19 and 1 article used
a bench model and an animal model in an attempt to validate
a criteria-based scoring system.7 There were 21 articles identified with original comparative data that allow assessment of
validity.2242 While this review cannot claim to be comprehensive, due to ambiguity of terms, indexing variability, design inconsistencies, and the continued introduction of new
literature on the subject, it was believed that these selected
articles illustrate the current issues in research methodology
for validation of surgical simulators and are appropriately
representative of the state of the art in surgical simulation
technology.

Review Articles
Among the review articles selected for examination,
relatively few focus specifically on the use of technology for
training and assessment of surgical technique. Most address

83

Aucar et al

Surg Laparosc Endosc Percutan Tech  Volume 15, Number 2, April 2005

the subject within a broader context of virtual reality applications in medicine. Four articles were selected for their
historical implications because they discuss tools for technical
skill practice, without relying on advanced technology.6,810
Some form of mechanical simulation has been common in
surgical education for many years. Trainees traditionally tie
knots on any rigid anchor that they can find in the hopes that
they will not appear clumsy when finally allowed to tie in the
operating room. In 1981, Zikria11 endorsed the value of
a simple knot-tying board, developed in 1967, as a practice
anchor for students and residents to learn standard knot-tying
techniques. Barnes and colleagues6,12 described more elaborate mechanical models to teach clamping, tying, and suturing
techniques. They emphasize the importance of a structured
curriculum and defined performance criteria and suggest a
head-mounted videocamera as a way to monitor the technical
performance of residents during bench practice and surgical
activity.12 Some mechanical models use processed animal
parts to impart a greater degree of realism. While mechanical
models are well established, they are not considered glamorous
and have not generated much attention for the development
of objective assessment techniques. Live animal models are a
well-established method of teaching surgical techniques. The
cost and inconvenience, however, are formidable. In addition,
significant political and social barriers exist. The use of animals to gain surgical proficiency is prohibited in the United
Kingdom by the Cruelty to Animals Act of 1876.13 Torkington
et al13 provide a concise overview of the historical motivations
for developing simulation models, a description of various
types of models, and the implications that simulation models
may hold for the future.
It is expected that mechanical bench models and animal
models will continue to play a significant role in surgical
training and testing for many years. However, there is an
escalating movement toward the development and use of
virtual environment computer technology to achieve the benefits of practice, with minimal consumption of material or
animal resources. The most significant advantage of virtualreality surgical simulators may be in their ability to monitor
performance by objective parameters. Four articles are discussed here for their descriptions of the theoretical foundations
of using virtual-reality models for surgical simulation.1417
Dumay and Jense14 provide an overview of the technical
features of virtual environments and the potential application
those environments have for endoscopic surgery, including
laparoscopy. They discuss the roles of immersion and the
challenges of virtual environment modeling, including modeling of anatomy, physiology, pathology, and tissue deformation.
They also identify three main subsystems that characterize
virtual environment systems: (1) an actuator subsystem, (2)
a sensor subsystem, and (3) a control subsystem.14 Lange
et al15 describe several specific virtual-reality applications,
including a surgical planning system, which creates 3-dimensional models from CT and MRI data, and a limb trauma
simulator model that illustrates the potential effects of ballistic
wounds. Ota et al16 describe the application of fuzzy logic data
analysis to evaluate performance of suturing tasks in a virtual
environment, while Satava17 has focused on the potential for
virtual-reality simulation to address many of the training

84

problems encountered in a military environment. He also


provides an excellent update of the status of various virtualreality simulation models that were available in 2001. Some of
these models are listed in Table 3. An important concept also
advanced by Satava and Ellis18 is that the simulation tools must
be intuitive and must accommodate the surgeon rather than
demand a long learning curve or require significant adaptation
of natural techniques. Many of the review articles encountered
compare surgical simulation to the flight simulation movement
that began over 40 years ago, pointing out that aviators
are expected to demonstrate proficiency in a simulated
environment prior to participating in commercial flying
activity.9,13,15,17,18
Most of the review articles encountered that address
advanced technologies for surgical simulation describe current
and theoretical achievements in general terms. Systematic
citation of original data to support validity is notably absent while feasibility, editorial, and theoretical comments
abound.1921 Schijven and Jakimowic22 support the value of
a virtual reality surgical simulator based on qualitative research in the form of a questionnaire given to expert and
referent surgeons at several international meetings. An
article by Berg et al23 cites various studies with conflicting
results regarding the value and reliability of technical skill
assessment by the use of simulators. They also review concepts in validation and propose metrics for evaluation of a
suture simulator, recognizing that no generalized standards
exist. Table 4 demonstrates a variety of parameters derived
from various articles that have been used or recommended for
simulator validation studies directed at assessing technical
performance. A significant move toward increasing the rigor of
scientific evaluation of simulation technology was made
through an international workshop aimed at standardizing the
definitions and taxonomy of term use in surgical skill
assessment.24
In the following section, original data articles are presented in the context of whether they use single cohort
comparisons, multiple cohort comparisons, or comparison
between simulated task performance and actual surgical tasks.
TABLE 3. Virtual Reality Simulation Models for
Surgical Applications
Tendon transplant simulator
Cholecystectomy simulator
Graphic simulation of abdomen
Visible Human Project; anatomic modeling
Limb trauma simulator
Hysteroscopy simulator
Liver surgery simulator
Intravenous insertion simulator
Sinus endoscopy simulator
Anastomosis simulator
Anesthesia simulator
Opthalmology simulator
Ultrasound simulator
Arthroscopy simulator
Tabulated from text in Dumay and Jense.14

q 2005 Lippincott Williams & Wilkins

Surg Laparosc Endosc Percutan Tech  Volume 15, Number 2, April 2005

TABLE 4. Parameters for Measuring Technical Performance


Time taken per task
Hand distance traveled
Instrument distance traveled
Repetition counts and economy of motion
Variation in suture entry and exit points
Symmetry of suturing or other maneuvers
Motion produce at application point
Force produce at application point

Comparative Data Articles


There were 22 articles identified that contained original
comparative data that allowed some assessment of the validity
of surgical simulation techniques.8,2545 It is notable that the
definition of various types of validity is not constant throughout the surgical literature. Most discussions about validity
stem from the social and behavioral sciences, where direct
objective measurement is not feasible and subtypes of validity
are recognized to overlap.10 Of the 22 articles cited, 9 of them
examine a single cohort of subject, such that some element of
internal validity could be ascribed.2533 Nine articles compared
performance on simulators between at least 2 groups with
different degrees of surgical experience, such that some measure of external validity was measured.8,3441 Articles comparing performance on simulators to that in an actual operating
environment are considered to assess construct validity, recognizing that the inference to be drawn applies only to technical skills, rather than the comprehensive skills of a surgeon.
Only 6 such articles were encountered.26,28,42,4345 Two of these
overlap with assessment of internal validity.27,28 The number
of cohorts, number of subjects per cohort and assessment
techniques, and validity implications for these 22 articles are
summarized in Table 5.

Articles Comparing Performance of


Single Cohorts
Of the 9 articles with single cohorts, 2 of them focused
on basic knot-tying skills25,26 and 2 articles compared performance on a live animal surgical task, providing some element
of construct validity.27,28 Tytherleigh et al25 attempted to
develop a structured scoring system by which 6 trained observers could score the performance of surgical trainees in
basic open surgical knot-tying task. Although there was good
correlation between scores assessed between observers, there
was some interobserver variability and there was significant
variation in performance between trainees. This illustrates the
inherent need to evaluate interrater reliability in evaluating an
objective assessment method, which requires meticulous attention to study design and execution detail. Pearson et al26
examined the performance of 43 graduate students without
prior laparoscopic experience on a simulated intracorporeal
knot-tying task and then examined the effect of 4 training
methods on subsequent improvement. One method was the
Minimally Invasive Surgical TrainerVirtual Reality (MISTVR) device. The MIST-VR provides a crude visual representation of a laparoscopic field on a desktop computer screen and
uses laparoscopic instrument handles with electronic sensors
q 2005 Lippincott Williams & Wilkins

Surgical Simulation Assessment

as the interface device. One of the 5 groups received no


training between reassessments. There was improvement in
performance with each of the practice methods, using time to
completion as the outcome measure. There are 2 articles by
Torkington et al29,30 that attempt to validate the MIST-VR
device as a training tool. In the first case, 30 medical students
were assessed and then received training on either the MISTVR device or by standard training methods or no training.
Hand movements, distance, speed, and time were used as the
outcome variables. Both methods of training showed advantage over no training, but there was no significant difference in
performance following the 2 training methods.29 The same
group tested 13 surgical trainees who either trained with the
MIST-VR or did not train and was able to show that there is
a learning effect associated with the use of the simulator
compared with no training.30 Jordan et al31 tested 32 participants with no laparoscopic experience on a simulated cutting
task under normal laparoscopic visualization, before and after
training with the MIST-VR, a video-box simulator, or no
training. Once again, training showed statistically significant
benefit over no training, but there was a trend for a greater
degree of improvement using the MIST-VR. Strom et al32
support the use of simulators as a pedagogical tool for medical
students based on observed improvement during practice on 2
distinct simulators, with and without haptic feedback. An
article by Gallagher et al33 measured the performance of 210
surgeons who had performed at least 50 laparoscopic cholecystectomies, on both the MIST-VR simulator and a mechanical box trainer. They found that 12% of those tested had
performance scores more than 2 standard deviations from the
mean. Because the cohort was identified by self-reporting on
a questionnaire and because no reference measurement was
available on their actual laparoscopic performance, construct
validity could not be ascribed. However, the study could be
interpreted to reflect significant variation in the measurable
skills of even experienced surgeons or to reflect a limitation in
the ability of the simulators to accurately measure the skills of
experienced surgeons.

Articles Comparing Multiple Cohorts


Articles comparing performance between distinct cohorts may provide some support for external validity if they
show that performance measured among those known to have
better technical dexterity is better than those who do not. This
presumes that experienced surgeons have greater degree of
surgical dexterity than inexperienced surgeons or others. The
results of those studies are somewhat mixed. A single article
was encountered that used a motion tracking device to assess
suturing technique applied in a bench model. Performance was
compared between basic surgical trainees, junior specialist
registrars, senior specialist registrars, and experienced consultants.34 These authors provided statistical support that the
motion tracking technique represents an effective, objective
measure of technical performance. At least 3 articles provide
supportive data for external validity in the context of laparoscopic surgery.3537 The performance of laparoscopically experienced surgeons was compared with that of surgeons with
little laparoscopic experience by McNatt and Smith.35 Surgeons with greater laparoscopic experience completed several

85

Surg Laparosc Endosc Percutan Tech  Volume 15, Number 2, April 2005

Aucar et al

TABLE 5. Validity Design for Articles With Original Comparative Data


Cohorts

No. of
Subjects

Training or Assessment
Methods

Tytherleigh et al,25 2001


Pearson et al,26 2002

1
1

31
43

Ahlberg et al,27 2002

29

Grantcharov et al,28 2001

17

Torkington et al,29 2001

30

Torkington et al,30 2001

13

Jordan et al,31 2001

32

Storm et al,32 2004

15

Gallagher et al,33 2003

210

Datta et al,34 2001


McNatt & Smith,35 2001
OToole et al,36 1999
Chaudhry et al,37 1999
Schijven & Jakimowicz,38 2003
Gallagher et al,39 2001
Paisley et al,40 2001
Fraser et al,41 2003
Fried et al,8 2004
Richards et al,42 2000

4
2
2
3
2
3
3
2
3
2

12/13/13/13
5/3
8/12
11/18/7
37/37
12/12/12
36/3716
83/82
82/66/67
5/7

Rosen et al,43 2001

5/5

Open knot tying


Unstructured training
Rosser drills
MIST-VR
Self practice
No practice
MIST-VR
Simulated Lap Appy in live pigs
MIST-VR
Lap Chole in live pigs
MIST-VR
Standard training
No training
MIST-VR
No training
MIST-VR
Video-box simulator
No training
Procedicus KSA (haptic)
MIST-VR
MIST-VR
Mechanical trainer
Motion tracking
MIST-VR
Vascular anastomosis simulator
MIST-VR
Xitact LS 500
MIST-VR
MIST-VR
MISTELS
MISTELS
Force/torque sensor
Lap Chole in pigs
Lap Nissen in pigs
Force/Torque sensor
Lap Chole in pigs
Video/bench model
Taped Lap Chole
MISTELS and porcine model
MIST-VR
Taped Lap Chole

Ref.

Scott et al,44 2000


Fried et al,45 1999
Seymour et al,46 2002

9/13
1
1

12
16

Validity
Design
Internal
Internal

Internal
Construct
Internal
Construct
Internal

Internal
Internal

Internal
Internal
External
External
External
External
External
External
External
External
External
Construct

Construct
Construct
Construct
Construct

MIST-VR, Invasive Surgical Trainer, Virtual Reality; MISTELS, McGill Inanimate System for Training and Evaluation of Laparoscopic
Skills; Lap Chole, laproscopic cholecystectomy; Lap Appy, laparoscopic appendectomy; Lap Nissen, laparoscopic Nissen fundoplication.

specific tasks on the MIST-VR simulator in less time and


with less performance errors than less experienced surgeons.
OToole et al36 compared performance between 8 vascular
surgeons and 12 medical students using a vascular anastomosis simulator (Boston Dynamics, Inc). The experienced
surgeons showed significantly better performance than the
medical students on specific suturing tasks. The surgeons
performance improved measurably during sequential sessions,
but the medical students performance was noted to increase to
a greater degree. This suggests that practice on the simulator

86

will improve performance on the simulator but does not


support that such practice will improve surgical performance,
unless construct validity can be established for the particular
simulation model. Chaudhry et al37 compared performance of
11 surgeons with 18 medical students and 7 nonsurgical personnel on 6 specific tasks using the MIST-VR simulator. There
was an evident performance advantage for surgeons; however,
the extent of difference and degree of variability was different
between the various tasks. The authors conclude that the
simulator is able to distinguish performance between surgeons
q 2005 Lippincott Williams & Wilkins

Surg Laparosc Endosc Percutan Tech  Volume 15, Number 2, April 2005

and nonsurgeons, but its value may be variable according to


the specific task. Schivjen and Jakimowicz38 compared
performance between 37 surgeons who had performed over
100 laparoscopic cholecystectomies and 37 surgeons with no
laparoscopic experience on the Xitact LS 500 simulator. They
found the expected difference in performance and both groups
showed improvement in scores over time. Although the article
is entitled Construct Validity, it is thought to represent
external validity in the context of this discussion. Gallagher
et al39 measured the psychomotor skills of 36 surgeons using
the MIST-VR. They were stratified into groups of 12, by
whether they had performed greater than 50, between 50 and
10, or fewer than 10 minimally invasive procedures. They
experienced surgeons performed significantly better in regard
to speed, error rate, and economy of motion. Paisley et al,40 on
the other hand, used the MIST-VR simulator to compare task
performance between 36 basic surgical trainees, 37 surgically
naive individuals, and 16 experienced surgeons. They found
only a weak correlation between simulator performance and
level of experience. There was also limited correlation with
performance among trainees who trained with the simulator,
compared with performance assessed by observation in the
operating theater. They point out that further work is needed
to increase the reliability and validity of current surgical
simulators. Fraser et al41 attempted to set a pass/fail score by
comparing performance on a physical model with video
imaging using the McGill Inanimate System for Training and
Evaluation of Laparoscopic Skills (MISTELS) set of tasks
between competentsubjects (experienced surgeons, fellows,
and chief residents) and noncompetent subjects (medical
students and residents within their first 2 years). While they
noted a significant difference in scores, the best cutoff provided only a sensitivity of 82% and specificity of 82% for
being able to identify the competent subjects. The same
group has demonstrated a significant performance difference
on the MISTELS task simulator, based on level of experience,
in a multiinstitutional sample of surgeons and trainees.8

Articles Comparing Simulator Performance to


Live Animal or Human Surgical Tasks
There are limited data supporting the construct validity
of simulator training. No studies were encountered comparing
simulator performance with the technical dexterity of fully
trained surgeons performing real operations. There are, however, 7 studies providing some suggestion that virtual-reality
and bench simulators may have value for surgical training and
assessment.27,28,4246 Two articles by the same group describe
the use of force/torque measurements at the hand-tool interface
to compare performance between experienced and inexperienced operators during live animal surgery.42,43 They found
that the technique could consistently distinguish performance
based on underlying experience. Grantcharov et al28 studied 14
surgical residents as they performed 6 tasks on the MIST-VR
simulator and then performed laparoscopic cholecystectomy
on living pigs. Performance errors and economy of motion
scores were assessed. Correlation was noted between error
scores in vivo and 3 of the 6 simulator tasks. Economy of
motion scores correlated with 5 of 6 tasks for the right hand,
but only 1 of 6 for the left hand. Ahlberg et al27 randomized 29
q 2005 Lippincott Williams & Wilkins

Surgical Simulation Assessment

medical students to either train with the MIST-VR or not and


then had them perform a simulated laparoscopic appendectomy on live pigs by preparing divided loops of small intestine
to represent the appendix. The operations were videotaped and
examined by independent observers. Interestingly, there was
no significant difference in operative performance between the
2 groups. However, among those who used the MIST-VR,
operative performance correlated with performance on the
simulator. This suggests that while the MIST-VR did not
improve performance of the subjects, it may be able to predict
surgical performance and theoretically successful outcomes.
Scott et al44 randomized second- and third-year residents to
train or not train on a video/bench model and then assessed
performance on 5 simulated tasks with specific criteria. Additionally, pre- and posttraining assessment using 8 specific
criteria was made through review of taped laparoscopic cholecystectomy by 3 observers blinded to training status. There
were no differences in baseline scores. However, there were
significant differences in posttraining scores in 5 of 5 simulated tasks and 4 of 8 operative criteria, favoring the residents
who used the training system. In 1999, Fried et al45 used
a methodology of baseline testing on an inanimate model
(MISTELS) and a porcine model, followed by randomization
to practice or not practice on the bench model, followed by
repeat testing on both models, for third-year surgical residents.
They reported a significantly improved assessment score among
those who practiced, Seymour et al46 prospectively randomized 16 surgical residents to train on the MIST-VR surgical
simulator and used a blinded, multiobserver technique to
review performance on videotaped laparoscopic cholecystectomy. Simulator-trained residents performed 29% faster and
were 6 times less likely to commit defined errors than the untrained cohort. This provides the strongest direct evidence
supporting the value of a training simulator noted so far.

DISCUSSION
Surgical simulators are undergoing a transition in status
from curious novelty, predominantly aimed at attracting attention at surgical equipment exhibits, to sophisticated instruments for training and assessing surgical skills. The assumption
that an individuals performance on a simulator reflects his or
her ability to perform real surgical tasks must be carefully
examined. Poor or marginal performance on a surgical simulator may be reflective of either a deficiency in technical skill
or a lack of the simulators validity in measuring true surgical
agility. The combined data reviewed above provide supportive
evidence of the ability of surgical simulators to serve as an
objective measure of technical skill level among surgeons. The
degree of sophistication of simulator technology, simulation
techniques, and assessment methodologies are continually
improving. However, limited data and several inconsistencies
remain that should temper the acceptance of current simulators
as inherently valid for certification of surgeons. There is no
single methodology that can optimally establish validity because there are variations in context. Additional studies are
needed that correlate the observed clinical performance of
capable and experience surgeons, using defined criteria, with
their measured performance on simulators. There remain

87

Aucar et al

Surg Laparosc Endosc Percutan Tech  Volume 15, Number 2, April 2005

numerous pitfalls in the methodological design of studies


aimed at establishing construct validity for surgical simulators.
For instance, if a trainees performance is measured
longitudinally on a simulation device and in the operating
room, and improved performance is observed in both contexts,
it might be assumed that the 2 activities relate to one another.
However, the same correlation in longitudinal improvement
might be predicted if the simulator practice was substituted
with an alternative unrelated activity, such as juggling or
sports. The probability that an individual might perform well
in one activity and poorly in another must be driven quite low
for construct validity to be established. The ability to
consistently distinguish the performance of experienced and
novice operators through blinded and contemporaneous
measure of simulated and clinical performance with limited
premeasurement practice will eventually support the validity
of surgical simulators. The strength of the association between
simulator scoring and clinical performance will likely increase
as simulation devices mature and experience in their use
expands. There is little potential danger in continuing to
explore the role of surgical simulators for training and
assessment unless a great weight is placed on their validity
prior to maturation of the technology and techniques.
There are several articles in the literature to suggest that
surgical simulators are already being used as surrogate measures of surgical performance.4749 Shah et al47 found a
correlation between performance errors on the MIST-VR
simulator and certain visual characteristics among medical
students by measuring tonic accommodation. It is suggested
that tonic accommodation may predict aptitude for laparoscopic abilities and could serve as a screening measure. There
are 2 articles that examine the effect of sleep deprivation on
surgical dexterity using a laparoscopic simulator.48,49 In 1998,
the performance of 6 surgical trainees was repeatedly measured in the evening and the next morning using the MIST-VR
using a crossover study design.48 Significant deterioration in
performance was noted, measured by the number of errors and
the time for task completion for each of the subjects under 3
conditions: a normal nights sleep, sham call with scheduled
disturbances, and a night of no sleep. In a similar study in
2001, Grantcharov et al49 used the 6 standard tasks of the
MIST-VR to compare the performance of 14 surgical trainees
under normal circumstances with performance after a night on
call and less than 3 hours of sleep. Performance was significantly different for 5 of the 6 tasks, with time, errors, and
unnecessary movements consistently increased after sleep
deprivation. The American College of Surgeons has already
identified the potential for simulation techniques to make
significant impacts on patient safety through its ability to
permit learning in a risk-free environment, refresh techniques
for surgeons returning to practice after an extended absence,
correct case-mix inequalities during training, and allow
prototyping of new procedures and testing of new devices
in a simulated environment.50 Additionally, simulation technologies may be advocated as a tool for identifying technical
aptitude among potential training candidates, and a tool for
measuring technical competency for certification and recertification purposes.50 The last 2 issues will undoubtedly elicit
controversy and underscore the importance of developing

88

reliable and reproducible techniques for the validation of surgical simulation technology.

CONCLUSIONS
Validity is not an inherent characteristic of a system but
depends on the purpose of measurement and the interpretation
of results. The surgical literature is replete with editorial,
concept, and feasibility articles describing the potential of
surgical simulators. Relative few data have been obtained so
far that examine the validity of simulators for the training and
assessment of surgical skills. There is a clear trend toward the
establishment of internal, externa, and construct validity of
electronic simulators, although the validation of mechanical
and animal models was never systematically pursued. A number of issues will become increasingly important for defining
the role of simulation technologies in surgical training and
practice. These include the refinement of simulation technology, identification of the appropriate context for their use,
reduction of costs to increase availability, identification of
appropriate metrics, and scientific validation of the techniques
for both teaching and competency assessment.
REFERENCES
1. Spencer F. Teaching and measuring surgical techniques: the technical
evaluation of competence. Bull Am Coll Surg. 1978;63:912.
2. Halstead W. The training of the surgeon. Bull Johns Hopkins Hosp. 1904;
15:267276.
3. Darzi A, Smith S, Taffinder N. Assessing operative skill: needs to be more
objective. BMJ. 1999;318:887888.
4. Kohn LT, Corrigan JM, Donaldson, MS (Committee on Quality of Health
Care in America, Institute of Medicine), eds. To Err Is Human: Building
a Safer Health System. Washington, DC: National Academy Press; 2000.
5. Meier A, Rawn C, Krummel T. Virtual reality surgical application
challenge for the new millennium. J Am Coll Surg. 2001;192:372384.
6. Barnes R. Surgical handicraft: teaching and learning surgical skills. Am
J Surg. 1987;153:422427.
7. Martin JA, Regehr G, Reznick R, et al. Objective structured assessment
of technical skill (OSATS) for surgical residents. Br J Surg. 1997;84:
273278.
8. Fried GM, Feldman LS, Vassiliou MC, et al. Proving the value of
simulation in laparoscopic surgery. Ann Surg. 2004;240:518528.
9. Satava R. Virtual reality surgical simulator: the first steps. Surg Endosc.
1993;7:203205.
10. Trochim W. The Research Methods Knowledge Base, 2nd ed. Cincinnati:
Atomic Dog Publishing, 2000. (http://www.socialresearchmethods.
net/kb/contents.htm [version current as of August 2004]).
11. Zikria BA. Exercise and drill boards for surgical training. 10 year
experience with a knot-tying board. Am J Surg. 1981;141:612613.
12. Barnes RW, Lang NP, Whiteside MF. Halstedian technique revisited.
Innovations in teaching surgical skills. Ann Surg. 1989;210:118121.
13. Torkington J, Smith SG, Rees BI, et al. The role of simulation in surgical
training. Ann R Coll Surg Engl. 2000;82:8894.
14. Dumay AC, Jense GJ. Institution TNO Physics and Electronics
Laboratory, The Hague, The Netherlands. Endoscopic surgery simulation
in a virtual environment. Comput Biol Med. 1995;25:139148.
15. Lange T, Indelicato DJ, Rosen JM. Virtual reality in surgical training. Surg
Oncol Clin North Am. 2000;9:6179.
16. Ota D, Loftin B, Saito T, et al. Virtual reality in surgical education.
Comput Biol Med. 1995;25:127137.
17. Satava RM. Accomplishments and challenges of surgical simulation. Surg
Endosc. 2001;15:232241.
18. Satava RM, Ellis SR. Human interface technology. An essential tool for
the modern surgeon. Surg Endosc. 1994;8:817820.
19. Smith SG, Torkington J, Darzi A. Objective assessment of surgical
dexterity using simulators. Hosp Med. 1999;60:672675.

q 2005 Lippincott Williams & Wilkins

Surg Laparosc Endosc Percutan Tech  Volume 15, Number 2, April 2005

20. Issenberg SB, McGaghie WC, Hart IR. Simulation technology for health
care professional skills training and assessment. JAMA. 1999;282:
861866.
21. Vanchieri C. Virtual reality: will practice make perfect? J Natl Cancer
Inst. 1999;91:207209.
22. Schijven M, Jakimowicz J. Face-, expert, and referent validity of the Xitact
LS500 laparoscopy simulator. Surg Endosc. 2002;16:17641770.
23. Berg D, Berkley J, Weghorst S, et al. Issues in validation of a dermatologic
surgery simulator. Studies in Health Technology and Informatics. 2001;
81:6065.
24. Satava RM, Cuschieri A, Hamdorf J. Metrics for Objective Assessment of
Surgical Skills Workshop. Metrics for objective assessment. Surg Endosc.
2003;17:220226.
25. Tytherleigh MG, Bhatti TS, Watkins RM, et al. The assessment of surgical
skills and a simple knot-tying exercise. Ann R Coll Surg Engl. 2001;83:
6973.
26. Pearson AM, Gallagher AG, Rosser JC, et al. Evaluation of structured and
quantitative training methods for teaching intracorporeal knot tying. Surg
Endosc. 2002;16:130137.
27. Ahlberg G, Heikkinen T, Iselius L, et al. Does training in a virtual
reality simulator improve surgical performance? Surg Endosc. 2002;16:
126129.
28. Grantcharov TP, Rosenberg J, Pahle E, et al. Virtual reality computer
simulation. Surg Endosc. 2001;15:242244.
29. Torkington J, Smith SG, Rees BI, et al. Skill transfer from virtual reality to
a real laparoscopic task. Surg Endosc. 2001;15:10761079.
30. Torkington J, Smith SG, Rees B, et al. The role of the basic surgical skills
course in the acquisition and retention of laparoscopic skill. Surg Endosc.
2001;15:10711075.
31. Jordan JA, Gallagher AG, McGuigan J, et al. Virtual reality training leads
to faster adaptation to the novel psychomotor restrictions encountered by
laparoscopic surgeons. Surg Endosc. 2001;15:10801084.
32. Strom P, Kjellin A, Hedman L, et al. Validation and learning in the
Procedicus KSA virtual reality surgical simulator. Surg Endosc. 2003;17:
227231.
33. Gallagher AG, Smith CD, Bowers SP, et al. Psychomotor skills assessment
in practicing surgeons experienced in performing advanced laparoscopic
procedures. J Am Coll Surg. 2003;197:479488.
34. Datta V, Mackay S, Mandalia M, et al. The use of electromagnetic motion
tracking analysis to objectively measure open surgical skill in the
laboratory-based model. J Am Coll Surg. 2001;193:479485.

q 2005 Lippincott Williams & Wilkins

Surgical Simulation Assessment

35. McNatt SS, Smith CD. A computer-based laparoscopic skills assessment


device differentiates experienced from novice laparoscopic surgeons. Surg
Endosc. 2001;15:10851089.
36. OToole RV, Playter RR, Krummel TM, et al. Measuring and developing
suturing technique with a virtual reality surgical simulator. J Am Coll
Surg. 1999;189:114127.
37. Chaudhry A, Sutton C, Wood J, et al. Learning rate for laparoscopic
surgical skills on MIST VR, a virtual reality simulator: quality of humancomputer interface. Ann R Coll Surg Engl. 1999;81:281286.
38. Schijven M, Jakimowicz J. Construct validity. Surg Endosc. 2003;17:
803810.
39. Gallagher AG, Richie K, McClure N, et al. Objective psychomotor skills
assessment of experienced, junior, and novice laparoscopists with virtual
reality. World J Surg. 2001;25:14781483.
40. Paisley AM, Baldwin PJ, Paterson-Brown S. Validity of surgical simulation
for the assessment of operative skill. Br J Surg. 2001;88:15251532.
41. Fraser SA, Klassen DR, Feldman LS, et al. Evaluating laparoscopic skills.
Surg Endosc. 2003;17:964967.
42. Richards C, Rosen J, Hannaford B, et al. Skills evaluation in minimally invasive surgery using force/torque signatures. Surg Endosc. 2000;14:791798.
43. Rosen J, Hannaford B, Richards CG, et al. Markov modeling of minimally
invasive surgery based on tool/tissue and force/torque signatures for
evaluating surgical skills. IEEE Trans Biomed Eng. 2001;48:579591.
44. Scott DJ, Bergen PC, Rege RV, et al. Laparoscopic training on bench
models: better and more cost effective than operating room experience?
J Am Coll Surg. 2000;191:272283.
45. Fried GM, Derossis AM, Bothwell J, et al. Comparison of laparoscopic
performance in vivo with performance measured in a laparoscopic
simulator. Surg Endosc. 1999;13:10771081; discussion 1082.
46. Seymour NE, Gallagher AG, Roman SA, et al. Virtual reality training
improves operating room performance: results of a randomized, doubleblinded study. Ann Surg. 2002;236:458463; discussion 463-4.
47. Shah J, Paul I, Buckley D, et al. Can tonic accommodation predict surgical
performance? Surg Endosc. 2003;17:787790.
48. Taffinder NJ, McManus IC, Gul Y, et al. Effect of sleep deprivation on
surgeons dexterity on laparoscopy simulator. Lancet. 1998;352:1191.
49. Grantcharov TP, Bardram L, Funch-Jensen P, et al. Laparoscopic
performance after one night on call in a surgical department: prospective
study. BMJ. 2001;323:12221223.
50. Dawson SL. A critical approach to medical simulation. Bull Am Coll Surg.
2002;87:1218.

89

Potrebbero piacerti anche