Sei sulla pagina 1di 31

Best Practice in SAS

programs validation. A Case


Study
CROS NT srl
Contract Research Organisation
Clinical Data Management
Statistics
Dr. Paolo Morelli, CEO
Dr. Luca Girardello, SAS programmer

AGENDA
Introduction

Program Verification: a Business Approach


Program Verification: some case studies

FACTS about CROS NT


Headquarters in Verona (Italy)
Founded in 1993
Offices in Milan and Munich
40 employees
Data Management, Statistical, PhV and
hosting services
Services to Pharma, Biotech and CROs
Cooperation with Universities of Padua, Bologna, Milan

Introduction
Topic of the presentation: how to maximize the quality of
programming while minimizing the time to verify program.
In the first part of the presentation we will discuss about the
business part:
What is program verification?
Why program verification is necessary?
When is program verification done?
Who performs program verification?
How does the verification process work?
In the second part of the presentation we will discuss about a
case study

What is program verification

Making certain that the program does what it is


supposed to do, producing a documented evidence
of this

Why program verification is necessary


The aim of SAS validation in pharmaceutical research area
is that end-users will produce high quality programs that fit
the purpose for which they are designed and provide
accurate results with a style that they promote:

Reliabity
Efficiency
Portability
Flexibility
Ease of use

When is program verification done

Program verification should performed as soon after the


development of the SAS code, before putting the product in
production
Development and production environment should be clearly
defined;
Audit trail of program changes should be present as soon
the program is released to production

Who performs program verification


The SAS programmer who create the code should perform basic
testing and follow coding rules, like:
Error log search
Warning evaluation
Comments on critical steps
Comments on Macro usage
Details of the SAS program (datetime of creation, SAS
programmer name, dataset used, datetime of verification, Name
of second SAS programmer, etc)

It should be emphasized to perform then a program


verification by a second SAS programmer

How does the verification process work


Biostatistician creates specs then
Submits request

Interactive Process

SAS developer produces TLGs


Then submits verification request

Interactive Process

Quality Control programmer verifies results

Different Verification Procedures


SOP should define different verification procedures.
Independent programming
Reviewing results
Random review of results
Visually verify code
Some of them should mandatory, other optional.

The Document Containing the programming specs (for example the


SAP) should define which approach to follow, illustrating program
verification techniques (for example using alternative SAS
programming procedures)
The determination of the level of validation should follow a risk-based
model. The key is to determine the effect on the process if the program
does not produce the desired result.

Error Types
Business strategy should identify common error types found in:
Statistical tables
Listings
Graphs
Data analysis files
Header section of SAS programs
Bad programming specifications

Metric report related to error type should be analyzed in order to


perform preventive action correction

Specific CDISC SDTM Validation specs


Metadata Level
Verifies that all required variables are present in the dataset
Reports as an error any variables in the dataset that are not
defined in the domain
Reports a warning for any expected domain variables which
are not in the dataset

Specific CDISC SDTM Validation specs Metadata Level


Notes any permitted domain variables which are not in the
dataset
Verifies that all domain variables are of the expected data type
and proper length
Detects any domain variables which are assigned a controlled
terminology specification by the domain and do not have a
format assigned to them

SAS Programming Rules when


validating
Emphasizing well commented programs.
Macro in order to use programs repeatedly to verify different
programs (re-usability)
Using alternative
validating.

SAS

programming

Define a workflow if error are identified

procedures

when

How to optimize the process

Good specs & Good standards & Good training


=
Good programming results

A Case Study

Example of Derived Datasets


Validation (1/4)
First Programmer programs
all derived datasets

Second Programmer programs


all derived datasets

PROC COMPARE
Compare
original derived datasets
versus
validation derived datasets

Example of Derived Datasets


Validation (2/4)
proc compare base=listing compare=validation
listbase listcomp;
id pt;
run;
The COMPARE Procedure
Comparison of WORK.LISTING with WORK.VALIDATION
(Method=EXACT)
Observation Summary
Observation
First
First
Last
Last

Obs
Unequal
Unequal
Obs

Base

Compare

1
79
79
89

1
79
79
89

ID
pt=121
pt=201
pt=201
pt=212

Number of Observations in Common: 89.


Total Number of Observations Read from WORK.LISTING: 89.
Total Number of Observations Read from WORK.VALIDATION: 89.
Number of Observations with Some Compared Variables Unequal: 1.
Number of Observations with All Compared Variables Equal: 88.

Example of Derived Datasets


Validation (3/4)
Values Comparison Summary
Number of Variables Compared with All Observations Equal: 3.
Number of Variables Compared with Some Observations Unequal: 1.
Total Number of Values which Compare Unequal: 1.
Maximum Difference: 1.
Variables with Unequal Values
Variable

Type

age

NUM

Len
8

Label
AGE (years)

Ndif

MaxDif

1.000

Value Comparison Results for Variables


_________________________________________________________
|| AGE (years)
||
Base
Compare
pt ||
age
age
Diff.
% Diff
_______ || _________ _________ _________ _________
||
201 ||
41
40
-1.0000
-2.4390
_________________________________________________________

Example of Derived Datasets


Validation (4/4)
The COMPARE Procedure
Comparison of WORK.LISTING with WORK.VALIDATION
(Method=EXACT)
Observation Summary
Observation
First Obs
Last Obs

Base

Compare

1
89

1
89

ID
pt=121
pt=212

Number of Observations in Common: 89.


Total Number of Observations Read from WORK.LISTING: 89.
Total Number of Observations Read from WORK.VALIDATION: 89.
Number of Observations with Some Compared Variables Unequal: 0.
Number of Observations with All Compared Variables Equal: 89.
NOTE: No unequal values were found. All values compared are exactly equal.

Example of Tables Validation (1/3)


First Programmer programs
all tables applying the set of
layout specifications and
saves outputs in Word

Second Programmer programs


all tables avoiding to add
additional SAS code to control
output

Compare of outputs

Example of Tables Validation (2/3)


________________________________________________________________
Tmt A
Tmt B
________________________________________________________________

First Programmer Output in Word

Second programmer Output SAS


proc means data=demog n mean stddev
median min max;
var age;
by tmt;
run;

Age (years)
n
Mean (SD)
Median
Min - Max

41
51.44 (10.39)
55.00
30.00- 66.00

48
52.10 (11.00)
55.00
27.00- 71.00

Gender
Female
14 (34.15%)
21 (43.75%)
Male
27 (65.85%)
27 (56.25%)
________________________________________________________________

Example of Tables Validation (3/3)


________________________________________________________________
Tmt A
Tmt B
________________________________________________________________

First Programmer Output in Word

Age (years)
n
Mean (SD)
Median
Min - Max

41
51.44 (10.39)
55.00
30.00- 66.00

48
52.10 (11.00)
55.00
27.00- 71.00

Gender
Female
14 (34.15%)
21 (43.75%)
Male
27 (65.85%)
27 (56.25%)
________________________________________________________________

Second programmer Output SAS


proc freq data=demog;
tables gender*tmt;
run;

Example of Listings Validation (1/2)


First Programmer programs
all listings applying the set of
layout specifications and
saves outputs in Word

Second Programmer prints


derived datasets in SAS

Compare
listing output in Word
versus
output in SAS of derived dataset

Example of Listings Validation (2/2)


Listing Output in Word

Listing 1

Demographic Characteristics

Subject ID Gender
Age Race
_______________ _______ ____ _____
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136

M
M
F
M
M
F
M
M
M
M
M
M
M
F
M
M

50
34
58
64
57
64
39
55
41
44
32
37
61
56
34
34

3
3
3
3
3
3
3
2
3
3
3
3
3
3
3
3

Print of Derived Dataset

Example of Registration Errors

Metrics on Programming Errors


Output
Structure
30%

Specification
not detailed
40%
Wrong
interpretation
of
specification
60%

Display
Variables
14%

Calculation of
variables
20%

Selection of
Variables
14%

Output
Writing
56%

Specification
14%

Layout
45%

SAS
Programming
66%

Programming
41%

Examples of Errors
Layout
Writing of a note in table
Incorrect: Percentages are calculated number of patients
Correct: Percentages are calculated on number of patients

Examples of Errors
Programming
data age;
set demog;
if age<20 then age_c=1;
else if 20<age<40 then age_c=2;
else if age>=40 then age_c=3;
run;
data age;
set demog;
if age<20 then age_c=1;
else if 20<=age<40 then age_c=2;
else if age>=40 then age_c=3;
run;

Examples of Errors
Wrong interpretation of specification
Note of a table (in SAP):
Note 1: Only patients with all value for primary analysis are
included in the table.
In SAS Program:
In the table, all patients are included

Thank you for your attention


Questions?

Potrebbero piacerti anche