Sei sulla pagina 1di 4

SDTM (Study Data Tabulation Model) defines a standard structure for human clinical

trial (study) data tabulations and for nonclinical study data tabulations that are
to be submitted as part of a product application to a regulatory authority such as
the United States Food and Drug Administration (FDA). The Submission Data Standards
team of Clinical Data Interchange Standards Consortium (CDISC) defines SDTM.

On July 21, 2004, SDTM was selected as the standard specification for submitting
tabulation data to the FDA for clinical trials and on July 5, 2011 for nonclinical
studies. Eventually, all data submissions will be expected to conform to this
format. As a result, clinical and nonclinical Data Managers will need to become
proficient in the SDTM to prepare submissions and apply the SDTM structures, where
appropriate, for operational data management.

Contents
1 Background
2 Datasets and domains
3 Special-purpose domains
4 The general domain classes
5 The CDISC standard domain models (SDTMIG 3.2)
6 Limitations and criticism of standards
7 References
8 See also
9 External links
Background
SDTM is built around the concept of observations collected about subjects who
participated in a clinical study. Each observation can be described by a series of
variables, corresponding to a row in a dataset or table. Each variable can be
classified according to its Role. A Role determines the type of information
conveyed by the variable about each distinct observation and how it can be used.
Variables can be classified into four major roles:

Identifier variables, which identify the study, subject of the observation, the
domain, and the sequence number of the record
Topic variables, which specify the focus of the observation (such as the name of a
lab test)
Timing variables, which describe the timing of the observation (such as start date
and end date)
Qualifier variables, which include additional illustrative text, or numeric values
that describe the results or additional traits of the observation (such as units or
descriptive adjectives).
A fifth type of variable role, Rule, can express an algorithm or executable method
to define start, end, or looping conditions in the Trial Design model.

The set of Qualifier variables can be further categorized into five sub-classes:

Grouping Qualifiers are used to group together a collection of observations within


the same domain. Examples include --CAT and --SCAT.
Result Qualifiers describe the specific results associated with the topic variable
for a finding. It is the answer to the question raised by the topic variable.
Examples include --ORRES, --STRESC, and --STRESN. Many of the values in the DM
domain are also classified as Result Qualifiers.
Synonym Qualifiers specify an alternative name for a particular variable in an
observation. Examples include --MODIFY and --DECOD, which are equivalent terms for
a --TRT or --TERM topic variable, --TEST and --LOINC which are equivalent terms for
a --TESTCD.
Record Qualifiers define additional attributes of the observation record as a whole
(rather than describing a particular variable within a record). Examples include
--REASND, AESLIFE, and all other SAE (serious adverse event) flag variables in the
AE domain; and --BLFL, --POS and --LOC, --SPEC, --LOT, --NAM.
Variable Qualifiers are used to further modify or describe a specific variable
within an observation and is only meaningful in the context of the variable they
qualify. Examples include --ORRESU, --ORNRHI, and --ORNRLO, all of which are
variable qualifiers of --ORRES, and --DOSU and --DOSFRM, all of which are variable
qualifiers of --DOSE.
For example, in the observation, 'Subject 101 had mild nausea starting on Study Day
6,' the Topic variable value is the term for the adverse event, 'NAUSEA'. The
Identifier variable is the subject identifier, '101'. The Timing variable is the
study day of the start of the event, which captures the information, 'starting on
Study Day 6', while an example of a Record Qualifier is the severity, the value for
which is 'MILD'.

Additional Timing and Qualifier variables could be included to provide the


necessary detail to adequately describe an observation.� The SDTM addition to
PROC CDISC does not convert existing SDS 2.x content to SDTM 3.x representations.

Datasets and domains


Observations are normally collected for all subjects in a series of domains. A
domain is defined as a collection of logically-related observations with a topic-
specific commonality about the subjects in the trial. The logic of the relationship
may relate to the scientific matter of the data, or to its role in the trial.

Typically, each domain is represented by a dataset, but it is possible to have


information relevant to the same topicality spread among multiple datasets. Each
dataset is distinguished by a unique, two-character DOMAIN code that should be used
consistently throughout the submission. This DOMAIN code is used in the dataset
name, the value of the DOMAIN variable within that dataset, and as a prefix for
most variable names in the dataset.

The dataset structure for observations is a flat file representing a table with one
or more rows and columns. Normally, one dataset is submitted for each domain. Each
row of the dataset represents a single observation and each column represents one
of the variables. Each dataset or table is accompanied by metadata definitions that
provide information about the variables used in the dataset. The metadata are
described in a data definition document named 'Define' that is submitted along with
the data to regulatory authorities.

Submission Metadata Model uses seven distinct metadata attributes to be defined for
each dataset variable in the metadata definition document:

The Variable Name (limited to 8-characters for compatibility with the SAS System V5
Transport format)
A descriptive Variable Label, using up to 40 characters, which should be unique for
each variable in the dataset
The data Type (e.g., whether the variable value is a character or numeric)
The set of controlled terminology for the value or the presentation format of the
variable(Controlled Terms or Format)
The Origin or source of each variable
The Role of the variable, which determines how the variable is used in the dataset.
Roles are used to represent the categories of variables as Identifier, Topic,
Timing, or the five types of Qualifiers. Since these roles are predefined for all
domains that follow the general classes, they do not need to be specified by
sponsors in their Define data definition document. Actual submission metadata may
use additional role designations, and more than one role may be assigned per
variable to meet different needs.
Comments or other relevant information about the variable or its data.
Data stored in dataset variables include both raw (as originally collected) and
derived values (e.g., converted into standard units, or computed on the basis of
multiple values, such as an average). In SDTM only the name, label, and type are
listed with a set of CDISC guidelines that provide a general description for each
variable used by a general observation class.

Comments are included as necessary according to the needs of individual studies.


The presence of an asterisk (*) in the 'Controlled Terms or Format' column
indicates that a discrete set of values (controlled terminology) is expected to be
made available for this variable. This set of values may be sponsor-defined in
cases where standard vocabularies have not yet been defined (represented by a
single *) or from an external published source such as MedDRA (represented by
**). ?

Special-purpose domains
The CDISC Version 3.x Submission Data Domain Models include special-purpose domains
with a specific structure and cannot be extended with any additional qualifier or
timing variables other than those specified.

Demographics includes a set of standard variables that describe each subject in a


clinical study
Comments describes a fixed structure for recording free-text comments on a subject,
or comments related to records or groups of records in other domains.
Additional fixed structure, non-extensible special-purpose domains are discussed in
the Trial Design model.

The general domain classes


Most observations collected during the study (other than those represented in
special purpose domains) should be divided among three general observation classes:
Interventions, Events, or Findings:

The Interventions class captures investigational treatments, therapeutic


treatments, and surgical procedures that are intentionally administered to the
subject (with some actual or expected physiological effect) either as specified by
the study protocol (e.g., �exposure�), coincident with the study assessment period
(e.g., �concomitant medications�), or other substances self-administered by the
subject (such as alcohol, tobacco, or caffeine)
The Events class captures occurrences or incidents independent of planned study
evaluations occurring during the trial (e.g., 'adverse events' or 'disposition') or
prior to the trial (e.g., 'medical history').
The Findings class captures the observations resulting from planned evaluations to
address specific questions such as observations made during a physical examination,
laboratory tests, ECG testing, and sets of individual questions listed on
questionnaires.
In most cases, the identification of the general class appropriate to a specific
collection of data by topicality is straightforward. Often the Findings general
class is the best choice for general observational data collected as measurements
or responses to questions. In cases when the topicality may not be as clear, the
choice of class may be based more on the scientific intent of the protocol or
analysis plan or the data structure.

All datasets based on any of the general observation classes share a set of common
Identifier variables and Timing variables. Three general rules apply when
determining which variables to include in a domain:

The same set of Identifier variables applies to all domains based on the general
observation classes. An optional identifier can be used wherever appropriate.
Any valid Timing variable is permissible for use in any submission dataset (such as
to describe studies with more precise time points such as a Pharmacokinetics
trial), but it should be used consistently where applicable for all domains.
Any additional Qualifier variables from the same general class may be added to a
domain model.

Potrebbero piacerti anche