Sei sulla pagina 1di 76

FDA QUALITY METRICS RESEARCH

FINAL REPORT

JULY 2017

Prof. Thomas Friedli, Prabir Basu, PhD Nuala Calnan, PhD


Stephan Koehler, Paul Buess OPEX and cGMP Consultant Regulatory Science
University of St.Gallen Adjunct Research Fellow at DIT
TABLE OF CONTENTS

2 | Quality Metrics Research


1 EXECUTIVE SUMMARY...............................................................................................................................................8
2 BACKGROUND ................................................................................................................................................................10
3 PROJECT DESIGN .......................................................................................................................................................... 12
3.1 Research Objective....................................................................................................................................................13
3.2 St.Gallen OPEX Benchmarking and Database .....................................................................................................13

4 THE PHARMACEUTICAL PRODUCTION SYSTEM MODEL (PPSM) HOUSE ................................. 15


4.1 Overview.................................................................................................................................................................... 16
4.2 Cultural Excellence .................................................................................................................................................. 18
4.3 CAPA Effectiveness .................................................................................................................................................. 21
4.4 Operational Stability (OS)....................................................................................................................................... 21
4.5 Lab Quality & Robustness (LQR) ........................................................................................................................... 21
4.6 PQS Effectiveness ..................................................................................................................................................... 23
4.7 PQS Efficiency .......................................................................................................................................................... 23
4.8 PQS Excellence ......................................................................................................................................................... 23

5 ANALYSIS APPROACH ................................................................................................................................................24


5.1 General Approach .................................................................................................................................................... 25
5.2 Analysis and Statistical Tools ................................................................................................................................. 25

6 FINDINGS..........................................................................................................................................................................26
6.1 Summary ...................................................................................................................................................................27
6.2 Analysis: PQS Effectiveness ....................................................................................................................................29
6.2.1 Service Level Delivery (OTIF) as Surrogate for PQS Effectiveness ..................................................................29
6.2.2 Inventory - Stability Matrix (ISM) .........................................................................................................................29
6.2.3 Moderating Effects ................................................................................................................................................... 33
6.2.4 Impact of C-categories on PQS Effectiveness...................................................................................................... 35
6.2.5 Impact of Performance Metrics on PQS Effectiveness ...................................................................................... 37
6.2.6 CAPA Effectiveness and PQS Effectiveness ......................................................................................................... 37
6.3 Analysis: PQS Effectiveness and Efficiency .......................................................................................................... 37
6.3.1 Linkage between PQS Effectiveness and Efficiency ........................................................................................... 37
6.3.2 Linkage between PQS Effectiveness and Efficiency with peer-group split.....................................................39
6.4 Analysis: Customer Complaint Rate .....................................................................................................................39
6.4.1 Linkage between Customer Complaint Rate and PQS Effectiveness..............................................................39
6.4.2 Linkage between Customer Complaint Rate and PQS Effectiveness for DS/DP Split ................................. 41
6.4.3 Linkage between Customer Complaint Rate and Rejected Batches
moderated by Operational Stability ...................................................................................................................... 41
6.5 Analysis: Cultural Excellence .................................................................................................................................43
6.5.1 Linkage between Quality Maturity and Quality Behavior ................................................................................43
6.5.2 Top-10 Quality Maturity Attributes that drive Quality Behavior ....................................................................43
6.5.3 Cultural Excellence as the foundation for PQS Effectiveness ..........................................................................45
6.5.4 Linkage between St.Gallen OPEX Enablers and Operational Stability .......................................................... 46
6.6 Limitations of the Data Analysis .......................................................................................................................... 46

Quality Metrics Research | 3


7 IMPLICATION FOR FDA QUALITY METRICS INITIATIVE .........................................................................47
8 CONCLUSION AND OUTLOOK ................................................................................................................................50
9 REFERENCES .................................................................................................................................................................. 52
APPENDIX .........................................................................................................................................................................54
Appendix 1: Dissemination..................................................................................................................................... 55
Appendix 1.1: Presentations at Conferences/Workshops .................................................................................. 55
Appendix 1.2: Publications ...................................................................................................................................... 55
Appendix 1.3: Further Industry Interaction ......................................................................................................... 55
Appendix 2: ...............................................................................................................................................................56
Appendix 2.1: Questions and Definitions from St.Gallen OPEX Report – Structural Factors ....................56
Appendix 2.2: Questions and Definitions from St.Gallen OPEX Report – Cost and Headcount .............. 60
Appendix 2.3: Questions and Definitions from St.Gallen OPEX Report – Enabler....................................... 61
Appendix 2.4: Questions and Definitions from St.Gallen OPEX Report – Performance Metrics ............. 66
Appendix 3: SPSS Output – MLR - Impact of C-Categories on PQS Effectiveness ...................................... 68
Appendix 4: SPSS Output – MLR - Impact of Performance Metrics on PQS Effectiveness ....................... 70
Appendix 5: Correlation table Compliance Metrics and Performance Metrics ............................................. 73
Appendix 6: Cultural Excellence Subelements.................................................................................................... 73
Appendix 7: OPEX Enabler Categories Implementation for OS HP vs. LP ....................................................74

4 | Quality Metrics Research


LIST OF FIGURES
Figure 1: Structure of St.Gallen OPEX Benchmarking database ...................................................................................... 14
Figure 2: St.Gallen OPEX Benchmarking Model ................................................................................................................. 14
Figure 3: Pharmaceutical Production System Model.......................................................................................................... 17
Figure 4: PPSM House with Metrics and Enabler ............................................................................................................... 17
Figure 5: The Sand Cone Model (Ferdows & De Meyer, 1990) .......................................................................................... 18
Figure 6: Inventory-Stability Matrix (ISM) ............................................................................................................................31
Figure 7: Inventory-Stability Matrix Excel.............................................................................................................................31
Figure 8: ISM: Level on inventory vs. Service Level Delivery .............................................................................................31
Figure 9: ISM: Rejected Batches vs. Service Level Delivery ............................................................................................... 32
Figure 10: ISM: Rejected Batches vs. Customer Complaint Rate ....................................................................................... 32
Figure 11: Moderating Effect Approach .................................................................................................................................. 33
Figure 12: Inventory Effect on Relationship Rejected Batches vs. Service Level Delivery ..............................................34
Figure 13: Effect of selected Production Strategy on Relationship Rejected Batches
vs. Service Level Delivery ........................................................................................................................................34
Figure 14: Effect of selected Production Strategy on Relationship Operational Stability
vs. Service Level Delivery ........................................................................................................................................36
Figure 15: Plot: Number of non-critical overdue CAPAs vs. Service Level Delivery (OTIF)...........................................38
Figure 16: Scatter plot between agg. PQS Effectiveness and PQS Efficiency ................................................................... 40
Figure 17: Scatter plot between agg. PQS Effectiveness and PQS Efficiency with peer-group ..................................... 40
Figure 18: Scatter plot for Customer Complaint Rate and the aggregated PQS Effectiveness ......................................42
Figure 19: Scatter plot for Rejected Batches and Customer Complaint Rate with
Operational Stability peer-groups .........................................................................................................................42
Figure 20: Linkage between Quality Maturity and Quality Behavior
– PDA results (left) and St.Gallen results (right)..................................................................................................42
Figure 21: Significant differences of the implementation level of Enabler Categories and Sub-Categories .............. 44
Figure 22: Appendix: MLR Impact of Level 3 Categories on PQS Effectiveness ............................................................. 68
Figure 23: Appendix: MLR Impact of Supplier Reliability (SR) on Operational Stability (OS) ..................................... 69
Figure 24: Appendix: MLR - Enter method ........................................................................................................................... 70
Figure 25: Appendix: MLR - Stepwise method ...................................................................................................................... 71
Figure 26: Appendix: MLR - Backward method ....................................................................................................................72
Figure 27: Appendix: Correlation table Compliance Metrics and Performance Metrics ................................................ 73
Figure 28: Appendix: Implementation Level of Quality Behavior and Maturity for OTIF HP vs. OTIF LP ............... 73
Figure 29: Appendix: Quality Behavior and Maturity for OTIF HP vs. OTIF LP t-Test Output ................................... 73
Figure 30: Appendix: Engagement Metrics Score for OTIF HP vs. OTIF LP ................................................................... 73
Figure 31: Appendix: Engagement Metrics Score for OTIF HP vs. OTIF LP t-Test Output .........................................74
Figure 32: Appendix: OPEX Enabler Categories Implementation for OS HP vs. LP .......................................................74
Figure 33: Appendix: OPEX Enabler Categories t-Test Output ..........................................................................................74

Quality Metrics Research | 5


LIST OF TABLES
Table 1: Engagement Metrics of the PPSM......................................................................................................................... 19
Table 2: St.Gallen Enabler - Quality Behavior match ........................................................................................................ 19
Table 3: St.Gallen Enabler - Quality Maturity match ........................................................................................................ 19
Table 4: Calculation of Cultural Excellence Score ............................................................................................................. 19
Table 5: CAPA Effectiveness Metrics................................................................................................................................... 20
Table 6: Calculation of Supplier Reliability Score............................................................................................................. 20
Table 7: Overview Operational Stability Metrics and Purpose of Measure.................................................................. 20
Table 8: Calculation of Operational Stability Score ......................................................................................................... 20
Table 9: Calculation of Lab Quality & Robustness Score .................................................................................................22
Table 10: Calculation of aggregated PQS Effectiveness Score ...........................................................................................22
Table 11: Calculation of PQS Efficiency Score .....................................................................................................................22
Table 12: Statistical Tools used ............................................................................................................................................... 25
Table 13: Findings overview ....................................................................................................................................................27
Table 14: Differences of mean of (aggregated) PQS Effectiveness (Score)
for OTIF HP and OTIF LP ......................................................................................................................................28
Table 15: T-test for equality of means of (aggregated) PQS Effectiveness (Score)
between OTIF HP and OTIF LP ............................................................................................................................28
Table 16: Average stability and inventory of four ISM-groups ...........................................................................................31
Table 17: ISM Overview Service Level Delivery (OTIF) ...................................................................................................... 32
Table 18: Overview Production Strategy ...............................................................................................................................34
Table 19: Correlation Analysis SR, OS, OTIF .......................................................................................................................36
Table 20: Metrics included in MLR ........................................................................................................................................36
Table 21: Results MLR Impact of KPIs on OTIF ..................................................................................................................38
Table 22: CAPA Effectiveness Metrics for Correlation Analysis ........................................................................................38
Table 23: Group statistics showing the mean difference between CCR HP and LP ..................................................... 40
Table 24: Independent Samples Test showing the significance of the statistical t-Test ............................................... 40
Table 25: Group statistics showing the mean difference between OTIF HP and LP .................................................... 44
Table 26: Independent Sample Test showing the significance of the statistical t-Test................................................. 44
Table 27: Comparison FDA Quality Metrics with St.Gallen PPSM Approach ............................................................... 49
Table 28: Appendix: Structural Factors from St.Gallen OPEX Questionnaire ...............................................................56
Table 29: Appendix: Cost and Headcount figures from St.Gallen OPEX Questionnaire ............................................ 60
Table 30: Appendix: Enabler from St.Gallen OPEX Questionnaire .................................................................................. 61
Table 31: Appendix: Performance Metrics from St.Gallen OPEX Questionnaire ........................................................ 66

6 | Quality Metrics Research


LIST OF ABBREVIATIONS
CI: Continuous Improvement
EFQM: Effective Management System
EMS: Effective Management System
ISPE: International Society for Pharmaceutical Engineering
JIT: Just-in-Time
LQR: Lab Quality and Robustness
OPEX: Operational Excellence
OPQ: Office of Pharmaceutical Quality
OS: Operational Stability
OTIF: On-time-in-full
PDA: Parenteral Drug Association
PPSM: Pharmaceutical Production System Model
PQS: Pharmaceutical Quality System
QMS: Quality Management System
SR: Supplier Reliability
TPM: Total Productive Maintenance
TQM: Total Quality Management
UID: Unique identifier

Quality Metrics Research | 7


1 EXECUTIVE SUMMARY

The FDA Quality Metrics initiative has emerged directly from the into year two and an outline of future research activities planned
FDA Science and Innovation Act (US Congress, 2012) (FDASIA, can be found in Chapter 8.
2012) and aims to provide both industry and regulators with better
The report is structured to provide Background and Research De-
insight into the current state of quality across the global pharma-
sign in Chapters 2 and 3 respectively. A key body of work is then
ceutical manufacturing sector that serves the American publics’
introduced in Chapter 4, outlining the design and development of
healthcare needs.
a holistic, system-based approach to performance management,
As part of this initiative the FDA awarded a research grant to the namely the Pharmaceutical Production System Model (PPSM). The
University of St.Gallen to help establish the scientific base for rele- Analysis Approach is explained in Chapter 5 with all of the de-
vant performance metrics which might be useful in predicting risks tailed analysis and Findings provided comprehensively in Chapter
of quality failures or drug shortages. An important factor in the 6. These detailed analyses are further supported and referenced
academic collaboration for this research was the availability of the with additional materials provided in numbered appendices. The
St.Gallen Pharmaceutical OPEX Benchmarking database, consist- implications of the research on the current FDA Metrics Initiative
ing of key performance indicator and enabler data related to more is discussed in Chapter 7, while Chapter 8 provides the conclusions
than 330 pharmaceutical manufacturing sites. and future outlook.
This report provides an account of the research activities, initial The main findings arising from the research conducted by the Uni-
data analysis undertaken and key findings arising from the first versity of St.Gallen in close collaboration with the FDA Quality
year of this research program. The research has now progressed Metrics Team are summarized below.

8 | Quality Metrics Research


Key Findings 1:
» The St.Gallen Pharmaceutical Production System Model » A prerequisite to identifying risks based on the reportable
(PPSM) was developed as a prerequisite for conducting a metrics will be to define appropriate thresholds or ranges
structured data analysis to demonstrate how Pharmaceutical for these metrics.4
Quality System (PQS) Excellence may be achieved. It is a
» The absence of any metrics addressing culture should
holistic model that illustrates a system-based understanding
be reconsidered given the high importance of Cultural
of pharmaceutical production.
Excellence on PQS performance based on the data analysis
» PQS Excellence comprises of both, PQS Effectiveness and conducted.
PQS Efficiency. A positive correlation has been demonstrated
» Reporting on a product level should also be reconsidered as
between these two elements of the PPSM.
the additional value (e.g. for preventing drug shortages) is
» The key performance indicator Service Level Delivery limited and may not justify the comparably high reporting
(OTIF) has been identified as a suitable surrogate for the burden across the supply chain. On the other hand it must
effectiveness of the Pharmaceutical Quality System for the be acknowledged that FDA intends to use quality metrics
purpose of data analysis2. data as well for other reasons such as a more targeted
preparation of site inspections.
» Operational Stability has been found to have a significant
impact on PQS Effectiveness3 . » Without considering the level of inventory, the program’s
ability to assess the risk for drug shortages is limited5.
» Supplier Reliability has been found to have a significant
impact on Operational Stability. » Evaluating advantages and disadvantages of other voluntary
reporting programs (such as OSHA, U.S. Department
» PQS Effectiveness high performing sites have a significantly
of Labor, 2017) versus mandatory participation is
higher Cultural Excellence compared to PQS Effectiveness low
recommended.
performing sites.
» A newly developed Inventory –Stability Matrix (ISM) allows
for a better understanding of the impact of inventory on Implications for Industry
PQS performance on a site.
» The research supports alignment of reporting of quality
» A High level of Inventory (Days on Hand) can compensate performance metrics with internal OPEX programs in order
for stability issues experienced on sites but may also mask to:
insufficient process capability.
› Justify the additional reporting effort and highlight the
» Sites with Low Stability and Low Inventory have the benefits to the business
highest risk profile regarding Rejected Batches, Customer
Complaint Rate and Service Level Delivery (OTIF) (PQS › Further systematize continuous improvement activities
Effectiveness surrogate) within organizations

» Operational Stability high performing sites have a › Improve the understanding of the actual performance
significantly lower level of Customer Complaints and a of the company’s production system in general, and the
significantly lower level of Rejected Batches compared to reported FDA metrics in particular.
Operational Stability low performing sites. » Fostering Quality Maturity will have a positive impact on
the Quality Behavior at a firm, leading to superior Cultural
Excellence and subsequently providing the foundation of
Implications for FDA Quality Metrics Program PQS Excellence.
» Lot Acceptance Rate and Customer Complaint Rate are
reasonable measures to assess Operational Stability and PQS
Effectiveness and should remain part of the Quality Metrics
Program.
» The level of detail of FDA suggested quality metrics
definitions is appropriate given the limited number of
metrics requested.

1. One of the suggested metrics from FDA’s revised draft guidance, invalidated Out-of-specification (OOS) could not be tested based on existing St.Gallen
OPEX data. However as Invalidated OOS iis part of the recently launched St.Gallen OPEX Benchmarking in QC Labs, such an analysis will be conducted
in year 2 in the context of the Pharmaceutical Production System Model (PPSM).
2. As this metric is the only performance indicator in the entire database that covers time, quantity and quality from a customer perspective, no other met-
rics have been considered and tested as surrogates.
3. Operational Stability is an average of multiple variables. The impact of single metrics is assessed in chapter 5.2.5.
4. To do this bears some complexity: first risk has to be operationalized and then it needs a certain amount of data to be able to find relations between the
metric values and the risk exposure. As FDA intends to do the analysis only in combination with other data they already have available there may be other
patterns arising that serve the aim to identify respective risks.
5. This conclusion has not been derived from data analysis but from theory and from the study of sources like the Drug Shortages report by International
Society for Pharmaceutical Engineering [ISPE] and The Pew Charitable Trusts [PEW] (2017).

Quality Metrics Research | 9


2 BACKGROUND

10 | Quality Metrics Research


Within the pharmaceutical industry, it is universally understood
that a robust Pharmaceutical Quality System (PQS) provides key
elements of assurance and oversight necessary for pharmaceutical
manufacturing and quality control laboratory processes: it ensures
that patients are provided with medications that are safe, effective,
and reliably produced to a high level of quality. However, despite re-
cent advances in the manufacturing sector, manufacturing quality
issues remain a frequent occurrence, and can result in recalls, with-
drawals, or even harm to patients (Woodcock & Wosinska, 2013; Yu
& Kopcha, 2017). Furthermore, manufacturing quality issues have
also been recently linked to the rise in critical drug shortages (ISPE
& PEW, 2017).
Many global regulatory agencies now employ risk-based inspection
scheduling by assessing the risk profile of manufacturing sites based
on the treatments they provide, their compliance history, as seen
in warning letters and field reports, in conjunction with records
on product recalls and market-based quality problems. These are
not necessarily the most informative measures, and by their nature,
provide historical or lagging data or signal detection. More relevant
data relating to the state-of-quality, provided in advance, could bet-
ter inform the risk factors that might predict potential quality prob-
lems or the likelihood of future drug shortages. This could become
a valuable additional source of information for a risk-based assess-
ment and inspection scheduling of pharmaceutical manufacturing
operations around the world.
FDA’s approach to quality oversight has evolved in recent years.
The Office of Pharmaceutical Quality (OPQ) established in 2015 has
made it a priority to ensure that pharmaceutical products available
to the American public meet high quality standards right through-
out their product lifecycle. The FDA Quality Metrics initiative,
which stems from the FDA Science and Innovation Act (US Con-
gress, 2012) (FDASIA, 2012), aims to develop and implement the
reporting of a set of standardized manufacturing quality metrics.
The establishment and collection of these metrics should provide
various stakeholders – from industry to regulators – with better in-
sight into the state of quality at a given manufacturing facility, and
allow stakeholders to better anticipate and address quality issues,
and the risks associated with them, while simultaneously reducing
extensive regulatory burden.
As part of this initiative the FDA has awarded a research grant
(Grant #1UO1FD005675-01; Title: FDA Pharmaceutical Manufac-
turing Quality Metrics Research) to the University of St.Gallen to
help establish the scientific base for such metrics. An important fac-
tor in the academic collaboration for this research was the availabil-
ity of the St.Gallen Pharmaceutical OPEX Benchmarking database,
consisting of key performance data related to more than 330 phar-
maceutical manufacturing sites.
The following chapters of this report provide an overview about the
research conducted by the University of St.Gallen in close collabo-
ration with the FDA Quality Metrics Team.

Quality Metrics Research | 11


3 PROJECT DESIGN

12 | Quality Metrics Research


ing has established itself as an important success component, pro-
3.1 Research Objective viding practitioners in pharmaceutical companies with exclusive
In support of OPQ’s commitment to transform the assessment of industry intelligence and support for data-backed decision making.
drug quality from a qualitative to a quantitative or semi-quantita- Today, the St.Gallen OPEX benchmarking database consists of
tive, expertise-based assessment, the key objective of this project is more than 330 manufacturing sites from over 124 different com-
to evaluate potential quality metrics candidates, including the ones panies and thus represents the largest independent7 OPEX bench-
suggested in FDA’s Quality Metrics Draft Guidance in November marking in the pharmaceutical industry – worldwide (see Figure 1).
20166, and to derive conclusions.
The following paragraphs provide an explanation of the underlying
Recommended quality metrics should facilitate oversight of the St.Gallen OPEX Benchmarking model and its individual sub-sys-
effectiveness of current manufacturing controls and the delivery tems. When developing the OPEX model back in 2004, the inten-
of key quality outcomes in manufacturing operations. In short, the tion was not to create something new from scratch, but rather to
principal aim of this research is to explore success factors which develop an OPEX reference model, by adapting and integrating
enable a robust Pharmaceutical Quality System (PQS), or in other proven production models already in existence in other industries,
words the achievement of PQS Excellence. to the specific needs of the pharmaceutical industry (Friedli, Basu,
Based on St.Gallen’s global OPEX database and nearly fifteen years Bellm, & Werani, 2013). This procedure ensured that the St.Gallen
of experience doing research with the pharmaceutical industry, the model was built upon a profound, theoretical foundation and like-
research team focused the evaluation of meaningful, measurable wise enables practical applications.
and reportable potential candidates of quality metrics, incorporat- For the St.Gallen team, Operational Excellence is a philosophy
ing both quantitative and qualitative cultural related indicators. which directs an organization towards continuous improvement.
The research strategy was executed in three stages: It is the balanced management of cost, quality and time focusing
» Stage 1 (Understand): the current FDA metrics concepts on the needs of the patient, comprising of both structural and be-
released in the “Request for Quality Metrics – Guidance havioral changes that support the necessary activities the best way
for Industry” (FDA, 2015) and the revised guidance (FDA, possible. To be sustainable it has to be driven by Top Management
2016) were examined in detail. The underlying research and be designed to engage every single employee. Operational Ex-
assumptions informed further work. cellence is not only about performance, it is also about the way an
organization achieves superior performance and about how it con-
» Stage 2 (Develop & Analyze): the researchers developed tinuously improves itself.
a set of quality metrics suitable to inform about overall
production system performance. Quality performance is The St.Gallen OPEX Modell serves as an analytical “thought mod-
modelled as the very foundation of this set of metrics. The el” for the benchmarking providing a sound base for an overall
resulting system-based model, entitled Pharmaceutical system based interpretation of data. The current St.Gallen OPEX
Production System Model (PPSM), describes the value chain Benchmarking reference model is exhibited in Figure 2.
from supplier inputs to final delivery and also comprises The St.Gallen OPEX reference model includes several sub-systems,
maintenance related data, enablers, cultural indicators and each of which in itself constitutes an important element that con-
standard operational performance metrics. This model tributes to the overall success. Even more important than the in-
serves as the basis for the detailed analysis of selected dividual sub-systems is the way they reinforce each other. Thus,
data from the St.Gallen OPEX Benchmarking database. In the model represents manufacturing as a holistic system in which
an additional step the St.Gallen metric sets and the FDA single elements or interventions have a direct and indirect impact
guideline metrics approaches have been compared. The on other elements or sub-systems. At the highest level of abstrac-
main objective of this exercise was to examine if the limited tion, the OPEX reference model is divided into two larger sub-sys-
set of KPIs given in the draft FDA guideline is capable of tems: a technical and a social sub-system. The technical sub-system,
demonstrating respectively comparable insights provided by comprises well-known manufacturing programs such as Total Pro-
the overall system-based PPSM evaluation. ductive Maintenance (TPM), Total Quality Management (TQM),
» Stage 3 (Verify): the research team used their access to the Just-in-Time (JIT), and structures them in a logical and consistent
industry to check the usability of a quality metrics approach. manner (Cua, McKone, & Schroeder, 2001).
Apart from that, the social sub-system takes up the quest for an
3.2 St.Gallen OPEX Benchmarking and Database operational characterization of management quality and work or-
ganization. This second, higher-level sub-system focuses on sup-
Since 2004, the Institute of Technology Management at the Uni-
porting, encouraging and motivating people to steadily improve
versity of St.Gallen has been assisting a number of pharmaceutical
processes (and by doing so, apply the technical practices in ways
companies to improve their performance with its benchmarking
that contribute to the overall goal of the company).
study on Operational Excellence. The St.Gallen OPEX benchmark-

6. Food and Drug Administration [FDA] (2016)


7. independent in this context means “not consultant driven”

Quality Metrics Research | 13


TOTAL: 336 SITES
Site Production Structure

165 53 66 25 27

Mixed API Solids & Semi Solids Liquids & Sterile Liquids Unassigned

Site Size (Employees)

55 126 87 50 18

0-100 101 -300 301 -500 501 -1000 >1000

Site Production Type

61 12 29 234

R&D Generic CMO Mix


Figure 1: Structure of St.Gallen OPEX Benchmarking database

STRUC TURAL FAC TORS


C OS TS
P reventive Maintenanc e P roc es s C us tomer S et-up Time P ull
Management Integration Reduc tions Sys tem

TPM T QM J IT

OP E R AT IONAL P E R F OR MANC E
E ffec tive C ros s-func t. S upplier
Hous e- Tec hnology P roduc t Quality P lanning L ayout
keeping Us age Development Management Adherenc e Optimization

S tandardization and Vis ual Management

1 Stable Equipment 2 Stable Proc es s es 3 L ow Inventories

Effec tive Management Sys tem

Management C ommitment Employee Involvement F unc tional Integration


Direc tion S etting & & &
C ompany Culture C ontinuous Improvement Qualific ation

Figure 2: St.Gallen OPEX Benchmarking Model

14 | Quality Metrics Research


4 THE PHARMACEUTICAL PRODUCTION
SYSTEM MODEL (PPSM) HOUSE

Quality Metrics Research | 15


4.1 Overview
The new Pharmaceutical Production System Model (PPSM) is a The PPSM has evolved over time throughout the research and has
model specifically developed for this FDA Quality Metrics project. been revised and refined several times as the understanding and in-
It has been designed to enable a structured analysis of the compo- sights gained developed. The current PPSM version presented here
nents which support the specific achievement of Pharmaceutical is heavily influenced by three key aspects:
Quality System (PQS) Excellence, which has been the primary fo-
1. Firstly, the initial data available for analysis was limited
cus of this research. It illustrates a holistic, system-based under-
to data already collected through the St.Gallen OPEX
standing of Pharmaceutical Production.
Benchmarking activities. Consequently, it is acknowledged
The PPSM is displayed in Figure 3. The [S] indicates that an aggre- that there may be other appropriate metrics that could also
gated score was calculated for the category. The letters A-E help be included in the chosen categories in the future.
to structure the discussion, without providing a statement of the
2. Secondly, based on the holistic St.Gallen ‘Excellence’
relative importance of the different parts of the house. For instance
understanding, the PPSM model goes beyond a pure focus
C-categories is referring to the three categories Supplier Reliability,
on the effectiveness of the PQS System and also incorporates
Operational Stability and Lab Quality and Robustness.
efficiency aspects (costs and headcounts).
The model serves several aims:
3. Thirdly, from a scientific perspective, the model is inspired
1. First, the PPSM provides a structured and holistic depiction by two renowned models. The Sand Cone Model (cf. Figure
of the relevant, available data from the St.Gallen OPEX 5), which suggests that there is a hierarchy (a sequence to
Database, including: Key Performance Indicators8 follow) between the four competitive capabilities of Quality,
(e.g. metrics within the C-categories), Dependability, Speed and Cost Efficiency (Ferdows & De
Enabler implementation9 Meyer, 1990) with Quality as foundation and the European
(e.g. qualitative enablers within the category Cultural Foundation for Quality Management (EFQM) (European
Excellence) and the Structural Factors10 of the given Foundation for Quality Management, 2017) model, which
organization (e.g. site structure, product mix, technology promotes the consideration of two key aspects when
employed). undertaking improvement programs, the Enablers (how)
and the Results (what).
2. Secondly, the model facilitates positioning of the three
metrics, suggested in the revised FDA Draft Guidance In line with the Sand Cone Model, the St.Gallen PPSM deals with
(2016), within the broader context of the holistic St.Gallen metrics reflecting quality, dependability, speed and cost. The basic
understanding, in order to test them for significance from a PPSM assumption holds that achievement of a higher performance
system perspective. By doing so: in PQS effectiveness goes hand in hand with achieving higher PQS
efficiency too.
a. The KPI Lot Acceptance Rate was assigned to the
C-category Operational Stability. Inspired by the EFQM classification, the aspects incorporated into
part A and B of the PPSM are considered as enabling elements,
b. The KPI Invalidated OOS to the C-category Lab Quality
whereas the C-elements in conjunction with the D-elements, of
and Robustness.
PQS Effectiveness and PQS Efficiency, are associated with the re-
c. Customer Complaint Rate is considered as an outcome sults.
metric within the PPSM and therefore is located in the
Finally, regarding the Lab Quality and Robustness (LQR) category in
D-category PQS Effectiveness.
the PPSM, collection of this aspect of the St.Gallen benchmarking
3. Thirdly, the model facilitates the grouping and discussion of assessment only started in Q1 2017, therefore the LQR data was not
the elements within the PPSM as well as examination of the available for research. Nevertheless, as this aspect was considered a
relationships between elements. For instance, the proposal fundamental element of the model, the PPSM house includes the
to examine the “Relationship of individual Operational category Lab Quality and Robustness, thereby adding to the com-
Stability metrics with PQS Effectiveness”, clearly defines the pleteness of the model, and ensuring that as soon as the lab data is
scope of the analysis to be discussed. available it can be seamlessly integrated.
4. Fourthly, the PPSM provides a structure for the overall
research project as it facilitates the tracking and
communication of each analysis already performed as well
as indicating any potential blank spots between the different
PPSM elements, thereby supporting the identification of
future potentially interesting analysis.
Figure 4 shows the PPSM House including all metrics assigned to
the PPSM categories.

8. Key performance indicators (KPIs) are a set of quantifiable measures that a company uses to gauge its performance over time. These metrics are used to
determine a company’s progress in achieving its strategic and operational goals, and also to compare a company’s finances and performance against other
businesses within its industry.
9. Enablers are production principles (methods & tools but also observable behaviour). The values show the degree of implementation based on a self-assess-
ment on a 5 point Likert scale.
10. Structural factors provide background information on the site, such as size and FTEs, technology, product program. Structural factors allow to build
meaningful peer groups for comparisons (“compare apples with apples”).
11. Key performance indicators (KPIs) are a set of quantifiable measures that a company uses to gauge its performance over time. These metrics are used to
determine a company’s progress in achieving its strategic and operational goals, and also to compare a company’s finances and performance against other
businesses within its industry.

16 | Quality Metrics Research


Customer Complaint
Rate

E PQS
Lot Acceptance Excellence [S]
Rate/
1 -Rejected Batches Invalidated OOS
Result System D
PQS Effectiveness [S] PQS Efficiency [S]

Supplier Reliability Operational Stability Lab Quality &


C
[S] [S] Robustness [S]

Structural Factors
B CAPA Effectiveness
Enabling
System
A
Cultural Excellence [S]

Figure 3: Pharmaceutical Production System Model

PQS EXCELLENCE: SCORE BUILD FROM PQS EFFECTIVENESS & PQS EFFICIENCY
PQS EFFECTIVENESS: PQS
PQS EFFICIENCY:
Service Level Delivery (OTIF) Excellence [S] » Maintenance Cost/Total Cost
Customer Complaint Rate » Quality Cost/Total Cost
PQS Effectiveness [S] PQS Efficiency [S]
» Cost for Preventive Maintenance/
OPERATIONAL STABILITY: Total Cost
Supplier Reliability Operational Stability Lab Quality &
» Unplanned Maintenance [S] [S] Robustness [S] » FTE QC/ Total FTE
» OEE (average) » FTE QA/Total FTE
CAPA Effectiveness
» Rejected batches » Inventory
» Deviation Cultural Excellence [S]
» Yield SUPPLIER RELIABILITY
» Scrap rate » Service level supplier (OTIF)
NR. OF OBSERVATIONS: From internal audit
» Release time (formerly DQ) » Complaint rate supplier
» Deviation closure time
(formerly DQ) LAB QUALITY & ROBUSTNESS:
» Analytical Right First Time
ENGAGEMENT METRICS
» Lab Investigations
» Suggestions (Quantity) CULTURAL EXCELLENCE: QUALITY MATURITY
» Invalidated OOS
» Suggestions (Quality) » Preventive maintenance [3]
» Total OOS
» Employee turnover » Housekeeping [2]
» Lab Deviation Events
» Sick leave » Process management [6]
» Recurring Deviation
» Training » Cross functional product development [3]
» CAPAs Overdue
» Level of qualification » Customer involvement [2]
» Customer Complaints Requiring
» Level of safety (Incidents) » Supplier quality management [5]
Investigation
» Set-up time reduction [1]
» Product Re-Tests due to Complaints
CULTURAL EXCELLENCE: QUALITY BEHAVIOR » Routine Product Re-tests
» Preventive maintenance [4] CAPA SYSTEM » Annual Product Quality Reviews
» Housekeeping [1] » Number of CAPAs (APQR)
» Process management [1] » Number of critical overdue CAPAs » APQR On Time Rate
» Cross functional product » Number of non-critical overdue CAPAs » Stability Reports
development [1] » Audits

Figure 4: PPSM House with Metrics and Enabler

Quality Metrics Research | 17


4.2 Cultural Excellence
According to Yu and Kopcha (2017) a critical enabler for product Quality Behavior and Quality Maturity Enablers
quality is the culture of quality prevalent within an organization. The research team then assigned the Enablers already available
This concurs with the understanding of the research team, which within the St.Gallen OPEX Database to one of two groups, either a
is demonstrated by placing the category Cultural Excellence as the group of Quality Behavior attributes or a group of Quality Maturity
foundation of an effective and efficient PQS system at the basis of attributes. These groupings were based on the definition of the
the pharmaceutical production system model. terms Quality Behavior and Quality Maturity in the PDA Quality
In the context of this research project, the term Cultural Excellence Culture Survey Report (Patel et al., 2015)12.
is used as umbrella term for a combination of one set of quantita- Quality Behavior summarizes all quality related behaviors of an
tive metrics as well as two sets of qualitative enablers. individual that can be observed in an organization covering as-
Engagement Metrics pects such as commitment, engagement, transparency and active
assistance from supervisors. Quality Maturity, on the other hand,
The first element of the PPSM Cultural Excellence category are the
comprises implementable elements of the pharmaceutical quality
so-called Engagement Metrics11, which are listed in Table 1.
system such as methods, procedures and tools. In total, 26 of the
Engagement metrics serve as an indicator for the motivation of St.Gallen Enablers have been assigned to Quality Behavior, and 36 to
employees in striving for continuous improvement, their technical Quality Maturity. (Note: There remain 53 other Enablers within the
and organizational capabilities and whether the workplace pro- St.Gallen database that were not assigned to either group as they
vides a safe and healthy environment. did not align with the two PDA categories used).
Motivation is approximated by the number of improvement sug- Table 2 and Table 3 provide an overview of which Enablers from
gestions (Suggestions Quantity), the financial impact or return on the five parts of the St.Gallen OPEX Benchmarking (TPM, TQM,
investment of the improvements (Suggestions Quality) and the av- JIT, EMS13 and Basic Elements) have been assigned to either Quality
erage turnover of employees. It is assumed that a high employee Behavior or to Quality Maturity14.
turnover rate indicates a culture where people are not happy and
they would leave if they had the chance to find alternative employ-
Cultural Excellence Score
ment. The overall PPSM Level 1 Cultural Excellence Score is then calcu-
lated as an average of the Engagement Metrics Score, the Quality
Level of qualification covers the work-related qualification of em- Behavior Score and the Quality Maturity Score. For all three scores,
ployees when entering the company. Training days addresses the the following rule applies: the higher the score the better.
willingness of a site to invest in building the capabilities of its
workforce. Sick leave and level of safety serve as indicators for a safe The Engagement Metrics Score is calculated as an average of the
and healthy work environment. relative16 values of all Engagement Metrics (cf. Table 1). The Quality
Behavior Score is calculated as an average of all Enablers assigned
to the Quality Behavior Group. In order to normalize the scale
from a 1-5 Likert scale (as used in the St.Gallen questionnaire) to
0-100%, a five has been converted to 100%, four to 75% etc. In cases
where a 1 is considered better than a 5 for some Enablers, 100% was
Cost assigned to 1 and 0% to 5. The same approach has been used for the
Ef�iciency Quality Maturity Score. Table 4 summarizes the calculation of the
Cultural Excellence Score.

Speed

Reliability

Quality Figure 5: The Sand Cone Model


(Ferdows & De Meyer, 1990)

12. Enablers are production principles (methods & tools but also observable behaviour). The values show the degree of implementation based on a self-assess-
ment on a 5 point Likert scale.
13. Effective Management System
14. Please find the detailed assignment of Enablers to the categories Quality Maturity and Quality Behavior in the Appendix.
15. Assigned to both categories.
16. The term ‘relative value’ indicates that not the absolute metrics values have been considered but their relative position within the sample. E.g. the lowest
absolute value for Sick leave in the sample is considered as the best value in the sample (see Table 1 better if: ) and therefore went as 100% into the calcu-

lation of the Engagement Metrics Score. The highest value for Sick Leave went as 0% into the calculation of the Engagement Metrics Score. For Number
of Suggestions, the highest absolute value was assigned 100%, the lowest absolute value was assigned 0% (see Table 1 better if: ). Therefore, the sites with

the highest (closest to 100%) aggregated Engagement Metrics Score is the best site of the sample in this category.
The transformation from absolute to relative values has been done with the excel function ‘percentile rank’.

18 | Quality Metrics Research


Engagement Metrics Unit Better if Unit Better if
Suggestions (Quantity) Number Training Days


Suggestions (Quality) Currency unit Level of qualification %


Employee turnover %


Level of safety
Number per month


Sick leave % (Incidents)


Table 1: Engagement Metrics of the PPSM

Quality Behavior

Assigned to Quality Unique identifier in St.Gallen 2016


Enabler Category
Behavior Questionnaire (UID)

TPM Preventive maintenance 4/8 D03, D05-D07

TPM Housekeeping 1/3 D15

TQM Process management 1/8 E02

TQM Cross functional product development 1/5 E1015

EMS Direction setting 3/6 G02,G05,G06

EMS Management commitment and company culture 7/11 G07, G08,G11,G13-G16

EMS Employee involvement and continuous improvement 5/11 G19,G20,G23-G26

EMS Functional integration and qualification 1/5 G31

Basic
Standardization and simplification 3/6 H01-H03
Elements

Table 2: St.Gallen Enabler - Quality Behavior match

Quality Maturity

Enabler Category Assigned to Quality Maturity UIDs

TPM Preventive maintenance 3/8 D01, D02,D04

TPM Housekeeping 2/3 D16, D17

TQM Process management 6/8 E01, E4-E8

TQM Cross functional product development 3/5 E9, E10

TQM Customer involvement 2/6 E15, E16

E20-E22, E24,
TQM Supplier quality management 5/7
E26

JIT Set-up time reduction 1/6 F06

EMS Direction setting 2/6 G03,G04

EMS Employee involvement and continuous improvement 4/11 G17,G18,G21,G22

EMS Functional integration and qualification 4/5 G28-G30,G32

Basic
Visual management 4/4 H07-H10
Elements

Table 3: St.Gallen Enabler - Quality Maturity match

Average of Average of

Engagement Metrics Score Seven EMS performance metrics (cf. Table 1)


Cultural Excellence
Quality Behavior Score 26 Enablers assigned to Q-Behavior (normalized to 1-100% scale, cf. Table 2)
Score
Quality Maturity Score 36 Enablers assigned to Q-Maturity (normalized to 1-100% scale, cf. Table 3)

Table 4: Calculation of Cultural Excellence Score

Quality Metrics Research | 19


Category Metric Number of data points

Number of CAPAs 14

CAPAs Number of critical overdue CAPAs 14 whereof 13 have reported 0

Number of non-critical overdue CAPAs 14

Number of observations of a health authority inspection 14


Observation
Number of observations per internal audit 14

Number of recalls 14

Market actions Number of supply stops (e.g. drug shortages) -

Others (e.g. withdrawals) -

Number of Warning Letters 14 whereof 14 have reported 0

Regulatory actions Number of 483s 14

Others (e.g. Field Alert Reports) 14

Table 5: CAPA Effectiveness Metrics

Average of relative value of Unit Better if

Complaint Rate (Supplier) % ➔


Supplier Reliability Score
Service Level Supplier %

Table 6: Calculation of Supplier Reliability Score

Metric Purpose of measure

Overall Equipment Effectiveness Measurement of Equipment Stability and Availability / Maintenance Effectiveness

Unplanned Maintenance Measurement of Maintenance Quality

Rejected Batches Measurement of Manufacturing Failure Rates

Scrap Rate Measurement of Manufacturing Waste/ Failure Rates

Deviations per Batch Process Capability

Deviation Closure Time Tension on system

Release time Tension on system

Table 7: Overview Operational Stability Metrics and Purpose of Measure

Average of relative value of Unit Better if

Overall Equipment Effectiveness %


Unplanned Maintenance %

Rejected Batches %

Operational Stability Score Scrap Rate %


Deviations per Batch Number / batch


Deviation Closure Time Working Days


Release time Working Days


Table 8: Calculation of Operational Stability Score

20 | Quality Metrics Research


4.3 CAPA Effectiveness 4.4 Operational Stability (OS)
The system for implementing corrective and preventive actions Operational stability within the St.Gallen PPSM equates to the pro-
(CAPA) is a fundamental part of any pharmaceutical quality sys- vision of capable and reliable processes and equipment. Referring
tem. According to the ICH Q10 Pharmaceutical Quality System to the Sand Cone Model, the PPSM Operational Stability (OS) em-
guideline (FDA, 2009), CAPAs result from the investigation of com- bodies the core capabilities of Quality and Dependability.
plaints, nonconformance, recalls, deviations, findings, and trends
The importance of robust manufacturing processes was highlight-
from process performance and product quality monitoring as well
ed in the ICH Quality Implementation Working Group on Q8 / Q9 /
as internal audits and external regulatory inspections. The level of
Q10 Questions & Answers document which outlines the potential
effort and documentation should be proportionate to the level of
benefits of implementing an effective PQS as follows:
risk. The CAPA system may be considered effective if it achieves the
key objective to support the improvement of product and process- Facilitated robustness of the manufacturing process, through
es as well as enhance the understanding of product and processes. facilitation of continual improvement through science and
Furthermore, the CAPA methodology may be applied throughout risk-based post approval change processes; Further reducing
the whole product lifecycle, including Pharmaceutical Develop- risk of product failure and incidence of complaints and recalls
ment, Technology Transfer, Commercial Manufacturing and Prod- thereby providing greater assurance of pharmaceutical product
uct Discontinuation (FDA, 2009). consistency and availability (supply) to the patient; (FDA, 2011)
Up until the end of 2016, no CAPA related metrics have been re- Table 7 provides an overview over the metrics that compose the
quested in the standard St.Gallen OPEX Benchmarking. However, PPSM Operational Stability category.
during a recent quality related discussion with St.Gallen fourteen
sites did report metrics that can be labeled as CAPA metrics. Table Operational Stability Score
5 provides an overview of the metrics that have been summarized The Operational Stability Score is calculated as an average of the rel-
in the category CAPA Effectiveness. Because of the limited number ative values of the metrics shown in Table 8.
of data points and the fact that for some specific metrics very little
difference between the 14 sites exist it is only possible to generalize
the analysis results of the CAPA Effectiveness category to a limited 4.5 Lab Quality & Robustness (LQR)
extent. However, the new 2017 St.Gallen Benchmarking Question- To have a comprehensive view of the production system and to
naire now includes all metrics listed in Table 5, therefore the ability cover the whole value chain from supply to release within a phar-
to perform statistical analysis and derive meaningful results in this maceutical company, the PPSM Supplier Reliability and Operational
category is expected to increase in the future. To further strength- Stability are complemented with the final C-category, Lab Quality &
en the usability of the PPSM some of the metrics (e.g. from the cat- Robustness. This category is also seen as one pillar of the risk-based
egory observation) have been allocated to other categories of the approach of FDA’s Quality Metrics Initiative (Yu, 2017).
model (cf. Figure 4).
The PPSM Lab Quality & Robustness category comprises the FDA
It should be noted that due to the limited number of data points, metric Invalidated OOS and additional indicators of the quality lev-
no CAPA Effectiveness Score has been calculated and used during el and robustness of the lab operations (e.g. analytical RFT, reoc-
year one of the research project. curring deviations or product re-testing due to complaints).
Supplier Reliability (SR) Lab Quality & Robustness Score
According to the ICH Q 10 guideline, the pharmaceutical quality The Lab Quality & Robustness Score is calculated as an average of the
system also extends to the control and review of any outsourced relative values of the metrics shown in table 9.
activities and quality of purchased materials. The PQS System
is therefore responsible for implementing systematic processes
which ensure the control of outsourced activities and the quality of
all purchased material. This includes the assessment of suitability
and competence of any other third party prior to outsourcing oper-
ations or selecting material suppliers. It also requires the establish-
ment of a clear definition of responsibilities for all quality-related
activities of any involved parties and for monitoring of the quality
of incoming material (FDA, 2009). In order to assess the reliability
of external suppliers, represented by the PPSM Supplier Reliability
Score, the research team uses the following metrics from the St.Gal-
len OPEX Benchmarking; Service Level Supplier which is a measure-
ment of the supplier’s ability to deliver on-time and Complaint Rate
Supplier which is a measurement of the supplier’s ability to deliver
products of high quality.
Supplier Reliability Score
The Supplier Reliability Score is calculated as an average of the
relative values of the metrics: Complaint Rate (Supplier) and Service
Level Supplier. Table 6 summarizes the calculation of the Supplier
Reliability Score.

Quality Metrics Research | 21


Average of relative value of Unit Better if

Analytical Right First Time %


Lab Investigations/1’000 Tests No./1’000 Tests


Invalidated OOS/100’000 Tests No. 100’000 Tests


Total OOS/100’000 Tests No./100’000 Tests


Lab Deviations Event/1’000 Tests No./1’000 Tests


Recurring Deviation %


CAPAs Overdue %


Lab Quality &
Robustness Score Customer Complaints req. Investigation/100’000 Tests No./100’000 Tests


Product Re-Test due to Complaints %


Routine Product Re-Tests No.


Annual Product Quality Reviews (APQR)/Products tested No./Product


APQR On Time Rate %


Stability Batches/Stability Reports No./Report


Batches/Audits No./Audit


Table 9: Calculation of Lab Quality & Robustness Score

Details Better if Aggregation 1 Aggregation 2

Complaint Rate Supplier


Supplier Reliability Score


Service Level Supplier

Unplanned Maintenance

OEE

Rejected Batches Aggregated PQS Score


Yield Operational Stability Score


Scrap Rate

Release Time

Deviation Closure Time


Table 10: Calculation of aggregated PQS Effectiveness Score

Better
Average of relative value of Unit
if

Maintenance Cost/Total Cost %


Quality Cost/Total Cost %


PQS Efficiency
Cost for Preventive Maintenance/Total Cost %

Score
FTE QC/ Total FTE %

FTE QA/Total FTE %


Table 11: Calculation of PQS Efficiency Score

22 | Quality Metrics Research


4.6 PQS Effectiveness 4.7 PQS Efficiency
The Pharmaceutical Quality System (PQS) is defined as, ‘The man- While the PPSM category PQS Effectiveness addresses the question
agement system to direct and control a pharmaceutical company of how well the PQS is working (i.e. does it achieve its objectives,
with regard to quality’, (FDA, 2009). The PQS is at the center of the “what”), the PPSM PQS Efficiency considers how many resourc-
interest for this research project as it plays an important role in es are deployed to achieve this level of effectiveness.
fostering the FDA vision formulated as part of the FDA’s Pharma-
The consideration of cost and deployment of FTE resources is of
ceutical Quality for 21st Century Initiative:
central interest for companies that intend to use their performance
“A maximally efficient, agile, flexible pharmaceutical manufac- metrics not only for fulfilling regulatory requirements but also in
turing sector that reliably produces high quality drugs without striving for continuous improvement. Investments in the effective-
extensive regulatory oversight” (Yu & Kopcha, 2017). ness of the PQS is much more likely to be supported by top man-
agement if they can be convinced that those investments will not
Two aspects of the FDA vision have to be highlighted to convey the
only have a positive impact on the effectiveness of the PQS but will
St.Gallen understanding of what a high level of effectiveness for a
also have a positive impact on efficiency. This is one key reason
PQS constitutes:
why the research team worked on investigating the relationships
1. Firstly, the pharmaceutical production system is supposed to between PQS Effectiveness and PQS Efficiency.
be reliable; that means it is able to provide the right drug at
In their article on drug shortages, Woodcock and Wosinska argue
the right quantity at the right time.
that the market for pharmaceutical products does not reward qual-
2. Secondly, the drugs have to be produced at a quality level ity (Woodcock &  Wosinska, 2013), thus creating an economic in-
that meets quality expectations from the customer and the centive to minimize investments in manufacturing quality. Show-
regulatory authorities. ing a positive impact of investments in quality on efficiency may
change this for the better provided that pharmaceutical managers
As described in more detail in chapter 6.2.1 there was no single
take a long-term perspective.
metric within the St.Gallen database that was initially designed to
measure PQS Effectiveness, rather an aggregated PQS Effective- PQS Efficiency Score
ness Score is calculated from several metrics from the C-categories. The PPSM PQS Efficiency Score is calculated as an average of the
Therefore the research team examined best fit surrogate candi- relative values of the ratios shown in Table 11.
dates for PQS Effectiveness to use as a dependent variable in the
statistical analysis. The research team identified the metric Service
Level Delivery (On Time in Full) (OTIF) as the best available surro- 4.8 PQS Excellence
gate, details can be found in section 6.2.1.
Following the holistic St.Gallen excellence understanding, the
Aggregated PQS Effectiveness Score overall PPSM PQS Excellence Score comprises both aspects of a
The Aggregated PQS Effectiveness Score is currently calculated pharmaceutical quality system, effectiveness and efficiency. This
from the average of the Supplier Reliability Score and the Opera- score allows for a high-level ranking of a site’s pharmaceutical qual-
tional Stability Score as shown in Table 10. ity system compared to other sites.
It should be noted that at the current state of the project the PQS
Excellence Score was not part of the analysis performed as the focus,
to date, was to identify general links between the PPSM elements
rather than ranking the sites according to the overall performance
of their PQS. However, the PPSM model facilitates this ranking in
future analysis.

Quality Metrics Research | 23


5 ANALYSIS APPROACH

24 | Quality Metrics Research


5.1 General Approach 5.2 Analysis and Statistical Tools
The conceptual background of this research is based on the overall For the detailed analysis of the PPSM the research team used differ-
system-based understanding of Ulrich, Dyllick, and Probst (1984), ent types of statistical analysis tools appropriate to convey a good
providing a holistic view on the unit of analysis to enable a better understanding of the interrelation between different elements of
understanding of problems from practice. the PPSM.
The complexity of the system is accepted and the idea of total con- Table 12 provides an overview about the variety of tools together
trol is abandoned. All elements of a system are seen as interrelated with a short description about the power of each analysis. For fur-
and together they influence the overall system performance. Single ther reading we refer to Dixon and Massey (1992), Eckstein (2016),
aspects of the system are not analyzed in isolation, rather all analy- Huizingh (2007) and Abramowitz and Weinberg (2008).
sis is conducted from a system perspective. The isolation of a single
element does not have the power to get a better understanding of
the overall system performance. (Friedli, 2006; Ulrich et al., 1984)
A descriptive model is used to better illustrate the overall Phar-
maceutical Production System Model (see chapter 4). This allows
scholars and practitioners to come to an easier understanding of
the analyzed system.
Table 12: Statistical Tools used

Tool Description

» The correlation analysis helps to understand the relationship between two


individual variables. It shows the strength and direction (positive/negative)
of the relation. By using the Pearson Correlation a linear relationship
between the two variables is assumed. (Abramowitz & Weinberg, 2008;
Huizingh, 2007).
(Pearson) Correlation » The correlation coefficient shows the degree of correlation. The higher this
value the stronger is the relation between the two variables. A significance at
the 0.01 level means that the false rejection probability of hypothesis H0 (no
significant correlation) amounts to 1% (Abramowitz & Weinberg, 2008).
» Correlation does not mean causation. No cause-effect relationship can be
disclosed (Abramowitz & Weinberg, 2008).

» A t-Test allows to test two groups to determine whether the mean for
a specific variable of these two groups is equal. Consequently, it can be
T-Test
identified if there is a significant difference of the means and which group
has a higher value. (Abramowitz & Weinberg, 2008; Huizingh, 2007)

» Multiple Linear Regression is the concept of a linear equation that will


predict the values of a target (dependent variable, DV) variable from
the predictors (independent variable, IV). In contrast to a correlation
analysis, regression analysis assumes that the IV are causing the DV (causal
relationship) however this causal relationship has to be concluded from
theory by the research team.(Abramowitz & Weinberg, 2008; Eckstein, 2016;
Huizingh, 2007)
» For one IV and one DV we talk about Linear Regression. If there are two or
more IV about Multiple Linear Regression.
(Multiple) Linear Regression (MLR)
» Method 1 (Default method): Enter (All IV are simultaneously entered into
the regression). Method 2: Stepwise Forward Selection: With statistical
consideration all IV are entered in sequence into the regression.
» Method 3: Backward Selection: All the independent variables are entered
into the equation first and each one is deleted one at a time if they do not
contribute to the regression equation
» There are further methods that were not used in this research. (Abramowitz
& Weinberg, 2008; Huizingh, 2007)

» A scatter plot visualizes the interrelation between two variables. Both


coordinates, x-axis and y-axis, can comprise one specific metric or an
aggregated score. Generally (x) is seen as an impact factor for (y). (Huizingh,
Scatter Plot 2007)
» A regression line can be drawn into the scatter plot to illustrate the
aggregated relationship based on all individual data points.

» Analog to scatter plot


Scatter Plot with Moderator/Grouped Scatter Plot » The Moderator is a third variable that is used for the scatter plot using colors
to illustrate to different samples/groups that are plotted. (Huizingh, 2007)

Quality Metrics Research | 25


6 FINDINGS

26 | Quality Metrics Research


6.1 Summary
The first section of this chapter provides an overview on the findings of
the many statistical analyses performed to date as part of this research
project.
Further detail on each of these analysis and associated findings can be
found in the associated sections indicated in Table 13.

Ch. Title Key Finding


6.2.1 Service Level Delivery (OTIF) » Service Level Delivery (OTIF) is deemed to be a good surrogate for PQS Effectiveness
as a Surrogate for PQS measured by the aggregated PQS Effectiveness Score
Effectiveness
6.2.2 Inventory-Stability Matrix » A high level of operational stability seems to be the major lever to achieve high levels of
Analysis I Service Level Delivery.

Inventory – OTIF » High level of inventories may compensate for stability issues

Analysis II » Sites with low operational stability show significant higher levels of Rejected Batches
Rejected Batches - OTIF » Sites with high levels of Rejected Batches and low inventory show a comparably low level of
Service Level Delivery
» Sites with high levels of Rejected Batches and high inventory, have a similar level of Service
Level Delivery as sites with few Rejected Batches
» Inventory mitigates the negative effect of high levels of Rejected Batches on the Service
Level Delivery (OTIF)
Analysis III » Sites with low stability and low inventory show a weak performance for both metrics,
Rejected Batches and Customer Complaint Rate
Rejected Batches - Customer
Complaint Rate » Sites with low stability and high inventory also show higher levels of Rejected Batches as
high stability groups, however demonstrate a Customer Complaint Rate level similar to
high stability groups
» Mitigating effect of inventory on the impact of low operational stability and high level of
Rejected Batches on the Customer Complaint Rate
6.2.3 Moderating Effects » High level of inventory reduces the negative impact of Rejected Batches on the Service
Analysis I Level Delivery level

Rejected Batches – OTIF


Moderator: Days on Hand
Analysis II » Sites with Make-to-Order (MtO) production strategy are less capable of mitigating the
negative impact of Rejected Batches on the Service Level Delivery level
Rejected Batches – OTIF
Moderator: Make-to-Strategy
Analysis III » Make-to-Order sites (MtO) demonstrate a lower level of PQS Effectiveness (OTIF) when
there is a lower level of Operational Stability
Operational Stability – OTIF
» Make-to-Stock (MtS) sites do not show this relationship.
Moderator: Make-to-Strategy
6.2.4 Impact of C-categories on » MLR demonstrates, for different entering methods, an elevated impact of the metrics Lot
PQS Effectiveness Acceptance Rate (1- Rejected Batches) and Scrap Rate as predictors for PQS Effectiveness
(OTIF)
6.2.6 CAPA Effectiveness » Highly significant correlation is only detectable between the metric Number of non-critical
overdue CAPAs and PQS Effectiveness (OTIF)
» Pearson correlation coefficient is -.810 indicating a strongly negative correlation
6.3 PQS Effectiveness and » Pharmaceutical manufacturing sites with a higher PQS Effectiveness have the
Efficiency tendency to also show a higher PQS Efficiency
Overall Sample » However, it has to be noted that the degree of determination of 11% is rather
limited
» Stronger relationship of sites in high stability, low inventory group between PQS
Sub-sample of high stability, Effectiveness and PQS Efficiency compared to the overall sample
low inventory sites
» Degree of determination increased to 25%

Quality Metrics Research | 27


Ch. Title Key Finding
6.4 Customer Complaint Rate » Customer Complaint Rate High Performer (peer-group with a low CCR) have a
Customer Complaint Rate significantly higher aggregated PQS Effectiveness Score compared to the Customer
and PQS Effectiveness Complaint Rate Low Performer

Customer Complaint Rate » A higher Customer Complaint Rate is accompanied by a low aggregated PQS Effectiveness
and PQS Effectiveness for Score
DS/DP Split
» For drug substance sites this relationship is stronger than for drug product sites
Customer Complaint » Operational Stability High Performer have both a low level of Customer Complaints as well
Rate and Rejected Batches as a low level of Rejected Batches
moderated by Operational
Stability
6.5 Cultural Excellence » High Quality Maturity is accompanied with a high degree of Quality Behavior
Quality Maturity and Quality
Behavior
Top-10 Quality Maturity » A special focus of the Top-10 Maturity Attributes includes the use of standardization,
Attributes driving Quality visualization and best-practice sharing
Behavior
Cultural Excellence as » PQS Effectiveness High Performer have a significantly higher implementation level of
the foundation for PQS Cultural Excellence compared to the PQS Effectiveness Low Performer
Effectiveness
St.Gallen OPEX Enablers and » For most St.Gallen OPEX Enabler (sub)categories the Operational Stability High Performer
Operational Stability have a significantly higher level of implementation compared to the Operational Stability
Low Performer
» Category Total Quality Management (TQM) however does not show a significantly
different implementation level for the two peer-groups (only same or slightly better
implementation level)

Table 13: Findings overview

GROUP STATISTICS

OTIF Peer N Mean Std. Deviation Std. Error Mean

PQS Effectiveness HP 26 .5508 .15485 .03037


LP 25 .4597 .13059 .02612

Table 14: Differences of mean of (aggregated) PQS Effectiveness (Score) for OTIF HP and OTIF LP

Independent Samples Test

t-test for Equality of Means

F Sig. t df Sig. (2-tailed) Lower Upper


PQS Effectiveness
1.392 .244 2.268 49 .028 .09117 .04019 .01040 .17194

2.276 48.194 .027 .09117 .04006 .01064 .17170

Table 15: T-test for equality of means of (aggregated) PQS Effectiveness (Score) between OTIF HP and OTIF LP

28 | Quality Metrics Research


6.2 Analysis: PQS Effectiveness 6.2.1.2 Approach and Sample
This section focuses on analysis that are directly related with the The second requirement requires that good/bad OTIF performer
effectiveness of the PQS. are also good/bad performer regarding the aggregated PQS Effec-
tiveness Score. In order to test this hypothesis a t-test for equality of
means has been selected as statistical tool. For the comparison of
6.2.1 Service Level Delivery (OTIF) as means of the aggregated PQS Effectiveness Score the following two
subgroups have been derived from the overall sample.
Surrogate for PQS Effectiveness
OTIF High Performer Group (OTIF HP) consists of the 10% best
performing sites for OTIF: N (OTIF HP)=26 OTIF Low Perform-
6.2.1.1 Motivation and Objectives er Group (OTIF LP) consists of the 10% worst performing sites for
One of the overall objectives of this research project was to assess OTIF: N (OTIF LP)=25
the impact of individual metrics on the related category (e.g. Reject- A significant difference between the aggregated PQS Effectiveness
ed Batches on Operational Stability) and subsequently to assess the Score along with a higher value for the OTIF HP would support the
impact of the three PPSM C-categories Supplier Reliability, Opera- suggestion that Service Level Delivery (OTIF) is a good surrogate for
tional Stability, Lab Quality & Robustness on the PPSM D-catego- PQS Effectiveness.
ries of PQS Effectiveness and PQS Efficiency.
In order to conduct statistical analysis, usually a dependent and 6.2.1.3 Results
an independent variable are required. Unfortunately there was no
preexisting independent variable for PQS Effectiveness available. Performing a t-test for equality of means reveals that the OTIF
Calculating an aggregated PQS Effectiveness Score based on the in- High Performer Group has a significant (t-test p-value=0.027) high-
tersystem categories scores (SR Score and OS Score) is essentially er value of the aggregated PQS Effectiveness Score (average for OTIF
possible, but delivers no use to perform statistical analysis. This is HP=55% compared to OTIF LP=46%). This confirms the hypothesis
due to the fact that the scores comprise entirely or at least partly that good OTIF performers are, on average, good aggregated PQS
of the same metrics, so that the statistical relation between the in- Effectiveness Score performers.
dividual performance metrics and the aggregated PQS Effectiveness
Score is defined by the formula used to calculate the aggregated PQS 6.2.1.4 Implications
Effectiveness Score rather than describing the actual relationship.
Combining the good fit of PQS Effectiveness and the metric Service
This situation resulted in the necessity to identify a suitable sur- Level Delivery (OTIF) from a theoretical perspective with the results
rogate metrics for the aggregated PQS Effectiveness Score for use in of the t-test is a strong indicator that OTIF is deemed to be a good
conducting statistical analysis. The core team defined two basic re- surrogate for PQS Effectiveness measured by the aggregated PQS Ef-
quirements for the surrogate metrics: fectiveness Score.
1. First, from a theoretical perspective, the surrogate metrics Consequently, the research team used OTIF in the research project
has to assess the same or at least very similar aspects as the as a surrogate, when required, for the PQS Effectiveness Score. If for
term PQS Effectiveness does. That signifies the ability of a any reason, the aggregated PQS Effectiveness Score is used it is clearly
PQS to effectively delivery high quality drugs when they are marked as such.
needed and in the quantity, they are needed.
2. Second, the surrogate metrics has to show a similar
distribution among the production sites as the aggregated
6.2.2 Inventory - Stability Matrix (ISM)
PQS Effectiveness Score, therefore sites that perform good
/ bad regarding the surrogate metrics should be also 6.2.2.1 Motivation and Objectives
performing good / bad regarding the aggregated PQS
Effectiveness Score. When introducing the metric Service Level Delivery (OTIF) as a sur-
rogate metric for the PQS Effectiveness Score a lively debate was trig-
The first requirement resulted in the identification of the metric gered among the team. It is generally agreed, that OTIF is a good
Service Level Delivery (OTIF)17, 18 as a good surrogate from a theo- indicator for the effectiveness of the PQS system that is, to provide
retical perspective. OTIF stands for ‘On-Time and In-Full’. In-full the right drugs at the right quality in the right amount at the right
implies both, within the specifications (“the right quality) and in time. However, discussion emerged around the question of wheth-
the full amount ordered (“the right quantity”). Combining the er OTIF is also a good indicator for the stability of a pharmaceutical
three aspects of at the right time, at the right quality and the right production system or if a high level of delivery capability could also
quantity, this metric appears to be a good surrogate for the aggre- be achieved through high inventories.
gated PQS Effectiveness Score.
Based on this discussion the research team developed the idea of
The statistical validation of the second requirement will be dis- splitting the overall sample into four distinct groups based on the
cussed in the next section. dimensions of Stability and Inventory, with the objective to identify
distinct features of the groups and significant differences between
them.

17. Definition of Service Level Delivery (OTIF): perfect order fulfillment (percentage of orders shipped in time from a site (+/- 1 days of the agreed shipment
day) and in the right quantity (+/- 3% of the agreed quantity) and right quality) to its customer.
18. OTIF will be used as a synonym for Service Level Delivery (OTIF)

Quality Metrics Research | 29


6.2.2.2 Approach and Sample

Figure 6 visualizes the concept of splitting the sample in four contrast to Group 3, a higher level of safety stock. Table 17 shows
groups along the dimensions Stability and Inventory. The 2 x 2 ma- that the performance of Group 4 regarding Service Level Delivery is
trix is referred to as the Inventory-Stability Matrix (ISM). significantly better than the performance of Group 3. This fact in-
dicates that having a high level of inventory compensates for a lack
The dimension Stability is operationalized by the Operational Sta-
of operational stability concerning the availability to provide drugs
bility Score (see section 4.5). Sites that have an over median value for
on time and in the right quality. The second best performance is
OS score are categorized in Group 1 or 2.
achieved by Group 2. Interestingly the very best performance re-
The dimension Inventory is operationalized by the metrics: Days on garding Service Level Delivery is achieved by Group 1 even with low
Hand (DOH)19. Sites that have an over median value (30 days) for inventory. This can be explained due to a very high level of opera-
DOH are categorized in the Group 2 or 4. tional stability of this group (OS Score (Group1) = 66% > OS Score
(Group2) = 64%).
Sample:
Summary: A high level of operational stability (OS Score) is the
The concept leads to four distinct groups, drawn from the overall
major lever to achieve high levels of Service Level Delivery. While
sample, as shown in Table 16.
high levels of inventories may compensate for stability issues, the
Note on sample used: the basic sample comprises all 336 sites avail- inherent risks introduced by low stability present a threat to the or-
able from the St.Gallen OPEX benchmarking database at the start ganization’s ability to consistently meet market demand, on-time,
of the research project. In order to assign any given site to one of in full. The combination of low stability and low inventory, as rep-
the four ISM groups, values for Days on Hand and the OS Score are resented by Group 3, results in a lower capability to deliver in time.
needed based on the criteria given above. In total, 204 sites have
The following two analysis will focus on the metrics proposed by
been assigned to the ISM groups as shown in Table 16.
the revised FDA Draft Guidance (FDA, 2016): Firstly, the Lot Ac-
Implementation ceptance Rate and secondly, the Customer Complaint Rate.
The result of implementing the ISM concept into MS Excel is
shown in Figure 7. The larger point per group indicates the aver- 6.2.2.3.2 Service Level Delivery and Lot Acceptance Rate
age value of all sites within that group. The large blue point for in-
stance represents the average Stability level (OS Score) and average This analysis assesses the position of the four groups with respect
Inventory level measured by the absolute number of Days on Hand to the FDA proposed metrics Lot Acceptance Rate, represented by
(DOH_abs) for group 4. The first figure per average point indicates the metric Rejected Batches and the PQS Effectiveness surrogate
the value according to the x-axis, the second figure the value of the Service Level Delivery (OTIF).
y-axis. Figure 9 shows that Group 3 and Group 4 have the highest level
Besides generating diagrams, such as Figure 9, the excel tool pro- of Rejected Batches. Both groups are characterized by a low level of
vides a detailed overview on the average, the median value, the val- stability. In comparison Groups 1 and 2, which have a higher level
ue of the 0.75 percentile, the value of the 0.25 percentile as well of operational stability reveal a significant20 lower level of Rejected
as the rank within the four ISM-groups of the selected metric (see Batches. Figure 8 does not support the assumption that a high level
Table 17 left side). of Rejected Batches is directly linked with a low level of Service Level
Delivery as Group 4, which has a higher level of Rejected Batches,
A second table provides an overview of the differences of the means still achieves OTIF values very similar to the high stability groups.
between the four groups. Difference between Group I and J is de- However, the weak performance of Group 3 indicates a strong link
fined as the difference of Group I’s mean and Group J’s mean, divid- between Rejected Batches and poorer Service Level Delivery when no
ed by mean of the overall sample. inventory is available. This analysis demonstrates evidence of the
Diff = absolute value ( (Mean(Group i) - Mean(Group j) / Mean(Over- buffering or masking effect of inventory on poor performance.
all Sample)).
Along with calculating the difference as defined above, Excel offers
6.2.2.3.3 Customer Complaint Rate and Lot Acceptance Rate
the functionality to calculate a t-test of two samples. If the differ- The third analysis assesses the position of the four groups regard-
ence between two Groups is highlighted in green (see Table 17 right ing the two metrics Rejected Batches and Customer Complaint Rate.
side), the group’s mean values show a significant difference (p-value
below 0.05). In the course of the project, discussions came up on the question
whether or not the metrics Rejected Batches and Customer Com-
plaint Rate are redundant and, as a consequence, collecting and
6.2.2.3 Results: reporting both metrics provides little additional value compared to
asking for only one of them.
6.2.2.3.1 Service Level Delivery and Level of Inventory
Figure 10 shows (as also seen in Figure 9) a significant higher level
The first analysis assesses the relationship between the level of In- of Rejected Batches for the low stability Groups 3 and 4 regardless of
ventory and the Service Level Delivery (OTIF) level. inventory. However, the performance of these two groups differ in
the comparison regarding the level of customer complaints. Group
Figure 8 shows that Group 3 has the lowest level of Service Level De-
4 actually shows a similar level of customer complaints as Group 1
livery. According to Table 17 the average value of Service Level Deliv-
and 2, even though the level of Rejected Batches is more than double
ery of Group 3 is significantly lower (p-value<0.05) than the average
that of the high stability groups. Within the high stability groups,
value of the other groups. Group 4 also has a low stability but, in
even though Group 2 experiences a higher level of Rejected Batches

19. Days on Hand (DOH): average inventory less write downs x 365 divided by the ‘Cost of Goods Sold’ Difference in mean between Group2 and Group4.
P-value of t-test is 0.00<0.05.
20. Cf. Appendix 1.1: Questions and Definitions from St.Gallen OPEX Report – Structural Factors

30 | Quality Metrics Research


High Inventory
Stability
Level of Level of value (DOH)

1 2
Group N value –
stability inventory – Average
Average [%]
[Days]

1 50 High 66% Low 12


High stability, High stability,
low inventory high inventory 2 59 High 62% High 103
S tability

3 47 Low 37% Low 14.25

3 4
4 48 Low 40% High 127

ALL 204 50% 30

Low stability, low Low stability, Table 16: Average stability and inventory of four ISM-groups
inventory high inventory
Low
Low Inventory High

Figure 6: Inventory-Stability Matrix (ISM)

Figure 7: Inventory-Stability Matrix Excel


Service Level Delivery (absolute)

Days on Hand (absolute)

Figure 8: ISM: Level on inventory vs. Service Level Delivery

Quality Metrics Research | 31


Overview Values Overview Differences between Groups
Metric: Service Level - Delivery (OTIF) Dif. / Avr.
Group1 Group2 Group3 Group4
N= Average Median High 25% Low 25% Rank T-Test
Group 1 48 94% 97% 99% 95% 1 Group1 1.60% 8.20% 2.40%
Group 2 44 93% 97% 100% 89% 2 Group2 1.60% 6.60% 0.80%
Group 3 41 87% 91% 97% 80% 4 Group3 8.20% 6.60% 5.80%
Group 4 39 92% 96% 99% 86% 3 Group4 2.40% 0.80% 5.80%
All 266 92% 96% 99% 90% If the cell is highlighted in green, the p-value of the t-test on equality of
means is below 0.05. The difference between the groups therefore is
Average 91% 95% 99% 87%
significant

Table 17: ISM Overview Service Level Delivery (OTIF)

Figure 9: ISM: Rejected Batches vs. Service Level Delivery

Figure 10: ISM: Rejected Batches vs. Customer Complaint Rate

DV: Metric A Regression line DV: Metric A Moderated regression line(s)


100 % of metric A on B 100 % of metric A on B

Metric C > Av. [1]

Metric C < Av. [2]

Check for
interaction effects

Pr: Metric B Pr: Metric B


0% 100 % 0% 100 %

Figure 11: Moderating Effect Approach

32 | Quality Metrics Research


than Group 1 it achieves the best performance for the metric Cus-
tomer Complaint Rate. Considering the hypothesis examined in 6.2.3.3 Results
this analysis, of whether to collect one or both metrics for Rejected
Batches and Customer Complaint Rate, Group 3 (low stability, low 6.2.3.3.1 Relationship of Rejected Batches and Service Level Delivery
inventory) is the only group that consistently displays poor perfor-
mance in both measures. For the other three groups there are dis- moderated by Level of Inventory
tinguishing features in their performance on both measures.
The first analysis using this moderating effect method addresses
These observations indicate a mitigating effect of inventory on the the impact of inventory, measured by the metric Days on Hand
impact of low operational stability coupled with high levels of Re- (DOH), on the relationship between the level of Rejected Batch-
jected Batches on the Customer Complaint Rate. es and the Service Level Delivery performance. The findings are
in line with the observations from the Inventory-Stability Matrix
A potential explanation for the described observation is as follows. analysis described previously.
Assume a scenario of a production facility which operates with
a high level of safety stock. In the event of a high number of re- Figure 12 shows the relationship between the level of Rejected
jected batches occurring, site managers have the option to fulfill Batches (x-axis) and the Service Level Delivery (y-axis). The sample
their deliveries to the customers fully or partly from safety stocks. is divided into three groups based on the level on inventory held.
Therefore, they have no incentives to release batches based on time The group DOH = low contains sites with a Days on Hand level
pressure. Quite in contrary, managers of sites with low inventory between 1 and 22. Sites with values between 22 and 78 days are
facing instability and increased rejects of batches, do not have the grouped in the group DOH = medium, all sites with more inventory
option to fulfill delivery obligations from stock and may be incen- than 78 days belong to group DOH = high.
tivized to close process deviations or compliance non-conformanc-
The following three observations can be made from Figure 11:
es faster, without concluding true root causes for defects, in order
to release batches to the market. Subsequently, post-market quality 1. For plants with low DOH value (blue), on average a higher
defects are discovered by the customers and reported. level of Rejected Batches leads to a lower level of Service
Level Delivery (R2=0.048)
In conclusion, the data indicates that Rejected Batches and Cus-
tomer Complaint Rate are not redundant metrics as the latter is 2. For plants with medium DOH value (green), on average a
also dependent on the inventory level whereas the first not direct- higher level of Rejected Batches leads to a lower level of
ly. Examining Rejected Batches therefore is a good indicator of un- Service Level Delivery (R2=0.023)
derlying operational stability whereas an examination of Customer
3. For plants with high DOH value (yellow), on average a higher
Complaint Rate is more complex as both inventory levels and oper-
level of Rejected Batches does not result in a lower level of
ational stability could have an impact on performance.
Service Level Delivery (R2=0.009)
In conclusion, the observations indicate that a high level on inven-
6.2.3 Moderating Effects tory reduces the negative impact of Rejected Batches on Service
Level Delivery. This finding is in line with the conclusion of the
6.2.3.1 Motivation and Objectives Inventory-Stability Matrix analysis also demonstrating a mitigat-
ing effect of inventory.
The objective of this analysis is to identify moderating effects of
structural factors on the relationships of the PQS Effectiveness
represented by the Service Level Delivery (OTIF) and Rejected Batch-
es. Structural factors are context information about the sites that
are collected together with KPIs and Enablers within the St.Gallen
OPEX Questionnaire21

6.2.3.2 Approach and Sample


The analysis approach is visualized in Figure 11 and described in
principle below.
The plot on the left-hand side shows an (X,Y) diagram with the pre-
dictor metric B on the x-axis and the dependent variable metric
A on the y-axis. The plot presents a regression that is calculated
based on the whole sample. On the right side, the sample is fur-
ther divided based on the value of a third moderating metric C. The
blue regression line represents sites with a metric C value above
the sample average, whereas the yellow line represents sites with
metric C values below the sample average.
For example, in sites with older equipment the impact of metric B
on dependent variable metric A is higher compared to sites with
newer equipment (where Metric C = Age of equipment)
The sample depends on the metrics selected and only includes sites
with values for metrics A-C.

21. Cf. Appendix 1.1: Questions and Definitions from St.Gallen OPEX Report – Structural Factors

Quality Metrics Research | 33


Service Level Delivery (absolute)

Rejected Batches (absolute)


Figure 12: Inventory Effect on Relationship Rejected Batches vs. Service Level Delivery

Group Production Strategy

Frequency Percent Valid Percent


Valid Make-to-Stock 69 20.7 50.7 50.7
Make-to-Order 67 20.1 49.3 100.0
Total 136 40.7 100.0
Missing System 198 59.3
Total 334 100.0

Table 18: Overview Production Strategy


Service Level Delivery (absolute)

Rejected Batches (absolute)

Figure 13: Effect of selected Production Strategy on Relationship Rejected Batches vs. Service Level Delivery

34 | Quality Metrics Research


6.2.3.3.2 Relationship of Rejected Batches and Service Level Delivery 6.2.4 Impact of C-categories on PQS Effectiveness
moderated by Make-to-Strategy
The second analysis focuses on the question whether or not the 6.2.4.1 Motivation and Objectives
production strategy at the site has a moderating impact on the re- After having identified the metric Service Level Delivery (OTIF) as
lationship between the level of Rejected Batches and the Service Level surrogate for PQS Effectiveness, the research team was able to eval-
Delivery. For this analysis the two production strategies Make-to- uate the impact of the C-categories Supplier Reliability and Opera-
Order (MtO) and Make-to-Stock (MtS) are compared. In order to tional Stability on the effectiveness of the PQS system.
assign the sites from the overall sample to either the MtO Group
or the MtS Group, the value of the following Enabler item was con- Supplier Reliability is operationalized with the SR Score and Opera-
sidered. tional Stability with the OS Score (see chapter 4). The third C-cate-
gory Lab Quality and Robustness has not been included in the anal-
“We mainly produce one unit when the customer orders one. ysis as no data points have been available to date.
We normally to not produce to stock.”
The Likert scale ranges from one (“Not at all”) to five (“Complete- 6.2.4.2 Approach and Sample
ly”). Sites indicating four or five are assigned to the MtS Group,
sites indicating one or two are assigned to the MtO Group. Table The approach included the application of two statistical tools.
18 provides an overview on how many sites are assigned to both Firstly, a correlation analysis including the SR Score, the OS Score
groups. and the metric Service Level Delivery (OTIF). The sample size rang-
es from 252 sites for the bivariate correlation of OTIF and the SR
Figure 13 shows the relationship between the level of Rejected Score to 303 sites for the bivariate correlation of the SR Score and
Batches (x-axis) and the Service Level Delivery (OTIF) (y-axis), the OS Score.
moderated by the selected production strategy.
Secondly, a multiple linear regression (MLR) with the metric Ser-
The following two observations can be made from Figure 13: vice Level Delivery (OTIF) as dependent variable and the OS Score
1. For plants with MtS strategy, on average a higher level and the SR Score as independent predictor variables.
of Rejected Batches does not impact the level of OTIF
(R2=0.020)
6.2.4.3 Results
2. For plants with MtO strategy , on average a higher level of
Rejected Batches leads to a lower level of OTIF (R2=0.098)
Correlation Analysis:
Table 19 reveals a highly significant (0.01 level) correlation between
In conclusion, the observations indicate that sites with Make-to- the Operational Stability Score and the PQS Effectiveness surrogate
Order (MtO) production strategy are less capable of mitigating the Service Level Delivery (OTIF) (p=0.01). In contrast, no significant cor-
negative impact of Rejected Batches on the Service Level Delivery. relation can be identified between the Supplier Reliability Score and
The finding supports prior outcomes indicating a compensation Service Level Delivery (OTIF (p=0.315).
effect of inventory on instabilities in the production system.
However, a highly significant correlation is shown between Suppli-
er Reliability Score and Operational Stability Score (p=0.003).
6.2.3.3.3 Relationship of Operational Stability and Service Level Delivery
Therefore, the correlation analysis indicates that the category Op-
moderated by Make- to-Strategy erational Stability is of highest importance as a contributing factor
The third analysis has a similar focus as second one. Instead of or predictor of the effectiveness of the PQS system. The reliability
considering Rejected Batches as predictor variable on the x-axis this of suppliers appears to have more an indirect impact on PQS Effec-
analysis evaluates the implication of the two production strategies tiveness as it is correlated with Operational Stability but not directly
Make-to-Order (MtO) and Make-to-Stock (MtS) on the relation with the PQS Effectiveness.
Operational Stability (x-axis) and Service Level Delivery (labeled OTIF Multiple Linear Regression (MLR)
Abs Figure 14).
In the multi linear regression analysis the Operational Stability
The following two observations can be made from Figure 14: Score and the Supplier Reliability Score serve as independent varia-
bles and Service Level Delivery (OTIF) as dependent variable. As en-
1. Make-to-Order sites show an increased PQS Effectiveness
tering method “Enter” was selected in SPSS. The MLR model itself
(OTIF) with an increased Operational Stability
is significant to the 0.05 level (p=0.028), with an R Square of 29%22.
2. Make-to-Stock sites do not show this relationship. Make-
On the level of the independent predictor variables, the Opera-
to-Stock sites have a high PQS Effectiveness (OTIF)
tional Stability Score is significant to the 0.05 level (p=0.011). Sup-
independent from their Operational Stability.
porting the first result of the correlation analysis, whereas Supplier
In conclusion, the observation shows that Make-to-Order sites Reliability Score is not found to be significant as a predictor variable
(MtO) demonstrate a lower level of PQS Effectiveness (OTIF) when (p=0.738).
there is a lower level of Operational Stability. Make-to-Stock (MtS)
In a second regression the Supplier Reliability Score serves as single
sites do not show this relationship.
predictor variable and Operational Stability Score as dependent var-
iable. This analysis demonstrates a significant (p=0.003) and pos-
itive (beta = +0.169) impact from the SR Score on the OS Score,
confirming the result from the correlation analysis.
In conclusion, there is statistical evidence that Operational Stabil-
ity is a major contributing factor for achieving a high level of PQS
Effectiveness, whereas Supplier Reliability is not directly linked.
However, both analysis indicate, that a high Supplier Reliability has
a positive influence on Operational Stability thus exerting influence
on the PQS system more indirectly.

Quality Metrics Research | 35


Production
Strategy
1.00 Make-to-Stock
Make-to-Order
Make-to-Stock
y=0.96+9.64E-4*x Make-to-Order

Make-to-Stock : R2 Linear = 7.728E-6


Make-to-Order : R2 Linear = 0.111
(absolute)

.90
OTIF

y=0.77+0.21*x
Del.
Delivery
Level

.80
Service
Service Level

.70

.60
.00 .20 .40 .60 .80 1.00

OS_Score
Operational Stability Score

Figure 14: Effect of selected Production Strategy on Relationship Operational Stability vs. Service Level Delivery

Table 19: Correlation Analysis SR, OS, OTIF

Independent Variable Dependent Variable

Metric from category


Metric from Category Operational Stability
Supplier Reliability

Customer Complaint Rate


Unplanned Maintenance Scrap Rate Service Level Delivery (OTIF)
(supplier)

Service Level Supplier Customer Complaint Rate Yield

Deviations Closure Time


OEE
Average Release Time

Table 20: Metrics included in MLR

36 | Quality Metrics Research


6.2.5 Impact of Performance Metrics on PQS Effectiveness 6.2.6.3 Results
The correlation analysis provides the following results:
6.2.5.1 Motivation and Objectives Significant correlations to the 0.01 level
The objective of this analysis is to evaluate the impact of single key 1. A highly significant correlation (p=0.008<0.01) is only
performance indicators (KPIs) from the categories Supplier Relia- detectable between the metric Number of non-critical
bility and Operational Stability on the PQS Effectiveness surrogate overdue CAPAs and the metric Service Level Delivery (OTIF).
Service Level Delivery (OTIF). Pearson correlation coefficient is -.810 indicating a strongly
negative correlation.
6.2.5.2 Approach and Sample Significant correlations to the 0.05 level
In order to analyze the impact of individual metrics a multiple lin- 1. The metric Number of observations per internal audit shows
ear regression is applied. SPSS offers multiple techniques to enter significant correlations with the metrics:
the metrics into the MLR model. For this analysis the entering › Rejected Batches: positive correlation (Pearson
techniques Enter, Stepwise selection and Backward selection, have correlation coefficient .580)
been used (cf. Table 12).
› Yield: negative correlation (Pearson correlation
Table 20 provides an overview over the key performance indica- coefficient -.609)
tors that serve as independent variables as well as the dependent
variable. It contains all variables of the two C-categories with the 2. The metric Number of Recalls shows significant correlations
exception of Number of Deviations as this metric is deemed to be with the metrics:
heavily dependent on site context (e.g. size, product type, volume) › Unplanned Maintenance: positive correlation (Pearson
and is therefore removed from this analysis. correlation coefficient .601)
For the highly significant correlation, Figure 15 visualizes the nega-
6.2.5.3 Results tive correlation between the metric Number of non-critical overdue
The results of the MLR analysis are summarized in Table 21. SPSS CAPAs and the metric Service Level Delivery (OTIF)
output can be found in the Appendix23. Considering the low number of data-points for the CAPA Effective-
The overview demonstrates that different entering methods for ness metrics (maximal 14), all correlations described above should
MLR with OTIF as dependent variable consistently show an elevat- be treated with caution as the ability to draw any conclusions from
ed impact of the metrics Lot Acceptance Rate (1- Rejected Batches) this analysis is rather low compared to the other analyses in this
and Scrap Rate as predictors for PQS Effectiveness. report.

In the research teams’ opinion this result supports the proposition


of FDA to select the Lot Acceptance Rate as a suitable quality metric 6.3 Analysis: PQS Effectiveness and Efficiency
candidate.

6.2.6 CAPA Effectiveness and PQS Effectiveness 6.3.1 Linkage between PQS Effectiveness and Efficiency

6.2.6.1 Motivation and Objectives 6.3.1.1 Motivation and Objectives


As described in the PPSM section (see section 4.3), the CAPA system In theory there is a positive link between effectiveness and efficien-
is an essential part of a pharmaceutical quality system. The objec- cy of a production system. Those companies that achieve a high ef-
tive of this analysis is to analyze the correlation between fectiveness also have a high efficiency. However the link only exists
from effectiveness to efficiency. Significantly, a high efficiency does
1. CAPA Effectiveness metrics with the PQS Effectiveness metrics not directly enable a high effectiveness (Corbett & Van Wasser-
2. CAPA Effectiveness metrics with Operational Stability metrics hoven, 1993; Ferdows & De Meyer, 1990; Gößler & Grübner, 2006).
The research team’s objective was to identify whether the St.Gallen
6.2.6.2 Approach and Sample OPEX Benchmarking database confirms the theory that achieving
high effectiveness allows for respectively drives a higher efficiency
Due to the limited number of data points of CAPA Effectiveness of the pharmaceutical quality system.
metrics available, the approach was limited to a simple bivariate
correlation analysis.
As this category has only recently been added to the standard
St.Gallen OPEX questionnaire the sample size is limited to only 14
sites that have filled in data for some of these metrics. No data is
currently available for Number of supply stops and Others (e.g. with-
drawals). In addition, some of them are not differentiating because
either all or nearly all sites have indicated zero (Number of Warning
Letters / Number of critical overdue CAPAs). Consequently, correla-
tions based on very few data points are not reported.
Table 22 provides an overview on all metrics that are used in the
correlation analysis.

22. See Appendix


23. Performance Metrics on PQS Effectiveness

Quality Metrics Research | 37


Nr. Dependent Entering Adjusted R Significance of Model to Predictors with best significance values
Variable Method Square .05 level (p-value)
1 OTIF Enter .417 Yes (Sig. 0.017) Rejected Batches (Sig. 0.007) *
Scrap Rate (.003) *
Unplanned maintenance (.051)*
2 OTIF Stepwise .482 Yes (.000) Rejected Batches (.002)*
Scrap Rate (.003)*
Unplanned Maintenance (.019)*

3 OTIF Backward .495 Yes (.000) Rejected Batches (.004)*


Scrap Rate (.001)*
Unplanned Maintenance (.027)*

Table 21: Results MLR Impact of KPIs on OTIF * Significant to 0.05 level

CAPA Effectiveness Metrics Operational Stability Metrics PQS Effectiveness Metrics


Number of CAPAs Unplanned Maintenance Service Level Delivery (OTIF)

Number of critical overdue CAPAs OEE Average Customer Complaint Rate

Number of non-critical overdue CAPAs Rejected batches

Number of Observations of a health authority


Deviation (absolute)
inspection

Number of Observations per internal audit Yield

Release Time
Number of Recalls
Deviation closure time

Table 22: CAPA Effectiveness Metrics for Correlation Analysis


Service Level Delivery (absolute)

Number of non-critical overdue CAPAs

Figure 15: Plot: Number of non-critical overdue CAPAs vs. Service Level Delivery (OTIF)

38 | Quality Metrics Research


6.3.1.2 Approach and Sample 6.3.2.3 Results
In order to assess the relationship between PQS Effectiveness and The scatter plot between PQS Effectiveness and PQS Efficiency
PQS Efficiency of the PPSM, the aggregated PQS Effectiveness Score (cf. Figure 17) suggests that sites from the peer-group “High Op-
(see section 4.7) comprising Supplier Reliability and Operational erational Stability, Low Inventory” (blue) have an even stronger
Stability was examined in relation to the PQS Efficiency Score (see relationship between the aggregated PQS Effectiveness and PQS
section 4.8). Efficiency compared to the overall sample. In addition, for this high
performing group the degree of determination increased from 11%
To enable greater clarity in the result, the sample examined com-
to 25%. Internationally recognized research considers a degree of
prises of only those sites that belong to the High Performer or the
determination of 25% as demonstrating as a strong relationship
Low Performer peer-groups respectively, of the surrogate of PQS
(Cohen, 1977).
Effectiveness, Service Delivery Level (OTIF).
To visualize the relationship a scatter plot was selected illustrating
the aggregated PQS Effectiveness on the x-axis and PQS Efficiency on 6.3.2.4 Implications
the y-axis.
Beyond simply establishing a positive relationship between PQS
Effectiveness and PQS Efficiency, the results from this analysis show
6.3.1.3 Results that there is a specific peer-group of “High Stability, Low Invento-
ry” sites that have a distinctively stronger relationship compared to
The scatter plot between PQS Effectiveness and PQS Efficiency (cf.
other sites in the sample.
Figure 16) suggests that the data of the St.Gallen OPEX Bench-
marking database can confirm the findings from traditional quality
management (e.g. Deming, 1986) as well as later partly empirical 6.4 Analysis: Customer Complaint Rate
based models (e.g. Ferdows & De Meyer, 1990). The data indicates
a positive relationship between both categories. Therefore, we can To understand the relationship between the FDA metric Custom-
conclude that pharmaceutical manufacturing sites with a higher er Complaint Rate (CCR) and the Pharmaceutical Quality System
PQS Effectiveness have an increased opportunity to also achieve a this section now deals with a range of statistical analysis and peer-
higher PQS Efficiency, which can deliver real benefits to the busi- groups to identify how the Customer Complaint Rate is related to
ness in terms of reduced operating costs. However, it has to be not- other metrics, and their respective categories, within the Pharma-
ed that the degree of determination of 11%, while limited, should ceutical Production System Model (PPSM).
encourage businesses to seek to realize these potential benefits.
Implications 6.4.1 Linkage between Customer Complaint Rate
Previous research by Flynn and Flynn did not find evidence to con- and PQS Effectiveness
firm the theory of Ferdows and De Meyer and suggested further
investigation (Flynn, 2004). However, the St.Gallen data confirms 6.4.1.1 Motivation and Objectives
Ferdow’s and DeMeyer’s model that suggests that efficiency builds
on effectiveness. A focus on improving PQS Effectiveness will also Based on the importance of PQS Effectiveness in establishing the
allow PQS Efficiency improvements at a later point in time in the level of excellence of the overall Pharmaceutical Quality System, a
pharmaceutical industry. Importantly, the opposite approach, to key objective of the research is to identify whether the FDA met-
improve PQS Efficiency first, will not lead on to long-term improve- ric Customer Complaint Rate shows any specific relationships that
ments of both aspects. would enable deriving conclusions about the PQS effectiveness
from this single metric.
6.3.2 Linkage between PQS Effectiveness and
Efficiency with peer-group split 6.4.1.2 Approach and Sample
To identify if there is a significant relationship between Customer
Complaint Rate and PQS Effectiveness the overall sample of sites
6.3.2.1 Motivation and Objectives from the St.Gallen OPEX Benchmarking database was split into
two peer-groups to perform a statistical t-Test.
Based on the finding that PQS Effectiveness and Efficiency has a
positive relation (cf. 6.3.1 Linkage between PQS Effectiveness and » Peer-group 1: Customer Complaint Rate High Performer
Efficiency) the St.Gallen research team aimed to further investigate (Top 10% best performing sites for Customer Complaint
this relationship. The key objective was to identify whether there Rate): n = 28
is a specific peer-group within the data that demonstrates an even
» Peer-group 2: Customer Complaint Rate Low Performer
stronger relationship between the two categories compared to the
(Bottom 10% worst performing sites for Customer
overall sample.
Complaint Rate): n = 29
For PQS Effectiveness the aggregated PQS Effectiveness Score was
6.3.2.2 Approach and Sample used. This score is calculated based on Operational Stability and
Supplier Reliability (see section 4.7).
To identify any differing relationship the research team extracted
the peer-group “High Operational Stability, Low Inventory” Group
1 from the Inventory-Stability Matrix (ISM) analysis (see 6.2.2) from
the overall sample and plotted their aggregated PQS Effectiveness
(x-axis) against PQS Efficiency (y-axis).

Quality Metrics Research | 39


PQS Efficiency Score

Aggregated PQS Effectiveness Score

Figure 16: Scatter plot between agg. PQS Effectiveness and PQS Efficiency

Stability(OS)/Inventory
Peer
1.00
High Stability, Low Inventory
Rest
High Stability, Low Inventory
Rest

.80 High Stability, Low Inventory : R2


Linear = 0.249
Rest : R2 Linear = 0.005
Score
Efficiency Score
Efficiency

.60

y=0.46+0.1*x
y=0.03+0.85*x
PQS
PQS

.40

.20

.00
.00 .20 .40 .60 .80 1.00

PQS Effectiveness
Aggregated Agg.
PQS Effectiveness Score

Figure 17: Scatter plot between agg. PQS Effectiveness and PQS Efficiency with peer-group

Table 23: Group statistics showing the mean difference between CCR HP and LP

Table 24: Independent Samples Test showing the significance of the statistical t-Test

40 | Quality Metrics Research


6.4.1.3 Results 6.4.3 Linkage between Customer Complaint Rate and
Based on the t-Test the researchers were able to show that the Cus- Rejected Batches moderated by Operational Stability
tomer Complaint Rate High Performer (peer-group 1 with a low
CCR) have a significantly higher aggregated PQS Effectiveness Score
(cf. Table 23 & Table 24) compared to the Customer Complaint Rate 6.4.3.1 Motivation and Objectives
Low Performer (peer-group 2 with a high CCR).
Building on the previous findings of the FDA metric Customer
Complaint Rate, the intention of this analysis is to analyze the re-
6.4.1.4 Implications lationship between this metric, another FDA metric of Lot Accept-
ance Rate (i.e. the available metric from St.Gallen Benchmarking: 1-
This analysis result shows that there is a significant difference in Rejected Batches), and the level of Operational Stability within sites.
the PQS Effectiveness between sites with a low and a high Customer
Complaint Rate. However, as PQS Effectiveness may be influenced
by many more factors it cannot be concluded that Customer Com- 6.4.3.2 Approach and Sample
plaint Rate is a good single measure to infer PQS Effectiveness.
To study the relationship between Customer Complaint Rate, Re-
jected Batches and Operational Stability a scattered plot was select-
6.4.2 Linkage between Customer Complaint Rate and ed. Operational Stability was used to build the peer-groups. Oper-
ational Stability High Performer (blue) can be distinguished from
PQS Effectiveness for DS/DP Split Operational Stability Low Performer (green).
The limited number of sites incorporated in this analysis can be
6.4.2.1 Motivation and Objectives explained by the number of different metrics that needed to be
completed by the OPEX Benchmarking participants. As the peer-
Based on the previous finding, that sites with a low Customer
groups are based on Operational Stability only sites that provided all
Complaint Rate also have a significantly higher aggregated PQS Ef-
individual data points required to calculate an Operational Stability
fectiveness Score compared to sites with a high Customer Complaint
Score were considered (see section 4.4).
Rate, the following analysis is intended to identify whether there
are differing relationships between these two metrics for drug sub-
stance and drug product sites. 6.4.3.3 Results
The scatter plot (cf. Figure 19) shows that Operational Stability High
6.4.2.2 Approach and Sample Performers (blue) have both a low level of Customer Complaints as
well as a low level of Rejected Batches, and therefore consequently
In order to assess if there is a differing relationship between Cus-
a high Lot Acceptance Rate. For Operational Stability Low Perform-
tomer Complaint Rate and the aggregated PQS Effectiveness a scat-
ers (green) no pattern can be identified.
ter plot was selected with a color coding for drug substance sites
(blue) and drug product sites (green).
To enable greater clarity in the result the sample comprises only of 6.4.3.4 Implications
those sites that belong to the High Performer (10% best perform- Linking two of the FDA metrics with Operational Stability provides
ing sites for Customer Complaint Rate) respectively Low Perform- a better understanding of the relationship between all three ele-
er (10% worst performing sites for Customer Complaint Rate) for ments of the Pharmaceutical Quality System. The findings from
Customer Complaint Rate. this analysis, although limited in number, indicate the importance
of focusing on the robustness and stability of the operational sys-
tems at sites in order to avoid issues that affect the overall PQS
6.4.2.3 Results system and may lead to product quality issues.
The scatter plot between Customer Complaint Rate and PQS Effec-
tiveness (cf. Figure 18) shows that the data from the St.Gallen OPEX
Benchmarking can confirm that there is a negative relationship
between the Customer Complaint Rate and the aggregated PQS
Effectiveness Score. A higher Customer Complaint Rate (x-axis) is
accompanied by a low aggregated PQS Effectiveness Score. For drug
substance sites this relationship is stronger than for drug product
sites. It is also notable that the degree of determination for drug
substance sites (R2 = 0.829) and for drug product sites (R2=0.378) is
very high.

6.4.2.4 Implications
This analysis shows that for the selected peer-groups the same rela-
tionship was identified, only the degree of determination differed.
Further investigation into the various drug product types will be
needed to better understand these differences (e.g. between chemi-
cal and biotech sites and will form part of the future research scope
of the research team.

Quality Metrics Research | 41


PQSEffectiveness Score

Customer Complaint Rate (absolute)

Figure 18: Scatter plot for Customer Complaint Rate and the aggregated PQS Effectiveness
Customer Complaint Rate (absolute)

Rejected Batches (absolute)

Figure 19: Scatter plot for Rejected Batches and Customer


Complaint Rate with Operational Stability peer-groups

R2 Linear = 0.664

1.00

.90
Quality Behavior

.80

.70 y=0.19+0.77*x

.60

.50

.40
.40 .60 .80 1.00

Quality Maturity

Figure 20: Linkage between Quality Maturity and Quality Behavior - PDA results (left) and St. Gallen results (right)

42 | Quality Metrics Research


6.5 Analysis: Cultural Excellence 6.5.1.4 Implications
Both analysis, the PDA study and the more recent work of this re-
search team show that a high Quality Maturity is accompanied by
6.5.1 Linkage between Quality Maturity and Quality Behavior a high degree of Quality Behavior. The confirmation of the previ-
ous findings of PDA was the starting point for the research team to
enhance the analysis in the field of Cultural Excellence with regards
6.5.1.1 Motivation and Objectives to the most significant maturity attributes that drive quality behav-
In both research and industry domains, Cultural Excellence is under- ior (cf. 6.5.2 Top-10 Quality Maturity Attributes that drive Quality
stood to provide the basis for PQS Effectiveness and it is recognized Behavior) and to identify if high performing PQS Effectiveness sites
to play an important role in delivering superior performance at a have a higher implementation of Cultural Excellence compared to
site, over and above that which can be directly measured through low performing PQS Effectiveness sites (cf. 6.5.3 Cultural Excellence
quality metrics (Patel et al., 2015), (ISPE, 2017). as the foundation for PQS Effectiveness) therefore providing da-
ta-backed evidence that cultural excellence is indeed a basis for
In the 2014 PDA Quality Culture study the association’s objective superior quality.
was to determine whether there is a relationship between ‘Quality
Maturity’ and ‘Quality Behavior’ (Patel et al., 2015). In addition, the
PDA report authors aimed to identify certain ‘Quality Maturity’ at-
6.5.2 Top-10 Quality Maturity Attributes that drive
tributes that may be used as surrogates to assess Quality Culture. Quality Behavior
‘Quality Behavior’ was summarized by PDA to include quality re-
lated behaviors of an individual that can be observed within an 6.5.2.1 Motivation and Objectives
organization covering aspects such as commitment, engagement,
transparency and active assistance from supervisors. ‘Quality Ma- Based on the theory that a high Quality Maturity level drives a high
turity’, as defined by PDA, comprises implementable elements of Quality Behavior that provides evidence of greater Cultural Excel-
the quality system such as methods and tools. lence, the research team objective was to identify which maturity
attributes have the most significant impact on the quality behavior
Following the hypothesis of Patel et al. (2015) that Cultural Excel- at a pharmaceutical manufacturing site. Each maturity attribute
lence is driven by quality behavior , which in turn is driven by qual- represents one individual Enabler from the OPEX Benchmarking
ity maturity, the St.Gallen research team included the objective to database.
analyze these relationships by translating the PDA understanding
of ‘Quality Behavior’ and ‘Quality Maturity’ using data from the
St.Gallen OPEX Benchmarking database. 6.5.2.2 Approach and Sample
To identify the most significant Quality Maturity Attributes a Mul-
6.5.1.2 Approach and Sample tiple Linear Regression (MLR) was used. For this analysis all Matu-
rity Attributes were inserted as the independent variables (IV) and
For this analysis, the Operational Excellence (OPEX) Enablers the Quality Behavior Score (the aggregated average implementa-
from the St.Gallen OPEX Benchmarking had to be assigned to tion of all Quality Behavior Attributes) was used as the dependent
either the Quality Behavior or Quality Maturity groups first. In variable (DV).
total the OPEX Benchmarking comprises of 114 Enablers cover-
ing different areas of OPEX, i.e. Total Productive Maintenance The sample that was used for this database is the entire St.Gallen
(TPM), Total Quality Management (TQM), Just-In-Time (JIT), OPEX Benchmarking database with more than 330 sites with a va-
Effective Management System (EMS) and Basic Elements. An riety of technologies, different sizes and locations in regions from
Enabler may either be a tool, a method or more generally indi- all around the world.
vidual behaviors that improve a specific outcome (e.g. TPM). En-
ablers therefore describe “how” a given organization operates.
Following this categorization, 26 of the OPEX Enablers were as-
signed to Quality Behavior group and 36 to the Quality Maturity
group24. A further 52 Enablers were not assigned as those do not
have a direct link to either of the two PDA categories under con-
sideration.
In order to assess the linkage between Quality Maturity and Qual-
ity Behavior the research team used a scatter plot to identify if a
high Quality Maturity (x-axis) is accompanied with a high Quality
Behavior (y-axis) (cf. Figure 20) . For both categories a score was
calculated that represents the average of the implementation level
of all attributes of the respective category.

6.5.1.3 Results
The statistical measure, adjusted R2 of 0.66, means that 66% of the
variation of the Quality Behavior Score can be explained with the
Quality Maturity Score. This supports and enhances the finding of
the PDA study in 2014 which showed a degree of determination of
34% (Patel et al. 2015).

24. Appendix 1.3: Questions and Definitions from St.Gallen OPEX Report – Enabler

Quality Metrics Research | 43


Table 25: Group statistics showing the mean difference between OTIF HP and LP

Table 26: Independent Sample Test showing the significance of the statistical t-Test

T PM T QM JI T EMS B asic Elements


Standardization and
Preventive Maintenance Process Management Set-up time Reduction Direction Setting
Simpli�ication
Management
Technology Assessment Cross functional
Pull Production Commitment and Visual Management
and Usage product development
Company Culture
Supplier Quality Employee Involvement
Housekeeping Layout Optimization
Management and CI
Functional Integration
Planning Adherence
and Quali�ication

Figure 21: Significant differences of the implementation level of Enabler Categories and Sub-Categories of the St.Gallen OPEX Benchmarking

44 | Quality Metrics Research


OPEX Benchmarking database was split into two peer-groups to
6.5.2.3 Results perform a statistical t-Test. To identify sites for the respective peer-
The following list is sorted in descending order starting with the group the PQS Effectiveness Surrogate Service Level Delivery (OTIF)
maturity attribute revealing the highest level of significance. All at- was used.
tributes are significant on the 95% confidence level:
» Peer-group 1: PQS Effectiveness High Performer (10% best
1. Optimized set-up and cleaning procedures are documented performing sites for OTIF): n = 32
as best-practice process and rolled-out throughout the
» Peer-group 2: PQS Effectiveness Low Performer (10% worst
whole plant.
performing sites for OTIF): n = 34
2. A large percentage of equipment on the shop floor is currently
Cultural Excellence in this analysis represents an aggregated score
under statistical process control (SPC).
of the Quality Behavior Score, Quality Maturity Score and the En-
3. For root cause analysis we have standardized tools to get gagement Metrics25 Score from the St.Gallen OPEX Benchmarking
a deeper understanding of the influencing factors (e.g. database. The Quality Behavior and Maturity Scores represent an
DMAIC). average of the implementation level of all attributes of the respec-
tive category. The Engagement Metrics1 Score represents the rel-
4. Goals and objectives of the manufacturing unit are closely
ative position of each site to the overall sample for this category
linked and consistent with corporate objectives. The site
based on the percentile rank calculation (thus enabling us to in-
has a clear focus.
clude Engagement Metrics with different scales (e.g. days and per-
5. We have joint improvement programs with our suppliers to centage)).
increase our performance.
6. All potential bottleneck machines are identified and supplied 6.5.3.3 Results
with additional spare parts.
Based on the t-Test (cf. Table 25 & Table 26) the researchers were
7. For product and process transfers between different units able to show that the PQS Effectiveness High Performer (peer-group
or sites standardized procedures exist, which ensure a fast, 1) have a significantly higher implementation level of Cultural Ex-
stable and complied knowledge transfer. cellence (cf. table 25 & table 26) compared to the PQS Effectiveness
Low Performer (peer-group 2).
8. Charts showing the current performance status (e.g. current
scrap-rates, current up-times etc.) are posted on the shop- Furthermore, the research team identified that the identified rela-
floor and visible for everyone. tionship between Cultural Excellence and the two peer-groups does
apply to each sub-category of Cultural Excellence as well. For Qual-
9. We regularly survey our customer`s requirements.
ity Maturity, Quality Behavior and Engagement Metrics the PQS
10. We rank our suppliers, therefore we conduct supplier Effectiveness High Performer have a statistical significantly higher
qualifications and audits. implementation level compared to the PQS Effectiveness Low Per-
former (cf. Appendix26).

6.5.3.4 Implications
6.5.2.4 Implications
The result of this analysis shows that a high degree of PQS Effec-
A special focus of these Top-10 Maturity Attributes are in the fields
tiveness is accompanied with a high level of Cultural Excellence ev-
of standardization, visualization and best-practice sharing. These
idenced at the site. Taking into account that there are other influ-
attributes support a high Quality Behavior for the pharmaceutical
encing factors to achieve a high PQS Effectiveness (e.g. Operational
manufacturing sites of the St.Gallen OPEX Benchmarking data-
Stability, Supplier Reliability) the results also shows a significant re-
base.
lationship between these two categories of the PPSM and the level
of Cultural Excellence at a site.
6.5.3 Cultural Excellence as the foundation for As a consequence, the widely discussed and understood impor-
PQS Effectiveness tance of Cultural Excellence and Quality Culture on the effective-
ness of a site’s PQS can be statistically confirmed with the data of
the St.Gallen OPEX Benchmarking database.
6.5.3.1 Motivation and Objectives
The importance of Cultural Excellence and Quality Culture has
widely discussed in the industry in recent times and is generally
understood on a qualitative basis. The research team aimed to ana-
lyze if this qualitative understanding of the role of Cultural Excel-
lence can be confirmed quantitatively.
The researcher’s objective was to identify whether there is a signif-
icant impact of Cultural Excellence on PQS Effectiveness at pharma-
ceutical manufacturing sites.

6.5.3.2 Approach and Sample


To identify if there is a significant impact of Cultural Excellence
on PQS Effectiveness the overall sample of sites from the St.Gallen

25. Engagement Metrics (see section 4.2)


26. Appendix 5: Cultural Excellence Subelements

Quality Metrics Research | 45


6.5.4 Linkage between St.Gallen OPEX Enablers and 6.5.4.4 Implications
Operational Stability In 4 out of 5 of the OPEX Enabler categories and sub-categories the
implementation level of OPEX strategies at the site is higher for the
Operational Stability High Performers compared to the Operational
6.5.4.1 Motivation and Objectives Stability Low Performers. For the Quality Metrics research this re-
sults shows that a formal Operational Excellence program is deci-
To gain a deeper understanding of the relationship between OPEX
sive in improving Operational Stability at a site and consequently in
Enablers and Operational Stability the research team aimed to ana-
improving the PQS Effectiveness (for impact of Operational Stability
lyze whether the St.Gallen OPEX Benchmarking database shows a
on PQS Effectiveness see section 6.2.4) . This finding is in-line with
relationship between those two categories.
the current industry and FDA discussions to resolve the missing
The objective was to identify if a higher implementation for the link between Quality and Manufacturing in many pharmaceutical
Enabler categories of the St.Gallen OPEX Benchmarking Model companies. Bringing these two functions together to create greater
(see chapter 3.2) Total Productive Maintenance (TPM), Total Qual- cohesion in objectives will support and improve Operational Ex-
ity Management (TQM), Just-In-Time (JIT), Effective Management cellence activities at a given site that then may lead onto acheiving
System (EMS) and Basic Elements is correlating with a higher Oper- higher PQS Effectiveness.
ational Stability. In addition, the research team aimed to also iden-
tify if some of the sub-categories (e.g. Preventive Maintenance)
within each Enabler section show any significant relationship. 6.6 Limitations of the Data Analysis
» For the purpose of participating in the OPEX Benchmarking
it is not the ultimate goal of each organization to fill in all
6.5.4.2 Approach and Sample
data points of the benchmarking questionnaire. Thus, the
To identify if there is a significant difference of the implementation St.Gallen database comprises incomplete datasets. This this
level of the St.Gallen OPEX Benchmarking Enabler categories (e.g. does not affect the outcome of the individual benchmarking
TPM) and sub-categories (e.g. Preventive Maintenance) between project of a participating company but for the purpose of
Operational Stability High Performers and Low Performers the this statistical analysis, some datasets could not be used
overall sample of sites from the St.Gallen OPEX Benchmarking da- for given specific analysis due to incomplete data. It was
tabase was split into two peer-groups to perform a statistical t-Test. deemed by the research team that replacing missing values
To identify sites for the respective peer-group the aggregated Oper- with the sample mean could have weakened the validity of
ational Stability Score (see section 4.5) was used. the analysis. Thus, sites with missing data were excluded for
certain analysis.
» Peer-group 1: Operational Stability High Performer (10% best
performing sites for the aggregated Operational Stability » For the section CAPA Effectiveness of the PPSM there was only
Score): n = 27 a limited number of datasets available as this section has not
traditionally been part of the St.Gallen OPEX Benchmarking
» Peer-group 2: Operational Stability Low Performer (10% worst
questionnaire. As a result of the recent updates to the
performing sites for the aggregated Operational Stability
standard questionnaire, in the future there will be more
Score): n = 32
data available for this section.
» The Quality Metric Invalidated OOS is part of the recently
6.5.4.3 Results (in 2016) introduced QC Lab OPEX Benchmarking
The analysis shows that in 4 out of 5 of the Enabler categories, as questionnaire. For the analysis undertaken in year 1 there
well as their respective sub-categories, the Operational Stability was not enough data available to draw conclusions. In the
High Performer have a significantly higher level of implementation future there will be more data available for this section.
compared to the Operational Stability Low Performer (cf. Figure 21). » During year 1 of this research no access was available to
However, the category Total Quality Management (TQM) does any FDA databases that could help to understand the
not show a significantly different implementation level for the relationship between PQS Excellence and FDA site-specific
two peer-groups. A closer look reveals that the Operational Stabil- accessible metrics (e.g. no. of recalls, time since last recall,
ity High Performers have at least the same or only slightly higher site compliance history)
implementation level compared to the Operational Stability Low » The St.Gallen OPEX Benchmarking database has a worldwide
Performers in TQM and its subcategories. Further examination of coverage, however the major proportion of the companies
this result is necessary in the next phase of this research to better are from Europe and the US. Consequently, findings are
distinguish the site capabilities within the TQM Enabler category valid especially for this areas.
for high and low Operational Stability performers.
The color coding in green in figure 20 shows all categories and
sub-categories that show a significantly differing implementation
level between the two peer-groups. The detailed outcome of this
analysis with the level of implementation for each category and
sub-category for both peer-groups can be found in the Appendix2.

27. Appendix 6: OPEX Enabler

46 | Quality Metrics Research


7 IMPLICATION FOR FDA QUALITY
METRICS INITIATIVE

Quality Metrics Research | 47


In this chapter we first compare the current FDA Quality Metrics product might be justified if this serves additional objectives
Reporting Program as described in the revised draft guidance for beyond the risk assessment of plants. As for FDA product
industry (FDA, 2016) to the St.Gallen overall system approach and level data could be helpful in better focusing inspections
then derive some conclusions for the further development of the a deeper analysis is needed that clarifies if this additional
program. benefit justifies the higher complexity for data collection,
reporting and analysis.
Table 27 highlights key differences between the FDA Quality Met-
rics Reporting Program as described in the revised guidance (FDA, 4. PQS OUTCOME FOCUS
2016) and the St.Gallen PPSM Approach.
St.Gallen traditionally collects data providing information
about effectiveness (“doing the right things”) and efficiency
1. SCOPE OF ANALYSIS (“doing things right”) i.e. in the St.Gallen operational
The researchers had full access to the St.Gallen OPEX excellence approach it is important to achieve a high level
Benchmarking database. This enabled examining multiple of quality but also to do this in an efficient way (costs,
relationships between numerous metric candidates time, workflow). The measurement of both dimensions of
including qualitative enablers, which provided an effectiveness and efficiency helps to align the overall business
opportunity to look for patterns in the Pharmaceutical interests with the attainment of high quality and helps to
Production Systems. This St.Gallen approach covers a incentivize businesses. This is especially true given that the
comprehensive view of the production system as visualized data analysis (cf. chapter 6.3) shows that a high effectiveness
in the St.Gallen Pharmaceutical Production System Model level goes hand in hand with high efficiency. The FDA
(PPSM) (cf. chapter 4). Understandably, the metrics selected reporting approach does not currently include any business
in the draft guidance have a narrower focus, in order to efficiency measures, which from a regulatory perspective
limit the efforts needed for collecting and reporting them. is understandable, but may account for the concentration
However, the research indicates that a single focus on on reporting burden as seen in much of the industry
outcome measures in the FDA reporting program may not commentary. The benefits of promoting meaningful PQS
to be comprehensive enough to infer overall performance performance measurement and management, as per ICH
at a site. This is especially the case when acknowledging Q10, can be lost in the debate on mandatory reporting and
the demonstrated impact of cultural excellence on quality potential burden.
outcomes as demonstrated with St.Gallen data. On the 5. REPORTING
other hand the quality metrics are only one part of the risk-
based approach and would be used only in conjunction The St.Gallen Benchmarking is a voluntary program.
with existing data (e.g. site compliance history) to better Companies that want to improve are participating and
understand the risk of sites and products. reporting their data. The FDA program could become
mandatory after the pilot phase. The voluntary nature of
2. ASSURANCE OF COMPARABILITY the St.Gallen OPEX program ensures that there is a high
The definitions of the requested data points in the FDA owner interest and motivation in reporting correct data.28
reporting program are on a very specific and detailed level in Formal reporting to a regulator is by definition another
order to assure the comparability of data submitted. St.Gallen situation. Nevertheless, in the next phase of research it
operates with a different approach for the benchmarking could be interesting to have a closer look at other regulatory
data. Based on the experience gained from more than a reporting programs like the OSHA Voluntary Protection
decade of data collection, analysis and interpretation from Program (VPP) for the future.
the St.Gallen Benchmarking program there is less emphasis 6. REPORTING EFFORTS
on a higher precision of definition as misunderstandings
and implausible data can be detected and addressed by an The research team acknowledges the diverse range of
overall pattern analysis. In addition, the participants of the challenges facing organizations in initially establishing a
St.Gallen Benchmarking are participating on a voluntary metrics reporting program, depending on their existing
basis and the data is used for internal purposes only. performance management capabilities. Nevertheless, leaving
Consequently, there is a high motivation to provide correct these one-time initialization costs aside, the FDA reporting
data which in turn limits the necessity for ensuring data program may be deemed to present a moderate level of
quality by highly detailed definitions. effort from participating organizations on an ongoing basis.
Upon review, much of the reporting burden concern raised
by industry relates to reporting at a product level what
3. CENTER OF ANALYSIS could multiply the reportable data points considerably,
particularly when considering the requirement to provide
The St.Gallen analysis is exclusively focused on the plant or data over the whole supply chain. Another burden often
site level performance (i.e. the establishment). No data is voiced by industry is the exact accuracy needed for the
collected on a per product basis. The main reason for this FDA Quality Metrics program. However without such an
approach is that within the St.Gallen philosophy, systemic accuracy comparability would not be given.
risks are not product specific but are in fact engrained in,
and arise from the culture, structure and processes which On the other hand, reporting in the St.Gallen Benchmarking
operate at a plant level. Furthermore, regulatory inspections program presents a considerable cross-functional team
are focused on plant level performance. The suggested effort to complete the comprehensive questionnaire,
reporting in the FDA program for data which is segregated by however, this overall benchmarking would normally only

28. The concern has been raised whether the St.Gallen data set is not representative as only well performing sites are choosing to share data voluntarily. In the
last decade however, companies with different level of maturity and performance, ranging from very poor performance to world class performance have
participated in the benchmarking. Therefore, we do not believe that the database is biased in favor of well-performing companies.

48 | Quality Metrics Research


be repeated every two to three years to measure progress 8. POTENTIAL TO DRIVE CONTINUOUS
and establish the next improvement priorities. From our IMPROVEMENT (CI)
perspective is also makes sense for companies to track a
The very motivation to take part in the St.Gallen
subset of the metrics continuously in the time between
Benchmarking program is to identify the areas with the
benchmarking in order to measure current developments.
highest potential for further improvements. It is inherent
7. POTENTIAL TO DERIVE RISKS to the approach that continuous improvement is enabled
and driven from the insights gained. In the case of the FDA
As the St.Gallen Benchmark is based on an overall system
reporting program, a similar impact on driving continuous
perspective, inherent risks can be identified by examining the
improvement could only be achieved if industry does
patterns and relationships revealed by the data. Whether the
not solely focus on the reportable metrics but integrates
reportable data from the FDA program will provide a good
them into an overall internal system-based continuous
risk predictor for a firm will require further investigation.
improvement approach. That would give industry the
What may be necessary is to establish over time a target
possibility not only to report, but also to understand the
range of acceptable/expected measures for each single
data and derive conclusions for future improvements.
metric. Then examining the patterns and relationships
between the actual metric results and consideration of other
available data about each firm (e.g. number of recalls, time
since last recall, compliance history) would also be necessary
to inform the risk estimation.

No.  Topic FDA Quality Metrics St.Gallen PPSM Approach

Scope of Performance Analysis Three single performance metrics Overall Production System

Overall Production System including


Scope of Enabler Analysis None
behavioral aspects

2 Assurance of comparability Precision in Definitions Pattern Analysis

3 Center of Analysis Establishment & Product Plant

4 PQS Outcome Focus Effectiveness Effectiveness & Efficiency

5 Reporting Could become mandatory Voluntarily

6 Reporting efforts Moderate High

Depending on capability
Possibility for overall system analysis (pattern
7 Potential to derive risks to identify critical values
recognition) helps to determine risks.
« calibration »

Possibility to link performance outcomes


Depending on implementation to enablers directly leads to the systematic
8 Potential to drive CI
philosophy in industry identification of next improvement
opportunities.

Table 27: Comparison FDA Quality Metrics with St.Gallen PPSM Approach

Quality Metrics Research | 49


8 CONCLUSION AND OUTLOOK

50 | Quality Metrics Research


Based on the findings (cf. chapter 6) and the comparison of the
FDA Quality Metrics with the St.Gallen PPSM Approach the re-
search team can confirm that the FDA proposed metrics are key
metrics of the Pharmaceutical Quality System (cf. chapter 4). The
comparison does reveal some major differences and other com-
monalities of both approaches.
The latest revision of the St.Gallen OPEX Benchmarking ques-
tionnaire will generate additional, distinct findings based on an
increased level of granularity in both the standard OPEX and QC
Lab questionnaires and the ability to collect technology specific
performance data.
Further analysis is necessary to continue to enhance the un-
derstanding of the holistic Pharmaceutical Production Sys-
tem Model (PPSM). Specifically, the analysis of the FDA
proposed metric Invalidated OOS, in context of QC lab ro-
bustness and within the overall context of the manufac-
turing strategy employed, will allow a more comprehen-
sive understanding of the Pharmaceutical Quality System.
Further work in the field of Quality Metrics Research will be fo-
cused on:
» Consideration of additional moderators to get distinct
results for new analysis
» Deepen understanding of inventory impact and possible
adjustments for analysis
» Investigation of the metric Invalidated OOS based on new
QC Lab OPEX Benchmarking tool
» Linkage of PQS outcome to regulatory factors (e.g. time
since last recall)
» Closer look at CAPA Effectiveness with an increased
dataset
» Calibrate the three draft guidance metrics with respect
to risk in the context of OPEX performance data. Which
values, or combinations of values, reveal a higher risk for
either quality issues or drug shortages?
» Establishing criteria for Data Quality assessment, i.e.
determine the rules and justification for inclusion or
exclusion of specific data sets within the model
» Identify “critical to quality” patterns within establishments,
with the potential to endanger plant stability and quality
outcomes.
» Means to sustainably support continuous improvement
and the true unification of quality and excellence across
the pharmaceutical industry.

Quality Metrics Research | 51


REFERENCES

52 | Quality Metrics Research


Abramowitz, S. K., & Weinberg, S. L. (2008). Statistics using SPSS: An Friedli, T. (2006). Technologiemanagement: Modelle zur Sicherung
integrative approach (2. ed.). New York, N.Y.: Cambridge University der Wettbewerbsfähigkeit. Berlin: Springer.
Press.
Friedli, T., Basu, P., Bellm, D., & Werani, J. (2013). Leading Pharma-
Cohen, J. (1977). Statistical power analysis for the behavioral sciences ceutical Operational Excellence: Outstanding Practices and Cases.
(Revised ed.). New York, NY: Acad. Press. Berlin, Heidelberg: Springer.
Corbett, J., & Van Wasserhoven, L. (1993). Trade-Offs? What Trade- Gößler, A., & Grübner, A. (2006). An Empirical Model of the Rela-
Offs? California Management Review. (35(4)), 107–122. tionship between Manufacturing Capabilities. International Journal
of Operations and Production Management,, 26(5), 458–485.
Cua, K. O., McKone, K. E., & Schroeder, R. G. (2001). Relationships
between implementation of TQM, JIT, and TPM and manufactur- Huizingh, E. (2007). Applied statistics with SPSS. London: SAGE.
ing performance. Journal of Operations Management, 19(6), 675–694. Retrieved from http://search.ebscohost.com/login.aspx?di-
https://doi.org/10.1016/S0272-6963(01)00066-3 rect=true&scope=site&db=nlebk&AN=1099413
Deming, W. E. (1986). Out of the crisis: Quality, productivity and com- International Society for Pharmaceutical Engineering (ISPE), &
petitive position. Cambridge, Mass.: Cambridge University Press. The Pew Charitable Trusts (PEW). (2017). Drug Shortages: A report
from The Pew Charitable Trusts and the International Society for
Dixon, W. J., & Massey, F. J. (1992). Introduction to statistical analysis
Pharmaceutical Engineering.
(4th ed.). New York, NY [u.a.]: McGraw-Hill.
ISPE. (2017). Cultural Excellence Report. Retrieved from http://
Eckstein, P. P. (2016). Angewandte Statistik mit SPSS: Praktische
www.ispe.org/news/5-may-2017/cultural-excellence-report
Einführung für Wirtschaftswissenschaftler (8., überarb. u. erw. Aufl.
2016). Lehrbuch. Wiesbaden: Springer Fachmedien Wiesbaden Patel, P., Baker, D., Burdick, R., Chen, C., Hill, J., Holland, M., &
GmbH; Springer Gabler. Sawant, A. (2015). Quality Culture Survey Report. PDA Journal of
Pharmaceutical Science and Technology, 69(5), 631–642. https://doi.
European Foundation for Quality Management. (2017).
org/10.5731/pdajpst.2015.01078
EFQM Model. Retrieved from http://www.efqm.org/efqm-model/
model-criteria U.S. Department of Labor. (2017). Voluntary Protection Programs
(VPP). Retrieved from https://www.osha.gov/dcsp/vpp/all_about_
Ferdows, K., & De Meyer, A. (1990). Lasting improvements in man-
vpp.html
ufacturing performance: In search of a new theory. Journal of Op-
erations Management, 9(2), 168–184. https://doi.org/10.1016/0272- Ulrich, H., Dyllick, T., & Probst, G. (1984). Management. Schriften-
6963(90)90094-T reihe Unternehmung und Unternehmungsführung: Bd. 13. Bern: P.
Haupt.
Flynn, B. (2004). An exploratory study of the nature of cumula-
tive capabilities. Journal of Operations Management, 22(5), 439–457. US Congress. (2012). Food and Drug Administration Safety and
https://doi.org/10.1016/j.jom.2004.03.002 Innovation Act. Retrieved from https://www.gpo.gov/fdsys/pkg/
PLAW-112publ144/pdf/PLAW-112publ144.pdf
Food and Drug Administration (FDA). (2009). Q10 Pharmaceutical
Quality System. Woodcock, J., & Wosinska, M. (2013). Economic and technological
drivers of generic sterile injectable drug shortages. Clinical phar-
Food and Drug Administration (FDA). (2011). Guidance for Indus-
macology and therapeutics, 93(2), 170–176. https://doi.org/10.1038/
try: Q8, Q9, and Q10 Questions and Answers(R4).
clpt.2012.220
Food and Drug Administration (FDA). (2015). Request for Quality
Yu, L. X. (2017). FDA’s Vision for Quality Metrics.
Metrics: Guidance for Industry.
Yu, L. X., & Kopcha, M. (2017). The future of pharmaceutical quality
Food and Drug Administration (FDA). (2016). Submission of Qual-
and the path to get there. International Journal of Pharmaceutics,
ity Metrics Data: Guidance for Industry. Retrieved from https://
528(1–2), 354–359. https://doi.org/10.1016/j.ijpharm.2017.06.039
www.fda.gov/downloads/drugs/guidances/ucm455957.pdf

Quality Metrics Research | 53


APPENDIX

54 | Quality Metrics Research


Nr. Title Page
1.1 Presentation at Conferences/Workshops 55
1.2 Publications 55
1.3 Further Industry Interaction 55
2.1 Questions and Definitions from St.Gallen OPEX Report – Structural Factors 56
2.2 Questions and Definitions from St.Gallen OPEX Report – Cost and Headcount 60
2.3 Questions and Definitions from St.Gallen OPEX Report – Enabler 61
2.4 Questions and Definitions from St.Gallen OPEX Report – Performance Metrics 66
3 SPSS Output – MLR - Impact of Level 3 Categories on PQS Effectiveness 68
4 SPSS Output – MLR - Impact of Performance Metrics on PQS Effectiveness 70
5 Correlation table Compliance Metrics and Performance Metrics 72
6 Cultural Excellence Subelements 72
7 OPEX Enabler 74
» CPhI Annual Industry Report 2016 - Measurement of
Appendix 1: Dissemination Pharmaceutical Quality in an Operational Excellence
The following section provides an overview how the results were (OPEX) Environment, October, 2016
disseminated by the research team. The presentations and pub- » ISPE Pharmaceutical Engineering 2016 - Announcing FDA’s
lications are sorted in anti-chronological, starting with the most Pharmaceutical Manufacturing Quality Metrics Research,
recent event/publication. June, 2016
» HSG Focus – Inspection in the pharmaceutical industry:
Appendix 1.1: Presentations at Conferences/Workshops FDA is counting on HSG know-how, Issue 3, 2016
» Pharmaceutical Manufacturing - Innovation in Quality
» AFI Symposium, June 2017, Rimini, Italy
Metrics, October, 2016
» ISPE/St.Gallen Operational Excellence and Quality Metrics
Training, May 2017, St.Gallen, Switzerland
Appendix 1.3: Further Industry Interaction
» Leadership Workshop, May 2017, Helsinki, Finland
» ISPE Cultural Excellence Conference, April 2017, » INTERACTION WITH INDUSTRIAL ADVISORY
Washington DC, USA BOARD
The Industrial Advisory Board was set-up at the beginning
» Paperless Lab Academy, April 2017, Barcelona, Spain of the research project. Its role is to discuss and evaluate
» ISPE Global Pharma Manufacturing Leaders Forum 2017, research findings, provide experience from industry and
April 2017, Barcelona, Spain provide suggestions for further analysis. The work of the
research team was well received during the three WebEx
» ISPE Quality Metrics Workshop, March 2017, East Hanover, meetings since the project start.
USA
» INTERACTION WITH PDA
» PDA Quality Culture and Quality Metrics Conference, During the project the interaction with PDA was focused
February 2017, Bethesda, USA on work in the field of Quality Culture and Cultural
Excellence. In the future, PDA and the St.Gallen research
» Bayer Quality Conference, September 2016, Düsseldorf,
team aim to deepen the collaboration regarding statistical
Germany
analysis linking the PDA Culture data with St.Gallen OPEX
» ISPE Annual Meeting 2016: Quality Metrics Revisited, Performance data to enable a better understanding of key
September 2016, Atlanta, USA levers of Quality Culture for OPEX Performance.
» SwissMedic - Operational Excellence in the Pharmaceutical » INTERACTION WITH ISPE
Industry and its contribution to FDA’s Quality Metrics The ISPE/St.Gallen vision is to work together to align the
Initiative, August 2016, Bern, Switzerland Quality Metrics discussion with Operational Excellence.
The aim is to provide members with the ability to do
conduct an individual project in order to have an evaluation
Appendix 1.2: Publications of the own company data in relation with the draft guidance
metrics form an overall system perspective.
» Pharma Focus Asia: Cultural Excellence as the foundation
for PQS Effectiveness (will be published in August 2017) » INTERACTION WITH THE OPEX RESEARCH GROUP
The OPEX Research Group is an exchange platform for
» ISPE Cultural Excellence Report, April 2017 all kind of topics related to operational excellence within
» PharmaTech 2017 - Achieving Manufacturing Excellence the pharmaceutical industry. The Research Group was
(Interview with Prof. Friedli for the article), March, 2017 established three years ago and meets four times per year
for a two day meeting. In 2017 eleven large and mid-size
» PharmTech 2017 - Defining Quality: Joining the Quality Lab pharmaceutical companies have been part of the research
and the Plant Floor, January, 2017 group. Results from analysis of this research have been
» ISPE Pharmaceutical Engineering 2016 – Leading Indicators shared and discussed with the members of the research
of Quality: Pinpointing Behaviors and Measuring Results, group.
June, 2016

Quality Metrics Research | 55


Appendix 2:
The following tables provide an overview on the Structural Fac-
tors, Cost and Headcount figures, Enabler and Performance figures
asked for in the St.Gallen OPEX questionnaire.

Appendix 2.1: Questions and Definitions from Table 28: Appendix: Structural Factors from
St.Gallen OPEX Report – Structural Factors St.Gallen OPEX Questionnaire

UID Questions
Corporate Level
A01 How many production sites does your company have?
A02 What was your total sales in the last year?
Please fill in the cost structure of your company as a percentage of sales.
A03 R&D
A04 Manufacturing costs
A05 General & administration costs
A06 Sales & marketing costs
A07 Net profit
Compared to your competitors, indicate the development of your company on the following dimensions within the last 3
years.
A09 Market share
A10 Sales growth
A11 Return on sales
A12 Launches of new promising products
A13 Share price
Company type - Please indicate your company type ( yes/ no). Multiple answers are possible!
A14 Pharmaceutical company with R&D
A15 Generics manufacturer
A16 Contract manufacturer
A17 Biotechnology
A18 Miscellaneous (If the stated types do not reflect your business please provide us the information)
Site role - If your site is part of a manufacturing network, does the site have a specific role within this network? Multiple
answers are possible!
A19 We have a manufacturing network. (yes/no)
If yes – Rate the following metrics from “No competence” to “High competence”
A20 Launch site
A21 Special technology
A22 Special capacity size
A23 High packaging and production flexibility
A24 Access and entrance to markets
A25 Close to regional technology clusters
A26 Follow-the-customer
A27 Low cost site
A28 Securing of raw material sources
A29 Development site
A30 Back-up site (redundancy/ capacity)
A31 No special site role (yes/no)

56 | Quality Metrics Research


Site Level
Planned improvements of the manufacturing strategy at your site. Indicate the degree of emphasis which your manufactur-
ing plant places on the following future activities.
Range: “No activities planned” to “Key activities”
B01 Reduce cycle time
B02 Reduce set-up time and cleaning time
B03 Increase flexibility to respond to demand changes in volume
Increase flexibility to respond to market needs for broad product mix (concerning package size, concentra-
B04
tions, flavors etc.)
B05 Increase flexibility to respond to shorter product lifecycles and higher number of product launches
B06 Accelerate new product introductions (time to market)
B07 Reduce process variance through statistical process control
B08 Increase supplier quality performance
B09 Reduce scrap rates
B10 Reduce lead time (lead time: time from raw material to finished goods incl. all kinds of process steps)
B11 Increase on-time delivery rate
B12 Reduce stock
B13 Increase asset utilization (e.g. machines)
B14 Increase employee productivity
B15 Increase capital investment productivity
Indicate how your production site is organized.
B16 Cost-center or Profit-center
Indicate the proportion of products manufactured at your plant (%) in the last year.
IP – protected products Sepa-
Not IP – protected products rated in
Synthetic
Contract manufacturing (share of CMO of total products) products,
Phy-
tophar-
maceu-
ticals,
Bio-tech-
nological
products
& Other
products
Number of products in the last year
B20 Number of different products produced at your plant
B21 Number of different technologies/ platforms (dosage forms) used at your plant (blister, bottles etc.)
B22 Number of different formats (dosages) produced at your plant
B23 Number of different SKUs produced at your plant
Procurement and supplier structure in the last year
Number of active suppliers for… Separated in API / starting & raw material, Excipients,
Total amount purchased (in millions) of… Packaging material and Others

B26 Total of the following 4 metrics:


B27 Percentage of internal suppliers (suppliers within the same manufacturing network)
B28 Percentage of suppliers that deliver your site frequently
B29 Percentage of suppliers that deliver your site frequently
B30 Frequently means on average every … days.
Sourcing by regions (primary the location of production, not the registered office/ in %) in the last year.
B31 West Europe
B32 North America
B33 East Europe

Quality Metrics Research | 57


B34 South and Middle America
B35 Middle East and Africa
B36 India
B37 China
Asia (excl. India, China and Japan)
Japan
B38c* Australia and New Zealand
Production structure
B39 Amount of API produced at your site in the last year
B40 Percentage of API produced at your site that was processed for own production
Indicate the volume of bulk goods produced at your site in the last year.
B41 Solid forms (tablets, capsules etc.)
B42 Liquids
B43 Sterile liquids
B44 Semi solid forms (creams etc.)
Indicate the total volume of bulk goods that was packed at your site in the last year.
B45 Solid forms (tablets, capsules etc.)
B45a - thereof packed in blisters
B45b - thereof packed in bottles
B46 Liquids
B46a - thereof packed in amps
B46b - thereof packed in vials
B46c - thereof packed in bottles
B47 Sterile liquids
B47a - thereof packed in amps
B47b - thereof packed in vials
B47c - thereof packed in bottles
B47d - thereof packed in syringes
B48 Semi solid forms (creams etc.)
Indicate the number of packed units (boxes for sale) at your site in the last year.
B49 Solid forms (tablets, capsules etc.)
B50 - average packaging size
B51 Liquids
B52 - average packaging size
B53 Sterile liquids
B54 - average packaging size
B55 Semi solid forms (creams etc.)
B56 - average packaging size
Batch & campaign structure in the last year
Number of batches produced in the last year
Separated in API, Formulation & Packaging
Number of campaigns in the last year
Formulation/ pelleting (in kg)
Batch size range from min to max
Packaging units (in units)
Vertical integration of manufacturing in the last year
B61 Different process steps performed API
B62 Different process steps performed Formulation
B63 Different process steps performed Packaging
B64 Total number of process steps executed at your site. (Sum of the 3 metrics above)

58 | Quality Metrics Research


Innovation structure
Number of new drug introductions within the last 3 years (“new drugs” should be understood in the sense of
B65
new for the site)
B66 Number of launched stock keeping units (SKU) at the site within the last 3 years (new for the site)
Number of inspections at site within the last 3 … regula-
B67
years from… tory body
B68 … headquarters
B69 … customer
Age of production technology in the last year
B70 Percentage of machines which are less than 3 years old
B71 Percentage of machines that are between 3 and 5 years old
B72 Percentage of machines that are between 6 and 10 years old
B73 Percentage of machines that are older than 10 years
Level of automation in the last year
B74 Percentage of machines that are manually operated
B75 Percentage of machines that are operated with IT-support
B76 Percentage of machines that are fully automated (without supervision/ from a control room)
Implementation of electronic batch records in the last year
B77 Range from “Completely” to “Not at all”
Customer structure in the last year
B78 Overall number of customers
B79 Percentage of internal customers (customer within the same manufacturing network)
B80 Percentage of customers delivered frequently
B81 Frequently means on average every … days.
B82 Number of orders received from your customers
Customers by regions (primary the location of production, not the registered office) in the last year.
B83 West Europe
B84 North America
B85 East Europe
B86 South and Middle America
B87 Middle East and Africa
B88 India
B89 China
Asia (excl. India, China and Japan)
Japan
B90c Australia and New Zealand
History of the plant
B91 The plant is an original plant and was founded by the company itself (yes/no)
The plant was acquired during a merger or acquisition within…
B92
…the last 3 years / …the last 3-10 years / …more than 10 years ago

Quality Metrics Research | 59


Appendix 2.2: Questions and Definitions from
St.Gallen OPEX Report – Cost and Headcount Table 29:Appendix: Cost and Headcount figures from
St.Gallen OPEX Questionnaire

UID Metrics Description


Costs and Headcounts
Cost structure
C01 Sales Please indicate the non-consolidated sales of your production site in the last year.
On the income statement, the cost of purchasing raw materials and manufacturing finished
Cost of Goods
C02 products. Equal to the beginning inventory plus the cost of goods purchased during the last
Sold
year minus the ending inventory.
Accounting prin- Please indicate the accounting principles on which the data is based (e.g. US GAAP)
C03
ciple
Direct material Cost for raw materials and preliminary products.
C04
costs
Indirect material Cost for operating supplies as well as services.
C05
costs
Direct labor Cost for employees directly involved in manufacturing and quality labs (see also FTE struc-
C06
costs ture below).
Indirect labor Cost for plant employees whose time is not charged to specific finished products (see also
C07
costs FTE structure below).
Costs for ma- Cost for machines, equipment, tools, spare parts including costs for depreciation, electrici-
C08
chines & tools ty for the machines etc.
Costs for proper- Cost for property and plant including costs of depreciation and other costs for electricity,
C09
ty and plant water etc.
Corporate alloca-
C10 Cost for corporate expenses charged to the plant.
tions
C11 Other costs Cost for e.g. Sales & Clerical, Marketing, R&D located at the dedicated plant.
Maintenance Total maintenance cost includes both - internally and externally- rendered services, also
C12
costs cost for spare parts and consumables used for maintenance.
Preventive main-
C13 Cost for planned and condition-based maintenance activities.
tenance cost
Overall costs for quality assurance (usually the total number from your cost center(s) QC/
Cost of quality QA).
Percentage of cost due to outsourced quality activities
C15 Rework cost Cost due to rework.
C16 Destruction cost Cost due to destruction.
Headcount structure
C17 Direct labor Sum of the following 6 metrics
C18 API production
C19 Production labor Pharmaceutical production
C20 Packaging
C21 Testing (“taking samples”) incoming
C22 Quality control Testing (“taking samples”) product testing
C23 Batch review and approval
C24 Indirect labor Sum of the following 17 metrics
C25 Testing (“taking samples”) management
C26 Laboratories management
Quality control
C27 Environmental monitoring
C28 Stability testing
C29 Quality assur- Validation of process, equipment and method
C30 ance Quality planning

60 | Quality Metrics Research


C31 Reactive (Firefighting)
C32 Basic Care (e.g. lubrication, cleaning)
C33 Maintenance Preventive (calendar based exchange of parts)
C34 Predictive (condition based exchange of parts)
C35 Other Maintenance
C36 Production management
C37 Materials management (procurement and logistics)
C38 Manufacturing engineering
Other functions
C39 EH&S (environment, health and safety)
C40 IT-support
C41 Miscellaneous (HR, finance, management)
C42 Overall Overall number of FTEs at the site
Employment structure
C43 Percentage of FTEs permanently employed by the company
C44 Percentage of FTEs temporarily employed by the company
C45 Percentage of FTEs temporarily employed by a temp agency

Appendix 2.3: Questions and Definitions from


St.Gallen OPEX Report – Enabler
The following table shows the Enabler from the St.Gallen OPEX
Questionnaire. Furthermore, Table 30 indicates which Enabler
have been assigned to Quality Culture and which have been as-
Table 30: Appendix: Enabler from St.Gallen OPEX
signed to Quality Maturity.
Questionnaire

Quality Quality
Culture Culture None
UID Enabler Behavior Maturity
Preventive maintenance
D01 We have a formal program for maintaining our machines and equipment. x
Maintenance plans and checklists are posted closely to our machines and
x
D02 maintenance jobs are documented.
We emphasize good maintenance as a strategy for increasing quality and
x
D03 planning for compliance.
All potential bottleneck machines are identified and supplied with additional
x
D04 spare parts.
We continuously optimize our maintenance program based on a dedicated
x
D05 failure analysis.
Our maintenance department focuses on assisting machine operators per-
x
D06 form their own preventive maintenance.
Our machine operators are actively involved into the decision making pro-
x
D07 cess when we decide to buy new machines.
Our machines are mainly maintained internally. We try to avoid external
x
D08 maintenance service as far as possible.
Technology assessment and usage
D09 Our plant is situated at the leading edge of new technology in our industry. x
We are constantly screening the market for new production technology and
x
D10 assess new technology concerning its technical and financial benefit.
D11 We are using new technology very effectively. x
D12 We rely on vendors for all of our equipment. x
D13 Part of our equipment is protected by firm`s patents. x

Quality Metrics Research | 61


Proprietary process technology and equipment helps us gain a competitive
x
D14 advantage.
Housekeeping
D15 Our employees strive to keep our plant neat and clean. x
D16 Our plant procedures emphasize putting all tools and fixtures in their place. x
We have a housekeeping checklist to continuously monitor the condition
x
D17 and cleanness of our machines and equipment.
Process management
E01 In our company direct and indirect processes are well documented. x
We continuously measure the quality of our processes by using process
x
E02 measures (e.g. On-time-in-full delivery rate).
E03 Our process measures are directly linked to our plant objectives. x
In our company there are dedicated process owners who are responsible
x
E04 for planning, management, and improvement of their processes.
A large percentage of equipment on the shop floor is currently under statis-
x
E05 tical process control (SPC).
We make use of statistical process control to reduce variances in process-
x
E06 es.
For root cause analysis we have standardized tools to get a deeper under-
x
E07 standing of the influencing factors (e.g. DMAIC).
We operate with a high level of PAT implementation for real-time process
x
E08 monitoring and controlling.
Cross functional product development
Manufacturing engineers (e.g. Industrial engineers) are involved to a great
extent in the development of a new drug formulation and the development x
E09 of the necessary production processes.
In our company product and process development are closely linked to
x x
E10 each other.
Due to close collaboration between the R&D and the manufacturing de-
partment, we could significantly shorten our time for product launches x
E11 (“scale-ups”) in our plant.
For the last couple of years we have not had any delays in product launch-
x
E12 es at our plant.
For product and process transfers between different units or sites stan-
dardized procedures exist, which ensure a fast, stable and complied x
E13 knowledge transfer.
Customer involvement
We are frequently in close contact with our customers
x
E14 (e.g. e-mail, telephone, e-rooms).
Our customers frequently give us feedback on quality and delivery perfor-
x
E15 mance.
E16 We regularly survey our customer`s requirements. x
E17 We regularly conduct customer satisfaction surveys. x
E18 On time delivery is our philosophy. x
We have joint improvement programs with our customers to increase our
x
E19 performance.
Supplier quality management
E20 Quality is our number one criterion in selecting suppliers. x
We rank our suppliers, therefore we conduct supplier qualifications and
x
E21 audits.
E22 We mostly use suppliers that we have validated. x
For a large percentage of suppliers we do not perform any inspections of
x
E23 the incoming parts/ materials.

62 | Quality Metrics Research


Inspections of incoming materials are usually performed in proportion to
x
E24 the past quality performance or type of supplier.
E25 Basically, we inspect 100% of our incoming shipments. x
We have joint improvement programs with our suppliers to increase our
x
E26 performance.
Set-up time reduction
We are continuously working to lower set-up and cleaning times in our
x
F01 plant.
F02 We have low set-up times for equipment in our plant. x
F03 Our crews practice set-ups regularly to reduce the time required. x
To increase the flexibility, we put high priority on reducing batch sizes in
x
F04 our plant.
We have managed to schedule a big portion of our set-ups so that the reg-
x
F05 ular up-time of our machines is usually not effected.
Optimized set-up and cleaning procedures are documented as best-prac-
x
F06 tice process and rolled-out throughout the whole plant.
Pull production
Our production schedule is designed to allow for catching up, due to pro-
x
F07 duction stoppings because of problems (e.g. quality problems).
We use a pull system (kanban squares, containers, or signals) for produc-
x
F08 tion control.
F09 We mainly produce according to forecasts. x
Suppliers are integrated and vendors fill our kanban containers, rather than
x
F10 filling our purchasing orders.
We value long-term associations with suppliers more than frequent chang-
x
F11 es in suppliers.
F12 We depend on on-time delivery from our suppliers. x
We deliver to our customers in a demand-oriented JIT way instead of a
x
F13 stock-oriented approach.
We mainly produce one unit when the customer orders one. We normally
x
F14 do not produce to stock.
Layout optimization
Our processes are located close together so that material handling and
x
F15 part storage are minimized.
Products are classified into groups with similar processing requirements to
x
F16 reduce set-up times.
Products are classified into groups with similar routing requirements to re-
x
F17 duce transportation time.
F18 The layout of the shop floor facilitates low inventories and fast throughput. x
As we have classified our products based on their specific requirements
x
F19 our shop floor layout can be characterized as separated into “mini-plants”.
Currently our manufacturing processes are highly synchronized over all
x
F20 steps by one tact.
Currently our manufacturing processes from raw material to finished goods
involve almost no interruptions and can be described as a full continuous x
F21 flow.
At the moment we are strongly working to reach the status of
a full continuous flow with no interruption between raw material to finished x
F22 goods.
We use “Value Stream Mapping” as a methodology to visualize and opti-
x
F23 mize processes.
Planning adherence
F24 We usually meet our production plans every day. x

Quality Metrics Research | 63


We know the root causes of variance in our production schedule and are
x
F25 continuously trying to eliminate them.
To increase our planning adherence we share data with customers and
x
F26 suppliers based on a rolling production plan.
We have smoothly leveled our production capacity throughout the whole
x
F27 production process.
Our plant has flexible working shift models so that we can easily adjust our
x
F28 production capacity according to current demand changes.
A smoothly leveled production schedule is preferred to a high level of ca-
x
F29 pacity utilization.
Direction setting
Our site has an exposed site vision and strategy that is closely related to
x
G01 our corporate mission statement.
Our vision, mission and strategy is broadly communicated and lived by our
x
G02 employees.
Goals and objectives of the manufacturing unit are closely linked and con-
x
G03 sistent with corporate objectives. The site has a clear focus.
The overall objectives of the site are closely linked to the team or personal
x
G04 objectives of our shop-floor teams and employees.
Our manufacturing managers (head of manufacturing, site leader etc.)
have a good understanding of how the corporate/ divisional strategy is x
G05 formed.
Our manufacturing managers know exactly what the most important crite-
x
G06 ria for manufacturing jobs are (i.e. low costs, delivery, quality, etc.).
Management commitment and company culture
Site management (committee) empowers employees to continuously im-
x
G07 prove the processes and to reduce failure and scrap rates.
Site management (committee) is personally involved in improvement proj-
x
G08 ects.
There is too much competition and too little cooperation between the de-
x
G09 partments.
G10 The communication is made via official channels. x
The site has an open communication culture. There is a good flow of infor-
x
G11 mation between the departments and the different management levels.
About innovations (e.g. new products, new SKUs, new technologies) we
x
G12 are informed early enough.
Problems (e.g. complaints, etc.) are always traced back to their origin to
x
G13 identify root causes and to prevent doing the same mistakes twice.
The achievement of high quality standards is primarily the task of our QA/
x
G14 QC departments.
Our employees continuously strive to reduce any kind of
waste in every process (e.g. waste of time, waste of x
G15 production space, etc.).
Command and control is seen as the most effective leadership style rather
x
G16 than open culture.
Employee involvement and continuous improvement
We have implemented tools and methods to deploy a continuous improve-
x
G17 ment process.
Our employees are involved in writing policies and procedures (concerning
x
G18 Site Vision down to Standard Operating Procedures).
Shop-floor employees actively drive suggestion programs (not excl. linked
x
G19 to a suggestion system in place).
Our work teams cannot take significant actions without supervisors or mid-
x
G20 dle managers approval.
G21 Our employees have the authority to correct problems when they occur. x

64 | Quality Metrics Research


G22 Occurring problems should be solved by supervisors. x
G23 Supervisors include their employees in solving problems. x
G24 Our plant forms cross-functional project teams to solve problems. x
G25 The company takes care of the employees. x
We have organized production employees into teams in production areas.
For each team there is one dedicated team member that is responsible for x
G26 supervisory tasks.
We have organized production employees into teams in production areas.
For team leadership we have an additional supervisory level in our organi- x
G27 zation.
Functional integration and qualification
Each of our employees within our work teams (in case workers are orga-
nized as teams) is cross-trained so that they can fill-in for others when x
G28 necessary.
At our plant we have implemented a formal program to increase the flexi-
bility of our production workers. Employees rotate to maintain their qualifi- x
G29 cation.
G30 In our company there are monthly open feedback meetings. x
The information of these official feedback meetings is used systematically
x
G31 in further training.
We continuously invest in training and qualification of our employees. We
x
G32 have a dedicated development and qualification program for our workers.
Standardization and simplification
We emphasize standardization as a strategy for continuously improving
x
H01 our processes, machines and products.
We use our documented operating procedures to standardize our process-
x
H02 es (e.g. set-ups).
Optimized operating procedures (e.g. shortened set-ups) are documented
x
H03 as best-practice processes and rolled-out throughout the whole plant.
Standardized functional descriptions have reduced the period
x
H04 of vocational training for new employees.
We use standardized machines and equipment (e.g. standardized machine
design, standardized spare parts etc.) to achieve a high up time of our ma- x
H05 chines.
By using standardized machines and fixtures we could significantly lower
x
H06 our material costs for spare parts.
Visual management
Performance charts at each of our production processes (e.g. packaging)
x
H07 indicate the annual performance objectives.
Technical documents (e.g. maintenance documents) and workplace in-
formation (e.g. standardized inspection procedures, team structures) are
x
posted on the shop floor and are easily accessible and visible for all work-
H08 ers.
Charts showing the current performance status (e.g. current scrap-rates,
current up-times etc.) are posted on the shop-floor and visible for every- x
H09 one.
Charts showing current takt times and schedule compliance (e.g. Andon-
x
H10 boards) are posted on the shop-floor and visible for everyone.

Quality Metrics Research | 65


Appendix 2.4: Questions and Definitions from
St.Gallen OPEX Report – Performance Metrics Table 31: Appendix: Performance Metrics
from St.Gallen OPEX Questionnaire

UID Metrics Description Separated


in API For-
mulation &
Packaging
Performance of the plant
TPM performance
The time spent for setup and cleaning as a percentage of the
I29a-c Setup and Cleaning X
scheduled time. [%]
The percentage of your equipment that is dedicated to one
I30a-c Dedicated equipment X
product. [%]
Proportion of unplanned maintenance work as a percentage of
I31a-c Unplanned maintenance X
the overall time spent for maintenance works. [%]
I32a-c Shift-model Number of shifts per day. (Mon-Sun)
I33a-c Shift length Average length of one shift in hours. [h] (Mon-Sun)
Scheduled Time - time during which the equipment was sched-
Loading = Scheduled Time uled or expected to operate during the time period being ana-
I34a-c X
/ Calendar Time lyzed
Calendar Time - usually 365 days, 8760 hours
Scheduled Time - time during which the equipment was sched-
(OEE) Availability = uled or expected to operate during the time period being ana-
I35a-c (Scheduled Time - Down- lyzed X
time) / Scheduled Time Downtime - breakdowns (unplanned downtimes) + setup
downtime
Amount Produced - the number of units produced during the
(OEE) Performance = time period
(Amount Produced x Ideal
I36a-c Ideal Cycle Time - the designed or optimum cycle time X
Cycle Time) / Available
time Available Time - the time the machine actually ran: scheduled
time - downtime
Input - the number of units that were started through the pro-
(OEE) Quality = (Input - cess
I37a-c X
Defects) / Input Defects - the number of defective units (even if they were sub-
sequently salvaged)
Overall Equipment Effec-
I38a-c = (OEE) Availability x (OEE) Performance x (OEE) Quality X
tiveness
TQM performance
Complaint rate (customer) Number of justified complaints as a percentage of all customer orders deliv-
I01
ered.
Yield Real achieved output in pharmaceutical production (Input minus material loss-
I02
es, weighting, sediments etc.).
RFT Total number of batches produced without document errors or exception re-
I03
ports as a percentage of the total number of batches produced.
I04 Rejected batches Number of rejected batches as a percentage of all batches produced.
Scrap rate Average difference between 100% and real achieved output in packaging op-
I05
erations.
Complaint rate (supplier) Number of complaints as a percentage of all deliveries received (from your
I06
supplier).
Release time Average time from sampling to release of finished products including all wait-
I07
ing times.
Deviations Number of deviations per month that arise from raw materials purchased, pro-
I08
duction components (equipment) and product/ process specifications.
I09 Deviation closure time Average deviation closure time in days.

66 | Quality Metrics Research


JIT performance
Average Inventory less write downs x 365 divided by the ‘Cost of Goods Sold
I10 Days on hand (DOH)
(COGS).
Perfect order fulfillment (percentage of orders shipped in time from your site
Service level - delivery
I11 (+/- 1 days of the agreed shipment day) and in the right quantity (+/- 3% of the
(OTIF)
agreed quantity) and right quality) to your customer.
Perfect order fulfillment (percentage of orders shipped in time to your site
I12 Service level supplier (+/- 1 days of the agreed shipment day) and in the right quantity (+/- 3% of the
agreed quantity) and right quality) from your supplier.
I13 Forecast accuracy Actual orders received compared to the annual sales forecast.
Production schedule ac- Number of released production orders as scheduled as a percentage of all
I14
curacy production orders released within your freezing period.
I15 Priority orders Number of priority orders as a percentage of all orders produced.
Freezing period in which you do not allow any changes of your production
I16 Production freeze period
schedule.
Replacement time to cus- Response time for short-term delivery to the customer for goods not on stock
I17
tomer (delivery time supplier and your production time).
I18 Cycle time (from weighing to packaging). e.g. 30% of all prod- < 15 days
I19 Cycle time ucts have a cycle time of 15-30 days. 70% of all products have 15-30 days
a cycle time of more than 30 days.
I20 > 30 days
Annual cost of raw materials purchased divided by the average raw material
I21 Raw material turns
inventory.
Annual cost of raw materials purchased plus annual cost of conversion divided
I22 WIP turns
by the average work in process inventory.
I23 Finished goods turns Annual cost of goods sold divided by the average finished goods inventory.
I24 Average order lead time Average time between a customer placing an order and receiving delivery.
Average time in days from receiving the raw material to re- Separated in
I25a-c
Average production lead lease of products in API production. Waiting time,
time Average time in days from receiving the raw material to re- Production &
I26a-c QA/AC
lease of finished products in pharmaceutical production.
Average time in hours spent between different products for Separated
I27a-c Average changeover time
setting up and cleaning the equipment. in API, For-
Average number of changeovers performed per month includ- mulation &
I28a-c Changeovers Packaging
ing changing lots and changing formats.
EMS performance
Number of management levels between production workers and the highest
I39 Management layers ranking manager at the site (e.g. Worker - Supervisor - Manager of the depart-
ment - Site-leader = 4 Levels).
Management span of con- The average number of employees directly reporting to supervisors.
I40
trol
Percentage of production workers that are organized in self-directed teams in
I41 Group work
terms of e.g. holiday planning and team meetings.
Number of production workers that are qualified to work on 3 or more technol-
I42 Functional integration
ogies/functional areas as a percentage of all workers.
I43 Suggestions (quantity) Average number of suggestions per employee in the last year.
Suggestions (financial im- Estimated total savings due to suggestions that were implemented.
I44
pact)
Employees leaving your site due to terminations, expired work contracts, re-
I45 Employee fluctuation
tirements etc. as a percentage of all employees.
Total time of employees absent (e.g. sick leave) as a percentage of the total
I46 Sick leave
working time.
Hours worked in paid overtime (excludes the overtime which is compensated
I47 Overtime
with free time) in the last year as a percentage of the overall working time.
Number of training days per employee (all kinds of training off- and on the job)
I48 Training
in the last year.

Quality Metrics Research | 67


Number of workers with prior work related qualification/education as a per-
I49 Level of qualification
centage of the total number of workers at your site.
Reportable incidents due to accidents and safety on average per month that
I50 Level of safety
are internally (on site) reported.

Appendix 3: SPSS Output – MLR - Impact of


C-Categories on PQS Effectiveness
Figure 22 shows the SPSS Output of the MLR with the Operational
Stability Score (OS) and the Supplier Reliability Score (SR) as pre-
dictor variables and Service Level Delivery (OTIF) as dependent
variable.

Figure 22: Appendix: MLR Impact of C-Categories on PQS Effectiveness

68 | Quality Metrics Research


Figure 23: Appendix: MLR Impact of Supplier Reliability (SR) on Operational Stability (OS)

Quality Metrics Research | 69


Appendix 4: SPSS Output – MLR - Impact of
Performance Metrics on PQS Effectiveness
Reference chapter 6.2.5.3

Figure 24: Appendix: MLR - Enter method

70 | Quality Metrics Research


Figure 25: Appendix: MLR - Stepwise method

Quality Metrics Research | 71


Figure 26: Appendix: MLR - Backward method

72 | Quality Metrics Research


Appendix 5: Correlation table Compliance Metrics and
Performance Metrics
Reference chapter 6.2.6.3
Correla tions

Unplanned Rejected Deviation Deviation Customer Service Level -


Maintenance OEE Average batches absolut Yield Scrap rate Release Time closure time Complaint Rate Delivery (OTIF)
Number of CAPAs Pearson Correlation .088 .401 -.371 .518 .217 -.459 -.287 -.212 -.079 -.025
Sig. (2-tailed) .774 .174 .192 .125 .456 .182 .393 .466 .787 .949
N 13 13 14 10 14 10 11 14 14 9
Number of non critical Pearson Correlation -.578
*
.376 -.400 .378 .472 -.502 -.087 -.230 -.073 -.810
**

overdue CAPAs Sig. (2-tailed) .038 .205 .156 .281 .088 .139 .800 .429 .805 .008
N 13 13 14 10 14 10 11 14 14 9
# of observations of a health Pearson Correlation -.366 -.033 -.120 -.267 -.101 .068 .072 .322 -.186 -.211
authority inspection Sig. (2-tailed) .218 .915 .684 .456 .730 .852 .832 .262 .525 .586
N 13 13 14 10 14 10 11 14 14 9
# of observations per Pearson Correlation -.271 -.413 .580
*
-.267 -.609
*
.575 .433 .371 -.158 -.017
internal audit Sig. (2-tailed) .371 .161 .030 .456 .021 .082 .184 .192 .589 .965
N 13 13 14 10 14 10 11 14 14 9
Number of recalls Pearson Correlation .601
*
.118 .195 -.522 -.052 .289 -.207 -.139 .280 .224
Sig. (2-tailed) .030 .701 .505 .122 .859 .417 .541 .637 .333 .562
N 13 13 14 10 14 10 11 14 14 9
*. Correlation is significant at the 0.05 level (2-tailed).
**. Correlation is significant at the 0.01 level (2-tailed).
b. Cannot be computed because at least one of the variables is constant.

Figure 27: Appendix: Correlation table Compliance Metrics and Performance Metrics

Appendix 6: Cultural Excellence Subelements


Reference chapter 6.5.3.3

Figure 28: Appendix: Implementation Level of Quality Behavior and Maturity for OTIF HP vs. OTIF LP

Figure 29: Appendix: Quality Behavior and Maturity for OTIF HP vs. OTIF LP t-Test Output

Figure 30: Appendix: Engagement Metrics Score for OTIF HP vs. OTIF LP

Quality Metrics Research | 73


Figure 31: Appendix: Engagement Metrics Score for OTIF HP vs. OTIF LP t-Test Output

Appendix 7: OPEX Enabler Categories Implementation


for OS HP vs. LP
Reference chapter 6.5.4.3

Figure 32: Appendix: OPEX Enabler Categories Implementation for OS HP vs. LP

Figure 33: Appendix: OPEX Enabler Categories t-Test Output

74 | Quality Metrics Research


Quality Metrics Research | 75
Funding for this report was made possible, in part, by the Food and Drug Administration through grant [1U01FD005675-01]. The views expressed in written ma-
terials or publications and by speakers and moderators do not necessarily reflect the official policies of the Department of Health and Human Services; nor does
any mention of trade names, commercial practices, or organization imply endorsement by the United States Government.

76 | Quality Metrics Research

Potrebbero piacerti anche