Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
System Identification
Arun K. Tangirala
Module 1
Lecture 2
Arun K. Tangirala System Identification July 27, 2013 22
Module 1 References Lecture 2
Contents of Lecture 2
Examples
Qualitative Quantitative
Increase in coolant flow (i) y(t) = bu(t) + c
rate reduces
(ii) y[k] = a1 y[k1]+a2 y[k2]+bu[k2]
temperature
(iii) y(t) = Aebu(tD)
Strain is directly
dy(t)
proportional to stress (iv) + a1 y(t) = bu(t)
dt
Increase in fuel flow to
an engine increases the
speed
Approaches to modelling
The focus of this course and the general identification literature is on the develop-
ment of quantitative models.
As we learnt in the previous lecture, there are basically two approaches to mod-
elling - (i) from fundamentals (first-principles models) and (ii) from experiments
(empirical models)
System Identification
For several applications in practice, empirical models are very useful due to lack of
sufficient process knowledge and the flexibility in model structure.
System Identification
The subject of system identification is concerned with
development of (black-box) models from experimental data, with
scope for incorporating any a priori process knowledge
Given the complexity of processes and that industries routinely collect large
amounts of data, it is sensible to build data-based models.
I Empirical approach is also favoured in disturbance modelling which involves
building time-series models
First-principles models are very useful for off-line applications (e.g.,
simulations).
I In fact, simplified / reduced-order versions are used in on-line applications.
I First principles models also contain some empirical correlations
Arun K. Tangirala System Identification July 27, 2013 27
Module 1 References Lecture 2
Recall
Disturbances
Measurable
Output Measured
Input
Actuators Process Sensors Responses &
Signal Disturbances
Sensor Noise
Overall Model
The overall model developed through identification is a composite model. The determin-
istic model is driven by a physical input while the stochastic portion of the model is driven
by a shock wave (fictitious and unpredictable)
Shock
wave
(ctitious,
random) Time-series
modelling
concepts
are
used
to
build
these
models
+
Physical
inputs + Process
response
(Exogenous)
Deterministic (Observed)
A good identification exercise separates the deterministic and stochastic parts with
reasonable accuracy. One should not be contained in the other.
The exercise of identification involves a systematic and iterative procedure starting from acquisition
of good data to a careful validation of the identified model.
Identification Workflow
Prior Process
Primarily one finds Knowledge
Data Generation and Acquisition
Disturbances
three stages, which Measurable
critical validation
Non-parametric
test Visualization
Pre-processing
Optimization
Criteria
Select Candidate
Models analysis
Points to remember
Data Acquisition
Data is food for identification. The quality of final model depends on quality of
data; hence great care must be taken to generate data. The nature of input,
sampling rate and sample size are the primary influencing factors.
A vast theory exists on the design of experiments, particularly the input design, i.e.,
what kind of excitation is best for a given process. The nature of input is tied to
the end-use of the model - whether it is eventually used for control, fault detection
or in simulation.
The theory of input design also allows us to obtain a preliminary idea of the model
complexity that can be built for a given data set. The basic tenet is that the
excitation in the input should be such that its effect in the measured output is
larger than those caused by sensor noise / unmeasured disturbances.
Data pre-processing
Often the data has to be subjected to quality checks and a pre-processing step before
presenting it to the model estimation algorithm. Some of the common factors that
affect data quality include outliers, missing data and high-levels of noise.
Outliers are data which do not conform to other parts of the data largely due to
sensor malfunctions and/or abrupt and brief process excursions. Detecting and
handling outliers in data can be very complicated and challenging primarily due to
the fact that there are no strict mathematical definitions of outliers and they vary
from process to process. A few reasonably good statistical methods have emerged
over the last few decades in this context. The subject continues to emerge in search
of robust universal methods.
The issue of missing data is prevalent in several applications. Intermittent sensor
malfunctioning, power disruptions, non-uniform sampling, data transfer losses are
some of the common reasons for missing data. Several methods have been devised
to handle this issue in data analysis and identification.
Visualization
Visualizing data is a key step in information extraction and signal analysis. The value
of information obtained from visual inspection of data at each stage of identification
is immense.
For a given data set and estimation algorithm, the type, structure and the order of
the model completely determine its predictability. The challenge in identification is
usually experienced at this stage.
There is usually more than one model that can explain the data. However, a careful
adherence to certain guidelines enables the user to discover a good working model.
In choosing a candidate model, specifically one should take cognizance of two facts:
Careful scrutiny of the data to obtain preliminary insights into the input-
output delay, type of model (e.g., linear/non-linear), order (e.g., first/second-
order), etc.
Estimation of non-parametric models, i.e., models that do assume any
structure. Non-parametric models provide reliable estimates of (i) delays, (ii)
step, impulse and frequency response (process bandwidth) and (iii) disturbance
spectrum without any assumptions on process dynamics
Choosing an estimation algorithm that is commensurate with the type and
structure of the model (e.g., least squares algorithms provide unique solutions
when applied to linear predictive models)
The model selection step is iteratively improved by the feedback obtained from
the model validation stage. In addition, one can incorporate any available process
knowledge to impose a specific model structure.
Arun K. Tangirala System Identification July 27, 2013 41
Module 1 References Lecture 2
Estimation Criteria
Model Validation
Model validation is an integral part of any model development exercise, be it empiri-
cal or a first-principles approach. The model is tested for predictive ability, reliability
(variability in parameter estimates) and complexity (overfitting).
Model Refinement
The outcome of the diagnostic tests constitutes the feedback for the previous stages.
When a model does not meet any of the aforementioned requirements, it immedi-
ately calls for improvements at one or more of the previous steps.
The developed model can be unsatisfactory for several reasons:
The user should carefully interpret the results of the diagnostic tests and associate
the failure of the model to one or more of the above reasons.
Arun K. Tangirala System Identification July 27, 2013 45
Module 1 References Lecture 2