Sei sulla pagina 1di 74

WMO Guidelines on Surface Station Data Quality

Assurance for Climate Applications

Revision history

Date Version Description Author


L. Chambers, M.
08/06/2016 1.0 Initial draft Flannery, J. Flannery, P.
Hechler
20/06/2016 2.0 Table of QC tests added L. Chambers
M. Flannery, J.
19/07/2016 2.1 Expanded draft
Flannery, L. Chambers
Table of QC tests refined, annex on test
30/08/2016 3.0 details added, case studies added, text L. Chambers, P. Hechler
revised. Introduction added, text revised

04/10/2016 4.0 Expanded and reordered draft J. Flannery, L Chambers

13/10/2016 5.0 Expanded and revised content M.Flannery, J.Flannery

07/11/2016 5.3 Expanded and revised content M.Flannery, J.Flannery,


L. Chambers

24/05/2017 6.0 Draft completion for wider expert review R. Spengler, P. Hechler

20/08/2017 6.1 Reordered and revised content P. Hechler

05.04.19 6.2 Review, some reorganising. Added section W. Wright & P. Hechler
on automated QC to Ch 2.

Xx/xx/2018 6.2 Updated content and intent to significantly …


simplify annexes

1
Table of Contents
1. Preface
2. Principles and general requirements for Assuring High Quality of Data for Climate Applications
2.1 Overarching principles
2.2 Requirements underpinning the adequacy of observations for climate purposes
2.3 Requirements for technical monitoring
2.4 Requirements underpinning the adequacy of climate data management practices
2.5 General consideration in designing a QC system

3. Elements of data life cycle quality assurance


3.1 Overview
3.2 Observation
3.2.1 Observation Standards
3.2.2 Observations Metadata
3.2.3 Quality Control at station level
3.2.4 Maintenance
3.2.5 Technical monitoring
3.3 Data delivery and ingest
3.3.1 Data delivery system
3.3.2 Electronic data ingest
3.3.3 Systemic network issues
3.4 Central data base
3.4.1 Introduction
3.4.2 Dealing with high-frequency data
3.4.3 Value-added quality control
3.4.4 Future aspects of quality control
3.4.5 Source and quality flags
3.5 Final archive
3.6 Disaster mitigation
4. Role of Quality Assurance Manager
REFERENCES

ANNEX 1: Quality Control Tests - Overview


ANNEX 2: Quality Control Tests - Details

2
1. Preface
This document replaces the WMO Technical Document WMO/TD-N° 111 (WCP-85), Guidelines on
the Quality Control of Surface Climatological Data of May 1986. It provides a relatively high-level
overview of the principles underpinning effective quality assurance of climate data, and
considerations for operational quality assurance and control (QA/QC) of meteorological data from
surface observing stations at various stages of the data lifecycle. It then proposes in the Annexes a
range of recommended QC tests, classified (Annex 1) into what we believe should be regarded as
Mandatory, Recommended and Optional practices. The document is intended to be further
developed and expanded over time to respond to new developments and to cover other sources of
climatological data.

Data homogenization, although part of the climate data quality control and assurance continuum, is
discussed separately in the Guidelines on Climate Metadata and Homogenization (WMO/TD-
No.1186) and therefore not covered within this document. Other value-adding processes such as
infilling of missing data and disaggregation of cumulative precipitation or evaporation totals, are also
beyond the scope of this Guidance.

This document focusses on National Meteorological and Hydrological Services’ (NMHS) surface
observational data, however, the underlying philosophy and principles apply to other entities with a
quality control focus. The GuideIines are closely aligned with WMO Climate Data Management
Systems (CDMS) Specifications (WMO-No. 1131), which specifies the functionality required to
effectively manage climate data across the complete data life cycle from station installation to final
archive.

In this document, Quality Assurance refers to the processes for maintaining a satisfactory level of
quality in a dataset or collection, so that data available to potential users is sufficiently reliable and
complete to be used with confidence. QA is as an end to end process that extends from the climate
archive back through data transmission channels to the point of observation. The quality assurance
process should help diagnose whether any systemic patterns of error are present at any point during
this data cycle, arising from e.g., poor observational technique, problems with measuring
technology or its software, maintenance failures, any upstream processing issues (such as issues
within the Central Processing Unit of an AWS) , etc. Feedback to observation providers, and
rectification of issues identified, are also part of the process requirements. All these should be
components of an entity’s Quality Management System as per ISO 9001 1. Quality Control in this
document refers to the tools and practices employed to verify whether a reported data value is
representative of what was intended to be measured, and has not been contaminated by unrelated
factors (cf. WMO-No 100, Guide to climatological practices). It is the process of ensuring that errors
in the data, or in the continuity of the data, are detected, by checking the data to assess
representativeness in time, space and internal consistency, and flagging any potential errors or
inconsistencies.

1
ISO 9001: 2015 Quality Management System –Requirements
3
Although most climate services require high quality time-series data, no attempt is made to tailor
quality control for any specific climate application such as climate change monitoring or specific
services, which may have additional, value-adding requirements (such as homogenization, infilling of
missing data etc). Finally, it should be noted that the quality control tests (and broader QA
procedures) need to be adapted to the specific climate conditions of a country, as well as fine-tuned
to fit to existing and planned observational and IT infrastructure as well as available human resource
capacities.

Principles and general requirements for Assuring High Quality of Data


for Climate Applications

The following highlights generic practices regarded as essential to ensure that the surface climate
record of a NMHS is of high quality. More detailed guidance is provided in chapter 3.

2.1 Overarching Principles:

• Establish a quality assurance process that covers the entire life cycle of data, from
observation to final archive
• Ensure that comprehensive documentation of all methods and practices employed to ensure
quality and traceability across the entire data life cycle is established, maintained and
accessible
• Establish arrangements with data providers to ensure the adequacy of observations for
climate purposes , are followed and maintained, including adherence to the ten principles of
climate monitoring (WMO No 1185).
• Ensure that objective QC tests and procedures are used, and that the outcomes of the tests
for each piece of data are clearly flagged.
• A copy of the originally-read or received data must always be kept
• Each NMHS should have a data quality assurance manager responsible for monitoring the
complete process of quality assurance across the data life cycle.
• Quality assurance processes require feedback on sources of error. It is recommended there
be a regular exchange of information between the quality assurance manager and personnel
responsible for the NMHSs observing systems so that sources of systemic errors can be
addressed.

2.2 Requirements underpinning adequacy of observations for climate


applications:
● The observing site location has to be representative of the defined environment.
● Each observing site has to be classified according to the Guide to Meteorological Instruments
and Methods of Observation (WMO, 2014).

4
● Automatic Weather Stations (AWS) should have data loggers (automatic data logging) that
can be used to restore information to the Climate Data Management System and allow
automatic re-computation of derived parameters.
● Data loggers have to meet the technical requirements set by the NMHS and be tested and
accepted for use.
● Pre-deployment testing, calibration and acceptance testing to ensure that the
instrumentation sensors or AWS software and sensors meet specified requirements
● Observation manuals must be an integral part of process documentation.
● Observers must undergo regular training, including in any changes to observational practices
● Ensuring there is regular Stations checking of stations including site visits at appropriate
intervals to check the need for maintenance/change of sensors and maintaining adequate
observing conditions at the site. Recalibration checks of sesnors should be performed on a
regular basis.
● Changes in sensor hardware (such as type or manufacturer), or changes to an observation
site, require parallel measurements to be made (from both the current and the new sensor,
or the current and new site) over a period of at least one year 2. Both the original and parallel
data must be archived and retained.
● NMHSs should develop minimum quality guidelines for external data providers, including
metadata requirements.

2.3 Requirements for technical monitoring

• Dataflow and ingest errors monitoring. An alerting system is needed to ensure that
circumstances in which the data can be corrupted or inadvertently overwritten or deleted
cannot occur; or if they do occur, can rapidly be detected and corrected. These features
require user acceptance testing before being made operational, and regular analysis of the
outcomes of quality control processes. There should also be automated processes for
ensuring that data expected are actually received, that time-stamping is accurate, and that
ingested data are not unrealistic.

• A semi-automatic real-time monitoring of technical equipment (AWS, Sensors, data


transmission) should be implemented so that possible malfunctions can quaickly be
identified and addressed (failure to do so has been known to lead to significant losses of
data from the climate record that may not be detected until months later).

• Data recovery processes. Where alerts reveal that data are missing or somehow corrupted,
there must be standard operating procedures to recover the original version of the data in

2
GCOS recommends at least two years of overlap trials. The rationale for one year´s testing is to ensure that
the full seasonal range of conditions over a year is sampled, and two years helps reduce the impact of
conducting the testing during what could be anomalous year, as well as providing additional intercomparison
data.
5
question; for instance, data from an AWS should be buffered, or recoverable from a data
logger, with a retention period of data of sufficient length. Prompt action is required,
because upstream databases, where the data arriving at a NMHS is often first stored, or
AWS data loggers, usually retain data for a finite period, typically only a few days. The data
are then overwritten. Consequently, if failure to adequately track these processes means
that data losses are not revealed for some time, a significant loss to the Climate Record
could result.

• Any troubles with sensors and data dropouts must be addressed as soon as possible, and
documented, with communicated to the data users.
• It is the responsibility of the data provider to investigate and fix any such processes that
result in lost or corrupt data, and proper internal protocols should be in place to ensure this.

2.4 Requirements underpinning the adequacy of data management


practices
Notes:
(1) The WMO Climate Data Management System Specifications (WMO, 2014) provide detailed
information on appropriate data management and quality management practices.
(2) The Manual on High Quality – Global Data Management Framework for Climate, a part of
the WMO Regulatory material, sets out standards and recommended practices for climate
data stewardship,

● From a data traceability perspective, it is important to be able to identify the source of the
data and to provide provenance (i.e., a record of changes made to the data) from sensor to
archive
● Quality flags are an essential component of data management and are used to provide a
measure of confidence in the veracity of the data.
● QC procedures should be documented in detail and made available to data users, preferably
online.
● Any alteration made to measured or observed values has to be flagged and documented as
part of the metadata.
● Data typically pass through various quality control stages (as described in Section 3 ). The
outcomes of QC tests at Each stage should be documented in the quality level of the
dataset.Thus for instance, quality flags may be assigned for data at point of observation, on
ingest, and following delayed-mode QC within a CDMS.
● Values that have been changed must not be deleted, but stored along with their
qualityassessment3.

3
Of course, it is contingent on NMHSs to ensure that the climate record presented to users is of the highest
standard possible, and capable of reliably supporting climate services. Erroneous values would not appear in
6
● When retrieving measured/observed data, data users should have access to all available
quality information (explanations included).
● Metadata supports the data, provides provenance information, can provide assistance in
investigating suspect data, and is essential for the application of homogenization. Therefore
rentention of metadata is a mandatory practice. It is important to have metadata readily
available to the quality control operator, preferably a single click away.

2.5 General considerations in designing a QC system.

QC procedures may be applied manually, automatically or semi-automatically, at each stage of the


QC process. For instance, a manual observer about to record an observation electronically may be
prompted to check the proposed value if algorithms suggest the value is unlikely. Similarly, on –
ingest checks may automatically eliminate values failing an extended range check (although a copy
of the value itself should be retained).

In so-called delayed mode QC - which takes place after ingest into a CDMS, and is so named because
the QC may occur at some time after the observation has been made - any or all of manual,
automatic and semi-automatic checks may be applied. As the names suggest, Manual QC refers to
checks made to the data by QC operators without electronic intervention. It is labour-intensive,
subjective, and depends heavily on the knowledge and experience of QC operators. Manual
techniques may be suitable where there are few stations and little access to QC diagnostic software.
On the other hand, automated QC refers to the process of scanning the data and assigning quality
flags fully automatically, i.e. without manual intervention. It is the fastest way of processing
incoming data. Currently, however, the lack of manual intervention means there is an increased risk
of erroneous values being missed, or good data being unfairly flagged as suspect (false positives).
There will also be a tendency to screen out extremes, which is not desireable given increasing
scientific interest in changes in extremes under global warming scenarios. Nevertheless, automated
approaches are suitable, and arguably essential, for processing high-frequency data, such as one
minute data from an AWS.

Semi-automated techniques may be thought of as a compromise between manual and fully-


automatic. Here the data are subject to an automated check, and then values flagged as suspect via
this process are subject to follow-up manual investigation by trained quality control staff, often
utilising diagnostic tools such as available satellite images or radar imagery in the case of rainfall.

Other concepts: Sequential techniques. Machine learning.

this public version of the climate record, but must be available if required for public enquiry,and/or to assist in
detecting and addressing measurement or software problems.
7
8
3. Elements of data life cycle Quality Assurance

3.1 Overview

The quality assurance across the data lifecycle can be broken down into five major components
(Figure 1) which are described below.

1 2 3 4 5
Risk
Observation Data Central data Final Mitigation
Delivery & Archive
base
Ingest

Figure 1. Major components in the data life cycle.

Step 1: Observation

This component includes data quality aspects associated with the observation of the meteorological elements
as well as with the operation of the observing station, either manned or automated. Keywords of the
component include observational metadata, e.g. site characteristics, instruments metadata and installation
specifications; data capture, either automatically or manually observed; and requirements and suggestions in
context with maintenance and technical monitoring.

Step 2: Data delivery & ingest

This component comprises data quality monitoring associated with the data delivery system, from the
observation site to the data base including data ingest. It refers to the performance of the data delivery system
either through hardcopy digitisation or electronic ingest, and in terms of QC it includes issues associated with
data ingest and systemic network issues.

Step 3: Central data base

This component typically comprises the extensive area of (central) data management iand is where so-called
“delayed mode” quality control occurs, using , either manual, semi-automated or automated quality control
tests.

Step 4: Final Archive

This component includes the requirements for final archiving, either in a suitable hardcopy repositiry or in an
electronic archive.

Step 5: Disaster mitigation

9
This component includes protection against loss of data through, for example, electronic backups and scans of
hardcopy records.

3.2 Observation

This section considers data quality aspects associated with the observation of meteorological
elements as well as with the operation of the observing station, either manned or automated.

3.2.1 Observation Standards

The search for a site, and the decision whether to install an observing station at this location lay the
basis for the data quality obtainable there (relative to the observation requirement). The site has to
satisfy data users‘ requirements over a period of time as long as possible, i.e. the observation
conditions should remain largely unchanged over a longer period of time. The site should be
representative of the meteorological conditions of a larger area.

Standards and guidance on meteorological observations are provided in e.g.:

• Manual on the Global Observing System (WMO, 2011)


• Guide to the Global Observing System (WMO, 2010)
• Manual on the WMO Integrated Global Observing System (WMO, 2015)
• Guide to the WMO Integrated Global Observing System (WMO, 2017)
• Guide to Meteorological Instruments and Methods of Observation (WMO, 2014)
• Guide to Climatological Practices (WMO, 2011)
• Challenges in the Transition from Conventional to Automatic Meteorological Observing
Networks for Long-term Climate Records (WMO, 2017)

Critical elements of observing standards and practices from a quality control and quality assurance
perspective regarding climate applications and services are:

• Sensor pre-deployment processes. The sensors need to meet specified standards including:
i. Manufacturer´s test specifications
ii. NMHS's or external review of sensor's capacity to deliver to standards that cover the
expected climatology (It is recommended that sensor testing replicates the expected
field range of the parameter)
iii. WMO standards for sensors
iv. Calibration?

• AWS pre-deployment processes. The AWS need to meet specified standards including:
i. Documentation and encoding specifications for AWS algorithms (where available) to
be maintained as part of the metadata by the NMHS

10
ii. Test bed and field trials to ensure AWS platform is fit for purpose (i.e. robust, reliable
and serviceable), and to ensure compliance with the NMHS or future WMO standards
iii. Data logger capacity dependent on NMHS capacity to recover data.

All existing sites have to be classified according to the WMO Technical Commission for Instruments
and Methods of Observation (CIMO) classification (WMO, 2014) for each parameter observed there;
the classification has to be included in the station’s metadata.

3.2.2 Observations Metadata

It is a requirement that observations metadata is documented and stored in the NMHS metadata
system according to WMO and NMHS standard practices (WIGOS Metadata Standard).

For the purposes of data quality control it is recommended that observations metadata contains
information such as:
o Site and exposure including:
▪ Latitude, longitude and elevation;
▪ Skyline survey if possible (including description);
▪ Site photographs for North, South, East and West;
▪ Local environment with site survey (polar diagram) including height and distance of
instrumentation to trees, buildings, etc.
o Instrument / sensor information including:
▪ Serial number;
▪ Calibration checks;
▪ Deployment / serviceability history.

Absence of complete, correct and up-to-date observations metadata would compromise the entire
Quality Management process!

3.2.3 Quality control at station level

This section applies to both manually performed observations and electronic automatically
generated observations from unmanned stations.

It is a great advantage if the first stage of quality control is already carried out at the station in real
time. The raw data of all sensors are then available on-site and – in the case of manned stations- the
observers can also perform a consistency check against other parameters. This prevents data that is
clearly erroneous from being written to the data archive andincorporated into products.

11
Manual observations

All observers must receive uniform instructions for meteorological observation which they are
required to follow. These instructions are an integral part of process documentation and
compliance. If there are any changes to these, the observers must be informed accordingly.
Observers should undergo refresher training at regular intervals.

Records which observers have made manually should be sent to a central office (usually the NMHS)
immediately after the end of the month (the station should maintain a copy).

For observations that are either fully or partially manually read parameters the following checks
should be performed:

● Consistency checks against other manual observations


● Range and constraint checks against reasonable climatological norms
● Consistency checks against the station record; possibly automatically while mobile data
enter into the system
● Any calculations performed on observations (such as relative humidity computation)
● For manned sites independent checks are often performed by the officer-in-charge or
another observer.

Examples of these checks are presented in Annexes 1 and 2.

Automatically generated observations

For fully automatically generated observations (AWS) the following checks should be performed:

● Constraint checks (database)


● Constraint checks (climatological values)
● Consistency checks against other parameters at the station (internal consistency)

For examples of these checks see Annexes 1 and 2.

All outcomes of the automatic quality checks are stored as a quality flag, which is transmitted to the
regional or central office together with the corresponding data value.

For electronic automatically generated observations from unmanned stations it is recommended


that the data logger capacity is sufficient to store data during communications outages and that, for
critical sites, a robust data recovery method, for instance satellite polling, is considered.

In the event of technical problems with sensor, data logger or the AWS the systems should
automatically generate ”technical alarms“ which can be retrieved, analyzed and acted upon by a
central office.
12
3.2.4 Maintenance

Regular visits to all stations are required to check the site conditions and - even more importantly -
to carry out maintenance works, including change of sensors.

In case of sensor changes at station level, recalibration is required and the calibration results must
be included in the metadata.

In case of sensor changes in an observing network, parallel measurements are required (current
versus new sensor) over a period of at least one year at all stations affected, or at a selection of
climatically-representative reference stations, before introducing the new sensor to the entire
network.
The results from these parallel measurements must be documented in the metadata, and thus be
accessible to data users for further analysis. It is important to record the exact date of the sensor
change for each site.

3.2.5 Technical monitoring

This section applies to both semi-automatic and fully automatic stations and addresses a semi-
automatic technical monitoring system for automated station networks to detect and intervene
regarding malfunctions of sensors, data loggers and communication systems.

Errors in sensors and data loggers can be detected rapidly by real-time monitoring of technical
parameters. This reduces the likelihood of missing data, or that faulty data continue to be processed
over a lengthy period of time, to the detriment of the times series. Gaps in times series become
smaller and the technical staff can rapidly intervene to solve the problem.

It is highly recommended to establish such monitoring system based on the following technical
opportunities:

- Intelligent sensors are able to detect whether their power supply, the internal calibration
mechanism or any internal system such as heating are operating correctly. If this is not the
case, corresponding signals are sent to the data logger and from there to the monitoring
software (control center) during data retrieval. Ideally, such an error in the sensor is made
visible as a quality flag attached to the measured values during signal processing in the data
logger; thus, it can be detected easily.
- Both the data logger and the data processing unit at the station may malfunction, which can
be detected from the fact that the measured values are no longer encoded or that retrieval
of data from the station is no longer possible.
- Technical monitoring of data communication provides rapid information about any error
that might exist. This helps to explain why certain data may be missing and, if necessary, to
initiate measures for mending the technical problem.

13
All technical errors should be stored in a database to which data users have access. The database
should contain the following information at least:

o Type of error
 Sensor
 Data logger / AWS
 Communication line
o Beginning of the error
o End of the error
o Measures to be taken to mend the error (Text)
o Has the error caused any data gaps?
o Who dealt with the error?

3.3 Data delivery system and ingest

This section considers data quality issues associated with the data delivery system from the
observation site to the (central) data base and data ingest.

3.3.1 Data delivery system

Many NMHSs receive hardcopy station records that are then digitised by NMHS staff. As part of the
QA/QC process it is recommended that:
o The record is imaged;
o Multiple key-entry digitisation is used to control for digitisation errors (double keying is
recommended)
o An operator ID be attached as metadata to the digitised record. This enables auditing of
the performance of operators and assists with training;
o Digitisation date/time to be appended to the digitised record. This enables auditing of
changes;
o Digitisation level QC is used to reduce the incidence of errors. For digitisation into a CDMS
ingest level QC algorithms should include:
▪ Range and consistency checks
▪ Checks to ensure data for the station, time and date do not already exist in the
database;
▪ Metadata checks to check that the station name correlates to the assigned
station number;
▪ Arithmetic checks of the parameter calculation (for instance calculation of dew
point using dry bulb and wet bulb temperature values);
▪ In the case of cumulative variables such as rainfall and evaporation, arithmetic
summation of daily values to ensure consistency with the recorded monthly total

14
3.3.2 Electronic data ingest

For ingest level QC it is recommended that tests be confined to identifying data that can be
scientifically supported as incorrect through domain range checks. If additional QC analysis is
performed at ingest it is recommended that this doubtful data be flagged for assessment through
the NMHS’s delayed-mode QC system (see next section).

Recommendations:

o The ingest reliably decodes data and metadata for all parameters
o Ingest QC software is separate from the software used to ingest the data to reduce
problems associated with upgrades and updates
o Disaster mitigation – there is the capacity to re-ingest recovered data where necessary
o Readily serviceable software that is accommodating of new data types, data formats, and
data flows
o Error logs are used to highlight potential problems with the data3.3.3 Systemic network
issues
Even in a fully automated system there is still a requirement for analysis of and appropriate tools to
detect systemic network issues. In addition, any changes to the system needs user acceptance
testing and testing in development environment, modelling worst case and highest impact on quality
scenarios, with the tests and results being fully documented. Eventually, end-to-end (sensor to
archive) testing is required.

Ideally there is a process where identified data issues are documented. Should an issue be observed
repeatedly, it may indicate a systemic problem.

Systemic network issues for a given parameter may be identified by a higher than usual frequency of
data flagged as ‘suspect’.

3.4 Central data base

This section looks at details of quality control within the CDMS, as well as rules, procedures and tests
and of strategies for data presentation.

There has to be a consistent quality control system that applies to all existing sensors and
observation data. The QC processes differ depending on the source of the data, the parameter
measured and the temporal resolution. Depending on the NMHS’s structure, QC is performed by a
central unit or at a regional level. A general recommendation cannot be given here. At least,

15
however, there should be a final QC which is carried out using a uniform procedure, even if
preceded by a regional QCcheck.

3.4.1 Introduction
Quality control tests within the central climate database (CDMS) can be performed at various stages
of the data life cycle, from collection to central processing. For example, the Climate-based range
test (Annexes 1, 2) is generally performed post ingest while the Domain test (Annexes 1,2) is often
performed at ingest level but can also be used post ingest to assist with network monitoring.

The quality control tests can be grouped into five main test types:

- Constraint tests,
- Consistency tests,
- Heuristic tests,
- Data provision tests and
- Statistical tests.

The sensitivity of a test is often determined by whether the test is automated or semi-automated.
Automated QC can result in lower confidence data and may either overestimate errors or
underestimate extremes. However, automated QC is suited to high frequency data – due to limited
value changes over short time periods and high volumes of data to process.

Allowing for Site Differences

Site differences need to be accounted for when running the quality control tests as these differences
can influence the Quality Monitoring Parameters (A QMP is a stored value that acts as a constraint,
normally for a given meteorological element/station/month of year/test type, and which may be
statistically-generated. It is compared against a meteorologicalreading) and normalisation constants
(see Annex 2). This is particularly important for NHMSs that operate over a wide range of climate
zones and elevations.

Coastal stations have vastly different climate ranges to continental stations and this can impact on
the expected number of false errors if the same test parameters are used. For NHMSs covering
different climate zones, there are two main approaches for dealing with site differences and
differing false error rates. One option is to have separate procedures for each zone, with quality
monitoring parameters and normalization constants determined for each zone. Alternatively a single
system could be used in a semi-automated procedure with trained operators used to identify false
errors related to climate, including seasonal, variability between neighbouring sites.

Differences in site elevation can also impact the expected number of false errors if the same test
parameters are used. Quality control operators can be used to check if neighbouring stations are at
significantly different elevation. Having sites at differing elevations can impact on the quality control
tests that can be performed, for example, the use of nearest neighbour based tests may not be

16
appropriate. Comparing data in normalized form, however, can solve the challenge of differences in
site elevation.

The length of the climate record can also impact on the QMPs. For example, QMPs generated by a
site that has 30 years of record will be significantly more representative than a site that has only 3
years of record. When running spatial statistics, for example, if the neighbours are relatively new
stations with short records, then this will influence the test performance. Metadata will assist
operators in determining the length of record and expected effectiveness of the test. The climate-
based range test (Annexes 1, 2) compares against climate record extremes, and therefore a site with
a short period of record is expected to triggering suspects more frequently.

The density of the observation network impacts on any quality control test that uses spatial
statistics, e.g. spatial variability test and Barnes spatial test (Annexes 1, 2). Limits may need to be
placed on spatial algorithms as to the distance of a neighbouring station from the target station that
is acceptable, and this will be determined by a number of factors including, climate zone,
topography and length of record and the parameter that is to be tested. For very remote sites or
very sparse networks the first day of the month may need to be triggered as suspect to ensure
manual checking for errors and the range of tests available may be limited. Alternative methods, and
parameters, may be necessary to validate any suspect observations. These methods include use of
satellite images, hydrology, radar, river heights and streamflow. It may also be possible to contact
the observer directly to confirm the recordedquantity, preferably in near to real time.
Where the skill set exists, multivariate tests can be used to assist the quality control operator. For
example, if a heavy daily rainfall total occurred on a given day these observations could be
compared with hourly rainfall observations, present and past weather reported, cloud types
associated with heavy rainfall, humidity, other rainfall measuring instruments (e.g. pluviometer or
rainfall intensity charts), radar images, satellite images, numerical weather forecasts, impact
analyses / media reports etc. This requires the quality control operator to have a good local climate
knowledge and that the alternative parameters and images are readily accessible

3.4.2 Dealing with high-frequency data

Nowadays, many NMHSs operate AWSs with which they collect data at high temporal resolutions (1
minute, 10 minutes) and that are transmitted at short intervals from the observing station to the
headquarters. At the national level, there are large differences regarding the use and processing of
these data. With regard to climate data, this is a great opportunity to start automatic, real-time
Quality Control monitoring directly at the station, and perform further levels of QC later in near real
time at the central unit. In this way, many major errors are excluded from the climate datasets (daily
values, monthly values) because they have already been flagged and/or corrected in the high-
resolution data.

Climate datasets are calculated from data of high temporal resolution. Every time that high-
resolution data are corrected, all datasets that are derived from these data have to be re-calculated

17
This type of QC, as well as the recalculation, should only be possible within a period of max. 20 days
(calculation of monthly values and generation of certain products).

From these high-frequency data, the original data values must be stored too, which requires more
complex storage systems and data models. The quality control of high frequency data is highly time-
consuming as the decision about the correctness of automatically tested data must always be taken
manually. This is why some NMHSs have returned to performing near-real time quality control on
the basis of hourly values. High-frequency data are only given a closer look if the hourly values
appear to be doubtful.

Where possible it is recommended that an automated QC process be used for all high-frequency
data parameters and that QC testing include:

▪ Completeness tests
▪ Consistency tests
▪ Spike tests
▪ Rapid change tests
▪ Flat line tests
▪ Domain tests

When dealing with high-frequency data an appropriate analysis window needs to be defined, e.g. for
data received at a 1 minute frequency a 10 minute analysis window for a real time air temperature
data feed can be displayed on the web, while other parameters, such as evaporation data, may be
analysed once per day.

The risks of fully automated tests can be mediated by multivariate analysis (heuristic).

18
Post event QC

There may be cases where a significant and unexpected change in the high frequency data
parameter, such as a data spike, warrants further investigation to assess the broader context of the
sudden change.

The event is evaluated after its occurrence using multivariate analysis to confirm whether the event
is representative of the meteorological conditions at the time, for instance a severe weather event
or localised phenomena such as a microburst or wildfire, or a spike generated by an internal voltage
surge.

19
Example: Impact of a wildfire. Station (014401; Warruwi Airport, Australia) was flagged as suspect in
the climate test with respect to maximum temperature on 24 July 2016. Initially it appeared to be a
high frequency (one minute) temperature spike, however, the spike lasted beyond one minute
(11:37 to 11:45); a grass fire is expected to be the reason. Upper panel: time series of hourly
temperatures overlaid with one minute temperature data (pink line). Solid green line highlights
calculated daily maximum temperature values and red line daily minimum temperatures. Suspect
values highlighted in magenta. Middle panel: Comparison of the one minute data across multiple
parameters. Lower panel: Comparison of daily maximum temperature values for 'nearby' sites.
Warruwi (dark green) has an unexpectedly high maximum temperature on the 24 July.

3.4.3 Value-added quality control


Value-added quality control in this context refers to data homogenisation. Homogenisation is the
adjustment of climate records to remove, if necessary, non-climatic factors so that the resultant data
reflects unbiased variations due to real climate processes.

WMO (2003) provides guidance on dealing with homogeneity problems. The four key steps
commonly used are:

● Analysis of metadata and quality control: This step looks for changes in the measurements
as well as what quality control procedures have been performed.
● Creation of a reference time series: Generally uses a weighted average of neighbouring
stations but other techniques, such as principal component analysis can also prove useful.
● Breakpoint detection: A search for in-homogeneities in the difference between the
reference time series and the candidate time series.
● Data adjustment: Decide on which breakpoints are accepted as in-homogeneities through
comparison with available metadata and expert judgement. The assessed discontinuities in
the data are then corrected to match the conditions of its most recent homogeneous
section.
It is important to document every adjustment made to the data and to always preserve the original
data.

3.4.4 Future aspects of quality control


The quality control processes described here are mainly based on algorithms which use ground
surface data, and the verification of which was done visually by the operator using, among others,
satellite and radar data and products.
Future automatic QC processes have the potential to integrate these types of remote sensing data as
well as any derived products directly into the verification algorithms. Based on remote sensing data,
the following parameters could be verified fully automatically, in particular for areas where
observing stations are sparse so that spatial comparison with other surface stations is not possible:

• Solar radiation
• Cloud cover
20
• Snow cover
• Temperature (daily mean)
• Precipitation (daily sum)

Numerical Weather Prediction products (e.g. First Guess) provide good help for near-real time QC of
measured/observed data to detect and rapidly correct faulty data. **** To here –wjw, 04/04***

3.4.5 Source and quality flags


Source flags identify the method of data delivery, for example, a way of distinguishing between
manual and automated observations. These flags can be used to compare the same observation
received via different methods, e.g., daily rainfall sent electronically compared with the same
observation record on a hard copy manuscript. These values are not always the same and may
require further investigation. The source flags can also be used to compare values from co-located
instruments, for example, daily rainfall where you are comparing manual rainfall observation with an
automated rain gauge reading, and from third party datasets.

Quality flags identify the confidence we have in the correctness of the observation. Quality flags are
set from the automated QC at station (AWS, datalogger), during the ingest process and during semi-
automated quality control and can also represent data tiers, e.g., third party data which may not be
expected to be of the same quality due to reduced metadata availability and differing
instrumentation types and maintenance routines.
An example of determining quality flags: If a manually read maximum thermometer value did not
compare favourably with the neighbours and the quality control operator suspected a reading error
then the data may be flagged as a gross estimate or in more extreme circumstances as a suspect
value.

Quality control flags can related specifically to the type of quality control applied, i.e., ingest quality
control, automated quality control and automated flagging of data, semi-automated quality control
with multivariate operator analysis.

Changes in flag values need to be maintained in an audit trail and these changes need to be available
for manual analysis.

The flagging model adopted should not be complex, it is recommended to use a series of flag subsets
to indicate various metadata rather than a single large listing of flags that encompasses every
combination of metadata scenario.
Planning of quality flag system will dictate how future proof the system will be. The flagging system
should be flexible enough to accommodate future changes in technology, networks, etc. The use of
flag subsets enables:

● Efficient modification of the flag subset when required, for instance adding an additional flag
to indicate a new sources of data to an existing flag series; or adding an extra flag series to

21
describe the quality of instruments exposure or the specific quality control already
performed by an external party for third party data.
● Retrospective quality control to take place as the quality control system improves.

WMO (2011) gives examples of quality flags of

• Type of data code


• Acquisition method code
• Validation stage code

It is not intended to be an exhaustive list, but it can be used basically.

Quality flags allow for audit trails. This includes monitoring of quality control operators,
accountability and retention of original records. For example, if a training issue was identified for a
particular operator it is possible through the use of quality flags to recover all corrections applied by
that operator which relate to that training issue, recover the original data, and reprocess the data.
This relies upon the audit trail being in a relational database; SQL can then be used to correlate the
audit trail results with the corrections applied to specific parameters in the climate database. This
type of monitoring is very important to ensure uniformity in regards to the processing of climate
data and ensuring data provenance and transparency of processes. All analyses need to be
scientifically supported before it is flagged as suspect. If the operator cannot make a decision then it
is recommended that the data be retained as is.

The use of quality flags is mandatory component of quality control. It provides a level of quality
assurance for the data users. It insures that suspect or erroneous data is not deleted but rather
flagged. It provides data users with the capacity of choosing the level of data quality appropriate for
their needs.

Tiered data

Data held by a NMHS may belong to different networks and may be expected to vary in quality, due
to exposure of the instrument or instrument set up, etc., and / or the amount of metadata available.
For this reason data may be considered to be tiered, i.e. belonging to a particular class of network,
e.g. NHMS operated or operated by external parties. As data quality may vary according to data tier,
it is recommended that appropriate flagging is used to indicate the data tier and to ensure
appropriate processing is performed. Processes may include the cleaning of data.

3.5 Final Archive

For electronic archives at the point of final archive it is important to ensure that:

22
● The product has a full audit trail of changes
● The archive retaining the original electronic message
● Provenance can be viewed
● The data is fit for purpose
● The data is of known quality and the quality tests that have been applied
● The data cannot be changed

Regular technology migration is needed to ensure that final archives remain accessible in perpetuity.

For hard copy storage records management would consider a controlled environment having:
● Acid free storage
● A full inventory of records held
● Appropriate handling procedures
● Record preservation
● Adequate security

All paper documents have to be subject to digitization (imaging and keying; cf. WMO, 2016).

3.6 Disaster Mitigation

Disaster mitigation is concerned with ensuring that data can be recovered in the event of loss or a
disaster. For databases or electronic data this includes ensuring there are multiple copies of the data
and off-site backups. For paper records it includes appropriate storage and digitisation.

1. Role of quality assurance manager

With each NMHS having its own procedures and structures, quality assurance managers play a very
important role. Each NMHS should have a data quality assurance managers position installed. This
person should know the details of the process(es) along the data life cycle (cf. chapter 3 above) and
has an overview of the state of the quality assurance process, even if the tasks of quality assurance
and quality control within the NMHS are distributed to different organizational areas. Quality
assurance managers should regularly exchange their experiences internationally.

The following list provides examples of items to be considered under charge of the data quality
assurance manager to ensure data quality in the recording, delivery and archive of data (the tasks
may be done by different sections of the NMHS or by external parties); metadata and
documentation from these items should be available to the QC manager to ensure an informed QC
process

23
● Ensuring instrument technology is fit for purpose:
▪ Instruments are assessed with performance and acceptance testing
documented before field trials;
▪ The field trials of instruments with test documentation, test conditions and
results to be completed within tolerance before the instrument is deployed
operationally;
▪ For instruments superseding older technology, parallel field trials are conducted
for a minimum of 1 year (recommended 2 years) to quantify the performance
differences between the instruments;
▪ The results of the parallel field trials has to be stored and to be available to data
users/customers;
▪ The instrument is sited operationally to WMO or country specifications, with
the instrument calibration tested against an instrument standard (where
applicable) before deployment into service;
● For instruments already in operational service:
▪ Ensure instrument outputs are within tolerance through a regular inspection
program;
▪ Ensure observations are made at the required time interval;
▪ Regularly document and monitor the exposure of the instruments at the site
noting any significant changes that will affect the quality of the data (this
information is transferred to the NMHS metadata system);
● For manually recorded observation:
▪ Ensure that the field books are checked for accuracy by staff at the station, this
is usually by the station officer in charge for sites with more than one observer;
▪ For manually recorded observations, metadata relating to instrument or data
quality or issues is noted in the station record of observations (field book);
▪ As far as practical ensure observers and observing networks are employing
standard practices and procedures (i.e. consistent practices and observational
methods).
▪ Ensure observations are of sufficient quality using a recognised suite of tools for
QC.
● Data ingest and archive:
▪ Responsible for ensuring adequate metadata and an audit trail for the data
exists;
▪ Feedback to site / engineer around site / instrument issues;
▪ Full receipt of observations;
▪ Responsible for data recovery – due loss of data received through electronic
means;
▪ Has information and knowledge about backup procedures;
▪ Ensure data received is accurately digitised and archived.
● Data quality:
24
▪ Responsible for ensuring complete, continues and timeliness run of all quality
control procedures and tests;
▪ Ensure the consequent flagging of all data;
▪ Responsible for document the quality information into metadata.

25
REFERENCES
World Meteorological Organization, 2003: Guidelines on Climate Metadata and Homogenization
(WMO/TD-No. 1186). Geneva.

------, 2010: Guide to the Global Observing System (WMO-No. 488). Geneva

------, 2011: Manual on the Global Observing System (WMO-No. 544). Geneva

------, 2011: Guide to Climatological Practices (WMO-No. 100). Geneva

------, 2014: Guide to Meteorological Instruments and Methods of Observation (WMO-No. 8). Geneva

------, 2014: WMO Climate Data Management System Specifications (WMO-No. 1131). Geneva

------, 2015: Manual on the WMO Integrated Global Observing System (WMO-No. 1160). Geneva

------, 2016: Guideline on Best Practices for Climate Data Rescue (WMO-No. 1182). Geneva

------, 2017: Guide to the WMO Integrated Global Observing System (WMO-No. 1165). Geneva.

------, 2017: Challenges in the Transition from Conventional to Automatic Meteorological Observing
Networks for Long-term Climate Records (WMO-No. 1202)

26
ANNEX 1: Quality Assurance and Quality Control Tests – Overview

Table 1: Selected quality assurance and quality control tests and suggested importance (M =
Mandatory, R = Recommended, O = Optional).

For further test details see Annex 2

Sources (Annexes 1, 2): QMS Test Specification (Bureau of Meteorology, Australia; internal
document), WMO (2011), WMO (2014)

Name of Test Short Description Notes M/ R /O


Constraint Tests
Range of tests to ensure that observations are technically and scientifically plausible based
upon theoretical and climatological limits, sensor hardware specifications or data base limits
Sensor-based Detects observations that are outside M
range test the range of theoretical limits or
sensor hardware specifications
Data-base limit Detects values that are outside the While ingest into the M
test range of the storage system storage system
Domain test Determines if the meteorological value M
is within realms of scientific possibility

Consistency Tests
Range of tests to ensure that inconsistent, unlikely or impossible records are either rejected
or flagged as suspect. A manual investigation may then assess the validity of the suspect
values
Three hourly Tests for consistencies between 3 M
test hourly values and daily values
Daily Min vs Compares the daily minimum and O
Ground Min daily ground minimum temperatures
test
Hourly MSLP Test for significant change in Test can be performed O
and SLP difference between MSLP and SLP using hourly, 3 hourly,
difference test over two consecutive recordings etc. observations
Precipitation Checks if data from one source is The data sources may R
multi source consistent with data from another be different
comparison source instruments or the
test same instrument from
different
communication paths
Zero Checks for instances when Typically, this indicates M
precipitation precipitation is recorded at one site precipitation being
spatial test (1) but not at the neighbouring sites and recorded on the wrong
visa versa day
Insufficient Checks if there is a sufficient number Test generally R
27
neighbours test of neighbouring stations to perform performed on the first
spatial tests day of the month and
for precipitation data
Precipitation Tests for overlap and underlap, i.e. Designed for rainfall R
period test record’s period value is larger or less precipitation records
than the actual null precipitation
reported dates
Tracking tests Compares how two elements or two Both are expected to R
neighbours both rise and fall together rise and fall together.
Very effective
Maximum Air Checks for consistency between daily R
Temperature maximum air temperature with sub-
Consistency daily observations
test
Solar hour – Tests the difference between the M
Astronomical sunshine duration and the calculated
test day length
Consistency Compares the air temperature with M
test between the wet-bulb temperature
Air
Temperature
and Wet Bulb
temperature
Present Air temperature is compared to M
weather and air depositing rime and freezing
temperature precipitation
consistency test
Wet bulb vs Tests the difference between wet bulb Fails test if difference is O
dew point and dew point temperatures < 0. Used with manual
consistency test observations
Wet bulb vs Wet bulb temperature is tested Used with manual O
dew point test against dew point temperature by observations
recalculating dew point temperature
from the wet bulb air temperature
Dew point air Tests if dew point temperature is less R
temperature than or equal to air temperature
consistency test
Visibility Test for visibility consistency with Horizontal visibility. R
consistency test present weather code against Used with manual
phenomena flags (fog, sandstorm, observations
mist, dust storm)
Total cloud Test for consistency of total cloud Used with manual R
amount amount against various elements observations
consistency test
Daily minimum Checks the consistency of daily O
versus ground minimum taken from the screen

28
minimum test versus ground minimum taken from
the ground
Daily These tests check the consistency of R
phenomena the daily phenomena flags with codes
flag tests in the sub-daily tables and various
other daily elements
Present Checks the consistency of the present R
Weather weather codes in the sub-daily tables
consistency and various other daily elements
tests
Soil Check the consistency of soil R
temperature temperatures at various depths
tests

Heuristic Tests
Set of tests that rely on experience and knowledge of observation processes, techniques and
instrumentation to detect inconsistent, unlikely or impossible records and flag them as
suspect. A manual investigation may then asses the validity of the suspect values
Relative Test to determine if wet bulb wick has R
humidity dry dried out
wet bulb test

Data Provision Tests


Set of tests to ensure that observations that do not match the expected schedule of observations
are either rejected or flagged as suspect
Observation Compares local clock time of R
received from observation with time received
future test
Period gap test Tests if the periods match the records Applies to daily M
that are present observations
Large period Tests if periods are excessively large Applies to daily records R
test (many days) or less than 1 day

Statistical Tests
Set of tests that statistically analyse historical data to detect inconsistent, unlikely or
impossible records and flag them as suspect. A manual investigation may then asses the
validity of the suspect values
Climate-based Compares the meteorological value of Thresholds can be M
range test with the climate upper and lower calculated to take into
values account seasonal
variations in the
observations
Flat line Checks the size of a run of R
meteorological values which are the
same. I.e. if the parameter is

29
unchanged over time
Rapid change Checks the difference between the R
test previous and current observation
Spike test Compares a given meteorological Similar to rapid test, R
observation with the previous and however, this test looks
next values for rise up and then
down (or down then up)
Frequency test Checks for instances of excessive Applies to manual O
rounding of a value observations were an
operator rounds rather
than interpolates values
Spatial Daily climatology of the differences O
variability test between the station of interest and
nearby stations is compared
Barnes spatial Compares the meteorological value Barnes analysis used to R
test with the surrounding data values using weight the values from
a Barnes analysis nearby stations
Linear Compares the meteorological value O
regression with the surrounding data values using
spatial test linear regression
Linear Uses the variability of the Variation on Linear O
regression neighbouring stations to calculate the regression spatial test
spatial test limits rather than the standard and spatial variability
variability test error estimates test
Linear Compares the climatology of the Similar to Linear R
regression differences between the station of regression spatial test
multi-day test interest and nearby stations over a but applies to multiple
multi-day period days
Max / Min test Tests that the difference between the Bounds are derived R
maximum and minimum temperature from climatology using
is realistic (i.e. greater than zero and a minimum of 5 years of
less than an upper bound) data, preferably 30
years

30
ANNEX 2: Quality Control Test Details

1. Constraint Tests

Range of tests to ensure that observations are technically and scientifically plausible based upon
theoretical and climatological limits, sensor hardware specifications or data base limits.

31
1.1. Sensor-based range test
Brief Description: Detects observations that are outside the range of theoretical limits or sensor
hardware specifications.

Parameters test applies to: Temperature, humidity, barometric pressure, wind

Detailed Description: The sensor-based range test is generally performed at the station for
automated observations. The limits are often set by the manufacturer often with input from the
NHMS. The sensor sensors at one second intervals and if the value exceeds the specified limit then
the value would be excluded. A minimum number of valid samples are required before a value is
transmitted – the number of valid samples may vary from one manufacturer to another. From these
valid samples a maximum, minimum and mean would be extracted. Testing also occurs for manual
observations, for example, when resetting liquid in gas thermometers compare the reset values with
ambient air temperature. At field stations with both manual and automated stations compare liquid
in gas readings with temperature probes. Tolerances would be applied.

Frequency of Test: At observation frequency

32
1.2 Database range test
Brief Description: Detects observations that are outside the range of database acceptance (database
technical constraints).

Parameters test applies to: all parameters stored in database

Detailed Description: The database range test is generally performed at the ingest system, at the
quality control software and at the database. The limits are widespread set, to ensure the storage of
data from all over the world, received via GTS. It is necessary to avoid values created by an oversight
e.g. during correction or ingest of data into database. Examples: temperature > 80°C, negative
pressure.

Frequency of Test: At QC-software and ingest system

33
1.3 Domain test

Brief Description: Determines if the meteorological value is within realms of scientific possibility

Parameters test applies to: temperature, humidity, barometric pressure, wind, etc.

Detailed Description: The test checks if an observed value lies between the 0.3 and 99.7 percentile
of all values from the past 30 years. This test applies to both manual and automated observations.
Where an observation is an accumulation value the observation is divided by the period to test
against a value per day.

Frequency of Test:

34
2. Consistency Tests

A range of tests to ensure that inconsistent, unlikely, or impossible records are either rejected or
flagged as suspect. A manual investigation may then assess the validity of the suspect values.

35
2.1. Three hourly test

Brief Description: Tests for consistencies between 3 hourly values and daily values.

Parameters test applies to: Daily maximum and minimum air temperature, dry bulb temperature,
daily maximum wind gust speed, hourly wind speed;

Detailed Description: For daily minimum temperature - the difference between the value and the
lowest 3 hourly temperature over the previous 24 hours is calculated and tested.
For daily maximum temperature - the difference between the value and the highest 3 hourly
temperature over the next 24 hours is calculated and tested.
For daily max wind gust – the difference between the value and the highest 3 hourly wind speed
over the previous 24 hours (00:00:00 to 23:59:59) is calculated and tested.
Frequency of Test: Daily

36
2.2. Daily Min vs Ground Min test

Brief Description: Compares the daily minimum and daily ground minimum temperatures.

Parameters test applies to: Daily terrestrial minimum temperature, daily minimum air temperature;

Detailed Description: The daily minimum temperature to be tested should be warmer than, or equal,
the ground minimum temperature from which a normalization constant is subtracted.
False positives can be generated by this test for sites that have a high level of heavy metals in the
soil (for instance iron oxide).

Frequency of Test: Daily

Test Order: Test run after the air minimum temperature has been assessed

37
2.3. Hourly MSLP and SLP difference test

Brief Description: Test for significant change in difference between mean sea level pressure (MSLP)
and station level pressure (SLP) over two consecutive recordings.

Parameters test applies to: Hourly MSLP and hourly SLP

Detailed Description: Test for significant change in difference between mean sea level pressure
(MSLP) and station level pressure (SLP) over two consecutive recordings. Test can be performed
using hourly, 3 hourly, etc. observations. This test checks for transposition and calculation errors.

Frequency of Test: For every observation

38
2.4. Precipitation multi source comparison test

Brief Description: Checks if data from one source is consistent with data from another source. The
data sources may be different instruments or the same instrument from different data delivery
paths or ingests.

Parameters test applies to: Precipitation

Detailed Description: This test checks to see that the data at given data source is consistent with
data from another co-located data source. The two data sources may be from different instruments
or may be the same instrument from different data delivery paths or ingests.
This test applies equally to a comparison done with different co-located instruments or the
comparison is done with the same instrument but via different data delivery sources. Where it is the
same instrument it is expected that the two values will be the same. Where a discrepancy exists and
it is not possible to make an objective decision as to which value is correct the data is flagged
accordingly. Consequently the same pass criteria is used whether the instrument is the same or
different. Other tests and analysis (e.g. spatial and temporal distribution) will aid in determining the
most likely correct value). Other analysis outside of this test should be used to identify stations with
consistent anomalies.

Frequency of Test:

39
2.5. Zero precipitation spatial test

Brief Description: Checks for instances when precipitation is recorded at one site but not at the
neighbouring sites and vice versa. Typically, this indicates precipitation being recorded on the wrong
day.

Parameters test applies to: Daily precipitation

Detailed Description: This test checks for instances where there is a precipitation value recorded but
all the neighbours are zero and vice versa. This will typically be where the value is recorded on the
wrong day. It is not picked up in spatial tests as the spatial tests are not performed at low rainfall
readings due to the spatial variability associated with showers.
This test is designed to have a low false failure rate rather than bring up potential suspects.

Frequency of Test: Run daily but analysed in monthly blocks

40
2.6. Insufficient neighbours test

Brief Description: Checks if there is a sufficient number of neighbouring stations to perform spatial
tests. Test generally performed on the first day of the month and for precipitation data.

Parameters test applies to: Precipitation

Detailed Description: A check for cases where there are insufficient neighbours to perform the suite
of spatial tests. This will ‘fail’ and prompt the QC operators to manually check the data. This test is
generally performed on the 1st day of the month (other days in the month are not tested) – the QC
operator will check all the days for the month.
Note the test should be performed when the precipitation amount is 0, >0 or null.

Frequency of Test: Run on first day of the month only

Test Order: Run before other Precipitation spatial tests

41
2.7. Precipitation period test

Brief Description: A consistency test assessing overlap and underlap for rainfall accumulations.

Parameters test applies to: Daily precipitation

Detailed Description: A record's period value (number of days the rainfall record is accumulated
over) is either larger or less than the actual number of days that the precipitation data is reported as
accumulated. This test is specifically designed for rainfall precipitation records. There are two parts
in this test: Overlap and Underlap.

Example Overlap:

When a record's period value is larger than actual null precipitation reported dates. For example, if
the period value is reported as 3 days on 4th of May, but rainfall (a non-null value) was last reported
on the 2nd of May, the actual period gap (rainfall accumulation) is only two days (target date
inclusive). For this example the rainfall value on the 2nd of May would be assessed for accuracy
(rainfall can be recorded on the incorrect day). If the both rainfall data on the 2nd May and 4th May
are assessed as correct then it is likely that the period is in error. Another possibility is that the 4th
May value is the aggregate of the 2nd to the 4th May. To determine the error, the QC operator will
use spatial data and other tools such as RADAR to confirm their analyses. The station record will be
assessed to ensure data was correctly digitised.

Example Underlap:

When a record's period value is less than actual null precipitation reported dates. For example, if the
period of rainfall accumulation is reported as 2 days on 6th of May, but the last recorded non-null
precipitation record is on 3rd of May, the actual period gap is 3 days (accumulation over 6 days
(target date inclusive).

Frequency of Test:

42
2.8. Tracking tests

Brief Description: Compares how two elements or two neighbours both rise and fall together, when
both are expected to rise and fall together. Very effective tests with low false error rates.

Parameters test applies to:

● Mean Sea Level Pressure and Station Pressure


● Terrestrial Minimum Temperature and Minimum Air Temperature
● Soil Temperatures at depth compared to other Soil Temperatures at a different depth
● Dry Bulb Temperature, Wet Bulb Temperature and Dewpoint Temperature (particularly for
manually read thermometers)
● Daily Evaporation and Daily Precipitation
● Under 3m Wind Run with the 10m Wind Run values

Spatial correlations tracking to neighbours: This may be visually assessed by the QC Operator using a
time-series neighbours plot on a GUI rather than identified by a computer algorithm. This applies to
many parameters particularly for networks with good spatial coverage, the following elements are
generally assessed:
● Hourly Dry Bulb Temperature
● Hourly Dew Point Temperature
● MSLP and SLP
● Wind Speed

Detailed Description: The tracking test looks how two elements or two neighbours both rise and fall
together. This test looks at how an element corresponds to another or its neighbour over time but
utilises a very simple parameter, which is that two elements are expected to rise and fall together.
In assessing the anomalies produced by this test the QC operator must take into consideration the
prevailing meteorological conditions including local effects, and the spatial density of the network.

Frequency of Test: Dependent on observation frequency for both parameters

43
2.9. Maximum Air Temperature Consistency test

Brief Description: Checks for consistency between daily maximum air temperature with sub-daily
observations.

Parameters test applies to: Maximum Air Temperature

Detailed Description: Checks for consistency between daily Maximum Air Temperature with hourly
Air Temperature (t), i.e. whatever is recorded in the hourly or 3-hourly records, i.e. all 3-hour
records, is no greater than the maximum temperature recorded. Reading error of 0.5 degrees
allowed. This test will also calculate on only one observation per day however test effectiveness
improves with greater frequency of hourly observations.
Frequency of Test: Daily

44
2.10. Solar hour – Astronomical test

Brief Description: Tests the difference between the sunshine duration and the calculated day length
in hours.

Parameters test applies to: sunshine duration

Detailed Description: The difference between the sunshine duration and the calculated day length.
The test is failed if the difference is greater than zero.
Frequency of Test: Daily

45
2.11. Consistency test between Air temperature and Wet Bulb temperature

Brief Description: Compares the air temperature with the wet bulb temperature.

Parameters test applies to: Dry Bulb temperature and Wet Bulb temperature

Detailed Description: The Dry Bulb temperature and the wet bulb temperatures are compared; if the
wet bulb including a tolerance value is greater than the air temperature the test fails.
Note that while the tolerance in this test applies to manually read liquid in glass thermometers, the
same algorithm is usually applied to electronic temperature sensor data. While it is recognised that
the tolerance for electronic temperature sensors should be much lower, it is convenient in a hybrid
network to use the same algorithm due to difficulties in discriminating instrument type in the
Quality Control stage.
Frequency of Test: Hourly

46
2.12. Present Weather and Air Temperature consistency test

Brief Description: Air temperature is compared to depositing rime and freezing precipitation.

Parameters test applies to: Manually observed Dry Bulb temperature and phenomena

Detailed Description: The air temperature is compared to the phenomena .


Frequency of Test: Hourly

47
2.13. Wet bulb vs dew point consistency test

Brief Description: Tests the difference between wet bulb and dew point temperatures.

Parameters test applies to: Wet Bulb Temperature and Dew Point Temperature

Detailed Description: Tests the difference between wet bulb and dew point temperatures. The test
fails if the wet bulb temperature is colder than the dew point temperature. This test applies to
manually read thermometer readings.

Frequency of Test: As per observation frequency

48
2.14. Wet bulb vs dew point test

Brief Description: Wet bulb temperature is tested against dew point temperature

Parameters test applies to: Wet Bulb Temperature and Dew point temperature

Detailed Description: Wet bulb temperature is tested against dew point temperature by recalculating
dew point temperature from the wet bulb air temperature: The recalculated dew point temperature
minus 1 is less than the observed dew point temperature and the observed dew point temperature
is less than the recalculated dew point temperature plus 1. Used with manual observations. This
test is used for manual observations where dew point temperature is calculated by the operator.
This test has greater sensitivity than 2.16 and could be run after test 2.16 to check for incorrectly
calculated dew points that pass test 2.16. For dew point values that fail this test the QC operator
should recalculate the dew point using the observation method in use for that country.

Frequency of Test: As per observations frequency

49
2.15. Dew point air temperature consistency test

Brief Description: Tests if dew point temperature is less than or equal to air temperature.

Parameters test applies to: Dew point temperature and air temperature

Detailed description: Tests if dew point temperature is less than or equal to air temperature. Valid
conditions are where the dew point temperature is equal to the air temperature only for air that is
saturated. Situations where this can occur include during precipitation and fog events. It is
important that the QC Operator assesses whether the dew point temperature is the result of
saturated conditions and not due to observer error in calculating the dew point.

Frequency of Test: As per observations frequency

50
2.16. Visibility consistency test

Brief Description: Test for visibility consistency with present weather code

Parameters test applies to: present weather and visibility

Detailed Description: Test for visibility consistency with present weather code against phenomena
flags (fog, sandstorm, mist, dust storm). Horizontal visibility. Used with manual observations.
Examples of unlikely/impossible combinations (to be adapted to different climate): visibility above
1000m and fog, visibility above 1000m or below 200m and slight or moderate sand or dust storm,
visibility above 10 km and mist etc.
Frequency of Test: As per observations frequency

51
2.17. Total cloud amount consistency test

Brief Description: Test for consistency of total cloud amount against various elements. Used with
manual observations.

Parameters test applies to: total cloud amount, present weather

Detailed Description: Test for consistency of total cloud amount against cloud height and present
weather.
Examples of unlikely/impossible combinations: total cloud amount less than total low (mid, high)
level cloud amount, total cloud amount invisible and precipitation trails (virga), cloudless and
precipitation etc.
Frequency of Test: As per observational frequency

52
2.18. Daily minimum versus ground minimum test (Range Test)

Brief Description: Checks the temporal consistency of daily minimum taken from the screen versus
ground minimum taken from the ground.

Parameters test applies to: Daily minimum air temperature and the daily minimum terrestrial
(ground) temperature

Detailed Description: This test checks the consistency of the daily minimum temperature which is
taken in the screen versus ground minimum temperature which is taken on the ground. It typically
picks up issues in the ground minimum as the measurement can be problematic if not done correctly
or bubbles occur in the thermometer.
Frequency of Test: Daily

53
2.19. The daily phenomena flag test

Brief Description: These tests check the consistency of the daily phenomena flags.

Parameters test applies to: Phenomena including hail, snow, thunderstorm, dust storm, strong wind,
gales, frost, mist, haze, fog etc.

Detailed Description: The tests check the consistency of the daily phenomena flags (days with hail,
fog etc) with codes in the sub-daily tables and various other daily elements. It will detect instances
where phenomena has been reported as a synoptic code present weather but has not been flagged
in the midnight – midnight phenomena category and vice versa.

Examples for suspect combinations (may need to be adapted to different climate conditions): snow
flag and minimum air temperature above 9 °C, no thunderstorm flag but thunderstorm observed
during the day, no dust/haze/fog flag but visibility below related thresholds during the day, frost flag
but minimum air temperature above 4 °C etc.

Frequency of Test: Daily

54
2.20. Present weather consistency tests

Brief Description: These tests check the consistency of the present weather codes in the sub-daily
tables and various other daily elements.

Parameters test applies to: present weather, and temperature etc

Detailed description: The tests flags unlikely combinations of present weather and temperature,
such as freezing precipitation or fog depositing rime codes and warm air temperatures. More
combinations can be included.

Frequency of Test: Daily

55
2.21. Soil temperature test

Brief Description: Check the consistency of soil temperatures at various depths.

Parameters test applies to: Soil temperature readings at depths 5 cm, 10 cm, 20cm, 50 cm, 1 m

Detailed Description: The test looks at various combinations of soil temperatures at different depths,
e.g. (may need to be adapted to different climate):

Target Element Test Expression (Suspect if true)


Soil temperature at 5cm depth SOIL 5 temp = SOIL 10 temp AND
(SOIL 5 temp)
SOIL 5 temp = SOIL 20 temp AND
SOIL 5 temp = SOIL 50 temp AND
SOIL 5 temp = SOIL 100 temp
SOIL 5 temp – SOIL 10 temp > 16
OR
SOIL 10 temp – SOIL 5 temp > 6
Soil temperature at 10cm depth SOIL 10 temp – SOIL 20 temp > 10
(SOIL 10 temp) OR
SOIL 20 temp – SOIL 10 temp > 6
Soil temperature at 20cm depth SOIL 50 temp – SOIL 20 temp > 6
(SOIL 20 temp) OR
SOIL 20 temp – SOIL 50 temp > 7
Soil temperature at 50cm depth SOIL 50 temp – SOIL 100 temp > 7
(SOIL 50 temp) OR
SOIL 100 temp – SOIL 50 temp > 4

Note: Soil temperatures are in degrees Celsius and thermometer depth is in cm

Frequency of Test: As per observation frequency

56
3. Heuristic Tests
Set of tests (usually multivariate) that rely on experience and knowledge of observation processes,
techniques and instrumentation to detect inconsistent, unlikely or impossible records and flag
them as suspect. A manual investigation may then asses the validity of the suspect values.

57
3.1. Relative humidity dry wet bulb test

Brief Description: Test to determine if wet bulb wick has dried out or is contaminated.

Parameters test applies to: Wet bulb temperature, Dry bulb temperature

Detailed Description: This test looks for a wet bulb wick that has dried out by examining the 15:00
observations where the relative humidity is > 90%. To reduce the false error rate, cases where there
is precipitation or fog at the same time are eliminated by examining the past and present weather
codes.
The test will be performed only if:

● The observation time is 15:00 (or the equivalent time where the maximum thermometer is
read in the afternoon)
● The elevation < 1100 m (it is not possible to distinguish the effects of orographic cloud vs
drying wet bulbs using this test).

Frequency of Test: daily for 15:00 observation (or equivalent)

Future directions for heuristic testing


Although currently untested, the use of numerical weather prediction forecasts in quality control of
data offers great promise. Site-specific weather forecasts could be used to determine the historical
difference (anomalies) between forecasted and observed values. If the difference between the
forecast and the observation is larger than a certain threshold, the observation could be flagged as
suspect. The advantage of this approach is that the magnitude of the differences is generally small
and hence the threshold value above which an observation would be considered suspect can be
determined fairly tightly. The disadvantage of the method is that if the forecast is wrong then a
correct observation may be flagged as suspect.

58
4. Data Provision Tests
Set of tests to ensure that observations that do not match the expected schedule of observations
are either rejected or flagged as suspect

59
4.1. Observation received from future test

Brief Description: Usually presented as a report for instances where the observer has sent the
observation ahead of time.

Parameters test applies to: Compares local clock time of observation with time received.

Detailed Description: Usually presented as a report for instances where the observer has sent the
observation ahead of time. The region will set the limit constituting the allowable ‘earliness’ of the
observation, for instance a 30 minute limit will accept data from observations reported up to 30
minutes earlier than the official observation time. It is recommended that a log identifying
consistently early observations received from a station be maintained to highlight deficiencies in
observational practice; these logs can be used to prioritise training requirements.

Frequency of Test: As generated

60
4.2. Period gap test

Brief Description: Tests if the period (day accumulation) match the records that are present.

Parameters test applies to: daily parameters

Detailed Description: The period gap test applies to daily observations and simply tests to see if the
periods (day accumulations) match the records that are present. For instance, a synoptic message is
received for the Monday morning for a station that has not recorded data on the weekend. The
Maximum and minimum temperature values in the data base for the Monday will need to be
checked as they will be a multi-day value even though they are archived as a single day value.
Frequency of Test: Daily

61
4.3. Large period test

Brief Description: Tests if periods (day accumulations) are excessively large (a period of several days)
for parameters where a large gap between daily readings may affect the integrity of the data; for
instance if rainfall is not read for 5 day in a hot climate, the rainfall data is likely to be reduced by
evaporation.

Parameters test applies to: Daily rainfall and evaporation readings

Detailed Description: The large period test picks up unreasonably large periods. For rainfall and
evaporation a smaller period gap may be acceptable depending on the NMHS’s climate and user
requirements. This test can be used by the NMHS to set the day accumulation limit where rainfall
and evaporation totals detected by this test may be considered suspect.
Frequency of Test: Daily

62
5. Statistical Tests
Set of tests that statistically analyse historical data to detect inconsistent, unlikely or impossible
records and flag them as suspect. A manual investigation may then asses the validity of the
suspect values.

63
5.1. Climate-based range test

Brief Description: Compares the meteorological value of with the climate upper and lower values.
Thresholds can be calculated to take into account seasonal variations in the observations.

Parameters test applies to: Generally daily parameters but can also be used for sub daily elements if
their QMP has been calculated.

Detailed Description: Compares the meteorological value with the climate upper and lower values.
Thresholds can be calculated to take into account seasonal variations in the observations. The signal
for this test is reliant on having sufficient years of temporal record (recommended at least 30 years)
to ensure a QMP that represents climate variation. Hence, this test will be noisy for new sites and
sites with incomplete temporal record, in this instance the QMP of a nearby station of similar
climate can be used. The Normalisation Constant is used to determine the ratio of false positive
errors detected by this test and can be adjusted accordingly.

Frequency of Test: Daily and sub daily

64
5.2. Flat line tests

Brief Description: Checks the length of a run of meteorological values which remain consistently the
same, i.e. if the parameter is unchanged over time.

Parameters test applies to: Daily and sub daily parameters

Detailed Description: Flat lines may occur with faulty equipment or when manually entered data is
incorrectly entered. It is important to note that this test is of limited value for some parameters that
have small variability such as soil temperatures taken well below the surface level.

Frequency of Test: As per observation

65
5.3. Rapid Change test

Brief Description: Checks the difference between the currently observed value and the previous
value for significance.

Parameters test applies to: Sub daily parameters generally with observational frequency of three
hours or less

Detailed Description: The rapid test picks up problems in the data where there has been a rapid rise
or fall in the data with time. This test assumes that the rapid change is not due to meteorological
variability which is why the test loses sensitivity for longer observation time intervals

It can be shown that the parameter variability for different stations is not constant for every station
/ month and not every hour. This temporal and spatial variation can justify the calculation of discrete
QMPs 4 at the stations, months and hours level.

When the time period between the observed value and the next or previous value is greater than 3
hours then the test is generally not performed due to test sensitivity reducing due to variability
resulting from meteorological conditions.

Frequency of Test: As per observation

4
For some parameters and sites the magnitude of the rise and fall may not be equal consistent with a skewed
distribution. Calculating QMPs for discrete thresholds for rise and fall results in more sensitive tests.
66
5.4. Spike test

Brief Description: Compares a given meteorological observation with the previous and next values.

Parameters test applies to: Sub daily data

Detailed Description: Similar to the rapid change test, however, this test looks for rise up and then
down (or down then up). The QMP limits on the spike test are smaller than the rapid change test.
When the time period between the observed value and the next or previous value is greater than 3
hours then the test is not performed. Ideally this test performs best on high frequency data for
observations of 20 minutes or less.

Frequency of Test: As per observation

67
5.5. Frequency test

Brief Description: Checks for instances of excessive rounding of a value. Applies to manual
observations where an operator rounds to the graduation marks on the instrument rather than
estimating the value between the instrument graduations.

Parameters test applies to: Usually applies to manual reading of thermometers and rain gauges

Detailed Description: The frequency of the last digit (the first decimal place for elements such as
temperature) is examined over a number of days and stored as a QMP. If the frequency of one of the
values is ‘high’ then the test is failed. The frequency is expressed in terms of a percentage of the
values analysed (the values must be non-null). Note that rounded values are not assessed as suspect
but are given a flag to indicate a lower level of confidence. This test is important for identifying
training issues.
The test will be performed for a whole month – and assigned to a nominated element and date.
The test will not be performed unless at least 75% of the values exist (and are non-null) over the
period.
Frequency of Test: Monthly

68
5.6. Spatial variability test

Brief Description: Daily climatology of the differences between the station of interest and nearby
stations is compared.

Parameters test applies to: Daily and sub daily parameters

Detailed Description: The estimation is calculated with a simple standard deviation of values from
the neighbours excluding the target station, and it is added/subtracted from the neighbour’s mean
value.

Frequency of Test: As per observation frequency

69
5.7. Barnes spatial test

Brief Description: Compares value of the meteorological element with the surrounding data values.
A Barnes analysis will be used to weight near stations 5.
Parameters test applies to: Daily and sub daily parameters

Detailed Description: The aim of the Barnes Analysis is to estimate a value at a target station, using
the actual values from all points surrounding using a weighting function. Nearby points should have
more influence and are therefore allocated a greater weighting.
The basic principle of the scheme is that that a value of the variable under consideration is
calculated at each of a number of network points 6 - the network points are stations locations. At
each station location, there is the observed variable's value and also the interpolated value based on
data from nearby observations, with an observation's influence on a point being determined by a
weighting function. At the point P under analysis, there will be the observed value and the
interpolated value but the observed value is not used to derive other nearby stations interpolated
values – it will be used only to compare the interpolated value to the observed value.
Constraint: This test will lose sensitivity for comparison stations greater than a set distance. NMHS’s
will need to determine the maximum range for the test. In Australia, data sparse regions are
relatively flat with homogenous climate, thus range (R) is set to 200km and stations beyond this
distance are excluded from the analysis.

Frequency of Test: Daily and sub daily

5
Other types of spatial analysis are under consideration and may be used in a later phase.
6
This network can be a grid where the grid values are interpolated back to the station locations. In this
description of the algorithm the network is simply the location of the stations so the interpolation is done
directly to the station locations. The end effect of the two methods is the same but the network of stations
version is faster. The grid version is convenient if the grid is available from other sources.
70
5.8. Linear regression spatial test

Brief Description: Compares the meteorological value with the surrounding data values using linear
regression.

Parameters test applies to: Daily precipitation, can be adapted for other parameters

Detailed Description: Linear Regression Spatial Analysis 7 produces an estimate for an element at a
target location and time based on how well the neighbours have correlated to the target over time.
Linear Regression is used to calculate this correlation. The correlation produces a weighting and is
used to balance stations which have provided a better fit than others.

Frequency of Test: Daily however the test can be adapted to other observational frequency

7
Adapted from a method described in, Hubbard, K.G. et al., 2005, “Performance of Quality Assurance
Procedures for an Applied Climate Information System”, Journal of Atmospheric and Oceanic Technology 22:
105-112.
71
5.9. Linear regression spatial variability test

Brief Description: Uses the variability of the neighbouring stations to calculate the test limits rather
than the standard error estimates. This test is a variation on the linear regression spatial test and
spatial variability test.

Parameters test applies to: Daily rainfall but can be adapted to other parameters

Detailed Description: This test uses the variability of the neighbours (at the same observation time)
to calculate the test limits rather than the standard error estimates. This is designed to give an
estimate of how reliable the neighbours are in calculating the estimate. This test is well suited to
precipitation with changes in the variability of a set of observations in a given region.
Constraint: This test will lose sensitivity for comparison stations greater than a set elevation from the
target station. NMHS’s will need to determine the maximum range for the test. In Australia, the
elevation range for homogenous climate is set to a maximum of 300m.

Frequency of Test: Daily however can be adapted to other observational frequencies

72
5.10. Linear regression multi-day test

Brief Description: Compares the climatology of the differences between the station of interest and
nearby stations over a multi-day period. Similar to linear regression spatial test but applies to
multiple days. This test is sensitive for detecting rainfall stations that regularly miss reading the rain
gauge.

Parameters test applies to: Daily rainfall, can be adapted to other parameters

Detailed Description: Although similar to the linear regression spatial test, the nc values etc. need to
be defined differently. For accumulations, precipitation is the sum of the neighbours, maximum
temperature is the max of the neighbours and minimum temperature is the min of the neighbours.
The neighbour values that are included must all have a period of ‘1’ 8 and the sum of the periods for
these records should equal the period for the target.
Frequency of Test: Daily however can be adapted to other observational frequencies

8
It would be possible to include neighbour values where the neighbour periods are > 1 (as long as their period
does not extend beyond the target period) but they are excluded here as these stations traditionally have
poorer quality readings.
73
5.11. Max / Min test

Brief Description: Tests that the difference between the maximum and minimum temperature is
realistic (i.e. greater than zero and less than an upper bound). Bounds are derived from climatology
using a minimum of 5 years of data, preferably 30 years. This test is sensitive for detecting manually
read misreads which may be out by 5 or 10 degrees for a Celsius reading.

Parameters test applies to: Daily Maximum and Minimum air temperature values

Detailed Description: The difference between the maximum and minimum temperature is
calculated. The test is failed if the difference is less than zero (i.e. the maximum is less than the
minimum) or exceeds an upper bound.
If the difference is less than 0 the test results in an error but if it exceeds the upper bound then the
test results in a suspect value.
Frequency of Test: Daily

74

Potrebbero piacerti anche