Weather-type dependent homogenization of the daily
Zwanenburg/De Bilt temperature series

Theo Brandsma

Royal Netherlands
Meteorological Institute (KNMI)
PO Box 201, 3730 AE De Bilt, The Netherlands
E-mail theo.brandsma@knmi.nl

ABSTRACT

Inhomogeneities arising from
the use of five different methods to calculate
daily mean temperatures are studied. The inhomogeneities
are artificially imposed on the 20th century
part of the Zwanenburg/De Bilt (1706-present)
temperature time series in the Netherlands.
Three approaches for removing the inhomogeneities
from the daily temperature data are compared.
In the first approach monthly adjustments are
derived. In the second approach daily adjustments
(for each calendar day) are derived that preserve
the monthly adjustments. The third approach
used the so-called objective Lamb weather types
for deriving weather-type dependent daily adjustments.
The homogenized and non-homogenized daily temperatures
are compared using, among others, the day-to-day
variability and the number of days above the
90th and below the 10th percentile. It is shown
that the method for calculating the daily mean
temperatures is of crucial importance. Also
the choice of the common period used for calculating
adjustments is rather important. Furthermore,
there is no benefit in using daily adjustments
derived from monthly adjustments. Finally, there
is some benefit in using atmospheric circulation
for weather-type dependent homogenization. The
use of an adequate method to calculate the daily
mean temperatures is, however, much more important.

1. INTRODUCTION

Several climate change and variability
studies have identified the need for homogenized
daily time series in order to study the frequency
and intensity of extreme climatic events. However,
homogenization of daily climate series requires
the development of new methodologies, because
the traditional monthly-based adjustments are
too coarse for daily data. In a review of homogenization
methods for climate data, Peterson et al. (1998)
already suggested that future work on homogenization
should include improvement of adjustment methodologies,
adjustments of daily data, and evaluating the
impact of adjustments on extreme events.

Many studies have been undertaken
to homogenize annual or monthly climatological
time series (Peterson et al., 1998; HMS, 1996;
Szalai et al., 1998; Slonosky et al., 1999;
Vincent, 1998; Vincent and Gullett, 1999; Karl
and Williams, 1987; Jones et al., 1986). In
most studies inhomogeneities are traced in annual
series and, subsequently, monthly adjustments
are determined and applied to homogenize the
series. Recently, in the EU-project IMPROVE
(Improved understanding of past climate variability
from early daily European instrumental sources),
an attempt was made to derive daily adjustments
by using daily measurements of cloudiness.

Inhomogeneities in climate time
series arise from non-climatic factors like
changes in station location, changes in methods
to calculate means, changes in observation practices,
changes in instruments and in station environment.
Each of these changes may require a separate
homogenization strategy. The changes may cause
stepwise and/or gradual biases in the climatological
time series, making these series unrepresentative
of the climate of the concerning area.

In the present paper, inhomogeneities
caused by changes in methods to calculate daily
mean temperature are artificially imposed on
the 20th century hourly data set of De Bilt
in the Netherlands. Three approaches for removing
these inhomogeneities are compared. Two of them
are traditional approaches invoking monthly
(12 values) or daily (365 values) adjustments
to the data. The third method exploits the potential
of the objective Lamb weather types for deriving
weather-type dependent daily adjustments for
daily temperature data is studied. The study
is part of a long-term project aiming at homogenizing
the complete Zwanenburg/De Bilt climatological
time series (1706-present) at a daily base.

2. METHODOLOGY

2.1 Introduction

De Zwanenburg/De Bilt time series
consists of six stations pasted together (Delft:
1706-1727; Rijnsburg: 1727-1734; Zwanenburg:
1735-1800 and 1811-1848; Haarlem: 1801-1810;
Utrecht: 1849-1897; and De Bilt: 1898-present).
The last 150 years, the daily mean temperatures
are calculated using the arithmetic mean of
24 hourly values. For earlier dates daily mean
temperature was mostly calculated from three
temperature readings per day, where the time
of measurement may vary. The inhomogeneities
in the monthly time series, caused by using
different methods to calculate the daily mean
temperature, were usually removed by calculating
monthly or daily adjustments for a common period.
The adjustments were applied such that the older
periods were adapted to the most recent period.

In this paper, we imitated the
above-mentioned process to study the homogenization
of daily temperature data, using the hourly
temperatures of De Bilt (1901-1995). A reference
temperature T_{ref} is defined as the arithmetic
mean of 24 hourly values. Five alternative methods
for calculating daily mean temperature are defined.
The transition of one to each other in a time
series may introduce inhomogeneities and biases
with respect to T_{ref}. Three approaches for homogenization
are compared. The first two of these approaches
follow from the current practice for homogenizing
temperature series on daily or monthly basis.
In the third approach, daily weather types are
used to find weather-type dependent adjustments.

The advantage of using manipulated
modern daily data of De Bilt (1901-1995), is
that the common period equals in fact the whole
record length (1901-1995). Therefore, the errors
can also be calculated for the whole record
length.

2.2 Weather types

The so-called objective Lamb
weather types were used as a measure for daily
weather. The objective Lamb weather classification
scheme, also known as the Jenkinson scheme,
was initially developed (Jenkinson and Collison,
1977) for the UK and the North Sea. From daily
MSLP data on a regular 5° latitude by 10° longitude
grid, extending back to December 1880, the following
air flow indices were derived daily for both
areas: (1) the direction of the flow; (2) the
strength of the flow; and (3) the total shear
vorticity. The latter is a measure of the rotation
of the atmosphere. Positive vorticity corresponds
to a low pressure area (cyclonic) and negative
vorticity corresponds to a high pressure area
(anticyclonic). The values of the three air-flow
indices determine the weather type of the day
considered. Sub-division gives a total of 27
weather types: anticyclonic, cyclonic, 8 directional
types, 16 hybrid types and one type denoted
as undefined. It was shown by Jones et al. (1993),
that for the UK area the resulting scheme quite
well reproduced the seasonal counts of the basic
Lamb Weather Types. An important advantage of
this objective scheme with respect to the latter
is that it can be applied to other parts of
Europe as well. As shown in Figure 1, we centered
the grid near the Netherlands at 50°N, 5°E to
obtain weather types for De Bilt.

2.3 Definition of methods to calculate daily
mean temperature

Table 1 defines the reference
temperature T_{ref} and the five alternative methods
for calculating the daily mean temperature.
The function f in this table equals the model
function of Parton and Logan (1981) that describes
the temperature on an arbitrarily time of the
day using the times of sunrise and sunset and
T_{x} and T_{n} and their times of occurrence. For
the diurnal course of the temperatures, the
function consists of a sinus function during
the day and an exponential function during the
night.

Figure 1:
Grid points of mean sea-level pressure used for
the calculation of the objective Lamb weather
types for De Bilt.

The model function of Parton
and Logan (1981) can be rearranged such that
T_{x} and T_{n} are found from two temperature measurements
on arbitrarily times of the day. Van Engelen
and Geurts (1983) applied this strategy and
used a third measurement for a final correction.
We also adopted this strategy.

2.4 Homogenization approaches

Three approaches (a,b,c) were
compared to remove the inhomogenities introduced
by the changes in methods to calculate daily
mean temperature. An extra subscript a, b or
c was added to T_{1},...,T_{5} to denote an approach.

Approach (a)
Monthly adjustments were calculated and applied
to T_{1},...,T_{5} using the, arbitrarily chosen, artificial
common period 1991-1995 with T_{ref} in the hourly
data set. The resulting homogenized temperatures
were denoted T_{1a},...,T_{5a}. In practice, common
periods of 1 to 5 years have been used to calculate
monthly adjustments. Monthly adjustments may,
however, be rather sensitive to the choice of
the overlapping period. Figure 2 illustrates
this for T_{1}-T_{ref} using the 19 non-overlapping
5-year periods in 1901-1995. Of course, for
shorter common periods the monthly adjustment
are even more sensitive to the choice of the
period.

The temperatures T_{1},...,T_{5} may have an extra subscript
a, b, or c indicating the homogenization approach
(for further explanation see Section 2.4).
a: monthly adjustments are applied (derived from the arbitrary chosen common period 1991-1995)
b: daily adjustments (for each calendar day) are applied (derived from the monthly adjustments)
c: daily adjustments are applied (derived from the weather types)

Approach (b)
Daily adjustments were calculated and applied
to T_{1},...,T_{5} for each calendar day using the monthly
adjustments. An iterative cubic spline interpolation
procedure was used that preserves the monthly
adjustments as proposed by Harzallah and Sadourny
(1995). The resulting homogenized temperatures
were denoted T_{1b},...,T_{5b}. An example of this approach
is shown in Figure 3 for T_{1}.

Approach (c)
Daily adjustments were calculated using the
weather types and applied to T_{1b},...,T_{5b}. The
resulting homogenized temperatures were denoted
T_{1c},...,T_{5c}. For each month and each weather type,
the mean of errors T_{1b}-T_{ref},...,T_{5b}-T_{ref}
was calculated
(for the whole period 1901-1995). These mean
errors were subsequently used to adjust T_{1b},...,T_{5b}
giving T_{1c},...,T_{5c}.

Figure 4 shows the frequency
of the weather types for the 1901-1995 period.
The anticyclonic type is denoted by (A), the
cyclonic by (C), the directional types by their
respective wind direction (NE, E, SE, S, SE,
S, SW, W, NW), and the hybrid types by combinations
of A or C with the directional types. UN denotes
that the type is undefined and MIS that the
type could not be calculated because of missing
values of the mean seal level pressure. The
frequency of weather types ranges between 24.5%
(A) and 0.7% (CNE and CN).

Figure 5 shows for T_{1b} that
the mean error (T_{1b}-T_{ref}) per weather type depends
on the season. Therefore, the relationship between
the weather types and the errors was calculated
for each month separately. Because some of the
weather types oc-cur only 0.7% of the days,
the whole series (1901?1995) was used to derive
the relationships between the weather types
and the errors. Obviously, in practice such
an ideal situation does not occur.

Figure 3:
Monthly (bars, 1991-1995) and daily adjustments
for T_{1}. The monthly adjustment is one
of the realizations of Figure 2. The daily adjustments
were derived from the monthly values using an
iterative cubic spline interpolation procedure
that preserves the monthly adjustments.

Figure 4:
Frequency of weather types for the period 1901-1995 in De Bilt. The weather types are defined in the text.

Figure 5:
Mean error (T_{1b}-T_{ref}) per wtype for the period 1901-1995 for the winter and summer half of the year in De Bilt.

3. RESULTS

To compare the homogenized and non-homogenized temperatures with T_{ref}, we used three
measures: (1) a direct comparison of the daily values using the bias (BIAS) and root mean
squared error (RMSE); (2) a comparison of the day-to-day variability; and (3) a comparison
of the number of days below the 10th and above the 90the percentiles for. The results for
these measures are successively pre-sented below.

3.1 Direct comparison of daily values

Table 2 presents the BIAS and RMSE, with respect to T_{ref}.
The table shows that the non-homogenized temperatures T_{3}, T_{4}
and T_{5} have a relatively small BIAS compared to
T_{1} and T_{2}. It is also seen that applying the monthly corrections
(sub-script a) is especially advantageous for T_{1} and T_{2}.
On the other hand, the BIAS becomes larger instead of smaller for T_{3}.
Apparently, the fact that the common period 1991-1995 is not representative for the whole
period becomes noticeable here. The weather type correction (subscript c) automatically
reduces the BIAS to zero because the reference and common period are identical (1901-1995).

Table 2:
BIAS (°C) and RMSE (°C) with respect to T_{ref} for homogenized (subscripts a, b and c) and
non-homogenized temperatures (defined in Table 1 and in Section 2.4) for De Bilt
(1901-1995). T_{1} to T_{3} represent different arithmetic means;
T_{4} and T_{5} indicate the Parton-Logan model.

Except for T_{1} and T_{2}, Table 2 shows that the effect of using
daily or monthly ad-justments on RMSE is small. There is no apparent positive effect of
using daily adjustments based on monthly adjustments (subscript b), as compared to using
only monthly adjustment, for both BIAS and RMSE. The positive effect of the weather type
correction on RMSE is noticeable for all temperatures. This effect is, however, much
smaller than the mutual differences in RMSE caused by using different methods for
calculating the daily mean temperature. Finally, there is a clear positive effect,
both in terms of BIAS and RMSE, of using the 22 UTC tem-peratures (T_{2},
T_{5}) instead of the 20 UTC (T_{1}, T_{4}) temperature.

Because it is known that there are some inhomogeneities in the De Bilt series, even in
the period 1901-1995, the calculations were also carried out for the rela-tively
homogeneous period 1971-1995. Because the differences with the results above were
small, only the 1901-1995 period is discussed in the remainder of this paper.

3.2 Day-to-day variability

An important measure of daily temperature variability is the so-called day-to-day
variability. To calculate the day-to-day variability the absolute values of the lag-1
differences were calculated for the homogenized and non-homogenized tempera-tures and
averaged for each month.

Table 3:
Percentage differences from the day to day variability of T_{ref} for a
selection homogenized (subscripts a, b and c) and non-homogenized temperatures
(defined in Table 1 and in Section 2.4), calculated for De Bilt (1901-1995).
T_{1} to T_{3} represent different arithmetic means;
T_{4} and T_{5} indicate the Parton-Logan model.

Table 3 presents the day-to-day variability for each month for a selection of the
homogenized and non-homogenized temperatures. The results are expressed as a
percentage differences from the day-to-day variability of T_{ref} calculated
as 100(T_{x}-T_{ref})/T_{ref}, where x denotes the concerning
temperature series. The values for the simulations using monthly and daily adjustments
are not shown because they hardly differ from the corresponding values for
T_{1},...,T_{5}. In general, the table shows a small decrease of
the percentage differences because of the weather type correction. As in Section 3.1,
this effect is much smaller than the mutual differ-ences in RMSE caused by using
different methods for calculating the daily mean temperature. Especially the good
performance of T_{5} and, to a lesser extent, T_{4} is striking.
For T_{1}, T_{2} and T_{3} the day-to-day variability is
overestimated. This is a well-known phenomenon that occurs when less than 24 hourly
values are aver-aged to calculate the daily mean temperature.

3.3 Occurrence of extreme events.

Extreme events were calculated by counting the number of days in a month above the
90th percentile and days below the 10th percentile as calculated for the refer-ence
temperature. First, the percentiles were calculated for each calendar day by linearly
interpolating between order statistics of T_{ref} for that day (95 values).
Sec-ond, to reduce the effect of sampling variability, smooth approximations of the
percentiles were used instead of the raw values. Smoothing was done with the so-called
supersmoother (Härdle, 1990).

Table 4 presents the BIAS and RMSE for the monthly counts. Results are given for
a selected number of homogenized and non-homogenized temperatures. The table shows
that the effect of the weather type correction is, in general, a decrease of both
BIAS and RMSE. The performance of T_{5c} is clearly the best. It is
notewor-thy that the RMSE of days above the 90th percentile is always lower than
the RMSE of days below the 10^{th} percentile.

Table 4:
BIAS and RMSE of the monthly number of days above the 90^{th} and below the
10^{th} percentiles, with respect T_{ref}, for a selected number of
simulations for De Bilt (1901-1995). T_{1} and T_{3} represent
different arithmetic means; T_{4} and T_{5} indicate the
Parton-Logan model.

4. DISCUSSION

In this paper we compared three methods for removing inhomogeneities using hourly
data of De Bilt (1901-1995). The inhomogeneities were artificially im-posed on the
De Bilt series using 5 methods for calculating daily mean tempera-ture. The
homogenized and non-homogenized temperatures were compared with the reference
temperature. It appeared that the benefit of using daily temperature adjustments,
derived from the monthly adjustments, is negligible. The length and choice of the
common period, used to calculate the monthly adjustments, is much more important.

For inhomogeneities caused by using several methods for calculating daily mean
temperature, the potential of the objective Lamb weather types for weather-type
dependent homogenization is present but small. A drawback of the weather types may
be that they are a representation of the atmospheric circulation at only one fixed
time of the day. Another problem is that they are not available for the re-mote past.
Possibly the use of wind speed and cloudiness may be more successful. These two
variables are often measured three times a day, also before the year 1900.

The adapted model of Parton and Logan (1981) to calculate daily mean tempera-tures
has a good predictive skill. This model may also be further improved, e.g. by taking
into account the last observation of the previous day and the first observa-tion of
the next day.

5. CONCLUSIONS AND RECOMMENDATIONS

One source of inhomogeneities in temperature time series may by the use of sev-eral
methods for calculating daily mean temperatures. In this paper these inho-mogeneities
were studied using the hourly data for De Bilt (1901-1995). It was demonstrated that
method for calculating the daily mean temperatures is of crucial importance. It is
also shown that the choice of the common period used for calcu-lating adjustments is
rather important. Furthermore, there is no benefit in using daily adjustments derived
from monthly adjustments. Finally, there is some bene-fit in using the objective Lamb
weather types for weather-type dependent homog-enization. The use of an adequate
method to calculate the daily mean temperatures is, however, much more important.

For future studies dealing with homogenization of daily temperature time series, we
recommend a further exploration of weather-dependent homogenization. For this purpose,
other variables like cloudiness, wind speed and direction may be used. It is may also
be profitable to develop a weather-dependent version of the model of Parton and Logan
(1981).

REFERENCES

Engelen, A.F.V. van, and H.A.M. Geurts, Historische weerkundige waarnemingen,
Deel III: Een rekenmodel dat het verloop van de termperatuur over een etmaal
berekent uit drie termijnmetingen van temperatuur, KNMI, De Bilt, 44 pp., 1983 (in Dutch).
Härdle, W., Applied Nonparametric Regression, Cambridge University Press, Cambridge, 333 pp, 1990.
Harzallah A, and R. Sadourny, Internal versus SST-forced atmospheric variability as simulated by an
atmos-pheric general circulation model, J Clim 8: 474-495, 1995.
HMS (Hungarian Meteorological Service), Proceedings of the First seminar for homogenization of surface
meteorological data, 6-12 October 1996, Budapest, Hungary, 144 pp., 1996.
Jones, P.D., M. Hulme, and K.R. Briffa, A comparison of Lamb circulation types with an objective
classifica-tion scheme, Int. J. Climatol., 13: 655-663, 1993.
Jones, P.D., S.C.B. Raper and T.M.L. Wigley, Southern Hemisphere surface air temperature variations: 1851-1984,
Journal of Climate and Applied Meteorology, 25: 1213-1230, 1986.
Jenkinson, A.F., and F.P. Collison, An initial climatology of gales over the North Sea,
Synoptic Climatology Branch Memorandum no. 62, Meteorological Office, Bracknell (unpublished).
Available from the Na-tional Meteorological Library, Meteorological Office, Bracknell, U.K, 1977.
Karl, T.R. and C.N. Williams, An approach to adjusting climatological time series for discontinuous
inho-mogeneities, Journal of Climate and Applied Meteorology, 26: 1744-1763, 1987.
Parton , W.J., and J.A. Logan, A model for diurnal variation in soil and air temperature, Agric.
Meteorol., 23: 205-216, 1981.
Peterson, T.C., D.R. Easterling, T.R. Karl, P. Groisman, N. Nicholls, N. Plummer, S. Torok, I. Auer, R. Boehm, D.
Gullett, L. Vincint, R. Heino, H. Tuomenvirta, O. Mestre, T. Szentimrey, J. Salinger, E.J. Fřrland, I. Hanssen-Bauer,
H. Alexandersson, P.D. Jones and D. Parker, Homogeneity adjustments of in situ atmospheric climate data: a review,
International Journal of Climatology, 18: 1493-1517, 1998.
Slonosky, V.C., P.D. Jones and T.D. Davies, Homogenization techniques for European monthly mean sur-face
pressure series, Journal of Climate, 12: 2658-2672, 1999.
Szalai, S., T. Szentimrey and C. Szinell (eds.), Proceedings of the Second seminar for homogenization of
surface meteorological data, 9-13 November 1998, Budapest, Hungary, 213 pp., 1998.
Vincent, L.A., A technique for the identification of inhomogeneities in Canadian temperature series,
Journal of Climate, 11: 1094-1104, 1998.
Vincent, L.A. and D.W. Gullett, Canadian historical and homogeneous temperature datasets for
climate change analysis, International Journal of Climatology, 19: 1375-1388, 1999.