DAPS - Data intercomparison


Within DAPS we perform an intercomparison experiment for data assimilation methods for paleoclimatic applications. We invite all groups who use such methods to participate. First analyses of the results will start in January 2019, so if possible data should be uploaded before that. Nevertheless, this is an open process and later submission are welcome.

For the intercomparison northern hemispheric local temperature pseudo-reconstructions for the period 1900 – 2000 will be assimilated. The data assimilation simulations will be compared using a common validation setup. The restriction to the northern hemisphere and to the period after 1900 CE ensures that the surface climate is known relatively well, which is crucial both for providing pseudo-reconstructions with a realistic climate signal and for validation. The HadCRUT4 gridded temperature observations will be used to generate the pseudo-reconstructions as well as for validation. In addition gridded sea level observations, atmospheric circulation indices, atmospheric reanalyses and reconstructions of ocean heat content will be used for validation.

If you have questions or comments please email Martin Widmann: This email address is being protected from spambots. You need JavaScript enabled to view it..

Protocol for data assimilation simulations

Simulation period 1900 – 2000 CE.

Data to be assimilated

Near-surface temperature pseudo-reconstructions for the northern hemisphere with annual resolution at 69 locations (Fig. 1). The locations are those of the PAGES2k proxy network at 1500 CE (PAGES 2k Consortium 2013), but with only one pseudo-reconstruction if several PAGES2k proxies are available within one HadCRUT4 gridcell. The pseudo-reconstructions are constructed by adding white noise to the annual mean HadCRUT4 temperatures (Morice et al 2014), with a ratio of signal to noise standard deviation of 0.5.

The pseudo-reconstructions to be assimilated and the noise data are available from our shared directory. Please contact This email address is being protected from spambots. You need JavaScript enabled to view it. to get access to it.

Fig. 1: Locations of temperature pseudo-reconstruction to be used as input for data assimilation.

Assimilation methods

The participating groups can use any assimilation method of their choice, including online, transient offline, and stationary offline approaches. The specific methods may include Kalman and Particle filters, but also other methods. 

For methods with a transient setup the CMIP5 forcings are recommended, but other choices can be made. For stationary setups the way how the background ensemble is generated can be freely chosen.

As the noise will be added and no rescaling will be applied the total variance of the pseudo-reconstructions will be higher than the variance of the temperature observations. This scaling has been recommended for model-data comparisons (Smerdon 2012, Sundberg et al. 2012, Moberg et al. 2015). However, a potential rescaling of these pseudo-reconstructions, for instance to the variance of the HadCRUT4 temperatures at the respective locations, is part of the definition of a specific assimilation approach and can thus be performed by the individual groups if considered to be appropriate.

Likewise, the decision on whether to assimilate absolute pseudo-reconstructions or anomalies, i.e. how to handle model biases, is part of the definition of the assimilation approach and can be decided by the participating groups.

Variables to be uploaded for validation

For the common validation, the participating groups should upload two different netCDF files and a description of the assimilation method including references. To have access to the directory where this information can be uploaded, please contact This email address is being protected from spambots. You need JavaScript enabled to view it..

The first netCDF file should contain the data assimilation-based reconstructions. For ensemble-based methods such as non-degenerated Particle Filters and Kalman Filters, the appropriately weighted ensemble mean should be uploaded. The following variables should be provided using the names given below:

- 2m air temperature: tas
- 850 hPa air temperature: ta850
- 500 hPa air temperature: ta500
- sea level pressure: psl
- 850 hPa geopotential height: zg850
- 500 hPa geopotential height: zg500
- precipitation: pr
- ocean heat content (if available): ohc
- AMOC strength (if available): amoc

The first dimension of the variables should be the time (monthly resolution if possible, annual is also fine), the second and the third the latitude and the longitude, if applicable, using the native model grid. In addition to these variables, a timeseries of the ensemble variance or another appropriate measure of the uncertainty for each variable should be provided, using the same variable name as above but ending with “_var”.

In addition to the results from the data assimilation simulations, the same variables should be uploaded in another netCDF file for standard transient forced simulations without data assimilation for estimating the skill of the method. If ensemble simulations are available, the ensemble mean along with the ensemble variance and the ensemble size should be uploaded. In this case, the name of the variables corresponding of the ensemble means will be the same as above, and the ensemble variance will be saved using the name of the variables plus the extension “_var”.

Validation methods

The data assimilation simulations and standard forced simulations will be validated against the HadCRUT4 gridded temperature observations, the HadSLP gridded sea level pressure, atmospheric reanalyses (e.g. ERA-interim, ERA20C), atmospheric circulation indices, and reconstructions of ocean heat content.

A basic validation will be performed jointly for all uploaded simulations by a DAPS validation group and will include

- Global, hemispheric and continental mean temperatures with annual and lower resolution (e.g.5yr and 10yr Hamming filter) for seasons and the whole year. The time series will be compared visually and correlations and RMSE will be calculated.
- Trend analysis (distributions and trends for a given time)
- Precipitation over India and the western US
- ENSO (e.g. SSTs in NINO3.4 region)
- NAM/SAM/NAO/(based on model PC1)
- Ocean heat content (same analysis as for atmospheric temperatures)
- AMOC (limited validation due to lack of observations, but will be included in joint diagnostics)

The participating groups are encouraged to validate further variables. The uploaded simulations will be available to the contributing groups.


Moberg, A., Sundberg, R., Grudd, H. and Hind, A., 2015. Statistical framework for evaluation of climate model simulations by use of climate proxy data from the last millennium-Part 3: Practical considerations, relaxed assumptions, and using tree-ring data to address the amplitude of solar forcing. Climate of the Past, 11(3), p.425.

Morice, C.P., Kennedy, J.J., Rayner, N.A. and Jones, P.D., 2012. Quantifying uncertainties in global and regional temperature change using an ensemble of observational estimates: The HadCRUT4 data set. J. Geophysical Research: Atmospheres, 117(D8).

Smerdon, J.E., 2012. Climate models as a test bed for climate reconstruction methods: pseudoproxy experiments. Wiley Interdisciplinary Reviews: Climate Change, 3(1), pp.63-77.

Sundberg, R., Moberg, A. and Hind, A., 2012. Statistical framework for evaluation of climate model simulations by use of climate proxy data from the last millennium–Part 1: Theory. Climate of the Past, 8(4), pp.1339-1353.

Pages 2k Consortium, 2013. Continental-scale temperature variability during the past two millennia. Nature Geoscience 6 (5), 339–346.