This page lists observational datasets commonly used in the CVDP, with links to their sources and information on strengths, limitations and applicability from the Climate Data Guide. The CVDP does not distribute the data.
Before use, the data must be put into a format that the CVDP can read in. Specifically, the data need to be in netCDF format, the file name needs to end with syntax indicating the dates present in the file ("YYYYMM-YYYYMM.nc"), and the variable name in the file must be listed in the CVDP functions.ncl file. For more information, refer to the README file that is distributed with the codebase.
TS (Sea Surface Temperature)
The NOAA Extended Reconstruction Sea Surface Temperature (ERSST) provides global, spatially complete SST data at a monthly timestep for 1854-present. Version 5 is based upon statistical interpolation of the ICOADS release 3.0 data. Argo float data are used in the recent years (since ~2000). The data are distributed as anomalies; climatologies (absolute values) are available for 1971-2000. The creators of version 5 took some steps to alleviate the overly smooth and damped anomaly fields in ERSSTv4. ERSST forms the basis for NOAA's merged global land-ocean surface temperature analysis.
The Hadley Centre Global Sea Ice and Sea Surface Temperature (HadISST) is a combination of monthly globally complete fields of SST and sea ice concentration for 1871-present. HadISST uses reduced space optimal interpolation applied to SSTs from the Marine Data Bank (mainly ship tracks) and ICOADS through 1981 and a blend of in-situ and adjusted satellite-derived SSTs for 1982-onwards. The "bucket correction" was applied to SSTs for 1871-1941. HadISST is primarily intended to be used as boundary conditions for atmospheric models.
The NOAA Extended Reconstruction Sea Surface Temperature (ERSST) provides global, spatially complete SST data at a monthly timestep for 1854-present. Version3 is based upon statistical interpolation of the ICOADS release 2.4 data. Version 3 includes satellite AVHRR SST data for 1985 onwards. Version 3b does not include satellite data due to a cold bias in the satellite-derived SSTs that proved difficult to correct.
The COBE data set is a spatially complete, interpolated 1°x1° SST product for 1891 to present. It combines SSTs from ICOADS release 2.0, the Japanese Kobe collection, and reports from ships and buoys. Data are gridded using optimal interpolation. As in HadISST, data up to 1941 were bias-adjusted using the "bucket correction." Prior to the interpolation analyses, data were also subject to quality control using a-priori thresholds, and nearby observations were combined.
HadSST3 provides monthly SST anomalies on a 5°x5° grid for 1850-present. The anomalies are derived from a 30-year climatology spanning 1961-90.Coverage is global but there is no interpolation; Thus, missing data occur in the final product. The primary input data are from ICOADS release 2.5. Bias adjustments to the ICOADS SSTs account for changes in measurement methods (e.g. engine room intake, bucket measurements, or buoy data).
TAS (Surface Air Temperature)
NASA Goddard's Global Surface Temperature Analysis (GISTEMP) combines land surface air temperatures primarily from the GHCN-M version 3 with SSTs of the ERSSTv3b analysis into a comprehensive global surface temperature data set spanning 1880 to the present at monthly resolution, on a 2x2 degree latitude-longitude grid. As such, it is one of the main data sets used to monitor global and regional temperature variability and trends.
NOAA's Merged Land-Ocean Surface Temperature Analysis (MLOST) combines land surface air temperatures primarily from the Global Historical Climatology Network, Monthly (GHCN-M) version 3 with SSTs of the ERSSTv3b analysis into a comprehensive global surface temperature data set spanning 1880 to the present at monthly resolution, on a 5x5 degree latitude-longitude grid. As such, it is one of the main data sets used to monitor global and regional temperature variability and trends.
Extending back to 1850 and frequently updated, HadCRUT4 is the longest data set of its type. HadCRUT4 is a combination of the global land surface temperature data set, CRUTEM4 and the global SST data set, HadSST3. HadCRUT4 is different from the most closely comparable products (e.g. NASA GISTEMP and NOAA MLOST) in that no interpolation is performed.
The Berkeley Earth Surface Temperatures (BEST) are a set of data products, originally a gridded reconstruction of land surface air temperature records spanning 1701-present, and now including an 1850-present merged land-ocean data set that combines the land analysis with an interpolated version of HadSST3. In contrast to other data sets incorporating records from roughly 5000-7000 land stations, the Berkeley data set incorporates approximately 37,000 records. This is in part due to the use of additional data bases beyond GHCN, and in part to the methodology, which allows short, fragmented timeseries to be incorporated into the statistical model.
PSL (Sea Level Pressure)
ERA5, the successor to ERA-Interim, provides global, hourly estimates of atmospheric variables, at a horizontal resolution of 31 km and 137 vertical levels from the surface to 0.01 hPa. Produced by ECMWF, ERA5 presently extends back to 1979; it will ultimately be extended back to 1950. ERA5 represents 10 years of progress made in modeling and data assimulation since the production of ERA-Interim.
ERA-20C is ECMWF's first atmospheric reanalysis of the 20th century, from 1900-2010. It assimilates observations of surface pressure and surface marine winds only. A coupled Atmosphere/Land-surface/Ocean-waves model is used to reanalyse the weather, by assimilating surface observations.
The CERA-20C is a global, coupled reanalysis spanning 1901-2010 with a focus on low-frequency climate variability. Similar to ERA-20C, the surface observations assimilated include surface pressures from the International Surface Pressure Databank v3.2.6 and ICOADS v 2.5.1, and surface winds over the oceans from ICOADSv2.5.1. Upper-air and satellite data are omitted. In contrast to ERA-20C, CERA-20C makes uses of a newer assimilation system that simultaneously ingests atmospheric and ocean observations (temperature and salinity from EN4) into a coupled Earth system model. The air-sea coupling leads to a more balanced system, without the spurious trends in surface heat fluxes evident in products like ORA-20C and ERA-20C.
The GPCC provides gridded gauge-analysis products derived from quality controlled station data. The Full Data Reanalysis Product is recommended for global and regional water balance studies, calibration/validation of remote sensing based rainfall estimations and verification of numerical models. The product is not bias corrected for systematic gauge measuring errors. However, the GPCC provides estimates for that error as well as the number of gauges used on the grid.
Data from rain gauge stations, satellites, and sounding observations have been merged to estimate monthly rainfall on a 2.5-degree global grid from 1979 to the present. The careful combination of satellite-based rainfall estimates provides the most complete analysis of rainfall available to date over the global oceans, and adds necessary spatial detail to the rainfall analyses over land. In addition to the combination of these data sets, estimates of the uncertainties in the rainfall analysis are provided as a part of the GPCP products.
SIC (Sea Ice Concentration)
Sea Ice Concentration data from NASA Goddard and NSIDC based on Bootstrap algorithm
Bootstrap sea ice refers to a well-known algorithm used to estimate sea ice concentration from passive microwave brightness temperatures. This latest version of the Bootstrap data set (produced by J. Comiso and distributed by NSIDC as NSIDC 0079, version 2) has been completely reprocessed so that SMMR and SSMI data are inter-calibrated with AMSR-E data. The Bootstrap data set provides a long-term, consistently interpreted and calibrated record for studies of climate variability and change, but users should be aware of uncertainties and possible biases.
Sea Ice Concentration data from NASA Goddard and NSIDC based on NASA Team algorithm
NASA Team sea ice refers to a well-known algorithm used to estimate sea ice concentration from passive microwave brightness temperatures. The NASA Team data (produced at NASA Goddard and distributed by NSIDC as NSIDC 0051) are very widely used and are a key input into other data sets including NSIDC's Sea Ice Index, NSIDC's Near-Real-Time ice concentrations (NISE1 & NSIDC 0081), HadiSST, and NOAA OI. Therefore, users should be aware of the strengths and limitations of the algorithm and the data sets derived from it.