sylas
December 7th 2008, 09:19 AM
Detailed weather records at various stations in the USA and other parts of the world have been maintained for hundreds of years. Though this was not the original intent, resulting data makes up a historical record that might be used to track long term changes in climate over the whole globe. Such a data resource is now potentially of great value in the study of Earth's climate.
There are serious questions about how much this data can reveal, or how reliable it can be. It was never envisaged that the data would be used for such a large application. The records from individual stations are often patchy, and over time there are all kinds of changes that take place that might impact the readings obtained. There are many sources of error that can distort readings, and any trends obtained in this way must to consider the quality of data and their consistency over time.
The quality and validity of these historical trends has been called into question in a number of other threads, in particular relating to temperature records, and this thread is here for discussion of the issues surrounding the matter.
(1) Thread scope: the land temperature record for continental USA
The scope of the thread is discussion of the land temperature record in the continental USA. Other aspects of climate debates do not belong here, except when they are directly related to the main topic in some way.
There are other threads that are wide open to more general topics. As thread originator I am asking that this thread maintain a clear and tight focus on the continental USA land temperature record. Let's keep this thread focused, substantive, and civil. Anyone is welcome to join in, with questions or answers or comments or that relate to the topic.
Since the topic has appeared a number of times in previous threads, it may sometimes be useful to supply links to relevant posts as background, or to give quotes from posts in other threads; but the thread of discussion here must be self-contained and comprehensible by reading the text available in this thread alone. Links are good for additional background or as references, but not as substitutes for readable questions or answers.
You can augment a quote tag like this -- [quote=sylas, in thread 'Global Warming';2513271] -- if you would like to indicate that it is from another thread.
The thread is a tad messy, as shadowmaster notes. And I have issues at home which mean I won't always be as quick as others. So when I have a more detailed reply for you, I will simply make it a new thread entirely, and post a shorter summary with link here. The new thread will have a limited focus, and will be open to participation from anyone. …
As a general caution when looking at data; remain aware of whether you are looking at temperature data in Fahrenheit or Celcius. Most scientific work, especially for international use, is based on Celcius; but much of the local US material uses Fahrenheit.
(2) Background information and data sources
There are two major collections of historical temperature data for the continental USA that have been a focus for discussion in previous threads, and a couple of other potentially relevant records.
(2.1) USHCN. The United States Historical Climatology Network (http://cdiac.ornl.gov/epubs/ndp/ushcn/newushcn.html) is a collection of historical records from 1221 weather stations across the 48 states of the continental USA. The stations chosen for inclusion in this set are mostly rural stations, with some in more urban settings. 820 are classified rural, 288 are associated with a small town with a population of less than 50,000; and 113 are classified urban, associated with cities with a population of 50,000 or more.
For these stations, historical records have been obtained from a number of sources, and digitized. The freely downloadable data includes daily recorded information for each station, and a number of monthly temperature time series. There are series for min, max, and mean temperature, and in each case there are four series available, with a series of successive corrections to systematic errors.
A series based on the raw data. (AREAL)
A series corrected for systematic errors arising from time of observation bias. (TOBS)
A series corrected for discontinuities in the station history, such as a movement in the site, or a change in the instruments. Missing values are also added based on neighbouring stations. (FILNET)
A series corrected for an urban hear island systematic bias. (URBAN)
A major question arising in the previous threads is the validity of these corrections.
Monthly data, and daily data, can be obtained either as complete files by ftp, or through a graphical user interface, which will also allow plots to be produced. It's a good idea to read the documentation before using the data. You can find the data from the menu down the left hand side of the USHCN page (http://cdiac.ornl.gov/epubs/ndp/ushcn/newushcn.html ).
(2.2) GHCN. The Global Historical Climatology Network (http://www.ncdc.noaa.gov/oa/climate/ghcn-monthly/index.php) is a larger set of land stations that covers the whole globe. USHCN stations are a strict subset. The GHCN includes also an additional 620 stations in the continental USA, which are relevant for this thread. The additional stations are mostly from a dataset (TD-3280) for airport, and a smaller number of additions from other sources.
The GHCN uses USHCN as a data source, and produces its own data files, which can be obtained by ftp from the NCDC (ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/v2/). A mapping between GHCN ids and USHCN ids is available in a file ushcn.tbl (http://code.google.com/p/open-gistemp/source/browse/trunk/input_files/ushcn.tbl?r=5). More convenient for a quick look is the graphical user interface at GISS (http://data.giss.nasa.gov/gistemp/station_data/) which allows one to plot various data for different stations.
(2.3) Co-operative Observers There are thousands of other small weather stations all across the continental USA as part of the Cooperative Observer Program (http://www.nws.noaa.gov/om/coop/what-is-coop.html) of the National Weather Service. This information is used in various ways, but not usually (as far as I am aware) by groups looking at global trends, or trends across the USA.
(2.4) USCRN The US Climate Reference Network (http://www.ncdc.noaa.gov/crn/programoverview.html) is a recent development. It is a network of high quality stations across the USA. This is planned to give better historical data for future researchers; but there is no associated historical data now that can be used to look at trends into the past century.
(3) Inferred history of mean temperature change across the USA
Information from combined sources is used by a number of different research groups to infer trends in mean temperature change over the last century and up to the present. The final calculated data product is usually a grid over the whole world, with monthly values at each grid cell. Values are given not as temperatures, but as anomalies, or the change in temperature. Given a long temperature record for a given land station, a baseline monthly average for that station is obtained, and then the entire series for the station is made into an "anomaly" by subtracting from all monthly temperatures the mean value at that station and for that month. Then it is the anomalies which are combined for a global or regional average anomaly.
A trend for the whole continental USA can be obtained by averaging across the appropriate grid cells. The NASA climate group in the USA also produces some figures for the USA in particular, and interested individuals can also calculate for themselves such means from the grids given by other research groups. It's a bit of work, but there's also nothing stopping any individual from using weather station records themselves to calculate anomalies for the USA. I've done this, and reported my results here at TWeb some months ago. (msg #56 of "The Last Warming")
The NASA climate group's USA anomaly trend can be downloaded in ascii here (http://data.giss.nasa.gov/gistemp/graphs/Fig.D.txt). Here is a plot of the trend as of Jan 2008.
60536
(4) Correcting historical data for systematic error
A major question that has been raised about the inference of trends is the role of error corrections. Historical temperature data is not used directly as recorded, but only after various corrections have been applied to account for systematic source of error.
It is therefore relevant to examine these corrections to see if they are legitimate.
(4.1) Time of observation bias
The most important correction to historical data is for a "time of observation bias". The times at which temperature observations are made – the operating schedule – varies over the operating life a station. For some stations there are multiple temperature readings per day, and this allows a reasonable estimate of the mean temperature. For most of the historical data, however, stations only record one temperature reading per day. Information about the operating schedule is available as meta-data.
It would obviously be incorrect to infer trends directly from raw data when temperatures in different years are based on readings made at a different time of day. The solution has been to apply a "time of observation" correction. Using information from first order stations, that take multiple readings of temperature throughout the day, a mean "climatology" is obtained. This gives the mean difference between readings at a certain time (date and hour) and the average temperature on that day. From this, a suitable correction is applied to compensate for the observing schedule.
(4.2) Station history adjustment
Meta data for stations can also record other major events. There may be changes in the instruments used, or in the location of a station. These introduce small discontinuities, or discrete jumps, in the temperature data obtained before and after the relevant event. There is also an attempt to use meta-data to identify such discontinuities, measure their effect, and compensate. One particular correction, called MMTS, is related to a change in instrumentation.
(4.3) Urban heat island effects
Urban environment can show a significant difference in mean temperature by comparison with rural environments. If the land around a weather station has become increasingly urbanized over time, then this may show up as a spurious warming trend, which ought to be removed.
There are a range of ways in which this effect is identified and compensated. For most of the USHCN stations, the effect is pretty negligible, as they are sited well away from large urban environments. But in some cases it may be significant, and this is a topic to consider also in the thread.
(4.4) The effect of corrections
The combined effect of all the various corrections has generally been to give a small increase in the warming effect by comparison with what would be obtained from raw data. Nevertheless, it cannot be presumed on this basis that the corrections must be a distortion. The validity of management of systematic errors has to be considered on its own merits. Here are plots, from documentation of USHCN data (http://cdiac.ornl.gov/epubs/ndp/ushcn/ndp019.html), which show the annual USA anomaly as obtained by RAW, TOBS, FILNET and URBAN data files. The SHAP and MMTS files are intermediate between TOBS and FILNET, and described in the documentation. The second graph shows the differences between the anomaly plots, which make it easier to see the additional trend.
6053560539
Perhaps counter-intuitively, historical reconstructions can change over time, as new records become digitized and available, or as error correction or other processing is refined. In msg #1056 of "Global Warming" grmorton provides an animation that contrasts before and after of a revision to processing of historical data at NASA. The updates are documented at A closer look at United States and global surface temperature change (http://pubs.giss.nasa.gov/abstracts/2001/Hansen_etal.html); the major change is precisely that they started to take account of the time of observation bias and other systematic errors listed above.
In this animation, bear in mind that the difference between 1999 and 2008 includes almost another ten years of data. To help focus on the differences made by error corrections only, I've attempted to edit the animation to remove this extra data, and just compare the updates. Original and edited animations are both shown.
6053760538
(5) Unidentified errors that may distort results
As well as systematic errors that might be identified from meta-data, there can also be problems with the siting or management of a weather station, which might not be recognized simply from available data. It's sometimes possible to identify large deviations in a temperature record and use this to flag potential problems, but there are likely to be many other issues that lead to inaccuracies from some stations.
Several examples have been proposed in various threads. A frequently cited example is if an air conditioner outlet is located anywhere nearby a thermometer. There's potential for spurious heating from the air conditioner to distort results. Many such examples have been given by gmorton, and a major reference often used to consider defects at individual stations is surfacestations.org (http://www.surfacestations.org/). This is a volunteer effort aiming to make their own classifications for as many of the USHCN and GHCN stations as possible, and it is widely referenced by people who are dubious of the value of trends obtained from weather stations in the USA.
Here's an example provided by grmorton of a badly located station with a nearby air conditioner, taken from msg #299 of thread "Global Cooling Anyone?"
60540
Note that even it is was possible to examine closely every weather station in the network, we could still have no assurance of identifying problems like this; because we are interested in historical data; and observing stations now tells us nothing about similar issues in the past.
One way to check for whether or not such poorly managed stations are causing a problem is to repeat calculations of the mean USA anomaly using only a small subset of stations for which such problems are expected to be minimal. If the same trend is obtained, this indicates that there is no significant systematic bias associated with using larger datasets. The larger datasets are still useful because they allow for better resolution of differences across the region. If we were only interested in a mean US anomaly, then a more sparse set of observations would be adequate. In the future, the USCRN can serve this purpose. For trends into the last century, we can still identify a subset of stations that are less likely to be afflicted with major distortions. I've done this myself in msg #56 of "The Last Warming" using only stations listed as "class-1" by the surfacestations group, and have shown that the effect on the USA trend is negligible. Here is my plot repeated from that post.
60532
(6) Statistical significance of trends
Finally, another criticism sometimes raised is that because there are variations in temperature from year to year, a trend in mean temperature is insignificant. This argument is based on quantifying the variation for stations, and quantifying trends, and giving various statistical arguments for whether or not a trend is significant.
This argument is distinct from questions about the accuracy of data. It can be engaged by using standard mathematical concepts of significance and trend analysis.
300
I'll be adding some posts myself to delve into some aspects more deeply; and I hope others will join into to identify matters they think most important, or to present their perspective.
Cheers -- Sylas
There are serious questions about how much this data can reveal, or how reliable it can be. It was never envisaged that the data would be used for such a large application. The records from individual stations are often patchy, and over time there are all kinds of changes that take place that might impact the readings obtained. There are many sources of error that can distort readings, and any trends obtained in this way must to consider the quality of data and their consistency over time.
The quality and validity of these historical trends has been called into question in a number of other threads, in particular relating to temperature records, and this thread is here for discussion of the issues surrounding the matter.
(1) Thread scope: the land temperature record for continental USA
The scope of the thread is discussion of the land temperature record in the continental USA. Other aspects of climate debates do not belong here, except when they are directly related to the main topic in some way.
There are other threads that are wide open to more general topics. As thread originator I am asking that this thread maintain a clear and tight focus on the continental USA land temperature record. Let's keep this thread focused, substantive, and civil. Anyone is welcome to join in, with questions or answers or comments or that relate to the topic.
Since the topic has appeared a number of times in previous threads, it may sometimes be useful to supply links to relevant posts as background, or to give quotes from posts in other threads; but the thread of discussion here must be self-contained and comprehensible by reading the text available in this thread alone. Links are good for additional background or as references, but not as substitutes for readable questions or answers.
You can augment a quote tag like this -- [quote=sylas, in thread 'Global Warming';2513271] -- if you would like to indicate that it is from another thread.
The thread is a tad messy, as shadowmaster notes. And I have issues at home which mean I won't always be as quick as others. So when I have a more detailed reply for you, I will simply make it a new thread entirely, and post a shorter summary with link here. The new thread will have a limited focus, and will be open to participation from anyone. …
As a general caution when looking at data; remain aware of whether you are looking at temperature data in Fahrenheit or Celcius. Most scientific work, especially for international use, is based on Celcius; but much of the local US material uses Fahrenheit.
(2) Background information and data sources
There are two major collections of historical temperature data for the continental USA that have been a focus for discussion in previous threads, and a couple of other potentially relevant records.
(2.1) USHCN. The United States Historical Climatology Network (http://cdiac.ornl.gov/epubs/ndp/ushcn/newushcn.html) is a collection of historical records from 1221 weather stations across the 48 states of the continental USA. The stations chosen for inclusion in this set are mostly rural stations, with some in more urban settings. 820 are classified rural, 288 are associated with a small town with a population of less than 50,000; and 113 are classified urban, associated with cities with a population of 50,000 or more.
For these stations, historical records have been obtained from a number of sources, and digitized. The freely downloadable data includes daily recorded information for each station, and a number of monthly temperature time series. There are series for min, max, and mean temperature, and in each case there are four series available, with a series of successive corrections to systematic errors.
A series based on the raw data. (AREAL)
A series corrected for systematic errors arising from time of observation bias. (TOBS)
A series corrected for discontinuities in the station history, such as a movement in the site, or a change in the instruments. Missing values are also added based on neighbouring stations. (FILNET)
A series corrected for an urban hear island systematic bias. (URBAN)
A major question arising in the previous threads is the validity of these corrections.
Monthly data, and daily data, can be obtained either as complete files by ftp, or through a graphical user interface, which will also allow plots to be produced. It's a good idea to read the documentation before using the data. You can find the data from the menu down the left hand side of the USHCN page (http://cdiac.ornl.gov/epubs/ndp/ushcn/newushcn.html ).
(2.2) GHCN. The Global Historical Climatology Network (http://www.ncdc.noaa.gov/oa/climate/ghcn-monthly/index.php) is a larger set of land stations that covers the whole globe. USHCN stations are a strict subset. The GHCN includes also an additional 620 stations in the continental USA, which are relevant for this thread. The additional stations are mostly from a dataset (TD-3280) for airport, and a smaller number of additions from other sources.
The GHCN uses USHCN as a data source, and produces its own data files, which can be obtained by ftp from the NCDC (ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/v2/). A mapping between GHCN ids and USHCN ids is available in a file ushcn.tbl (http://code.google.com/p/open-gistemp/source/browse/trunk/input_files/ushcn.tbl?r=5). More convenient for a quick look is the graphical user interface at GISS (http://data.giss.nasa.gov/gistemp/station_data/) which allows one to plot various data for different stations.
(2.3) Co-operative Observers There are thousands of other small weather stations all across the continental USA as part of the Cooperative Observer Program (http://www.nws.noaa.gov/om/coop/what-is-coop.html) of the National Weather Service. This information is used in various ways, but not usually (as far as I am aware) by groups looking at global trends, or trends across the USA.
(2.4) USCRN The US Climate Reference Network (http://www.ncdc.noaa.gov/crn/programoverview.html) is a recent development. It is a network of high quality stations across the USA. This is planned to give better historical data for future researchers; but there is no associated historical data now that can be used to look at trends into the past century.
(3) Inferred history of mean temperature change across the USA
Information from combined sources is used by a number of different research groups to infer trends in mean temperature change over the last century and up to the present. The final calculated data product is usually a grid over the whole world, with monthly values at each grid cell. Values are given not as temperatures, but as anomalies, or the change in temperature. Given a long temperature record for a given land station, a baseline monthly average for that station is obtained, and then the entire series for the station is made into an "anomaly" by subtracting from all monthly temperatures the mean value at that station and for that month. Then it is the anomalies which are combined for a global or regional average anomaly.
A trend for the whole continental USA can be obtained by averaging across the appropriate grid cells. The NASA climate group in the USA also produces some figures for the USA in particular, and interested individuals can also calculate for themselves such means from the grids given by other research groups. It's a bit of work, but there's also nothing stopping any individual from using weather station records themselves to calculate anomalies for the USA. I've done this, and reported my results here at TWeb some months ago. (msg #56 of "The Last Warming")
The NASA climate group's USA anomaly trend can be downloaded in ascii here (http://data.giss.nasa.gov/gistemp/graphs/Fig.D.txt). Here is a plot of the trend as of Jan 2008.
60536
(4) Correcting historical data for systematic error
A major question that has been raised about the inference of trends is the role of error corrections. Historical temperature data is not used directly as recorded, but only after various corrections have been applied to account for systematic source of error.
It is therefore relevant to examine these corrections to see if they are legitimate.
(4.1) Time of observation bias
The most important correction to historical data is for a "time of observation bias". The times at which temperature observations are made – the operating schedule – varies over the operating life a station. For some stations there are multiple temperature readings per day, and this allows a reasonable estimate of the mean temperature. For most of the historical data, however, stations only record one temperature reading per day. Information about the operating schedule is available as meta-data.
It would obviously be incorrect to infer trends directly from raw data when temperatures in different years are based on readings made at a different time of day. The solution has been to apply a "time of observation" correction. Using information from first order stations, that take multiple readings of temperature throughout the day, a mean "climatology" is obtained. This gives the mean difference between readings at a certain time (date and hour) and the average temperature on that day. From this, a suitable correction is applied to compensate for the observing schedule.
(4.2) Station history adjustment
Meta data for stations can also record other major events. There may be changes in the instruments used, or in the location of a station. These introduce small discontinuities, or discrete jumps, in the temperature data obtained before and after the relevant event. There is also an attempt to use meta-data to identify such discontinuities, measure their effect, and compensate. One particular correction, called MMTS, is related to a change in instrumentation.
(4.3) Urban heat island effects
Urban environment can show a significant difference in mean temperature by comparison with rural environments. If the land around a weather station has become increasingly urbanized over time, then this may show up as a spurious warming trend, which ought to be removed.
There are a range of ways in which this effect is identified and compensated. For most of the USHCN stations, the effect is pretty negligible, as they are sited well away from large urban environments. But in some cases it may be significant, and this is a topic to consider also in the thread.
(4.4) The effect of corrections
The combined effect of all the various corrections has generally been to give a small increase in the warming effect by comparison with what would be obtained from raw data. Nevertheless, it cannot be presumed on this basis that the corrections must be a distortion. The validity of management of systematic errors has to be considered on its own merits. Here are plots, from documentation of USHCN data (http://cdiac.ornl.gov/epubs/ndp/ushcn/ndp019.html), which show the annual USA anomaly as obtained by RAW, TOBS, FILNET and URBAN data files. The SHAP and MMTS files are intermediate between TOBS and FILNET, and described in the documentation. The second graph shows the differences between the anomaly plots, which make it easier to see the additional trend.
6053560539
Perhaps counter-intuitively, historical reconstructions can change over time, as new records become digitized and available, or as error correction or other processing is refined. In msg #1056 of "Global Warming" grmorton provides an animation that contrasts before and after of a revision to processing of historical data at NASA. The updates are documented at A closer look at United States and global surface temperature change (http://pubs.giss.nasa.gov/abstracts/2001/Hansen_etal.html); the major change is precisely that they started to take account of the time of observation bias and other systematic errors listed above.
In this animation, bear in mind that the difference between 1999 and 2008 includes almost another ten years of data. To help focus on the differences made by error corrections only, I've attempted to edit the animation to remove this extra data, and just compare the updates. Original and edited animations are both shown.
6053760538
(5) Unidentified errors that may distort results
As well as systematic errors that might be identified from meta-data, there can also be problems with the siting or management of a weather station, which might not be recognized simply from available data. It's sometimes possible to identify large deviations in a temperature record and use this to flag potential problems, but there are likely to be many other issues that lead to inaccuracies from some stations.
Several examples have been proposed in various threads. A frequently cited example is if an air conditioner outlet is located anywhere nearby a thermometer. There's potential for spurious heating from the air conditioner to distort results. Many such examples have been given by gmorton, and a major reference often used to consider defects at individual stations is surfacestations.org (http://www.surfacestations.org/). This is a volunteer effort aiming to make their own classifications for as many of the USHCN and GHCN stations as possible, and it is widely referenced by people who are dubious of the value of trends obtained from weather stations in the USA.
Here's an example provided by grmorton of a badly located station with a nearby air conditioner, taken from msg #299 of thread "Global Cooling Anyone?"
60540
Note that even it is was possible to examine closely every weather station in the network, we could still have no assurance of identifying problems like this; because we are interested in historical data; and observing stations now tells us nothing about similar issues in the past.
One way to check for whether or not such poorly managed stations are causing a problem is to repeat calculations of the mean USA anomaly using only a small subset of stations for which such problems are expected to be minimal. If the same trend is obtained, this indicates that there is no significant systematic bias associated with using larger datasets. The larger datasets are still useful because they allow for better resolution of differences across the region. If we were only interested in a mean US anomaly, then a more sparse set of observations would be adequate. In the future, the USCRN can serve this purpose. For trends into the last century, we can still identify a subset of stations that are less likely to be afflicted with major distortions. I've done this myself in msg #56 of "The Last Warming" using only stations listed as "class-1" by the surfacestations group, and have shown that the effect on the USA trend is negligible. Here is my plot repeated from that post.
60532
(6) Statistical significance of trends
Finally, another criticism sometimes raised is that because there are variations in temperature from year to year, a trend in mean temperature is insignificant. This argument is based on quantifying the variation for stations, and quantifying trends, and giving various statistical arguments for whether or not a trend is significant.
This argument is distinct from questions about the accuracy of data. It can be engaged by using standard mathematical concepts of significance and trend analysis.
300
I'll be adding some posts myself to delve into some aspects more deeply; and I hope others will join into to identify matters they think most important, or to present their perspective.
Cheers -- Sylas