Page 1 of 2 12 LastLast
Results 1 to 10 of 18

Thread: Exploring the GHCN Data Part One: Getting the Raw Data

  1. #1
    tWebber Leonhard's Avatar
    Join Date
    Jan 2014
    Location
    Denmark - Jutland
    Faith
    Catholic
    Gender
    Male
    Posts
    4,684
    Amen (Given)
    879
    Amen (Received)
    2616

    Exploring the GHCN Data Part One: Getting the Raw Data

    Following some discussions here in Civics, which apparently is where all climate change related discussion take place on this forum, Mountain Man, and others, have insisted that the GHCN tampered with the data, adjusting and "faking it", in various ways.

    This has been a claim thrown around a lot in various articles that cast doubt on whether the science of Global Warming is driven by science or political bias. Typically it is done by taking a graph of the "raw" temperature and subtracting ti from the "adjusted" temperature, and showing that the new adjusted reconstruction tends to make past temperatures colder and newer temperatures warmer.

    This has led to suspicion that the various climate groups who have done these temperature reconstructions have their thumb on the scale. Biasing it towards higher temperatures.

    I've countered that the adjustments were well founded. They are there to remove bias introduced by a shift of observation time, to changes of instruments, and even the urban heat island or urban cold island effects, and done with good reason. And that the raw data, at any rate, has always been openly accessible to the public, which was what the scientists argued at court during audits of their work.

    The data had always been publically available.

    This is obvious since even climate change skeptics have used this data in their own arguments.

    So with that in mind, I've decided to see if I can't reproduce their climate temperature reconstruction. However I will confine myself only to the US Land Temperature Record. I admit this is simply because of a lack of time. I have a full-time job now. There's only a couple of hours I can dedicate to this project every week.

    This week, after asking around a bit, I found a link to the FTP server where all the climate data has been stored, for version 1, 2 and 3. Both the adjusted gridded data and the raw data, as well as links to some articles detailing what kinds of adjustments the scientists have done.

    Here it is in all its unvarnished glory. ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/v3

    The data set I've got act as a summary of the monthly records of all the individual land stations, their maximum record, their minimum and their average reading. The data is unadjusted, at least by the GHCN.

    I might one day get access to the raw data from the National Meteorological Services for comparison, but until that day I'll settle for the GHCN's data.

    They've only done quality control on this data, removing stations that displayed repeating numbers for an entire day, or that suddenly spiked much higher or lower than stations around them. Unfortunately, I won't be able to do any verification on this step until I have access to the raw data from the National Meteorological Services, which will be a while.

    I'll attempt to replicate the temperature reconstruction from the raw data. If all goes well, I can start to apply the adjustments they made, either by a set of different successive corrections for the biases mentioned in the beginning of this post. Or, as they did it, by the pair-wise homogenization algorithm. We'll see how far I get.

    For the next Sunday, I should have a dirt-quick averaging of the data, and I'll attempt a gridded averaging as well, all with nice colorful charts.

    Then in the following weeks, I'll go into the specific adjustments made.

  2. #2
    tWebber Leonhard's Avatar
    Join Date
    Jan 2014
    Location
    Denmark - Jutland
    Faith
    Catholic
    Gender
    Male
    Posts
    4,684
    Amen (Given)
    879
    Amen (Received)
    2616
    Yikes, I didn't mean to post this on News Desk, can someone move it to Civics?

  3. #3
    Evolution is God's ID rogue06's Avatar
    Join Date
    Jan 2014
    Location
    Southeastern U.S. of A.
    Faith
    Christian
    Gender
    Male
    Posts
    40,509
    Amen (Given)
    899
    Amen (Received)
    15620
    Quote Originally Posted by Leonhard View Post
    Yikes, I didn't mean to post this on News Desk, can someone move it to Civics?
    Done

    I'm always still in trouble again

    "You're by far the worst poster on TWeb" and "TWeb's biggest liar" -- starlight (the guy who says Stalin was a right-winger)

  4. #4
    tWebber Teallaura's Avatar
    Join Date
    Jan 2014
    Location
    In my house.
    Faith
    Christian
    Gender
    Female
    Posts
    12,696
    Amen (Given)
    6034
    Amen (Received)
    4616
    Quick question - haven't been following this closely so I'm unclear - is the argument that the weighting is deliberately skewed or based on invalid assumptions? I would assume the latter (which could be used to skew inappropriately) - and if so, your approach confuses me. The former would show itself pretty easily, i would think - in fact, it would be indicative of gross incompetence (only an idiot would fake results that way - it's too easy to disprove).

    Okay, so it's not a quick question - I don't get how you prove the methodology by replicating it (your second option) - or did I miss something?

  5. #5
    tWebber Starlight's Avatar
    Join Date
    Feb 2014
    Location
    New Zealand
    Faith
    Atheist
    Gender
    Male
    Posts
    7,772
    Amen (Given)
    2546
    Amen (Received)
    1542
    Teallaura, my understanding is that when standard maintenance is carried out on a weather station (replacement of thermometer, replacement of cladding around the station etc) this action can bias the temperature reading at that station (e.g. the new thermometer might be off by 0.1 degree compared to the old one, the new metal cladding might get a bit warmer than the old wooden cladding etc). So in order to be able to meaningfully compare temperatures recorded by the pre-maintenance weather station with the post-maintenance one, they write a little algorithm that goes over the data the weather station is reporting and identifies the systematic adjustments that need to be made.

    The way this is typically done is by comparison to surrounding weather stations - so if on average the old weather station was +.2 degrees higher than the 5 nearby weather stations over a 1 year period leading up to the time of maintenance, and post-maintenance it is on average reporting a temperature that is .1 degrees lower than the 5 nearby weather stations over a 1 year period following maintenance, then that suggests the maintenance has made a systemic 0.3 degree temperature measurement difference. So if you want to remove the effect of maintenance on the question of have temperatures gone up or down at that weather station over the long term (i.e. to study climate change), then you will want to adjust the raw data by 0.3 degrees either pre-maintenance or post-maintenance before you graph it to make sure you're comparing apples to apples and that temperature increases you're seeing aren't just an artifact of old thermometer technologies being gradually replaced around the world with newer thermometer technologies. This is known as "adjusted" data as compared to "raw" data which is what the station itself is outputting.

    The climate change denial crowd occasionally assert that if you look at some carefully preselected weather stations and use raw data rather than adjusted data, then it looks like the temperature isn't increasing over time (because if you look hard enough you can find a handful of weather stations where maintenance has reduced the post-maintenance reported temperatures by enough that it outweighs the climate change effect). They use this as evidence for their belief that the algorithms that do the "adjustments" on the data do something nefarious beyond merely correcting for instrumentation changes, because apparently scientists are all trying to create this global warming conspiracy through using adjustments to data that could be trivially debunked by other scientists (none of whom choose to do so, because they're all in on it of course, paid off as they are by Big Science). Leonhard has downloaded the raw data for all US land weather stations and is going to check the climate-change-deniers' theory that the raw data shows cooling and only the "adjusted" data shows warming, and also have a look at what the adjustments are.

  6. #6
    tWebber Teallaura's Avatar
    Join Date
    Jan 2014
    Location
    In my house.
    Faith
    Christian
    Gender
    Female
    Posts
    12,696
    Amen (Given)
    6034
    Amen (Received)
    4616
    Quote Originally Posted by Starlight View Post
    Teallaura, my understanding is that when standard maintenance is carried out on a weather station (replacement of thermometer, replacement of cladding around the station etc) this action can bias the temperature reading at that station (e.g. the new thermometer might be off by 0.1 degree compared to the old one, the new metal cladding might get a bit warmer than the old wooden cladding etc). So in order to be able to meaningfully compare temperatures recorded by the pre-maintenance weather station with the post-maintenance one, they write a little algorithm that goes over the data the weather station is reporting and identifies the systematic adjustments that need to be made.

    The way this is typically done is by comparison to surrounding weather stations - so if on average the old weather station was +.2 degrees higher than the 5 nearby weather stations over a 1 year period leading up to the time of maintenance, and post-maintenance it is on average reporting a temperature that is .1 degrees lower than the 5 nearby weather stations over a 1 year period following maintenance, then that suggests the maintenance has made a systemic 0.3 degree temperature measurement difference. So if you want to remove the effect of maintenance on the question of have temperatures gone up or down at that weather station over the long term (i.e. to study climate change), then you will want to adjust the raw data by 0.3 degrees either pre-maintenance or post-maintenance before you graph it to make sure you're comparing apples to apples and that temperature increases you're seeing aren't just an artifact of old thermometer technologies being gradually replaced around the world with newer thermometer technologies. This is known as "adjusted" data as compared to "raw" data which is what the station itself is outputting.

    The climate change denial crowd occasionally assert that if you look at some carefully preselected weather stations and use raw data rather than adjusted data, then it looks like the temperature isn't increasing over time (because if you look hard enough you can find a handful of weather stations where maintenance has reduced the post-maintenance reported temperatures by enough that it outweighs the climate change effect). They use this as evidence for their belief that the algorithms that do the "adjustments" on the data do something nefarious beyond merely correcting for instrumentation changes, because apparently scientists are all trying to create this global warming conspiracy through using adjustments to data that could be trivially debunked by other scientists (none of whom choose to do so, because they're all in on it of course, paid off as they are by Big Science). Leonhard has downloaded the raw data for all US land weather stations and is going to check the climate-change-deniers' theory that the raw data shows cooling and only the "adjusted" data shows warming, and also have a look at what the adjustments are.
    SL, I am not reading this. I'm asking Leo a specific question concerning his methodology so I'm clear on what he's trying. You cannot answer for him and I don't need an overview at this time. I don't mean to be unkind about it - I just don't want to debate and/or muck up his thread.


    ETA: Okay, so I got a skim of the first sentence - I'm almost positive that's not what he's talking about. Pretty sure he had refuted that specific a long time ago.

  7. #7
    tWebber Leonhard's Avatar
    Join Date
    Jan 2014
    Location
    Denmark - Jutland
    Faith
    Catholic
    Gender
    Male
    Posts
    4,684
    Amen (Given)
    879
    Amen (Received)
    2616
    Quote Originally Posted by Teallaura View Post
    Quick question - haven't been following this closely so I'm unclear - is the argument that the weighting is deliberately skewed or based on invalid assumptions?
    I apologize for the semi-long response to the quick question.

    The arguments that I've seen here on tweb, has been primarily an accusation of outright fraud. Making up results. "Putting their thumb on the scales". The evidence is usually that the adjusted data, minus the raw data, shows a tendency to make past temperatures colder and recent temperatures warmer. Some like Anthony Watts has argued that some of the land stations are susceptible to heat island effects, and he argues that this hasn't be adequately adjusted for, and provided his own crowd-sourced list of weather stations that he considers proper. I'd be looking at that one in the end.

    Okay, so it's not a quick question - I don't get how you prove the methodology by replicating it (your second option) - or did I miss something?
    If the assumptions used to correct for biases had been questioned, then we could talk about those. The claims I've seen are about fraud though, so I'm addressing those.

    I was in here on tweb where the reconstructions were questioned, where I pointed out that the biases the researchers tried to address were well-known and well accounted for (I can't remember if I linked to the articles discussing them though), I was accused of "hand-waving", and then got about two dozen links to Breitbart, et. al. articles accusing the researchers of data fraud.

    The argument is that the scientists are using software to fake global warming data. Now its not clear what this means, but I take it as meaning that if you actually addressed the documented biases in the data (time of observation bias, instrumentation shift bias, location bias, non-gridded vs gridded weighted bias) you still wouldn't get heating. That the heating is fake and made up.

    That's mainly the claim I'm going to examine.

    I'm also doing this series to counter the claims I've often heard that the data doesn't exist, or the scientists have deleted it, and that they haven't provided public access to their methodology or their software.

    It was true at one time that they didn't provide that software, as that typically isn't normal scientific practice (other researchers are meant to replicate the work - not merely run a program other researchers have built to verify it works on their computers). During one court trial against the CRU researchers following the leaking of internal emails, they argued that the software was easy enough to replicate by anyone.

    Which is what I'm trying to show.

    Also I can show that the GHCN for instance (and Berkeley BEST) has provided the software they wrote and even included instructions for how you run it.

    So to reiterate, I'm mostly countering the claim that the adjustments were pure fraud (as you say - its easy to show this isn't the case, hence why I'm doing it). Though I'm also going to deal with a supposed problem of heat-islands, by only using the weather stations Anthony Watts crowd source says were good. And I'm showing that the data, methodology and software (at least in the case of GHCN v.2 and v.3 data sets) has been publicly accessible.

    And I'm satisfying my own personal curiosity in how these things work.
    Last edited by Leonhard; 01-15-2018 at 07:56 AM.

  8. #8
    tWebber Mountain Man's Avatar
    Join Date
    Jan 2014
    Location
    United States
    Faith
    Christian
    Gender
    Male
    Posts
    11,422
    Amen (Given)
    4684
    Amen (Received)
    4363
    What guarantee is there that the raw data itself hasn't been tampered with? That's one of the chargers I've read, that data downloaded over subsequent years shows different numbers without the alterations being documented, so the only way you would know about is if a third-party archived the data.

    https://wattsupwiththat.com/2015/07/...t-any-further/

  9. #9
    tWebber Leonhard's Avatar
    Join Date
    Jan 2014
    Location
    Denmark - Jutland
    Faith
    Catholic
    Gender
    Male
    Posts
    4,684
    Amen (Given)
    879
    Amen (Received)
    2616
    Quote Originally Posted by Mountain Man View Post
    What guarantee is there that the raw data itself hasn't been tampered with? That's one of the chargers I've read, that data downloaded over subsequent years shows different numbers without the alterations being documented, so the only way you would know about is if a third-party archived the data.

    https://wattsupwiththat.com/2015/07/...t-any-further/
    Its an interesting link, and its something I've been thinking of as well.

    I don't think there's any guarantee's for anything in this world. However personally outright fraud with the raw data would be easy enough to show. Just request the weather station data itself, and if it wildly diverges, then there's something up. Granted the raw data used comes from many sources, and even Anthony Watts admits he hasn't done a proper tracking of it. So there's not really any evidence that anything bad is at foot here.

    It depends in the end whether you're committed to Global Warming being a conspiracy I guess.

    But I understand the need for verification, and I'd definitely want to get that too, but I also don't wanna get sidetracked with what I'm doing now.

    I would personally like to examine it Mountain Man, but its an enormous project to verify the whole shebang. Neither the GHCN, nor GISS, nor BEST or HadCRU have denied that the raw datasets they receive have changed slightly. (roughly 0.1C adjustments here and there, as can be seen in the link). The argument has been that the alterations are due to updates when more accurate temperature data becomes available.

    The GHCN gets their data from the National Meterological Services, any quote-unquote tampering with the raw data they receive comes from there. The archived data of the individual weather stations are also available, but I haven't gotten around for that comparison yet.

    I might do that in the future. I definitely wanna do it. At least for a subset. Ask around and see what kind of documentation exists for the changes of various values. But as I only have a few hours I'm trying to be realistic with what I can do. And right now that's examining the particular kinds of adjustments made by the GHCN.

    In other words, one step at a time. For the next couple of weeks I'll just try to reproduce a simple temperature reconstruction from the raw data, and adding in the corrections for biases.

    I should be able to at least answer your question of "whether all adjustments favor global warming". I have a feeling they don't. One researcher told me that there's even more warming in the raw data than in the adjusted data!

    But we'll see.
    Last edited by Leonhard; 01-15-2018 at 04:48 PM.

  10. Amen Teallaura amen'd this post.
  11. #10
    tWebber Teallaura's Avatar
    Join Date
    Jan 2014
    Location
    In my house.
    Faith
    Christian
    Gender
    Female
    Posts
    12,696
    Amen (Given)
    6034
    Amen (Received)
    4616
    Quote Originally Posted by Leonhard View Post
    I apologize for the semi-long response to the quick question.

    The arguments that I've seen here on tweb, has been primarily an accusation of outright fraud. Making up results. "Putting their thumb on the scales". The evidence is usually that the adjusted data, minus the raw data, shows a tendency to make past temperatures colder and recent temperatures warmer. Some like Anthony Watts has argued that some of the land stations are susceptible to heat island effects, and he argues that this hasn't be adequately adjusted for, and provided his own crowd-sourced list of weather stations that he considers proper. I'd be looking at that one in the end.



    If the assumptions used to correct for biases had been questioned, then we could talk about those. The claims I've seen are about fraud though, so I'm addressing those.

    I was in here on tweb where the reconstructions were questioned, where I pointed out that the biases the researchers tried to address were well-known and well accounted for (I can't remember if I linked to the articles discussing them though), I was accused of "hand-waving", and then got about two dozen links to Breitbart, et. al. articles accusing the researchers of data fraud.

    The argument is that the scientists are using software to fake global warming data. Now its not clear what this means, but I take it as meaning that if you actually addressed the documented biases in the data (time of observation bias, instrumentation shift bias, location bias, non-gridded vs gridded weighted bias) you still wouldn't get heating. That the heating is fake and made up.

    That's mainly the claim I'm going to examine.

    I'm also doing this series to counter the claims I've often heard that the data doesn't exist, or the scientists have deleted it, and that they haven't provided public access to their methodology or their software.

    It was true at one time that they didn't provide that software, as that typically isn't normal scientific practice (other researchers are meant to replicate the work - not merely run a program other researchers have built to verify it works on their computers). During one court trial against the CRU researchers following the leaking of internal emails, they argued that the software was easy enough to replicate by anyone.

    Which is what I'm trying to show.

    Also I can show that the GHCN for instance (and Berkeley BEST) has provided the software they wrote and even included instructions for how you run it.

    So to reiterate, I'm mostly countering the claim that the adjustments were pure fraud (as you say - its easy to show this isn't the case, hence why I'm doing it). Though I'm also going to deal with a supposed problem of heat-islands, by only using the weather stations Anthony Watts crowd source says were good. And I'm showing that the data, methodology and software (at least in the case of GHCN v.2 and v.3 data sets) has been publicly accessible.

    And I'm satisfying my own personal curiosity in how these things work.
    Okay, thanks. I get you now.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •