Tuesday, January 07, 2014

Some of our data is missing ...

But there is nothing, no data, no documentation, not a single wire chamber photograph, not a single collision publicly available of which you could make sense. And this is a scandal. ~ Alexander Unzicker, The Higgs Fake: How Particle Physicists Fooled the Nobel Committee

I read Unzicker's book not too long ago and was amazed at just how ticked off he is with particle physics. In general, I tend to agree with him, especially in the matter of the Nobel Committee falling all over itself to give the prize out to Higgs for the possible discovery of what is maybe the Higgs boson.

For perspective, Einstein received a Nobel 10 years after expounding the General Theory of Relativity, and even then he got it for the photoelectric effect, despite the fact that solar eclipse data had verified the warping of space as predicted by Einstein and the fact that perturbations in Mercury's orbit were also explained by his theory of gravity.

But, it was the section where Unzicker started ranting about how much raw data was gone. I thought he might be indulging in hyperbole. After all, the raw data, especially for published research, can be valuable for detailed assessment of experimental accuracy and methodology. It was partly the lack of raw data that brought down Jan Hendrik Schön, which I've discussed before. Eugenie Reich brings out the point in her book Plastic Fantastic: How the Biggest Fraud in Physics Shook the Scientific World that suspicions began to arise when it was noticed that some graphs and tables in different papers from the genius looked exactly alike.  The suspicions began to turn into certainties when Schön couldn't produce his raw data.

Then along came this article, which is downright scary. Seems that a survey by Current Biology found that over 75% of biological study data between 1991 and 2011 seems to have gone away. Well, you say, what the heck, they got what they needed out of the numbers, so who needs them?

Well, how's this for a reason to keep the raw information? Seems that even 40-year old data can provide insights, in this case, into the deposition of moon dust. Amazingly, NASA had lost all the data. Fortunately, one of the researchers kept a backup copy of the tapes.

You want another reason? I'll give you a practical example. This involves mundane consumer products, yet it could have cost a lot of time and money trying to figure out what had changed in the product—when nothing had.

Back when I worked in the razor blade business, a large customer started complaining that one of our products was nowhere near as good as it appeared to be years ago. So we ran tests, meaning we had people shave with the product, and, lo and behold, the result was not as good as it had been a few years before. I'll spare you the gory details of how we went in circles for some time and how my boss ended up leaving the company, in part because of this problem, in part because of some other things. Once he was gone, I was free to take a close look at that old data that he had kept back. These were shaving tests he had run and summarized, and he had insisted there was nothing in that data that was important. Unfortunately, it turned out he had played a little fast and loose with the results, combining some rating levels to make it look like the product was on a par with competitors. Turned out the old data looked just like our new data. In other words, the product hadn't gotten any worse; it had always been kind of crappy.

So why did the customer perceived that the product had declined? The reality was that he wanted a price cut and figured this was a good way to get one. The happy part of the story is that we were able to figure out how to make the product better. The sad part is that the customer still went with someone cheaper (who wasn't as good as our new product). The bottom line is had we known nothing had changed, we wouldn't have wasted a ton of people-hours on a niche product and either would have immediately given him the cut or told him (politely) to go suck rocks.

The reality is that old data can be very revealing when revisited. In addition, there is probably terabytes of data that have never seen the light of day. Astronomers periodically find fascinating things in old Hubble pictures that have never been properly analyzed because of the glut of information available. If they don't see what they were looking for, they don't look at the data any further. In a couple of cases, that means they missed actual images of planets orbiting distant stars. Who knows what else is hidden in the old pictures, assuming they haven't all been tossed?

Now, the Large Hadron Collider, in fact any collider, turns out huge masses of collisions. Today, these are analyzed by computers looking for a particular event that fits a model. Well, if the model is wrong, the event might mean something totally different. We don't know that for certain, but, because data is just being chucked, it will be very difficult to go back and evaluate the possibility.

And that ain't science.

No comments: