Quote:
Originally Posted by Dave2042
Good points. I have always felt that there is a danger of missing the wood for the trees in discussions of statistical significance and sigmas.
The reasons, as I see it, that sigma is so critical in particle physics are: - It is generally not known what is 'expected', or predicted by an accepted theory. Indeed, whether the 'blip' is real, is a driver of whether we like one theory over another, given the endless variants of possible theory.
- You can simply do more runs and drive sigma up as high as you like, or to zero, and settle the matter.
This is not like that. - We strongly expect gravitational radiation to exist, for widely accepted theoretical reasons, as well as having the second 'simultaneous' signal telling us this is not just a glitch.
- We can't just do more runs. This is what we have and we need to work with it.
This is more than just an appeal to Bayesian stats. My point is a more fundamental one about what sigma is. All it tells you is something about the likelihood the signal could have been generated randomly. This number operates in total blindness to any theoretical background or extraneous but relevant information. That's kind of fair in much of modern particle physics, but certainly not here.
The worst (but I feel clearest) example of this misunderstanding is the one where we look at climate stats and say that the probability of the rising temperature data being random is 5%, and conclude there is only a 95% chance we are causing global warming. In fact we know (in the usual sense of the word) that it's warming and we are doing it, from basic science. The uncertainty exists only in relation to our ability to measure the current effect above a lot of background noise.
|
good post. It's rare to find scientific articles (especially those based upon experimental investigation) that include the three main sources of errors - Systematic, statistical and theoretical i.e Graphs with data points that show 3 error bars.
And there is the common confusion between what is meant by precision and what is meant by accuracy.
I recall attending a session at a seminar/conference some years back where the author presented a graph with only 3 data points and no error bars included. He then fitted some sort of parabolic curve to these 3 points, and extracted a value for a maximum. I asked him how he chose the type of curve and what the regression coefficient was with just 3 data points. He answered by saying that each data point involves about 2 months work, and that he knew from other research that the relationship was non-linear.
One can fit just about any curve to 3 points. It would be like claiming a linear relationship with just two data points.
If you look at the original Hubble data you can see a general upward trend, but the scatter in the data was large and the galaxies that were observed at the time, with the equipment and techniques available, were nearby or very large galaxies. A linear relationship was nevertheless fitted by Hubble and a value for the Hubble constant extracted. It took many decades to obtain data from distant galaxies that showed the Universe was not only expanding but that this expansion was accelerating.
Error bars from all sources, and their justification is very important in the scientific profession and normally taken for granted