IceInSpace - View Single Post

CraigS · #19 06-12-2011, 08:04 PM

Quote:

Originally Posted by Robh

Craig, I agree with you.

If you have two random variables x and y with mean1 and mean2, and then you add the variables x + y, the new mean = mean1 + mean2.
The standard deviations add in quadrature i.e. new sd = sqr(sd1^2+sd2^2).
In the article, given 2.5 sigma and 3.5 sigma,
then he proposes that the new sd = sqr(2.5^2+3.5^2) = 4.3 or 4.3 sigma.

However, this is different to just combining two lots of data together.
Here the mean will be (m*mean1+n*mean2)/(m+n) where m and n are the size of each population.
Similar calculations for the new sd will involve sd1 and sd2 with elements that reflect the population size in each experiment.

The author is mistakingly using the x + y scenario.

Regards, Rob.

Cool explanation, Rob.

Thanks for confirming what I must admit is now, was just a remnant recollection from a past life which required some involved data analysis.
There are many other treatments which may end up being applied to the datasets before combining them also (eg: normalisations, weighting factors, skewing corrections ... and stuff I have no idea about). These treatments can easily rule out seemingly 'obvious' handling of derived distribution statistics. (As an aside, I've lost track of how many times I've seen people attempting to average averages, arithmetically).

Still, it all depends on which process step of the two datasets they attempt to combine. Until they present it all, we should view it with a 'steely eye', eh ?
I'm sure the folk dealing with the information know exactly what they're doing ... perhaps it was just the journo at work this time eh ?

Cheers