Log in

View Full Version here: : Drowning in data


Argonavis
22-11-2007, 08:06 AM
I have yet to hear where the proposed Square Kilometer Array telescope will be sited, either in WA or Soth Africa, but the back end problems are highlighted here:


http://canadafreepress.com/index.php/article/555

quote: "Consider that, within the next decade, astronomers are expecting to be processing 100,000 terabyte’s every hour at the Square Kilometer Array telescope. That’s 10 million gigabytes. And please don’t ask how long it took to do the math on all of that. "

This is the problem with modern astronomy - data storage and manipulation.

"So CSIRO - Australia’s Commonwealth Scientific and Industrial Research Organization--are starting a new research program, entitled ‘Terabyte Science’, which hopes to help science deal with the mass amount of data that will soon be commonplace.

“CSIRO recognizes that, for its science to be internationally competitive, the organization needs to be able to analyze large volumes of complex, even intermittently available, data from a broad range of scientific fields,” says program leader, Dr John Taylor, from CSIRO Mathematical and Information Sciences.

“This will need major developments in computer infrastructure and computational tools. It involves IT people, mathematicians and statisticians, image technologists, and other specialists from across CSIRO all working together in a very focused way,” he says. "

This is often how science works - one area will prompt developments in another. This is often the "use" of astronomy - it is training PhD's not only in the science of astronomy, but also requires solving cutting edge IT and engineering problems. These can easily flow on to the rest of society.

My High School music teacher put it well when he said that there are no boxes - no isolated islands of knowledge, that all knowledge is interdependent and feeds off each other.

Campus Dweller
22-11-2007, 08:57 AM
I remember reading somewhere that NASA has thousands of old magnetic tapes of data from early space probes that were never analysed.

xelasnave
22-11-2007, 09:13 AM
Will I noticed some time ago that there is so much data coming in and yet it seems that working out what it is saying is left to chance that it will be pulled out and looked at...an impression only.

The net is the really big hope as with the net information can flow so much faster yet to form an overview seems to be getting more difficult... I read a fair bit and what staggers me is the complexity with any filed.. just looking at the various forms of math had me just reading the headings for night after night without even getting a glimpse of what each category could provide..no doubt there are people who can manage more information than me yet the same problem must present to then at some stage...

I agree with your music teachers observation yet knowing such does not take us further it seems as to make sense of the incoming data one needs someone to specialise in a limited field... they then think they know it all..which maybe in respect of the speciality they do but linking all the knowledge in an over view seems to me close to impossible... each specialist needs to be able to condense their research of many years into a short simple statement so it can flow with other simple statements...not easy..humans like to sound as if they know all about what they specialise in and therefore making a simple statement is not that simple. How can you ask someone who has spent years on a subject to ..put it in a sentence that sums it up correctly and without qualification of the finner points. State you life in a sentence of no more than 30 words... not easy.

On the bright side at least we are getting data accumulations and with the net so much of it is accessible for all to think about what it may be telling us.. but how to get overviews is the problem I think will be more difficult than gaining data. How to link the data without the boxes and make it all fit one larger picture I wonder.

alex

Kal
22-11-2007, 10:52 AM
The only way to deal with that amount of data would be real time analysis, discarding the majority of it and archiving the relevent, or archiving the results of the real time analysis.

It is a massive amount of data, if you used 10Gb fibre channel you would still need 30 or more cards just to handle that raw throughput!

higginsdj
22-11-2007, 01:20 PM
The trouble is that the data has other uses other than the immediate - ie I take images to do Minor Planet Photometry - how many variable stars for transient (new MP's/Comets) might there be in those images? I have kept all my images (well except that small bunch I accidentally lost when I archived it) and will one day write some code to auto catalogue and display it (like DSS).

Cheers

Gargoyle_Steve
26-11-2007, 04:32 AM
Seems like another project crying out for the SETI-at-HOME (http://setiathome.berkeley.edu/)approach.

"What is SETI@home?
SETI@home is a scientific experiment that uses Internet-connected computers in the Search for Extraterrestrial Intelligence (SETI). You can participate by running a free program that downloads and analyzes radio telescope data. "

(from Wikipedia)
"With over 5.2 million participants worldwide, the project is the distributed computing project with the most participants to date. Since its launch on May 17, 1999, the project has logged over two million years of aggregate computing time. On September 26, 2001, SETI@home had performed a total of 1021 floating point operations. It is acknowledged by the Guinness World Records as the largest computation in history (Newport 2005). With over 1.6 million computers in the system, as of October 16, 2007, SETI@home has the ability to compute over 274 TeraFLOPS (http://en.wikipedia.org/wiki/TeraFLOPS) "