by Greg Emmerich, UW Madison M.S. Biotechnology Program. Advanced Biotechnology: Global Perspectives. Thesis Paper. April 16th, 2013.
The Digital Revolution has created a knowledge-based society reliant upon a high-tech global economy. The pace of innovation has been exponential, leaving some to wonder what possibilities the future may hold.
Big Data is the term given for collections of data sets that are too large and complex for traditional hands-on data management and processing. The term comes from the realm of information technology, but across an increasing number of fields, scientists are encountering situations that fit the category of Big Data. Astronomy, genetics, and proteomics are a few of the fields beginning to feel the pressure for managing their data effectively.
There are numerous technical challenges going into setting up a system to process Big Data in reasonable amounts of time. Machine learning algorithms present great potential in their ability to tease out hidden relationships among data sets and make predictions, but these analyses require distributed computing clusters capable of communicating intermediate results between tasks.