Center for Comprehensive Informatics and Department of Biomedical Informatics Special Presentation
Speaker: Michael Franklin, PhD, Professor of Computer Science, University of California, Berkeley
Title: AMPLab: Making Sense of Big Data with Algorithms, Machines and People
When: Wednesday, December 7, 2011, 10:00-11:00am
Where: Emory University, 208 White Hall (across the street from the Math/CS Building)
of all types are trying to figure out how to extract value from their
data. The key challenge is that the massive scale and diversity of
the continuous flood of information we are faced with breaks existing
technologies. State-of-the-art Machine Learning algorithms do not scale
to massive data sets. Existing data analytics frameworks cope poorly with
incomplete and dirty data and cannot process heterogeneous multi-format
information. Current large-scale processing architectures struggle with
diversity of programming models and job types and do not support the rapid
marshalling and unmarshalling of resources to solve
specific problems. All of these limitations lead to a Scalability
Dilemma: beyond a point, our current systems tend to perform worse as they are
given more data, more processing resources, and involve more people —
exactly the opposite of what should happen.
To address these issues, a group of researchers from machine learning, systems, databases, and networking at Berkeley started a new five-year research effort called the AMPLab, where AMP stands for "Algorithms, Machines, and People". AMPLab envisions a world where massive data, cloud computing, communication and people resources can be continually, flexibly and dynamically be brought to bear on a range of hard problems by huge numbers of people connected to the cloud via increasingly powerful mobile and other client devices. The team is developing a new data analytics stack that implements this vision. AMPLab's research is supported in part by 19 leading technology companies, including founding sponsors Google and SAP.
This talk will give an overview of the AMPLab motivation and research agenda and discuss several of the lab’s initial projects. One such project, CrowdDB, is developing infrastructure to support hybrid cloud/crowd query answering systems - leveraging the very different skills, and performance, reliability, and cost characteristics of large groups of machines and large groups of people. More information is at http://amplab.cs.berkeley.edu.
About the Speaker:
Michael Franklin is a Professor of Computer Science at UC Berkeley, focusing on new approaches for data management and data analysis. At Berkeley he directs the Algorithms, Machines and People Laboratory (AMPLab). He is founder and CTO of Truviso, Inc. a real-time data analytics company that enables customers to quickly make sense of diverse, high-speed, continuous streams of information. He is a Fellow of the Association for Computing Machinery, and a recipient of the National Science Foundation CAREER award and the ACM SIGMOD "Test of Time" award. He was recently awarded the Outstanding Advisor Award from the Computer Science Graduate Student Association at Berkeley. He is currently serving as a committee member on the US National Academy of Science study on Analysis of Massive Data. He received his Ph.D. from the University of Wisconsin in 1993.