MathCS Seminar

Title: Reading Between the Lines of Datacenter Logs
Seminar: Computer Science
Speaker: Dr. Nosayba El-Sayed of MIT
Contact: ,
Date: 2018-02-20 at 1:00PM
Venue: Atwood Chemistry Building Room 240
Designing datacenters that are reliable, energy-efficient, and capable of delivering high performance and high utilization is a nontrivial problem facing scientists, businesses, and governments alike. In this talk, I will demonstrate how analyzing large datasets from different organizations helped us uncover interesting (and often surprising) patterns in the behavior of systems and applications in these large-scale platforms. I will show how real-world data helped us tackle critical questions such as how does temperature impact server reliability in places like Google, or how well do users configure the computing jobs they submit to shared clusters (spoiler alert: not very well!). Finally, I will demonstrate how simple machine learning techniques can be leveraged to accurately predict job failures in datacenters, while using data that is easily collected in current platforms.

