CS700:Graduate Seminar in Computer Science & Informatics

Managing Execution of Bioinformatics Applications on the Grid

Grid computing has gained popularity as the emerging architecture for next-generation high performance distributed computing with the goal of providing ubiquitous access to distributed HPC resources that are shared between multiple organizations through "virtualization" and "aggregation." However, available resources are characterized by their heterogeneity in terms of availability as well as hardware and software characteristics. Such resource heterogeneity results in dependencies between applications and resources, which is observed through variable performance of applications as they execute on available resources, and it leads to user dissatisfaction and poor resource utilization. Scheduling applications across multiple heterogeneous resources, often referred to as metascheduling, is a challenging task that has to consider heterogeneity and availability of the resources as well as load balancing and application specific characteristics to achieve desired performance and utilization. This talk provides an overview of the different issues involved in metascheduling, describes some of the techniques developed to address these issues, and presents the application of these techniques to a bioinformatics and a statistics application along with performance results.
Enis Afgan is a Computer Science Ph.D. Candidate in the Collaborative Computing Laboratory. His research is in Grid and Distributed Computing.