CS700:Graduate Seminar in Computer Science & Informatics

Leveraging Semantic, Temporal and Geographical Meta-data in Textual Data Analysis and Retrieval
Alex Kotov, Department of Mathematics and Computer Science

Over the past decade, the World Wide Web has undergone a fundamental transformation from being an information resource into a social phenomenon, often referred to as Web 2.0. In addition to its profound cultural and societal impact, this transformation presented many exciting research challenges, such as how to provide efficient access to collaboratively generated on-line knowledge repositories and leverage valuable meta-data provided by social media services. In the first part of this talk, I will present a natural language question-based information retrieval framework for exploratory queries, which is particularly effective for entity-centric knowledge repositories, such as Wikipedia, and an information retrieval model that effectively leverages semantic meta-data in the form of ConceptNet, a collaboratively created on-line knowledge base. This direction of my research bridges the existing gap between information retrieval, natural language processing and knowledge management. Then, I will focus on textual data analysis and retrieval methods that leverage temporal and geographic meta-data provided by Web 2.0 resources. In particular, I will present a probabilistic model combining Markov-Modulated Poisson Process and dynamic programming to discover entities with temporally correlated bursts from multi-lingual Web news streams and a probabilistic topic model, which given a geo-tagged document collection, allows to not only detect global topics common to all locations, but also local topics, which are specific only to certain locations. I will also present an information retrieval model that leverages geographical meta-data to bridge the implicit geospatial focus of social media documents and information needs of social media users to enable effective geographically-focused search functionality in social media services, such as Twitter.

Biographical Sketch:
Dr. Alexander Kotov is currently a Post-Doctoral Scholar and Adjunct Assistant Professor at Emory University working with Professor Eugene Agichtein in the Intelligent Information Access Laboratory. He received a PhD from the University of Illinois at Urbana-Champaign in 2011, under the supervision of Professor ChengXiang Zhai. His general research interests are in large-scale textual data analysis: information retrieval, textual data mining and natural language processing. Dr. Kotov is a recipient of DAAD/Siemens Fellowship, Best Reviewer award at EMNLP-CoNLL 2012 and appeared on the UIUC Incomplete List of Teachers Ranked as Excellent. He was also a research intern at Microsoft Research and Yahoo!