Eugene Agichtein
Education
- Ph.D. Computer Science, May 2005. Columbia
University, New York, New York.
Dissertation: Extracting Relations from Large Text Collections (advisor:
Luis Gravano).
- M.S. Computer Science, May 2000. Columbia
University, New York, New York.
- B.S. Engineering, May 1998. The Cooper
Union, New York, New York.
Professional Employment
- September 2006 - present: Assistant Professor.
Mathematics and Computer Science Department, Emory University
- May-June 2007: Visiting Researcher.
Yahoo! Research, Santa Clara, CA
- September 2004 - August 2006: Postdoctoral
Researcher. Microsoft Research, Redmond, WA
- 1998 - 2004: Research Assistant. Computer
Science Department, Columbia University, New York, NY
- Summer 2003: Research Intern. Microsoft
Research, Redmond, WA
- Summer 2001: Research Intern. IBM Almaden Research
Center, San Jose, CA
- Summer 2000: Research Intern. NEC Research
Institute, Princeton, NJ
- 1997-1998: Junior Research Scientist.
The Proteus Project, New York University Computer Science Department, New
York, NY
Honors and Awards
- Microsoft Research "Beyond Search" Award,
2007
- Best Paper Award, ACM International
Conference on Management of Data (SIGMOD 2006)
- Best Student Paper Award,
19th IEEE International Conference on Data
Engineering (ICDE 2003)
- Award for Exemplary Service to the Computer Science
Department, Columbia University, 2002
- Award for Excellence in Teaching, School of Engineering and
Applied Science, Columbia University, 2001
- Full tuition scholarship, The Cooper Union,
1994-1998
Patent
- U.S. Patent 7,269,545: E. Agichtein and S.
Lawrence. Method for retrieving answers from an information retrieval
system.
Tutorials
- Towards Web-Scale Information Extraction,
Eugene Agichtein, webcast, ACM SIGKDD Web Seminar, March 2007
- Scalable Information Extraction and
Integration,
Eugene Agichtein and Sunita Sarawagi, presented at the ACM International Conference on
Knowledge Discovery and Data Mining (KDD), 2006
Invited Papers
-
E. Agichtein, Web Information Extraction and User
Information Needs: Towards Closing the Gap, in the IEEE
Data Engineering Bulletin issue on Web-Scale Data, Systems, and Semantics, December 2006
-
E. Agichtein, Scaling Information Extraction to Large Document Collections,
in the IEEE Data Engineering Bulletin
issue on Searching and Mining Literature Digital Libraries, December 2005
Journal Papers
-
S.
Sahay, S.
Mukherjea, E. Agichtein,
E. V Garcia, S. Navathe,
and A. Ram, Discovering Semantic Biomedical
Relations utilizing the Web, to appear in the ACM Transactions on
Knowledge Discovery from Data (TKDE), special issue on Bioinformatics,
2008
- P. Ipeirotis, E. Agichtein, P. Jain, and L.
Gravano, Towards a Query Optimizer for Text-Centric
Tasks, in ACM
Transactions on Database Systems (TODS), vol. 32, no. 4, 2007
- E. Agichtein, S. Lawrence and L. Gravano, Learning to Find Answers to Questions on
the Web, in ACM
Transactions on Internet Technology (TOIT) Special Issue on "Machine
Learning for the Internet", 2004
- H. Yu and E. Agichtein, Extracting Synonymous Gene and Protein
Terms from Biological Literature, in
Bioinformatics, 2003 (also in Proc. of ISMB 2003)
- F. M. Torres, E. Agichtein, L. Grinberg, G.
Yu, and R. Q. Topper, A note on the application of the
"Boltzmann simplex"-Simulated Annealing algorithm to global optimizations of
argon and water clusters, Journal of Molecular Structure
(THEOCHEM), 1997
Papers in Refereed Conferences
- Y. Liu, J. Bian, and E. Agichtein,
Predicting Information Seeker Satisfaction in Community Question Answering,
in Proc. of the ACM SIGIR International Conference on Research and
Development in Information Retrieval (SIGIR), 2008 (17% accepted)
- J. Bian, Y. Liu, E. Agichtein and H. Zha.
Finding the Right Facts in the Crowd: Factoid Question Answering over Social
Media, in Proc. of the International World Wide Web Conference
(WWW), 2008 (11% accepted)
- Y. Liu and E. Agichtein,
You've Got Answers: Towards Personalized Models for Predicting Success in
Community Question Answering (short paper),
in Proc. of the Annual
Meeting of the Association for Computational Linguistics (ACL), 2008
(25% accepted)
- B. Li, Y. Liu, and E.
Agichtein, CoCQA: Co-Training Over Questions and Answers for Predicting
Question Subjectivity Orientation (full paper), in Proc. of Conference
on Empirical Methods in Natural Language Processing (EMNLP), 2008 (21%
accepted)
- E. Agichtein, C. Castillo, D. Donato, A.
Gionis, G. Mishne, Finding High Quality Content in Social Media, in Proc. of the ACM
Web Search and Data Mining Conference (WSDM), 2008 (16% accepted)
- C. Clarke, E. Agichtein, S. T. Dumais, and
R. W. White, The Influence of Caption Features on
Clickthrough Patterns in Web Search, in Proc. of the ACM SIGIR Conference on
Research and Development in Information Retrieval (SIGIR), 2007 (18%
accepted)
- E.
Agichtein, C. Burges, and E.
Brill, Question Answering over
Implicitly Structured Web
Content, in Proc. of the IEEE/WIC/ACM
Conference on Web
Intelligence (WI), 2007 (17%
accepted)
- P. Jurczyk and E. Agichtein, Discovering Authorities in Question
Answer Communities Using Link Analysis (short paper), in Proc. of the ACM Conference on Information and Knowledge
Management (CIKM), 2007 (26% accepted)
- E. Agichtein, E. Brill, and S. T. Dumais, Improving Web Search Ranking by
Incorporating User Behavior Information, in Proc. of the ACM SIGIR Conference on Research and Development on
Information Retrieval (SIGIR), 2006 (19% accepted)
- E. Agichtein, E. Brill, S. T. Dumais, and R.
Ragno, Learning User Interaction Models for
Predicting Web Search Result Preferences, in Proc. of the ACM SIGIR Conference on
Research and Development on Information Retrieval (SIGIR), 2006 (19%
accepted)
- P. Ipeirotis, E. Agichtein, P. Jain, and L.
Gravano, To Search or to Crawl: Towards a Query
Optimizer for Text-Centric Tasks, in Proc. of the ACM Conference on
Management of Data (SIGMOD), Best Paper Award, 2006 (13% accepted)
- E. Agichtein and Z. Zheng, Identifying “Best Bet” Web Search Results
by Mining Past User Behavior (short paper),
in Proc. of the ACM International Conference on Knowledge Discovery and Data
Mining, (KDD), Industrial Applications track, 2006 (24% accepted)
- E. Agichtein, Confidence Estimation Methods for
Partially Supervised Relation Extraction (short paper), in Proc. of the SIAM Conference on Data Mining (SDM), 2006
(30%
accepted)
- E. Agichtein and S. Cucerzan, Predicting Accuracy of Extracting
Information from Unstructured Text Collections, in Proc. of the ACM Conference on Information and Knowledge
Management (CIKM), 2005 (18% accepted)
- E. Agichtein and V. Ganti, Mining Reference Tables for Automatic
Text Segmentation, in Proc. of the ACM
International Conference on Knowledge Discovery and Data Mining (KDD),
2004 (12% accepted)
- E. Eskin and E. Agichtein, Combining Text Mining and Sequence
Analysis to Discover Protein Functional Regions, in Proc. of the Pacific Symposium on Biocomputing (PSB), 2004
(28%
accepted)
- E. Agichtein and L. Gravano, Querying Text Databases for Efficient
Information Extraction, in Proc. of the IEEE
International Conference on Data Engineering (ICDE), Best Student Paper Award, 2003
(14% accepted)
- H. Yu and E. Agichtein, Extracting Synonymous Gene and Protein
Terms from Biological Literature, in Proc. of the
Conference on Intelligent Systems for Molecular Biology
(ISMB), 2003 (15% accepted)
- E. Agichtein, S. Lawrence and L. Gravano, Learning Search Engine Specific Query
Transformations for Question Answering, in the 10th World Wide Web Conference (WWW), 2001
(20%
accepted)
- E. Agichtein and L. Gravano, Snowball: Extracting Relations from Large
Plain-Text Collections, in the 5th
ACM International Conference on Digital Libraries (ACM DL), 2000 (33%
accepted)
Papers in Refereed Workshops and Poster and Demonstration Sessions
- Q. Guo, E. Agichtein, C. Clarke and A. Ashkan.
Understanding "Abandoned" Ads: Towards Personalized Commercial Intent
Inference via Mouse Movement Analysis, in Proc. of the SIGIR
2008 Workshop on Information Retrieval in Advertising (IRA), 2008
- A. Ashkan, C. Clarke, E. Agichtein and Q.
Guo. Characterizing Query Intent From Ad Clickthrough Data, in Proc.
of the SIGIR 2008 Workshop on Information Retrieval in Advertising
(IRA), 2008
- J. Bian, Y. Liu, E. Agichtein and H. Zha,
A Few Bad Votes Too Many? Towards Robust Ranking in Social Media, in Proc. of the WWW Workshop on Adversarial Information Retrieval (AIRWeb),
2008
- Q. Guo and E. Agichtein, Exploring
Client-Side Instrumentation for Personalized Search Intent Inference:
Preliminary Experiments, in Proc. of the AAAI 2008 Workshop on
Intelligent Techniques for Web Personalization and Recommender Systems (ITWP),
2008
- B. Li, Y. Liu, A. Ram, E. V. Garcia, and E.
Agichtein, Subjectivity Analysis for Questions in QA Communities
(poster), in Proc. of the ACM SIGIR International Conference
on Research and Development in Information Retrieval (SIGIR), 2008
- Y. Liu, E. Agichtein, On the Evolution of
the Yahoo! Answers QA Community (poster), in Proc. of the
ACM SIGIR International Conference on Research and Development in
Information Retrieval (SIGIR), 2008
- Q. Guo, E. Agichtein, Exploring Mouse
Movements for Inferring Query Intent (poster), in Proc. of the ACM SIGIR International Conference on Research and Development in
Information Retrieval, 2008
- P. Jurczyk and E. Agichtein. HITS on
Question Answer Portals: an Exploration of Link Analysis for Author Ranking
(poster), in Proc. of the ACM SIGIR International Conference on
Research and Development in Information Retrieval, 2007
- L. Xiong and E. Agichtein. Towards
Privacy-Preserving Query Log Publishing, in Proc. of the Query Log
Analysis: Social and Technological Challenges Workshop at WWW 2007
- S.Sahay, E. Agichtein, E.V. Garcia, B. Li,
and A. Ram. Semantic Annotation and Inference for Medical Knowledge
Discovery, in Proc. of NSF Symposium on Next Generation Data Mining
Techniques (NGDM), 2007
- E. Agichtein and S. Cucerzan, Predicting Extraction Performance by
Using Context Language Models, n
Proc. of the SIGIR Workshop on Methodologies and Evaluation of Lexical Cohesion
Techniques in Real-World Applications (SIGIR ELECTRA), 2005
- E. Agichtein, S. Cucerzan, and E. Brill, Analysis of Factoid Questions for
Effective Relation Extraction (poster), in Proc. of the ACM SIGIR 2005
- E. Agichtein, P. Ipeirotis, and L. Gravano,
Modeling Query-Based Access to Text
Databases, in
Proc. of the Sixth International Workshop on the Web and Databases (WebDB), 2003 (25%
accepted)
- E. Agichtein, C.T. H. Ho, V. Josifovski, and
J. Gerhardt. Extracting Relations from XML Documents, in
Springer Lecture Notes in Computer Science (LNCS), Volume 2814, "Conceptual
Modeling for Novel Application Domains"; also in the International Workshop
on XML Schema and Data Management (XSDM), 2003
- E. Agichtein and Luis Gravano, QXtract: A Building Block for Efficient
Information Extraction from Plain-Text Databases (demo), in the ACM International Conference on
Management of Data (SIGMOD), 2003
- E. Agichtein, L. Gravano, J.Pavel, V. Sokolova, A.
Voskoboynik. Snowball: A Prototype System for
Extracting Relations from Large Text Collections (demo), in the ACM International Conference on Management of Data
(SIGMOD), 2001
- A. Borthwick, J.
Sterling, E. Agichtein, and Ralph Grishman. Exploiting Diverse Knowledge Sources via
Maximum Entropy in Named Entity Recognition, in the Sixth Workshop on
Very Large Corpora, 1998
- E. Agichtein, E. Eskin and L. Gravano. Combining Strategies for Extracting
Relations from Text Collections, in the ACM SIGMOD Workshop on Data Mining and Knowledge
Discovery (DMKD, 2000
Other Publications and Abstracts
- S. Sahay, B. Li, E. V. Garcia, E. Agichtein,
and A. Ram,Domain Ontology Construction from
Biomedical Text, to appear in Proc. of the 2007 International Conference on
Artificial Intelligence (ICAI), 2007
- S. Cucerzan and E. Agichtein, iFactoid Question Answering over
Unstructured and Structured Content on the Web at TREC 2005, in the proceedings of the TREC 2005
conference
- E. Agichtein, Extracting Relations From Large Text
Collections, Ph.D. Thesis, Columbia University, 2005
- A. Borthwick, J. Sterling, E.
Agichtein, and R. Grishman, NYU: Description of the MENE Named Entity
System as used in MUC-7, in the proceedings of the 7th Message
Understanding Conference (MUC-7)
Invited Talks
-
May 2008: Searching Social Media: Yahoo!
Research, New York
-
May 2008: User Behavior Modeling for
Searching the Web and Social Media: Center for Disease Control (CDC),
Atlanta, GA
-
April 2008: User Behavior Modeling for
Searching the Web and Social Media: Invited talk at the SIAM Data Mining
Conference (SDM 2008)
-
October 2007: Patterns in User Behavior in
Web Search and Social Media: University of Waterloo, Information
Retrieval seminar
-
February 2007: Patterns in Search: Mining Web Search User Behavior: Georgia Institute of Technology, GVU Center brownbag
- January 2007: Patterns in Web Search: Yahoo! Research, Santa Clara
- November 2006: Information Access and
Knowledge Discovery in Unstructured Data: Emory University, School of Public Health,
Biostatistics Seminar
- Spring 2006: Surfacing Information in Large
Unstructured Datasets: Emory University, Georgia Tech
Research Institute (GTRI), University of California-Riverside, University of
Maryland-College Park (Computer Science department), University of Maryland Computational Linguistics
and Information Processing Colloquium, New York
University (Proteus research group), Microsoft Research
-
2004:
Extracting Relations from Large Text
Collections: Florida International University, College of William and
Mary, Microsoft Research (job talk), IBM Almaden Research Center, Rice University
-
2001:
Finding Answers to Questions on the Web:
Machine Learning Workshop at Snowbird
TEACHING
EXPERIENCE
Emory University
-
Fall 2008:
CS572
– Information Retrieval and Web Search
-
Fall 2008:
CS171 – Introduction to Computer Science II: Elementary Data Structures and
Algorithms
-
Spring 2008:
CS171 – Introduction to Computer Science II: Elementary Data Structures and Algorithms
-
Fall 2007:
CS571: Natural Language Processing
-
Spring 2007:
CS171 – Introduction to Computer Science II: Elementary Data Structures and
Algorithms
-
Fall 2006: CS584
– Information Retrieval and Web Search
Other Educational Activities
-
Ph.D. thesis committee: Dawid Kurzyniec,
Emory University Mathematics and Computer Science Department, Feb. 2007
-
Ph.D. thesis proposal committee:
Dawid Kurzyniec,
Emory University Mathematics and Computer Science Department, Dec. 2006
-
Fall
2001: Head Teaching Assistant, “Database Systems” course, Columbia
University
-
Spring
1999: Teaching Assistant, “Advanced Database Systems” course,
Columbia University
PROFESSIONAL SERVICE
Conference and Workshop Organization
- Co-Chair: CIKM 2008 Workshop on
Searching Social Media (SSM 2008)
- Co-Chair: SIGIR 2008 Workshop on
Information Retrieval in Advertising (IRA 2008)
- Tutorials Co-Chair (Information Retrieval):
Annual Meeting of the Association for Computational Linguistics (ACL 2008)
- Senior PC Member:
ACM International Conference on Research and
Development in Information Retrieval (SIGIR 2008)
- Senior PC Member, ACM International Conference on Research and
Development in Information Retrieval (SIGIR 2007)
Conference Technical Program Committee
Service
- PC Member, ACM International Conference on
Web Search and Data Mining (WSDM 2009, WSDM 2008)
- PC Member, Conference on Empirical Methods
in Natural Language Processing (EMNLP 2009, EMNLP 2008, EMNLP 2007)
- PC Member, Annual European Conference on
Information Retrieval (ECIR 2009, ECIR 2008)
- PC Member, ACM International Conference on Knowledge
Discovery and Data Mining (KDD 2008, KDD 2006)
- PC Member, AAAI Conference on Artificial
Intelligence, Web track (AAAI 2008)
- PC Member, Annual Meeting of the Association
for Computational Linguistics (ACL 2008, COLING/ACL 2006)
- PC Member, IEEE International Conference on
Data Engineering (ICDE 2008)
- PC Member, International Joint Conference on
Natural Language Processing (IJCNLP 2008)
- PC Member, The Annual Conference of the North American
Chapter of the Association for Computational Linguistics (NAACL-HLT 2007)
- PC Member, The Fourth
International Conference on Knowledge Capture (K-CAP 2007, K-CAP 2005)
- PC Member, Pacific Symposium on Biocomputing
(PSB 2007)
- PC Member, ACM International Conference on Research and
Development in Information Retrieval (SIGIR 2006)
- PC Member, IEEE International Conference on Data Mining
(ICDM 2006, ICDM 2005)
- PC Member, European Conference on Principles and Practice
of Knowledge Discovery in Databases (ECML/PKDD 2006)
- PC Member, Intelligent Systems in Molecular Biology
Conference (ISMB 2006, ISMB 2005, ISMB 2004)
Other Peer Reviewing Service
- Program Committee Member - Workshops and
Poster sessions: WSDM 2009 Workshop on Mining Click Data (WSCD09), ICDE 2008 Workshop on Ranking in Databases (DBRank
2008), SIGIR 2007 Workshop on Learning to Rank for Information
Retrieval, WWW 2007 Workshop on Query Log
Analysis: Social and Technological Challenges (QueryLogs 2007), ICDE 2007 Workshop on Text Data
Mining and Management (TDMM 2007), IJCAI-2007 Workshop on Analytics for
Noisy Unstructured Text Data (AND 2007), Bar Ilan Symposium on Foundations
of Artificial Intelligence (BISFAI 2007), CIKM 2006 International Workshop on Health Information
and Knowledge Management (HIKM 2006), International Conference on Data Warehousing
and Knowledge Discovery (DaWaK 2006), Next Generation Information Technologies and
Systems Workshop (NGITS 2006), International Conference on Knowledge
Science, Engineering and Management (KSEM 2006), Event Extraction and Synthesis Workshop at AAAI 2006,
Semantic Mining in BioMedicine Symposium (SMBM 2004), International World-Wide Web Conference
(WWW) Posters Track (WWW 2002).
- Journal and Conference Referee:
Foundations and Trends in Information Retrieval, ACM
Transactions on Information Systems (ACM TOIS), ACM Transactions on Database
Systems (ACM TODS), International Journal on Very Large Data Bases (VLDB Journal),
IEEE Transactions on Knowledge and Data Engineering (IEEE TKDE), IEEE
Internet Computing (IEEE IC), IEEE International Conference on Data
Engineering (IEEE ICDE), ACM International Conference on Management of Data
(ACM SIGMOD), International Conference on Very Large Data Bases (VLDB).
UNIVERSITY SERVICE
- Fall 2007-present: Department Graduate Committee
- Fall 2006-present: Computer Science Ph.D. Admissions Committee
Last updated: August 2008.