HIDE: Health Information DE-identification

Project Overview
Health informatics is receiving a tremendous amount of attention nationally and locally, as a strategic area of technological development in the 21st century. There are recent discussions about the development of national and regional health information network as well as bench-to-bedside translation of genomic information into practice. Recent provisions of standardization of health care transactions will make it faster and easier to share health information. However, such data sharing has been stymied by restrictions of the privacy, security and quality of the data. The objective of this project is to develop a configurable and integrated Health Information DE-identification (HIDE) framework for publishing and sharing health data while preserving data privacy. There are three research thrusts.

  • 1. Novel techniques for de-identifying unstructured (text) data and an integration of the techniques for de-identifying structured data (relational) as well as unstructured data will be explored.
  • 2. The important user needs and algorithms to take them into account in the de-identification will be investigated and developed.
  • 3. New models and technqiues will be developed for de-identifying data from multiple distributed data sources while preserving privacy for both data subjects and data providers.

    The envisioned outcome of the project will be a suite of algorithms and techniques as well as a set of open source software tools that will allow medical information service providers as well as computer science researchers to manage and share privacy constrained data more effectively and efficiently. While the project is focused on the health domain, the resulting algorithms and techniques will be widely applicable in various application domains.


  • Acknowledgement
    This research is supported partially by a Career Enhancement Fellowship by Woodrow Wilson Foundation, and a URC and an ITSC grant by Emory. Any opinions, findings, and conclusions or recommendations expressed in the project material are those of the authors and do not necessarily reflect the views of the sponsors.