Skip to Main Content

SOM Data Management: Discovery and Sharing

De-Identifying Your Data

When a dataset is too sensitive to share in its entirety, it is necessary to consider: "how can a version that is safe to share be created?". The process of doing so involves de-identification or anonymisation to remove all data that can be used to identify individual participants in a research project, thereby protecting their privacy. This may require hiring a professional statistician, which can be written into a grant.

Why Open Science?

Data Repository

The following are characteristics identified by the BioMedical Informatics Coordinating Committee (BMIC) as being desirable characteristics of data repositories:

  • Persistent unique identifiers
  • Long-term sustainability
  • Metadata
  • Curation & quality assurance
  • Maximally open access
  • Tracking data re-use
  • Free of charge
  • Secure
  • Common format
  • Provenance

Additional considerations for repositories involving human data:

  • Fidelity to consent
  • Restricted use compliant
  • Privacy
  • Plan for breach
  • Download audit and control
  • Clear use guidance
  • Retention guidance
  • Plan for use violations
  • Request review


Source: Huerta MF. Strategic Approaches to Data Science & Open Science: Research Data Management. Presented at: Research Data Management Symposium; 2019 Dec 5; New York, NY.

These repositories are selected from the much longer list available via NIH.

These generalist repositories are affiliated with NIH, which suggests depositing in a generalist repository if a domain specific repository cannot be found. List from NIH.

Using Northwell Data for Research

There are two options for Northwell employees who wish to obtain EHR data for research purposes.

  1. Fill out this form from Quantitative Intelligence. Under "Service Requested," select "Data Request for Research."
  2. Visit the Analytic Resource Center, where you can request access to various dashboards and reports on Northwell Healthcare Analytics. These may be especially useful for Quality Improvement projects.

Data Discovery Resources

Choosing the Right Repository for Your Data

There are several considerations you may need to account for when choosing a repository for your data.

  • Does your funding agency or publisher specify a repository to use?
  • If they do not specify a repository, do they have guidelines? For instance, the NIH directs researchers to seek out a discipline specific repository.
  • How large is your data? Many repositories have limits on the amount of free storage provided. Do you need to budget for storage?
  • Is the data sensitive or under embargo? What controls do you need for your data?
  • Who is the funder for the repository? Is it sustainable? Is it endorsed by a scholarly or professional group?

Data Management and Sharing Policies

Contact Us

For questions or comments, email us at

Hofstra University

This site is compliant with the W3C-WAI Web Content Accessibility Guidelines
HOFSTRA UNIVERSITY Hempstead, NY 11549-1000 (516) 463-6600 © 2000-2009 Hofstra University