NLM-Scrubber

Provided by:
U.S. National Library of Medicine - Lester Hill National Center for Biomedical Communications


NLM-Scrubber is a free clinical text de-identification tool designed and developed at the U.S. National Library of Medicine.

Narrative clinical reports contain a rich set of clinical knowledge that could be invaluable for clinical research. However, they usually also contain personal identifiers that are considered protected health information and are associated with use restrictions and risks to privacy. Computational de-identification seeks to remove all of the identifiers in such narrative text in order to produce de-identified documents that can be used in research while protecting patient privacy. Computational de-identification uses natural language processing (NLP) tools and techniques to recognize patient-related individually identifiable information (e.g., names, addresses, and telephone and social security numbers) in the text, and redacts them. In this way, patient privacy is protected and clinical knowledge is preserved.

NLM-Scrubber is capable of de-identifying many kinds of clinical reports with high accuracy. The software design uses a number of deterministic and probabilistic pattern recognition algorithms and various computational linguistic methods. The application accepts narrative reports in plain text or in HL7 format. 

The application software includes an editor for visualization and markup called the Visual Tagging Tool (VTT) that we use to produce gold standards against which to test the tool. Although designed specifically for tagging identifiers that contain personally identifiable, protected health information, VTT has been made publicly available to the greater NLP community for expanded lexical tagging and text annotation.

Language Coverage
English (Latin)

Get Started with the service

: Free of charge

Support

Helpdesk: scrubber-problems@mail.nih.gov

Documentation: NLM-Scrubber publications

Access the service

Request - Downloadable : https://scrubber.nlm.nih.gov/files/

Tool