next up previous contents
Next: 1.5 Clinical overview of Up: 1.4 Data Organization Previous: 1.4.3 Subject ID -   Contents


1.4.4 De-identification of patients' data

The process for the removal of protected health information (PHI) in the the MIMIC II database is fully described in Neamatullah et-al(6). A labeled subset of the data, together with a public version of the code can be found on PhysioNet at: http://www.physionet.org/physiotools/deid/.

Figure 1.5 illustrates the de-identification process. Briefly, the salient points for the user of our database are:

Figure 1.5: De-identification process
Image deid

Examples of a de-identified nursing progress note and discharge summary can be found in figures 1.6 and 1.7 respectively. Note that a few of the de-identified sections of the nursing note are false positives, and a small fraction of the clinical information may have been lost. However, all dates and names (the only PHI in this document) were caught by our algorithm. Note also the the high prevalence of abbreviations such as S/O (sign out), D/C'd (discontinued, or discharged), Neo (neosynephrine), NSR (normal sinus rhythm), F/E (fluid and electrolytes), GI (gastrointestinal), HEME (hematology), ID (infectious disease), A (assessment), P (plan), etc. Note also the low degree of structure in the nursing note, broken into a few categories; S/O, F/E, NEURO, GI, HEME, ID, RESP, SKIN, ACCESS, SOCIAL, A, and P. The boldface type has been added to this figure to highlight these categories, but is not available in the notes.

Figure 1.6: Example of a de-identified progress note. Sub-headings have been capitalized in bold face type for easier reading. Removed text is denoted by square brackets. True positives are colored green, false positives are colored red.
Image Progress_note

Figure 1.7: Example of a section of a de-identified discharge summary. All de-identified elements are denoted by square brackets. No false positives exist in this example.
Image Discharge_summary


next up previous contents
Next: 1.5 Clinical overview of Up: 1.4 Data Organization Previous: 1.4.3 Subject ID -   Contents
djscott 2011-09-07