next up previous contents
Next: 1.4.4 De-identification of patients' Up: 1.4 Data Organization Previous: 1.4.2 What is a   Contents


1.4.3 Subject ID - Case ID matching

Given that the MIMIC II data are collected from different sources, they must be matched to a unique patient and temporally aligned. The bedside monitor-generated data included a unique identifier (the Case_ID), assigned automatically by the monitor, and fields for patient name (first and last name) and medical record number (MRN). The name and MRN fields were manually entered by nurses into the networked central station when a patient was admitted. Unfortunately in approximately 30% of cases one or more identifier fields were not completed for admitted patients. Moreover, human errors are likely to exist in the manually recorded name and MRNs. The CareVue clinical information system also included a unique patient identifier (that maps to our ICUstay_ID) for each ICU stay of a patient. The subject's CareVue data also includes identifying information such as a patient's name and MRN which was automatically input from the hospital-wide information system when a patient is admitted to a unit.

When waveform files included the patient's identifying information (name, and MRN), the physiologic data records (indexed by a Case_ID) were matched to the corresponding clinical information records from CareVue. There were two stages to the merging process. The first stage included matching names and medical record numbers (when available and accurately recorded) from the monitor-generated data records to those of the clinical data records from CareVue. The second stage included comparing the similarity of the physiologic trends from the higher resolution monitoring data (approximately 1 sample per minute) with the nurse-validated vital sign trends in the clinical information system sampled on an hourly basis.

Briefly, determination of trend similarity included four stages:

However, it is possible that some of the matches may be incorrect. After manual review we believe we have caught most of the inconsistencies, but anomalies may still be present.


next up previous contents
Next: 1.4.4 De-identification of patients' Up: 1.4 Data Organization Previous: 1.4.2 What is a   Contents
djscott 2011-09-07