PhysioNet/CinC Challenge 2016: Training Sets

The new PhysioNet website is available at: https://physionet.org. We welcome your feedback.

This database holds the records used in the PhysioNet/CinC Challenge 2016. See the page for more details. The database is also described in:

Liu et al. An open access database for the evaluation of heart sound algorithms. Physiol Meas. 2016 Nov 21;37(12):2181-2213 https://www.ncbi.nlm.nih.gov/pubmed/27869105

Please cite this publication and also include the standard citation for PhysioNet when referencing this material:

Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PCh, Mark RG, Mietus JE, Moody GB, Peng C-K, Stanley HE. PhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals. Circulation 101(23):e215-e220 [Circulation Electronic Pages; http://circ.ahajournals.org/cgi/content/full/101/23/e215]; 2000 (June 13).

Heart sound recordings were sourced from several contributors around the world, collected at either a clinical or nonclinical environment, from both healthy subjects and pathological patients. The Challenge training set consists of five databases (A through E) containing a total of 3,126 heart sound recordings, lasting from 5 seconds to just over 120 seconds. You can browse these files or download the entire training set as a zip archive (169 MB).

In each of the databases, each record begins with the same letter followed by a sequential, but random number. Files from the same patient are unlikely to be numerically adjacent. The training and test sets have each been divided so that they are two sets of mutually exclusive populations (i.e., no recordings from the same subject/patient were are in both training and test sets). Moreover, there are two data sets that have been placed exclusively in either the training or test databases (to ensure there are ‘novel’ recording types and to reduce overfitting on the recording methods). Both the training set and the test set may be enriched after the close of the unofficial phase. The test set is unavailable to the public and will remain private for the purpose of scoring.

Participants may note the existence of a validation dataset in the data folder. This data is a copy of 300 records from the training set, and will be used to validate entries before their evaluation on the test set. More detail will be provided in the scoring section below.

The heart sound recordings were collected from different locations on the body. The typical four locations are aortic area, pulmonic area, tricuspid area and mitral area, but could be one of nine different locations. In both training and test sets, heart sound recordings were divided into two types: normal and abnormal heart sound recordings. The normal recordings were from healthy subjects and the abnormal ones were from patients with a confirmed cardiac diagnosis. The patients suffer from a variety of illnesses (which we do not provide on a case-by-case basis), but typically they are heart valve defects and coronary artery disease patients. Heart valve defects include mitral valve prolapse, mitral regurgitation, aortic stenosis and valvular surgery. All the recordings from the patients were generally labeled as abnormal. We do not provide more specific classification for these abnormal recordings. Please note that both training and test sets are unbalanced, i.e., the number of normal recordings does not equal that of abnormal recordings. You will have to consider this when you train and test your algorithms.

Both healthy subjects and pathological patients include both children and adults. Each subject/patient may have contributed between one and six heart sound recordings. The recordings last from several seconds to up to more than one hundred seconds. All recordings have been resampled to 2,000 Hz and have been provided as .wav format. Each recording contains only one PCG lead.

Please note that due to the uncontrolled environment of the recordings, many recordings are corrupted by various noise sources, such as talking, stethoscope motion, breathing and intestinal sounds. Some recordings were difficult or even impossible to classify as normal or abnormal.

Icon  Name                            Last modified      Size  Description
[PARENTDIR] Parent Directory - [   ] validation.zip 2016-03-18 15:33 18M [DIR] validation/ 2016-03-20 00:01 - [   ] training.zip 2016-05-18 17:17 181M [DIR] training-f/ 2016-09-02 16:43 - [DIR] training-a/ 2016-09-03 00:02 - [DIR] training-b/ 2017-04-25 16:13 - [DIR] training-d/ 2017-04-25 16:13 - [DIR] training-e/ 2017-04-25 16:13 - [DIR] training-c/ 2018-11-09 13:29 -

Questions and Comments

If you would like help understanding, using, or downloading content, please see our Frequently Asked Questions.

If you have any comments, feedback, or particular questions regarding this page, please send them to the webmaster.

Comments and issues can also be raised on PhysioNet's GitHub page.

Updated Friday, 28 October 2016 at 16:58 EDT

PhysioNet is supported by the National Institute of General Medical Sciences (NIGMS) and the National Institute of Biomedical Imaging and Bioengineering (NIBIB) under NIH grant number 2R01GM104987-09.