Logistic Regression-HSMM-based Heart Sound Segmentation

The new PhysioNet website is available at: https://physionet.org. We welcome your feedback.

This code is used in the sample submission of the 2016 PhysioNet/CinC Challenge.

1 July 2017: The journal Physiological Measurement has published a special issue on “Recent advances in heart sound analysis” which includes extened articles and updates on many of the top scoring entries in the competition, and an extensive analysis of the Logistic Regression-HSMM-based Heart Sound Segmentation software package, available on this page.

A detailed description of this segmentation algorithm and its application can be found in:

Springer DB, Tarassenko L, Clifford GD. Logistic Regression-HSMM-based Heart Sound Segmentation. IEEE Transactions on Biomedical Engineering, 2016 Apr; 63(4):822-32. doi: 10.1109/TBME.2015.2475278. Epub 2015 Sep 1.

Liu C, Springer D, Clifford GD. Performance of an open-source heart sound segmentation algorithm on eight independent databases. Physiological Measurements, 2017 August; 38(8):1730 –1745. PDF

Please cite the above publications when referencing this material, and also include the standard citation for PhysioNet:

Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PCh, Mark RG, Mietus JE, Moody GB, Peng C-K, Stanley HE. "PhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals," Circulation 101(23):e215-e220 [Circulation Electronic Pages; http://circ.ahajournals.org/content/101/23/e215.full]; 2000 (June 13).


Heart sound segmentation refers to the detection of the exact positions of the first (S1) and second (S2) heart sounds in a heart sound recording or phonocardiogram (PCG). These sounds originate at the beginning of mechanical systole and diastole respectively [1], with S1 occuring immediately after the R-peak (ventricular depolarisation) of the electrocardiogram (ECG), while S2 occurs at approximately at the end-T-wave of the ECG (the end of ventricular depolarisation) [2] (See Figure 1).

Heart sound segmentation is an essential step in the automatic analysis of heart sound recordings, as it allows the analysis of the periods between these sounds for the presence of clicks and murmurs; sounds often indicative of pathology [3]. Segmentation becomes a difficult task when PCG recordings are corrupted by in-band noise. Common noise sources include endogenous or ambient speech, motion artefacts and physiological sounds such as intestinal and breathing sounds. Other physiological sounds of interest, such as murmurs, clicks, splitting of the heart sounds and additional S3 and S4 sounds, can also complicate the identification of the S1 and S2 sounds. For example, Figure 1 shows a PCG (alongside an ECG) with a loud mid-systolic click indicative of mitral valve prolapse, which could be confused for S1 or S2.

[Example of an ECG-labelled PCG]

Figure 1: Example of an ECG-labelled PCG, with the ECG, PCG and four states of the heart cycle (S1, systole, S2 and diastole) shown. A mid-systolic click can be seen in each of the heart cycles, labelled as "m". The R-peak and end-T-wave are labelled as references for defining the approximate positions of S1 and S2 respectively.

While threshold-based segmentation methods have shown modest success, probabilistic models, such as hidden Markov models (HMMs), have recently been shown to surpass the capabilities of previous methods. Segmentation performance is further improved when a priori information about the expected duration of the states is incorporated into the model, such as in a hidden semi-Markov model (HSMM) [4]. An excellent introduction to HMMs can be found in [5].

The code on this page extends the work of [4] by implementing such a HSMM for segmentation, but extended with the use of logistic regression for emission probability estimation which was found to significantly improve segmentation accuracy. In addition, we implement a modified Viterbi algorithm for decoding the most-likely sequence of states. This modified Viterbi algorithm overcomes the limitation of the standard duration-dependant Viterbi algorithm where the start and end of detected states have to coincide with the start and end of the PCG recording. More details can be found in the paper referenced above.

Explanation of the code

The code at the bottom of this page can be used to both train and execute the HSMM-based segmentation of PCGs. The step-by-step operations of the code are explained below.

Training the HSMM

The HSMM training is performed within the trainSpringerSegmentationAlgorithm.m file. Within this file, the basic steps are:

  1. Extract the features from the training set of recordings (getSpringerPCGFeatures.m). These include the:
    • Homomorphic envelope
    • Hilbert envelope
    • Power spectral density envelope
    • Discrete wavelet transform envelope
  2. Label the PCGs based on the supplied R-peak and end-T-wave locations (labelPCGStates.m).
  3. Train the three parameter matrices for the HSMM (trainBandPiMatricesSpringer.m).

Running the trained HSMM

The three trained parameter matrices from above are used to segment a PCG in runSpringerSegmentationAlgorithm.m. The basic operations of the segmentation are:

  1. Extract the features from a test set recording (getSpringerPCGFeatures.m).
  2. Estimate the heart rate based on the autocorrelation (getHeartRateSchmidt.m).
  3. Use the extended duration-dependant Viterbi algorithm to derive the most likely sequence of states from the PCG (lviterbiDecodePCG_Springer.m).
    • The duration distributions for the four states used in the Viterbi algorithm are computed in get_duration_distributions.m.

An example of the output of runSpringerSegmentationAlgorithm.m is shown in blue at the top of Figure 2.

Important options for the code are stored in default_Springer_HSMM_options.m. These include:

If you set springer_options.use_mex = true; in the options file, please be sure to create the MEX executable from the source file by running: mex viterbi_Springer.c before running the code.

[Example of a segmented PCG]

Figure 2: Example of a segmented PCG using the code on this page. The four HSMM-detected states of the heart cycle (S1, systole, S2 and diastole) are shown at the top of the plot in blue - this is the output of the runSpringerSegmentationAlgorithm.m file. The red dotted line shows the labelled states, derived from the R-peak and end-T-wave positions in the labelPCGStates.m file. These labelled states are only used for training the algorithm. The ECG is shown in red at the bottom of the plot, with the positions of the R-peaks (*) and the end-T-waves (o) indicated. Only the PCG is used to derive the detected states in blue.

Quick start guide and example usage

An example of how the model is first trained on a small example set of data, and then used to segment a test recording is given in run_Example_Springer_Script.m. This should run out-of-the-box (provided you have set the default_Springer_HSMM_options.m correctly). If you set springer_options.use_mex = true; in the options file, please be sure to create the MEX executable from the source file by running: mex viterbi_Springer.c before running the code.

This code loads a few records from the example data. Alongside each recording are the R-peak and end-T-wave annotations which are used to label the locations of the heart sounds in the training set.

The HSMM is trained on five of the six example recordings, and applied to the sixth. Running this script should produce a plot similar to that shown in Figure 3.

[Output of run_Example_Springer_Script.m]

Figure 3: The output when running run_Example_Springer_Script.m. This shows the PCG signal in black, with the HSMM derived states shown in red.

The code in run_Example_Springer_Script.m will give you a clear example of how to apply these method to new datasets. Tested April 4 2016.

Example Data

The example_data.mat file contains the full dataset from the paper mentioned at the top of this page. Included in this .mat file is a struct with the audio files, annotations, binary diagnosis (1 = presence of pathology) and the "patient number" for each record.

The example_data.example_audio_data is sampled at 1000 Hz, and contains recordings from various locations on the chest. While there are 135 patients, there are 792 recordings. This is due to multiple recordings per patient, as well as some recordings being split at positions of inconsistent annotations.

The example_data.example_annotations are the positions of the R-peak (first column of the cell array) and the end-T-wave positions (second column of the cell array), sampled at 50 Hz. As explained in the associated publication, these annotations are computed based on the agreement between five different automatic R-peak and end-T-wave detectors. They are not human-derived annotations.

The example_data.binary_diagnosis cell array shows which recordings were from subjects with no pathological heart damage as assessed by echocardiography (binary_diagnosis=0), and those with some sort of heart pathology (most commonly mitral valve prolapse - binary_diagnosis = 1).

The example_data.patient_number cell array indicates which subject each recording was captured from. Again, there are multiple recordings per patient due to multiple auscultation positions and splitting of recordings due to inconsistent annotations.


  1. Douglas, G., Nicol, E.F., and Robertson, C., Macleod’s Clinical Examination. Elsevier Health Sciences, 2009.
  2. Tilkian, A.G. and Conover, M.B., Understanding heart sounds and murmurs: with an introduction to lung sounds. Elsevier Health Sciences, 2001.
  3. Leatham, A. Auscultation of the heart and phonocardiography. Churchill Livingstone: 1975.
  4. Schmidt, S.E.; Holst-Hansen, C.; Graff, C.; Toft, E.; Struijk, J.J. Segmentation of heart sound recordings by a duration-dependent hidden markov model. Physiol Meas 2010, 31, 513-529.
  5. Rabiner, L., "A tutorial on hidden Markov models and selected applications in speech recognition," Proceedings of the IEEE, vol. 77, no. 2, pp. 257–286, Feb. 1989.
Icon  Name                                       Last modified      Size  Description
[PARENTDIR] Parent Directory - [   ] COPYING 2016-04-06 18:39 34K GNU General Public License [IMG] Figure1.png 2016-04-06 19:09 78K [IMG] Figure2.png 2016-04-06 19:11 84K [IMG] Figure3.png 2016-04-06 19:12 79K [TXT] Hilbert_Envelope.m 2016-04-04 15:50 1.9K [TXT] Homomorphic_Envelope_with_Hilbert.m 2016-04-04 15:50 3.4K [   ] Liu_HSMM_Heartsound_Algo_IOP_July_2017.pdf 2017-09-06 12:43 1.1M [   ] SpringerSegmentationCode.zip 2016-04-07 09:21 73M [TXT] butterworth_high_pass_filter.m 2016-04-04 15:50 2.8K [TXT] butterworth_low_pass_filter.m 2016-04-04 15:50 2.8K [TXT] default_Springer_HSMM_options.m 2016-04-04 15:50 1.6K [   ] example_data.mat 2016-04-04 15:50 73M [TXT] expand_qt.m 2016-04-04 15:50 2.2K [DIR] fullsizedfigures/ 2016-04-06 19:07 - [TXT] getDWT.m 2016-04-04 15:50 2.5K [TXT] getHeartRateSchmidt.m 2016-04-04 15:50 3.7K [TXT] getSpringerPCGFeatures.m 2016-04-04 15:50 4.1K [TXT] get_PSD_feature_Springer_HMM.m 2016-04-04 15:50 2.5K [TXT] get_duration_distributions.m 2016-04-04 15:50 3.7K [TXT] labelPCGStates.m 2016-04-04 15:50 6.9K [TXT] normalise_signal.m 2016-04-04 15:50 1.4K [TXT] runSpringerSegmentationAlgorithm.m 2016-04-04 15:50 2.7K [TXT] run_Example_Springer_Script.m 2016-04-04 15:50 2.2K [TXT] schmidt_spike_removal.m 2016-04-04 15:50 4.3K [TXT] trainBandPiMatricesSpringer.m 2016-04-04 15:50 5.9K [TXT] trainSpringerSegmentationAlgorithm.m 2016-04-04 15:50 3.7K [TXT] viterbiDecodePCG_Springer.m 2016-04-04 15:50 15K [TXT] viterbi_Springer.c 2016-04-04 15:50 13K

Questions and Comments

If you would like help understanding, using, or downloading content, please see our Frequently Asked Questions.

If you have any comments, feedback, or particular questions regarding this page, please send them to the webmaster.

Comments and issues can also be raised on PhysioNet's GitHub page.

Updated Friday, 28 October 2016 at 16:58 EDT

PhysioNet is supported by the National Institute of General Medical Sciences (NIGMS) and the National Institute of Biomedical Imaging and Bioengineering (NIBIB) under NIH grant number 2R01GM104987-09.