Current Issue Cover
采用自适应信号恢复算法的非接触式心率检测

周双1,2, 杨学志1,2, 金兢1,2, 方帅1,2, 刘雪南1,2(1.合肥工业大学计算机与信息学院, 合肥 230009;2.工业安全与应急技术安徽省重点实验室, 合肥 230009)

摘 要
目的 心率是反映人体心血管状况和心理状态的重要生理参数。最近的研究表明,光电容积成像技术可以在不接触人体的情况下,利用消费级的摄像机捕获面部表皮颜色的变化进而估计心率。然而,在实际环境中,面部运动带来的干扰会导致心率检测的准确性下降。近年来,国内外学者已经提出了一些方法来去除运动噪声,但是效果均不理想。为了解决上述问题,提出一种可以抗面部运动干扰的新方法。方法 首先检测和跟踪受试者的脸部。然后将脸部区域分块,并提取各块的色度特征建立原始血液容积脉冲矩阵,利用自适应信号恢复算法从原始血液容积脉冲矩阵中分离出低秩矩阵并构建期望血液容积脉冲信号。最后通过功率谱密度估计心率。结果 在环境光作为光源的条件下,利用网络摄像头采集30名受试者的人脸视频进行实验分析。结果显示,提出的方法测得的心率与参考值具有很强的相关性:在静态场景中皮尔森相关系数r=0.990 2,在动态场景中r=0.960 5。并且与最新方法相比,动态场景中的误差率降低了53.90%,相关性提高了7.46%。此外,在10 min的心率检测实验中,方法的测量值与参考值保持着良好的一致性。结论 本文方法优于现有的非接触式心率检测技术,能有效地消除面部运动带来的干扰,长期稳定地检测心率。
关键词
Non-contact heart rate detection using self-adaptive signal recovery algorithm

Zhou Shuang1,2, Yang Xuezhi1,2, Jin Jing1,2, Fang Shuai1,2, Liu Xuenan1,2(1.School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230009, China;2.Anhui Key Laboratory of Industrial Safety and Emergency Technology, Hefei 230009, China)

Abstract
Objective Heart rate is an important physiological parameter that reflects the cardiovascular condition and mental state of the human body. Traditional techniques for heart rate detection need pressure sensors or optical sensors attached with human skin. However, the contact between sensors and the human skin tends to result in inconvenience for subjects, especially for those with skin diseases. Accordingly, using traditional techniques in our daily lives is difficult. With the improvement of technology for bio-image information processing, a non-contact heart rate detection method based on imaging photoplethysmography has become an attractive research focus. The color of human epidermis changes in a subtle way with the rhythm of heart beat. These changes are invisible to human eyes and can be captured by a webcam for heart rate estimation. Current technologies for non-contact heart rate detection prioritize facial skin due to its dense capillary distribution and the significant improvement of face tracking technology. However, in realistic environments involving facial motions, the precision of heart rate detected fails to meet the requirements. In recent years, various methods, such as independent principal component analysis, adaptive filtering, and wavelet transform, have been proposed to address this problem. After obtaining initial chrominance signals from different channels of a video, the independent principal component analysis algorithm extracts mutually independent source signals from the initial signals. One of the source signals represents the pulsation of heart beat. However, the source signals obtained are still corrupted by abundant noise. The independent principal component analysis algorithm is complicated, which hinders their widespread application. Adaptive filtering is generally based on the least mean square algorithm. Noises mixed in the chrominance signal can be removed by adaptively adjusting the parameters of the filter, regardless of the noise characteristics. However, this method can merely filter out Gaussian white noise, rather than the sharp noise caused by motion interference in practice. Wavelet transformation method decomposes the original blood volume pulse signal into a series of frequency bands. The method then selects the signals limited in the heart rate band and integrates them into a desirable signal, from which heart rate can be estimated by later processes. Nevertheless, the power spectral density function of the sharp noise caused by motion interference overwhelms the entire heart rate band, which cannot be discarded thoroughly by wavelet transformation. Overall, existing non-contact heart rate detection technologies fail to effectively filter out the sharp noise caused by facial motions, and their results are inaccurate. Method To tackle above-mentioned problems, this work proposes a novel method for detection of facial motions. Discriminative response map fitting and Kanede-Lucas-Tomasi algorithm are used in video to detect and track the facial regions. Some facial areas (e.g., eyes) do not contain information about heart rate. Thus the face is divided into several sub-regions, and these sub-regions are then studied respectively, with the aim of mitigating the impact of uninteresting areas. Chrominance characteristics of each sub-region are extracted from the video to establish the raw blood volume pulse matrix. In theoretical aspect, the ideal blood volume pulse matrix should be low-rank. However, in practice, the expression change at a certain moment may make the blood volume pulse distorted abruptly in corresponding time. The rank of the blood volume pulse matrix increase as the result of the distortion of local elements. This work performs the adaptive signal recovery algorithm on the damaged blood volume pulse matrix to discard abnormal elements arising from facial motions and reconstruct the row-rank matrix. From the matrix, sub-regions rich in information about heart rate are selected to compose the desirable blood volume pulse signal. Heart rate can be calculated according to the power spectrum density of the desirable blood volume pulse signal. Result About 211 videos from 30 subjects are recorded under natural ambient lighting condition. Three series of experiments are conducted on the proposed method to verify its validity:experiments focusing on heart rate accuracy in static scenarios and dynamic scenarios, experiments focusing on method stability with different video durations and frame rates, and a 10 minutes experiment for long-term heart rate monitoring. Results of the accuracy experiments show that the Pearson correlation coefficient between the estimated heart rate and the ground truth is 0.990 2 in static scenarios and 0.960 5 in dynamic scenarios. The heart rate estimated by the proposed method highly approximates to the ground truth. Compared with the newest method, our method decreases the error rates by 53.90% and increases the Pearson correlation coefficient by 7.46%. The stability experiments show that the proposed method performs well as long as the duration of videos is higher than 8 seconds or the frame rate is higher than 20 frames per second. The long-term heart rate monitoring shows that the heart rate measured by the proposed method has similar fluctuations with the ground truth. Conclusion The proposed method, which is proved to be superior to state-of-the-art ones, can dispose of expression disturbances and increase the detection accuracy of heart rate. However, in cases involving sudden illumination changes, the accuracy of the proposed method fails to reach the high level, which is planned to be further improved in our future works.
Keywords

订阅号|日报