戚刚,杨学志,吴秀,霍亮(合肥工业大学计算机与信息学院, 合肥 230009)
目的 心率是直接反映人体健康的重要指标之一，基于视频的非接触式心率检测在医疗健康领域具有广泛的应用前景。然而，现有的基于视频的方法不适用于复杂的现实场景，主要原因是没有考虑视频中目标晃动干扰和空间尺度特征，使得血液容积脉冲信号提取不准确，检测精度不尽人意。为了克服以上缺陷，提出一种抗人脸晃动干扰的非接触式心率检测方法。方法 本文方法主要包含3个步骤：首先，针对目标晃动干扰人脸区域选择的问题，利用判别响应图拟合检测参考图像的人脸区域及主要器官特征点，在人脸跟踪时首次引入倾斜校正思想，输出晃动干扰抑制后的人脸视频；然后，结合空间尺度的差异，采用颜色放大方法对晃动干扰抑制后的人脸视频进行时空处理，提取干净的血液容积脉冲信号；最后，考虑到小样本问题，通过傅里叶系数迭代插值的频域分析方法估计心率。结果 在人脸静止的合作情况以及人脸晃动的非合作情况下采集视频，对心率检测结果进行定量分析，本文方法在两种情况下的准确率分别为97.84%和97.30%，与经典和最新的方法相比，合作情况准确率提升大于1%，非合作情况准确率提升大于7%，表现了出色的性能。结论 提出了一种基于人脸视频处理的心率检测方法，通过有效分析人脸的晃动干扰和尺度特性，提取到干净的血液容积脉冲信号，提高了心率检测的精度和鲁棒性。
Heart rate detection for non-cooperative shaking face
Qi Gang,Yang Xuezhi,Wu Xiu,Huo Liang(School of Computer and Information, Hefei University of Technology, Hefei 230009, China)
Objective Heart rate is one of the important indicators that can directly reflect the health of the human body. Heart rate detection has been applied to many aspects of the medical field, such as physical examination, major surgery, and postoperative treatment. Heart rate detection based on face video processing has recently been performed through a noncontact manner without complex operations and sense of restraint. However, the existing methods cannot predict well in complex realistic scenes, including shaking target. If face detection in video processing is accompanied with face shaking, the facial region of interest is selected inaccurately. Such methods also disregard spatial scale features, which are significant to extract blood volume pulse (BVP) signal. The results of current methods are consequently inadequate. To this end, a new non-contact heart rate detection method based on face video processing is proposed to reduce the influence of face shake and improve precision. Method Our method consists of three major steps. First, we deal with video through a robust face detecting and tracking model to obtain a refined face video in which facial shake is eliminated. Considering that the universal Viola-Jones face detection model generates an incorrect face area when a face is tilted along consecutive frames, discriminative response map fitting is used to detect important feature points for tracking the right face area. For the first frame image, we mark 66 landmark points on the facial organ (eyes, nose, mouth, and facial shape) and four vertexes of facial rectangle. These feature points are then entered into the Kanade-Lucas-Tomasi tracking model to calculate the facial rectangle of subsequent frames. According to the oblique angle of each facial rectangle, the corresponding face image is rotated to a vertical position. Second, the modified face video is handled by a space-time processing algorithm for amplifying the video color variations to separate the spatial scale characteristics of the video and intercept the frequency range of blood volume changes. We average the chrominance of skins under the eyes as clean BVP. Finally, for the BVP signal that belongs to a small sample, frequency domain analysis and iterative Fourier coefficient interpolation are combined to estimate heart rate. Iteration is performed 1 000 times for improved accuracy. Result The proposed method is tested on two different types of face video libraries comprising still and shaking face videos. Each video library contains 60 10-second videos from 20 participants, including twelve men and eight women. We conduct a quantitative analysis for the typical method provided by Poh, the up-to-date method provided by Liu, and our method. Statistically, the overall accuracies of our method in still and shaking face videos are 97.84% and 97.30%, respectively. The accuracy is increased by more than 1% in still face videos and more than 7% in shaking face videos. Conclusion Video-based heart rate detection in complex realistic scenes is affected by facial shaking, which leads to significantly reduced accuracy. Neglecting spatial scale characteristics and the small sample affect detection performance. Hence, this study proposes a novel heart rate detection method applied to complex realistic scenes. We detect and track important facial feature points to effectively analyze the state of facial shaking and adjust the facial slope. After space-time processing for selecting a proper spatial scale, a clean BVP signal is extracted to calculate heart rate iteratively. Experimental results indicate that our method has high accuracy and preferable adaptive performance to cases involving facial shaking.