发布时间: 2019-08-16
摘要点击次数:
全文下载次数:
DOI: 10.11834/jig.180643
2019 | Volume 24 | Number 8

图像理解和计算机视觉

眼球光心标定与距离修正的3维注视点估计

张远辉, 段承杰, 朱俊江, 何雨辰

中国计量大学机电工程学院, 杭州 310018

收稿日期: 2018-12-10; 修回日期: 2019-01-18

基金项目: 国家自然科学基金项目（61801454）；浙江省自然科学基金项目（LY19F010007，LQ18F010006，LQ19F030007）

第一作者简介: 张远辉, 1982年生, 男, 博士, 副教授, 主要研究方向为计算机视觉、信号处理、机器人智能控制。E-mail:zyh@cjlu.edu.cn;
段承杰, 男, 硕士研究生, 主要研究方向为计算机视觉。E-mail:xxpcb2018@163.com;
朱俊江, 男, 博士, 讲师, 主要研究方向为信号处理。E-mail:zjj602@yeah.net;
何雨辰, 男, 博士, 讲师, 主要研究方向为大数据分类诊断方法。E-mail:yche@cjlu.edu.cn.

中图法分类号: TP391.4

文献标识码: A

文章编号: 1006-8961(2019)08-1369-12

摘要

目的在基于双目视线相交方法进行3维注视点估计的过程中，眼球光心3维坐标手工测量存在较大误差，且3维注视点估计结果在深度距离方向偏差较大。为此，提出了眼球光心标定与距离修正的方案对3维注视点估计模型进行改进。方法首先，通过图像处理算法获取左右眼的PCCR（pupil center cornea reflection）矢量信息，并使用二阶多项式映射函数得到左、右眼的2维平面注视点；其次，通过眼球光心标定方法获取眼球光心的3维坐标，避免手工测量方法引入的误差；然后，结合平面注视点得到左、右眼的视线方向，计算视线交点得到初步的3维注视点；最后，针对结果在深度距离方向抖动较大的问题，使用深度方向数据滤波与Z平面截取修正法对3维注视点结果进行修正处理。结果选择两个不同大小的空间测试，实验结果表明该方法在3050 cm的工作距离内，角度偏差0.7°，距离偏差17.8 mm，在50130 cm的工作距离内，角度偏差1.0°，距离偏差117.4 mm。与其他的3维注视点估计方法相比较，在同样的测试空间条件下，角度偏差和距离偏差均显著减小。结论提出的眼球光心标定方法可以方便准确地获取眼球光心的3维坐标，避免手工测量方法带来的误差，对角度偏差的减小效果显著。提出的深度方向数据滤波与Z平面截取修正法可以有效抑制数据结果的抖动，对距离偏差的减小效果显著。

关键词

双目视线; 2维注视点; 3维注视点; 眼球光心; 3维坐标标定; 数据滤波; 距离修正

3D gaze estimation using eyeball optical center calibration and distance correction

Zhang Yuanhui, Duan Chengjie, Zhu Junjiang, He Yuchen

College of Electrical and Mechanical Engineering, China Jiliang University, Hangzhou 310018, China

Supported by: National Natural Science Foundation of China (61801454)

Abstract

Objective Gaze estimation can be divided into 2D and 3D gaze estimation. The 2D gaze estimation based on polynomial mapping uses only single-eye pupil center cornea reflection (PCCR) vector information to calculate the 2D (x, y) point of regard (POG) in a plane. The 3D gaze estimation based on binocular lines of sight intersection needs to use the PCCR vector information of both eyes and the 3D coordinate of the left and right eyeball optical centers (the point at which eye sight is emitted) to calculate 3D (x, y, z) POG in a 3D space. In the process of 3D gaze estimation, the measurement error exists as a result of manual measurement of the 3D coordinates of the eyeball optical center and the large deviation of the 3D gaze estimation results in the direction of depth. On the basis of the traditional binocular lines of the sight intersection method for 3D gaze estimation, we propose two primary improvements. We use a calibration method to obtain the 3D coordinates of the eyeball optical center to replace manual measurement. Then, we use data filtering in-depth direction and Z-plane intercepting correction method to correct the 3D gaze estimation results. Method First, the subject gazes at nine marked points on a calibration plane, which is at the first distance away from human eyes, and an infrared camera in front of the subject is used to capture eye images. The image processing algorithm can obtain the PCCR vector information of both eyes. The mapping functions of both eyes on the first plane can be solved according to the second-order polynomial mapping principle between the PCCR vector and the plane marked points. Second, with the calibration plane moved to a second distance, the subject gazes at the nine marked points again. With the use of the mapping functions of both eyes, the 2D POG of both eyes at the first calibrated distance can be calculated, and the nine marked points at the second distance to the left and right 2D POG at the first calibrated distance can be connected. Multiple lines will intersect at two points, and calculating these two equivalent intersection points obtains the calibration result of the 3D coordinates of the eyeball optical center. Third, 3D gaze estimation can be performed. With the left and right planar 2D POG combined with the 3D coordinates of the eyeball optical center and with the establishment of an appropriate space coordinate system (taking the calibration plane as the X and Y plane and taking the depth of the distance as the Z axis), the lines of sight of both eyes can be calculated. According to the principle of human binocular vision, both eyes' lines of sight will intersect at one point in space, and calculating the intersection point can obtain the rough 3D POG. The binocular vision lines are generally disjoint due to calculation and measurement errors. Thus, the midpoint of the common perpendicular should be chosen as the intersection. Finally, for the larger jitter of the resultant in-depth direction, the proposed data filtering in-depth direction and Z-plane intercepting correction method is used to correct the rough result. In this method, the data sequence of depth distance direction (Z coordinate) is first filtered. Using the filtered distance result generates a plane that is perpendicular to the Z axis. Then, the plane intercepts the left and right lines of sight to obtain two points, and the midpoint of two points is chosen as the correction result of the other two directions (X and Y). After this filtering and correction process, a more accurate 3D POG can be obtained. Result We use two different sizes of workspaces to test the proposed method, and the experiment result shows that in the small workspace (24×18×20 cm³), the work distance in-depth direction is 3050 cm, the angular average error is 0.7°, and the Euclidean distance average error is 17.8 mm. By contrast, in the large workspace (60×36×80 cm³), the work distance in-depth direction is 50130 cm, the angular average error is 1.0°, and the Euclidean distance average error is 117.4 mm. Compared with other traditional 3D gaze estimation methods, the proposed method considerably reduces the angle and distance deviation under the same distance testing condition. Conclusion The proposed calibration method for the eyeball optical center can obtain the 3D coordinates of the eyeball optical center conveniently and accurately. The method can avoid the eyeball optical center measurement error introduced by manual measurement and reduce the angle deviation of 3D POG significantly. The proposed data filtering in-depth direction and Z-plane intercepting correction method can reduce the jitter of the 3D POG result in-depth direction and can reduce the distance deviation of 3D POG significantly. This method is of great significance for the practical application of 3D gaze.

Key words

binocular lines of sight; 2D gaze; 3D gaze; eyeball optical center; 3D coordinates calibration; data filter; distance correction

0 引言

注视点是指人观察物体时的视线落点，注视点的估计可分为平面注视点估计和3维注视点估计。平面注视点估计又称2维注视点估计，仅需考虑人的视线落点在一个特定的平面内移动，仅包含2维的运动信息，例如网页信息浏览。3维注视点估计则需要考虑视线的深度信息，包含3维的运动变化，例如驾驶员观察行车环境^[1]。3维注视点的估计一般需要通过双眼特征的提取才能获得对应的深度信息，因此3维注视点的估计方法更加复杂，误差一般比2维注视点估计大。

目前关于3维注视点的研究主要集中在国外的研究学者，国内关于此方面的研究较少。3维注视点设备可分为头戴式^[2-4]与非头戴式^[5-7]，头戴式是指将图像采集设备佩戴在头上，该方式可以保持人与摄像机的相对位置保持不变，方便后期图像处理工程，并且最终计算得到的注视点精度较高，非头戴式则不需要图像采集设备与人直接接触，对人的干扰性小，体验感更舒适。使用环境可分为真实环境^{[5, 8]}与虚拟环境^[9-10]，真实环境下可用于人机交互、安全辅助驾驶等，虚拟环境下可用于沉浸式虚拟现实体验等。国内外关于注视点的研究起初是从2维注视点研究开始，在其基础上研究3维空间注视点。Liu等人^[11]将Tobii Pro X60眼动仪用于3维环境，通过场景摄像机获取测试场景，估计测试者在3维环境中的注视点，但该方法实际上估计的是场景摄像机获取的2维图像上的点。Mujahidin等人^[5]使用一种商用眼动追踪设备来获得双眼的屏幕注视点，然后通过左、右视线交点得到3维注视点，这是一种典型的结合左、右眼的2维注视点来进行3维注视点估计的方法，其标定平面选为电脑屏幕，因此工作范围限制在了人与电脑屏幕之间，并且还需考虑被观察物体不能遮挡到眼动仪。Mlot等人^[8]为了提高3维注视点精度，使用头戴式设备获取眼部图像，利用多个深度距离进行平面注视点标定，每个深度距离的标定点个数多达16个，因而标定过程较为繁琐，且瞳孔间距使用的是一个固定值，没有自适应功能。Wibirama等人^[9]利用Nvidia 3D Vision的立体眼镜设备进行虚拟环境下的3维注视点测试。赵新灿等人^[12]使用手势跟踪设备Leap Motion进行瞳孔追踪，利用人体位置追踪设备追踪人体运动状态，将平面映射模型扩展至3维，用于沉浸式虚拟现实中的人机交互。Li等人^[13]使用头戴式设备以及注视向量法进行3维注视点估计，并将其应用于与小型机械臂的人机交互。

3维注视点的测试空间一般为空间立方体区域^{[5, 7-8]}，用标定平面的长、宽和标定平面在深度方向可移动的距离来描述，此外也有其他类型的测试空间。Pichitwong等人^[14]的3维注视点测试空间为一个水平平面和一个垂直平面，并将每个平面分为9个区域，分别测试各个区域的注视精度。Leroux等人^[15]将测试空间设置为包含高度的半圆形区域，测试物体为标记的水平面网格点和具有一定高度的实际物体。

针对目前国内外研究现状以及各种方案的优缺点，本文采用非头戴式设备进行3维注视点的估计，使用低成本的USB摄像头而非昂贵的商业眼动仪来获取平面注视点，使用新颖的标定方式获取双眼的眼球光心3维坐标，使双眼位置的确定更加方便准确，使用双目视线交点的方法计算3维注视点，并对3维注视点结果进行进一步的滤波和修正处理，使3维注视点估计精度更高。

1 3维注视点估计方法

本文提出的3维注视点估计方法，基本原理是利用人的左右眼的视线交点来估计3维注视点。首先对距离人眼特定距离的平面进行2维注视点标定，得到左右眼的平面2维注视点；然后结合左右眼眼球光心(眼球中视线的发出点)的真实3维坐标，建立空间坐标系，得到左右眼的3维视线的直线方程；最后根据人的双眼视觉原理，得到双目视线在空间中的交点，即所求的3维注视点。由于检测及计算误差，计算得到的双目视线常为异面不相交状态，但可以选用两条视线的空间最短距离的中点作为左右视线交点。对于眼球光心3维坐标的确定，可以通过手工测量的方法获取大致位置，但手工测量过程比较繁琐且存在误差，对此本文提出一种通过标定方式来确定眼球光心3维空间坐标的方法，用该方法代替手工测量的方法。另外，由于测量噪声以及2维注视点误差引起的3维注视点误差，最终得到的3维注视点结果波动较大，对此本文提出一种数据滤波与$Z$平面截取修正法对原始3维注视点结果进行修正，该方法首先对深度方向($Z$方向)的距离值进行滤波，再对另外两个方向($X$、$Y$方向)的结果进行修正。

1.1 双目视线交点

本文的3维注视点估计方法的几何原理如图 1所示，两个圆球代表人的左右眼，两个叉号代表左右眼在标定平面位置的2维注视点，连接眼球光心点和2维注视点并作延长线得到左右眼的视线，理想情况下左右视线在空间中相交于一点，但实际求解出的两条视线一般为异面直线，如图 1虚线圆圈内显示的局部放大图所示，两条异面直线存在一个空间上的最短距离，最短距离线段与两条异面直线均为垂直关系，选取最短距离线段的中点作为双目视线的交点，即估计的3维注视点。

图 1 3维注视点估计几何原理

Fig. 1 The geometric principle of 3D gaze estimation

平面2维注视点估计采用的是二阶多项式映射的方法，首先通过图像处理算法获取红外光照射下的左右眼睛的PCCR (pupil center cornea reflection)矢量信息，如图 2所示，瞳孔中心坐标为$\left(x_{\mathrm{p}}, y_{\mathrm{p}}\right)$，角膜反射光斑中心坐标为$\left(x_{\mathrm{r}}, y_{\mathrm{r}}\right)$，则PCCR矢量的两个分量为

$ v_{x}=x_{\mathrm{p}}-x_{\mathrm{r}}, v_{y}=y_{\mathrm{p}}-y_{\mathrm{r}} $

(1)

图 2 瞳孔中心与反射光斑

Fig. 2 Pupil center and corneal reflection glint

二阶多项式映射函数为

$ \begin{array}{c}{x_{2 \mathrm{d}}=f_{1}\left(v_{x}, v_{y}\right)=} \\ {a_{0}+a_{1} v_{x}+a_{2} v_{y}+a_{3} v_{x}^{2}+a_{4} v_{x} v_{y}+a_{5} v_{y}^{2}}\end{array} $

(2)

$ \begin{array}{c}{y_{2 \mathrm{d}}=f_{2}\left(v_{x}, v_{y}\right)=} \\ {b_{0}+b_{1} v_{x}+b_{2} v_{y}+b_{3} v_{x}^{2}+b_{4} v_{x} v_{y}+b_{5} v_{y}^{2}}\end{array} $

(3)

式中，$a_{0}, \cdots, a_{5}, b_{0}, \cdots, b_{5}$为标定系数，$\left(x_{2 \mathrm{d}}, y_{2 \mathrm{d}}\right)$为平面2维注视点。

空间中两条异面直线求最短线段中点的方法如图 3所示。

图 3 异面直线间最短线段的中点

Fig. 3 The midpoint of the shortest line segment between lines in different planes

$\boldsymbol{P}_{0}$、$\boldsymbol{Q}_{0}$为左、右眼空间位置，$\boldsymbol{P}_{1}$、$\boldsymbol{Q}_{1}$为左、右眼的平面2维注视点增补距离信息后的空间位置，$\boldsymbol{P}_{2}$、$\boldsymbol{Q}_{2}$为左、右眼视线最近距离线段的两个端点。

引入向量$\boldsymbol{u}, \boldsymbol{v}, \boldsymbol{w}$和$\boldsymbol{w}_{0}$，令

$ \left\{ {\begin{array}{*{20}{l}} {\mathit{\boldsymbol{u}} = {\mathit{\boldsymbol{P}}_1} - {\mathit{\boldsymbol{P}}_0}}\\ {\mathit{\boldsymbol{v}} = {\mathit{\boldsymbol{Q}}_1} - {\mathit{\boldsymbol{Q}}_0}}\\ {\mathit{\boldsymbol{w}} = {\mathit{\boldsymbol{P}}_2} - {\mathit{\boldsymbol{Q}}_2}}\\ {{\mathit{\boldsymbol{w}}_0} = {\mathit{\boldsymbol{P}}_0} - {\mathit{\boldsymbol{Q}}_0}} \end{array}} \right. $

(4)

根据图 3中的几何关系，可得

$ \boldsymbol{P}_{2}=\boldsymbol{P}_{0}+m \cdot \boldsymbol{u}, \boldsymbol{Q}_{2}=\boldsymbol{Q}_{0}+n \cdot \boldsymbol{v} $

(5)

式中，$m$和$n$为系数，将式(5)代入式(4)的第3项，可得

$ \begin{array}{*{20}{c}} {\mathit{\boldsymbol{w}} = {\mathit{\boldsymbol{P}}_2} - {\mathit{\boldsymbol{Q}}_2} = }\\ {{\mathit{\boldsymbol{P}}_0} - {\mathit{\boldsymbol{Q}}_0} + m \cdot \mathit{\boldsymbol{u}} - n \cdot \mathit{\boldsymbol{v}} = }\\ {{\mathit{\boldsymbol{w}}_0} + m \cdot \mathit{\boldsymbol{u}} - n \cdot \mathit{\boldsymbol{v}}} \end{array} $

(6)

根据向量间的垂直关系，相互垂直的向量间的点积为零，有

$ \left\{ {\begin{array}{*{20}{l}} {\mathit{\boldsymbol{w}} \cdot \mathit{\boldsymbol{u}} = 0}\\ {\mathit{\boldsymbol{w}} \cdot \mathit{\boldsymbol{v}} = 0} \end{array}} \right. $

(7)

将式(6)代入式(7)并整理，有

$ \left\{\begin{array}{l}{(\boldsymbol{u} \cdot \boldsymbol{u}) \cdot m-(\boldsymbol{v} \cdot \boldsymbol{u}) \cdot n=-\boldsymbol{w}_{0} \cdot \boldsymbol{u}} \\ {(\boldsymbol{u} \cdot \boldsymbol{v}) \cdot m-(\boldsymbol{v} \cdot \boldsymbol{v}) \cdot n=-\boldsymbol{w}_{0} \cdot \boldsymbol{v}}\end{array}\right. $

(8)

求解此线性方程组，可得$m$和$n$的值，代入式(5)可得${\mathit{\boldsymbol{P}}_2}$和${\mathit{\boldsymbol{Q}}_2}$的3维坐标，最后计算${\mathit{\boldsymbol{P}}_2}$和${\mathit{\boldsymbol{Q}}_2}$的中点即为所要求的点。

1.2 用户眼球光心3维坐标标定

眼球光心点的3维坐标标定原理如图 4所示，选定初始位置的标定平面的左上角为3维坐标系原点。

图 4 眼球光心坐标标定原理

Fig. 4 The calibration principle of the eyeball optical center

首先标定平面位于用户前方$d_{1}$的距离，用户通过依次注视标定平面上的若干个标记点，如图 4中标定平面上的两个十字形标记$M_{1}$、$M_{2}$(实际标定过程中应该选用更多的点以提高标定精度)；然后通过二阶多项式映射的方式计算得到该平面的2维注视点的映射函数；接着将标定平面移动到用户前方$d_{1}+d_{2}$的距离，用户再次注视标定平面上的标记点，通过上一步得到的2维映射函数，可以计算出用户在$d_{1}$距离平面的两组左、右眼的平面注视点，即图 4中的${P_{1{\rm{l}}}}$和${P_{{\rm{1r}}}}$，${P_{1{\rm{l}}}}$表示用户注视$d_{2}$距离平面上的$M_{1}^{\prime}$标记点在$d_{1}$距离的平面上计算得到的左眼2维注视点，${P_{{\rm{1r}}}}$表示相应的右眼2维注视点，下标中的数字表示对应的标记点的序号；最后连接这些对应的标记点与左、右眼2维注视点，这些直线即为人左、右眼睛的视线方向，则多条视线的交点坐标即为用户的左、右眼眼球光心点的3维坐标，即图 4中的$L_{\mathrm{est}}$和$R_{\mathrm{est}}$。

由于2维平面注视点估计的误差以及计算的误差，多条视线的实际计算结果不会严格相交于一点，而是相聚于某一点附近，此时需要求出它们的等效交点，即求解出一个3维空间点，使该点到所有直线的距离总和最短。等效交点的求解方法可以使用文献[16]中的算法。

1.3 数据滤波与$Z$平面截取修正法

由于噪声及测量偏差，实际3维注视点估计结果会存在一定的误差。实验结果显示3维注视点的深度值($Z$坐标)在目标值附近有较大抖动，而3维注视方向的角度误差较小。根据这种特点，本文提出深度方向数据滤波与$Z$平面截取修正法对原始3维注视点结果进行修正。该方法首先采取平均值滤波或中位值滤波的方法对一系列3维注视点结果的深度距离值($Z$坐标)进行滤波处理，将滤波后的深度距离值作为修正后的3维注视点的$Z$坐标，然后使用$Z$平面截取修正法对原始3维注视点的$X$、$Y$坐标进行修正，具体的修正方法如图 5所示。

图 5 3维注视点$Z$平面截取修正法示意图

Fig. 5 $Z$-plane intercepting correction method of 3D gaze

图 5中，$L_{1}$和$L_{2}$为左、右眼视线，$P_{0}$为目标注视点，位于目标平面(平面0)上，对应的深度方向坐标为$z_{0}$。$P_{1}$为根据视线交点得到的原始3维注视点，其深度方向坐标为$z_{1}$，则$Z$方向误差为$z_{0}$－$z_{1}$，欧氏距离误差为线段$P_{0}P_{1}$的长度。滤波处理后，得到新的深度方向数值为$z_{2}$，根据平均值滤波和中位值的滤波特性，存在抖动的数据点会更接近目标值，即$z_{2}$相比$z_{1}$更接近目标值$z_{0}$，再以此深度距离值产生一个虚拟平面(平面2)，如图 5的局部放大图中颜色较深的矩形平面，该平面截取左、右视线产生两个交点${P_{{\rm{l}}}}$和${P_{{\rm{r}}}}$，取两个交点的中间点$P_{2}$作为修正后的3维注视点，修正后的3维注视点在$Z$方向误差为$z_{0}$－$z_{2}$，欧氏距离误差为线段$P_{0} P_{2}$的长度。对比可以看出修正后的误差结果$\left|P_{0} P_{2}\right| < \left|P_{0} P_{1}\right|$，达到误差减小的目的。此示意图为原始3维注视点落在目标注视点前方的情况，对于落在后方的情况，修正原理与之类似。

2 实验设计与对比结果

2.1 实验设置与误差评估

实验装置如图 6所示，分为头部支架、图像采集设备和标定机构3部分。头部支架用于帮助测试者保持头部静止，本实验在测试过程中需要测试者尽量保持头部静止。头部支架上的下巴托的高度通过丝杠结构可以手动调节，以保证舒适的测试环境；图像采集设备为USB摄像头，镜头外有一圈红外LED，镜头配有红外滤光片，用于采集测试者的眼部红外图像。摄像头固定于头部支架的前下方，不会影响测试者观察前方的标定板；标定机构的标定板固定于自制的滑轨机构上，由带编码器的直流电机和皮带机构传动，通过微控制器(STM32)及其外接按键可控制其移动特定的距离，用于测试不同深度距离下的3维注视点估计精度。

图 6 实验装置

Fig. 6 Experimental facility

实验设计结构如图 7所示，主要分为图像采集和数据处理两大部分。图像采集阶段获取测试者的眼部红外图像，用于标定和3维注视点测试；数据处理阶段包括数据预处理和结果计算。本实验对8位实验者进行了测试，并且测试了两种不同大小的工作空间。一组大空间为60 cm × 36 cm × 80 cm，标定板依次距离测试者50 cm，70 cm，90 cm，110 cm和130 cm，标定平面的尺寸为60 cm × 36 cm，空间示意图如图 6右图所示；一组小空间为24 cm × 18 cm × 20cm，标定板依次距离测试者30 cm，35 cm，40 cm，45 cm和50 cm，标定平面的尺寸为24 cm × 18 cm。标定点个数均为9个，位置分布按照九宫格分布原则，即将标定平面均分为九宫格，取每个格子的中心点为标记点。正常情况下人的视线位于前下方，因此设置标定平面的最高点与测试者的水平视线大致齐平。测试者在每个点的注视时间约2~3 s，根据声音提示依次注视标记点，每个注视点摄像头自动采集20幅测试者的眼部图像。5个工作距离中，第1个工作距离用于2维映射函数的标定，第2、3个工作距离用于眼球光心3维位置的标定，所有工作距离的全部数据用于3维注视点测试。为减少计算复杂度与数据冗余，2维映射函数的标定以及眼球光心3维位置的标定均仅使用20幅图像中的前5幅。数据预处理中图像处理部分采用C++语言调用OpenCV计算机视觉库进行人眼特征点的提取，之后使用MATLAB进行映射函数和眼球光心的计算和数据滤波处理，最后进行3维注视点的估计和误差分析，统计汇总多位测试者的实验结果。

图 7 实验设计系统结构图

Fig. 7 System structure of experimental design

误差的评价指标选用欧氏空间的距离误差和角度误差，通过多组实验结果的平均值和标准差来评价。欧氏距离误差$E R R_{\mathrm{Euc}}$为目标注视点$\left({{x_a}, {y_a}} \right., {z_a})$与估计的3维注视点$\left(x_{\mathrm{e}}, y_{\mathrm{e}}, z_{\mathrm{e}}\right)$的直线距离，即

$ E R R_{\mathrm{Euc}}=\sqrt{\left(x_{a}-x_{\mathrm{e}}\right)^{2}+\left(y_{a}-y_{\mathrm{e}}\right)^{2}+\left(z_{a}-z_{\mathrm{e}}\right)^{2}} $

(9)

角度误差$E R R_{\mathrm{Ang}}$为目标注视点与双眼中点的连线(记为向量$\mathit{\boldsymbol{a}}$)和估计注视点与双眼中点的连线(记为向量$\mathit{\boldsymbol{b}}$)之间的夹角，即

$ E R R_{\mathrm{Ang}}=\arccos \left(\frac{\boldsymbol{a} \cdot \boldsymbol{b}}{|\boldsymbol{a}| \cdot|\boldsymbol{b}|}\right) $

(10)

2.2 眼球光心标定对比实验

以测试者$A$为例，在小工作空间下使用手工测量方法得到左、右眼的眼球光心坐标为(88, 0, -300)，(152, 0, -300)，瞳孔间距为64 mm，使用提出的标定方法标定眼球光心的3维坐标，结果如图 8所示，两个黑色圆形为标定的左、右眼的眼球光心位置，两个黑色方块为手工测量方法确定的左、右眼的眼球光心位置。标定方法得到的左、右眼光心为(97.2, 21.2, -289.5)，(165.4, 22.9, -269.9)，瞳孔间距为71 mm。从图 8可以看出，使用标定方式得到的眼球光心位置更准确。

图 8 眼球光心标定结果

Fig. 8 The calibration result of eyeball optical center

以上述两种方案确定的眼球光心坐标为条件进行3维注视点估计的误差分析，结果如表 1和表 2所示，表中的3维注视点结果均未进行滤波和修正处理。从对比结果可以看出，使用标定方式确定眼球光心坐标得到的3维注视点，在不同的深度测试距离下，欧氏距离误差和角度误差均有不同程度的减小，特别是角度误差减小的更多。平均欧氏距离误差从27.6 mm减少到24.5 mm，平均角度偏差从1°减少到0.5°。

表 1 手工测量眼球光心坐标时的3维注视点误差
Table 1 The error of 3D gaze estimation when getting eyeball optical center coordinates by manual measurement

下载CSV

$Z$轴深度/mm	平均误差					标准差
$Z$轴深度/mm	$X$方向/mm	$Y$方向/mm	$Z$方向/mm	欧氏距离/mm	角度偏差/(°)	欧氏距离/mm	角度偏差/(°)
300+0	2.6	3.9	11.1	12.5	0.4	2.6	0.1
300+50	3.2	6.4	17.0	19.2	0.7	4.6	0.2
300+100	4.4	10.5	25.4	28.8	1.0	7.8	0.2
300+150	5.6	12.9	30.7	35.3	1.5	10.2	0.4
300+200	8.3	16.0	35.7	42.4	1.6	11.0	0.2
平均值	4.8	9.9	24.0	27.6	1.0	7.2	0.2

表 2 标定方法确定眼球光心坐标时的3维注视点误差
Table 2 The error of 3D gaze estimation when getting eyeball optical center coordinates by calibration

下载CSV

$Z$轴深度/mm	平均误差					标准差
$Z$轴深度/mm	$X$方向/mm	$Y$方向/mm	$Z$方向/mm	欧氏距离/mm	角度偏差/(°)	欧氏距离/mm	角度偏差/(°)
300+0	2.4	3.1	9.6	10.9	0.5	2.4	0.1
300+50	2.9	4.0	14.4	15.7	0.5	4.8	0.1
300+100	3.0	4.8	19.9	21.4	0.5	5.2	0.2
300+150	4.8	5.6	32.2	33.6	0.6	12.7	0.3
300+200	6.1	5.5	39.3	40.9	0.6	15.3	0.2
平均值	3.8	4.6	23.1	24.5	0.5	8.1	0.2

2.3 数据滤波与$Z$平面截取修正对比实验

继续分析测试者$A$，对该测试者的3维注视点结果的$Z$方向(深度方向)采用两种方式进行滤波处理，图 9(a)—(i)为第2个测试深度($Z$ = 50 mm)对应的9个标定点的滤波结果。红色水平点划线为理想值，黑色折线为滤波前的3维注视点$Z$坐标数据，绿色圆圈点为平均值滤波的结果，蓝色星号点为中位值滤波的结果。平均值滤波的窗口长度设置为5个点，即以某个数据点的前两个数据和后两个数据以及该点数据求和后的平均值作为该点滤波后的值，对于序列两端的前两个和后两个数据，分别用此序列的第3个和倒数第3个数据的滤波结果作为滤波结果。中位值滤波的滤波宽度也设置为5个点，算法与平均值滤波类似，区别是5个数据中取中位数作为滤波后的值。

图 9 两种滤波后的结果

Fig. 9 Two filtering results

从图 9可以看出，两种滤波效果区别不大，整体上看平均值滤波的效果略优于中位值滤波，两种滤波方式都可以消除数据的抖动，使数据更加平缓，更加趋于理想值。

结合测试者$A$在小工作空间使用眼球光心标定时的测试结果，使用$Z$平面截取修正法，对原始3维注视点结果进行处理，修正前后误差变化的可视化结果如图 10所示，该图为测试者注视第3个深度距离的9个标记点(每个点均选用各自滤波序列中的第12个数据)的修正前后对比结果，红色和蓝色圆点表示眼球光心，星号表示平面2维注视点，红色和蓝色直线代表左、右视线，绿色网格线交点为目标注视点，洋红色短虚线末端圆圈表示修正前的3维注视点，黑色短实线末端圆圈表示修正后的3维注视点。可以看出修正后的距离误差相比修正前有明显减小。

图 10 $Z$平面截取修正法结果

Fig. 10 The result of $Z$-plane intercepting correction method

两种滤波方式修正后的实验结果如表 3和表 4所示。对比表 2中未使用深度方向数据滤波与$Z$平面截取修正法的实验数据，可以看出两种滤波方式修正后的3维注视点结果在欧氏距离误差减小方面又有较大提升，300 mm深度处的误差从10.9 mm减少到7.0 mm和7.0 mm，400 mm深度处的误差从21.4 mm减少到16.5 mm和15.1 mm，500 mm深度处的误差从40.9 mm减少到34.3 mm和35.1 mm。

表 3 中位值滤波及$Z$平面截取修正后的误差
Table 3 The result using median filtering and $Z$-plane intercepting correction method

下载CSV

$Z$轴深度/mm	平均误差					标准差
$Z$轴深度/mm	$X$方向/mm	$Y$方向/mm	$Z$方向/mm	欧氏距离/mm	角度偏差/(°)	欧氏距离/mm	角度偏差/(°)
300+0	2.0	2.2	5.6	7.0	0.5	4.6	0.1
300+50	2.4	3.3	11.0	12.2	0.5	6.0	0.1
300+100	2.5	4.0	15.0	16.5	0.5	7.1	0.2
300+150	4.6	4.9	28.6	29.9	0.6	13.2	0.3
300+200	5.9	4.4	32.8	34.3	0.6	16.7	0.2
平均值	3.5	3.8	18.6	20.0	0.5	9.5	0.2

表 4 平均值滤波及$Z$平面截取修正后的误差
Table 4 The result using average filtering and $Z$-plane intercepting correction method

下载CSV

$Z$轴深度/mm	平均误差					标准差
$Z$轴深度/mm	$X$方向/mm	$Y$方向/mm	$Z$方向/mm	欧氏距离/mm	角度偏差/(°)	欧氏距离/mm	角度偏差/(°)
300+0	1.9	2.1	5.8	7.0	0.5	4.5	0.1
300+50	2.3	3.0	9.6	10.9	0.5	6.8	0.1
300+100	2.3	3.8	13.6	15.1	0.5	8.1	0.2
300+150	4.3	4.5	27.0	28.1	0.6	13.8	0.3
300+200	5.8	4.4	33.9	35.1	0.6	16.4	0.2
平均值	3.3	3.6	18.0	19.2	0.5	9.9	0.2

2.4 最终实验结果

使用本文提出的眼球光心坐标标定方法及深度方向数据滤波与$Z$平面截取修正法进行3维注视点估计，由于两种滤波方式相差不大，最终选用效果更好一些的平均值滤波方式。将8位测试者的实验结果汇总求取平均值，表 5和表 6分别是小工作空间(24 cm×18 cm×20 cm)和大工作空间(60 cm×36 cm × 80 cm)环境下的实验误差统计。

表 5 小工作空间的平均误差
Table 5 Average error in small working spaces

下载CSV

$Z$轴深度/mm	平均误差					标准差
$Z$轴深度/mm	$X$方向/mm	$Y$方向/mm	$Z$方向/mm	欧氏距离/mm	角度偏差/(°)	欧氏距离/mm	角度偏差/(°)
300+0	2.2	2.9	6.2	7.8	0.5	4.6	0.2
300+50	2.9	3.3	9.4	11.0	0.6	5.9	0.2
300+100	3.6	4.1	14.3	16.2	0.7	7.5	0.2
300+150	4.8	5.0	21.2	23.2	0.7	11.8	0.3
300+200	6.2	6.6	28.3	30.9	0.8	15.1	0.2
平均值	3.9	4.4	15.9	17.8	0.7	9.0	0.2

表 6 大工作空间的平均误差
Table 6 Average error in large working spaces

下载CSV

$Z$轴深度/mm	平均误差					标准差
$Z$轴深度/mm	$X$方向/mm	$Y$方向/mm	$Z$方向/mm	欧氏距离/mm	角度偏差/(°)	欧氏距离/mm	角度偏差/(°)
500+0	8.6	9.1	22.0	26.6	0.6	19.6	0.2
500+200	15.6	16.0	58.8	65.0	0.9	41.6	0.3
500+400	23.6	22.3	106.8	114.1	1.0	63.2	0.4
500+600	28.3	29.0	164.6	172.6	1.1	75.7	0.4
500+800	30.6	31.5	201.1	208.7	1.2	114.7	0.4
平均值	21.3	21.6	110.7	117.4	1.0	63.0	0.3

将本文方法的实验结果与Hennessey等人^[7]和Mlot等人^[8]提出的方法进行对比。Hennessey等人^[7]使用多个红外光源标记眼动信息，测试空间为30 cm × 23 cm × 25 cm，标定平面依次距离测试者17.5 cm，22.5 cm，27.5 cm，32.5 cm，37.5 cm和42.5 cm。Mlot等人^[8]使用5个不同距离进行2维注视点的映射函数标定，测试空间为20 cm×20 cm×20 cm，标定平面依次距离测试者20 cm，25 cm，30 cm，35 cm和40 cm。为保证实验条件的一致性，仅用本文小工作空间深度值为300~400 mm的实验数据与文献[7-8]对应距离条件的数据进行对比，结果如表 7所示，可以看出本文方法在距离误差和角度偏差方面与文献[7-8]相比有明显提高。

表 7 本文方法与文献[7-8]方法误差结果对比
Table 7 Comparison of error results between our method and reference [7-8]

下载CSV

方法	平均误差					标准差
方法	$X$方向/mm	$Y$方向/mm	$Z$方向/mm	欧氏距离/mm	角度偏差/(°)	欧氏距离/mm	角度偏差/(°)
本文	2.9	3.4	10.0	11.7	0.6	6.0	0.2
文献[7]	12.6	11.9	29.0	36.8	—	27.0	—
文献[8]	2.8	3.7	14.5	16.7	1.3	8.3	0.6
注：加粗字体表示最优结果，“—”表示文献[7]中未描述该结果。

3 结论

针对3维注视点估计过程中眼球光心坐标不易测量及测量不准的问题，本文提出的通过标定的方法确定眼球光心的方法可以取代手工测量方法，与Mlot等人的实验结果相比，本文方法计算得到的结果角度偏差较小。同时，针对3维注视点结果抖动较大的问题，本文提出的深度方向数据滤波与$Z$平面截取修正法可以对原始结果进行修正，进一步减小欧氏距离误差，与其他3维注视点估计方案相比，本文方法对双眼位置的确定更方便准确，注视点估计精度更高，适用的工作空间更大。本文方法对3维注视点的实际应用具有重要意义。

本文方法是以2维映射函数计算得到的平面注视点为基础来计算3维注视点，因此需要测试者的头部保持静止，今后的研究将专注于动态标定的方式以满足头部运动的需求，以及如何在更大的工作空间实现准确的3维注视点估计。

参考文献

[1] Takagi K, Kawanaka H, Bhuiyan M S, et al. Estimation of a three-dimensional gaze point and the gaze target from the road images[C]//Proceedings of the 14th International IEEE Conference on Intelligent Transportation Systems. Washington, DC: IEEE, 2011: 526-531.[DOI: 10.1109/itsc.2011.6083129]

[2] Shimizu S, Fujiyoshi H. Acquisition of 3D gaze information from eyeball movements using inside-out camera[C]//Proceedings of the 2nd Augmented Human International Conference. Tokyo, Japan: ACM, 2011: 6.[DOI: 10.1145/1959826.1959832]

[3] Lidegaard M, Hansen D W, Krüger N. Head mounted device for point-of-gaze estimation in three dimensions[C]//Symposium on Eye Tracking Research and Applications. Safety Harbor, Florida: ACM, 2014: 83-86.[DOI: 10.1145/2578153.2578163]

[4] Abbott W W, Faisal A A. Ultra-low-cost 3D gaze estimation:an intuitive high information throughput compliment to direct brain-machine interfaces[J]. Journal of Neural Engineering, 2012, 9(4): 046016. [DOI:10.1088/1741-2560/9/4/046016]

[5] Mujahidin S, Wibirama S, Nugroho H A, et al. 3D gaze tracking in real world environment using orthographic projection[C]//Proceedings of 2016 Conference on Fundamental and Applied Science for Advanced Technology (ConFAST 2016). Yogyakarta, Indoesia: AIP Conference Proceedings, 2016, 1746(1): 020072.[DOI: 10.1063/1.4953997]

[6] Panev S, Manolova A. Improved multi-camera 3D eye tracking for human-computer interface[C]//Proceedings of the 8th International Conference onIntelligent Data Acquisition and Advanced Computing Systems: Technology and Applications. Warsaw, Poland: IEEE, 2015: 276-281.[DOI: 10.1109/idaacs.2015.7340743]

[7] Hennessey C, Lawrence P. Noncontact binocular eye-gaze tracking for point-of-gaze estimation in three dimensions[J]. IEEE Transactions on Biomedical Engineering, 2009, 56(3): 790–799. [DOI:10.1109/tbme.2008.2005943]

[8] Mlot E G, Bahmani H, Wahl S, et al. 3D gaze estimation using eye vergence[C]//Proceedings of the 9th International Joint Conference on Biomedical Engineering Systems and Technologies-Volume 5: HEALTHINF. Rome, Italy: Scitepress, 2016: 125-131.[DOI: 10.5220/0005821201250131]

[9] Wibirama S, Nugroho H A, Hamamoto K. Evaluating 3D gaze tracking in virtual space:a computer graphics approach[J]. Entertainment Computing, 2017, 21: 11–17. [DOI:10.1016/j.entcom.2017.04.003]

[10] Wang R I, Pelfrey B, Duchowski A T, et al. Online 3D gaze localization on stereoscopic displays[J]. ACM Transactions on Applied Perception (TAP), 2014, 11(1): 3. [DOI:10.1145/2593689]

[11] Liu C C, Herrup K, Shi B E. Remote gaze tracking system for 3D environments[C]//Proceedings of the 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Seogwipo, South Korea: IEEE, 2017: 1768-1771.[DOI: 10.1109/embc.2017.8037186]

[12] Zhao X C, Pan S H, Wang Y P, et al. Eye gaze tracking in 3D immersive environments[J]. Journal of System Simulation, 2018, 30(6): 2027–2035. [赵新灿, 潘世豪, 王雅萍, 等. 沉浸式3维视线追踪算法研究[J]. 系统仿真学报, 2018, 30(6): 2027–2035. ] [DOI:10.16182/j.issn1004731x.joss.201806004]

[13] Li S P, Zhang X L, Webb J D. 3-D-gaze-based robotic grasping through mimicking human visuomotor function for people with motion impairments[J]. IEEE Transactions on Biomedical Engineering, 2017, 64(12): 2824–2835. [DOI:10.1109/tbme.2017.2677902]

[14] Pichitwong W, Chamnongthai K. 3-D gaze estimation by stereo gaze direction[C]//Proceedings of the 13th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology. Chiang Mai, Thailand: IEEE, 2016: 1-4.[DOI: 10.1109/ecticon.2016.7561491]

[15] Leroux M, Achiche S, Raison M. Assessment of accuracy for target detection in 3D-space using eye tracking and computer vision[J]. PeerJ Preprints, 2017, 5: e2718v1. [DOI:10.7287/peerj.preprints.2718]

[16] Zhang Y H, Wei W, Yu D, et al. Shadow based single camera vision system calibration[J]. Journal of Image and Graphics, 2009, 14(9): 1895–1899. [张远辉, 韦巍, 虞旦, 等. 基于影子的乒乓球机器人单目视觉系统标定[J]. 中国图象图形学报, 2009, 14(9): 1895–1899. ] [DOI:10.11834/jig.20090929]