Current Issue Cover
结合双视通路与尺度信息融合的轮廓检测方法

杜仕荣1, 范影乐2, 蔡哲飞2, 房涛1(1.杭州电子科技大学模式识别与图像处理实验室 杭州 杭州电子科技大学电子信息学院 杭州;2.杭州电子科技大学模式识别与图像处理实验室 杭州)

摘 要
目的 考虑到图像信息在视觉通路中的表征是多尺度的,为了实现自然场景下多对比度分布图像的轮廓检测任务,本文提出了一种基于双视通路尺度信息整合的轮廓检测新方法。方法 首先构建对亮度信息敏感和对颜色信息敏感的M、P并行通路,构建不同尺度的感受野模拟神经节细胞对刺激的模糊和精细感知,使用亮度对比度和色差信息指导不同尺度感受野响应的自适应融合,使其能够充分提取亮度轮廓和颜色轮廓。其次结合外膝体(LGN)多尺度方向差异编码与多尺度朝向选择性抑制方法,构建显著轮廓提取模型,实现轮廓区域的增强以及背景纹理的抑制。最后将加工后的亮度轮廓和颜色轮廓前馈至V1区,构建双通道响应权重调节模型整合M、P通路所得信息,进一步丰富轮廓。结果 本文使用BSDS500图像库、NYUD图像库对本文提出的算法进行验证,其中在BSDS图像库的平均最优P指标为0.74,相对于SCO、BAR、SED等基于生物视觉机制的检测方法有4%-13%的提升,所得结果轮廓图也更为连续、准确。结论 本文利用M、P双通路机制以及亮度信息和颜色信息在前端视觉通路中的编码过程实现轮廓信息的加工与提取,可以有效实现自然图像的轮廓检测,尤其是对于图像中的细微轮廓边缘的检测,也为研究更高级皮层中视觉信息机制提供新的思路。
关键词
A countour detection method combing dual visual pathway and scale information fusion

(Laboratory of Pattern Recognition and Image Processing,Hangzhou Dianzi University,Hangzhou)

Abstract
Objective The extraction and utilization of contour information, as a low-level visual feature of the target subject, contribute to the efficient execution of advanced visual tasks such as object detection and image segmentation. When processing complex images, contour detection based on biological vision mechanisms can quickly extract object contour information. However, currently, the perception of primary contour information is based on a single scale receptive field template or a simple fusion of multiple scale receptive field templates, ignoring the dynamic characteristics of receptive field scales and making it difficult to accurately extract contours in complex scenes. Considering the serial parallel transmission and integration mechanism of visual information in the M and P dual vision pathways, we propose a new contour detection method based on the fusion of dual vision pathway scale information. Method In our research, firstly, we introduce Lab, a color system that is close to human visual physiological characteristics, to extract color difference and brightness information in the image. Compared with conventional RGB color systems, it is more in line with the way the human eye perceives visual information. Considering that the scale of the receptive field of ganglion cells varies with the size of local stimuli to adapt to different visual task requirements in different scenes, the smaller the scale of the receptive field, the more refined the perception of detailed information. Simulate the fuzzy and fine perception of stimuli by ganglion cells using two different scale receptive fields. Using color difference and brightness contrast information to guide adaptive fusion of large and small scale receptive field responses, highlighting contour details. Secondly, considering the differences in perception of orientation information among receptive fields at different scales of the lateral geniculate body, the standard deviation of the optimal orientation obtained from perception at multiple scales is introduced as the encoding weight for the direction difference, achieving modulation of texture region suppression weight information. Combining local contrast information to guide the lateral inhibition intensity of non classical receptive fields based on the difference between the central and peripheral directions, through the collaborative integration of the two, the enhancement of contour regions and the suppression of background textures are achieved. Finally, in order to simulate the complementary fusion mechanism of color and brightness information in the V1 region, a weight association model integrating contrast information is proposed. Based on the fusion weight coefficients obtained from the local color contrast and brightness contrast, complementary fusion of information flows in the M and P paths is achieved, enriching contour details.Result We compared our model with three biological vision based mechanisms and one deep learning based model, namely SCSI, SED, BAR, and PiDiNet. On the BSDS500 dataset, we used quantitative evaluation indicators including ODS, OIS, AP indicators, and PR curves, and selected 5 images to compare the detection performance of each method. The experimental results show that our model has better overall performance than other models. Compared with the results of SCSI, SED, and BAR, the ODS index (higher is better) has increased by 4.45%, 2.94%, and 4.45%, respectively. The OID index (higher is better) has increased by 2.82%, 5.80%, and 8.96%, respectively. The AP index (higher is better) has increased by 7.25%, 4.23%, and 5.71%, respectively. The PiDiNet model based on deep learning has some shortcomings compared to various indicators, but this model does not require pre training of data, has biological interpretability, and has a small overall computational power requirement. Further extract 4 images on the NYUD dataset to compare the detection performance of each model, and visually compare them using false detection rate, missed detection rate, and overall performance indicators. In addition, we conducted a series of ablation experiments to clearly demonstrate the contribution of each module in the model to overall performance.Conclusion In this paper, we use M and P dual-path mechanism and the encoding process of luminance and color information in front-end visual path to realize contour information processing and extraction, which can effectively realize contour detection of natural images, especially for subtle contour edge detection in images, and also provide a new idea for studying visual information mechanism in higher-level cortex.
Keywords

订阅号|日报