Current Issue Cover
多路径卷积神经网络的轮廓感知

谭明明,范影乐(杭州电子科技大学)

摘 要
目的 引入视觉信息流的整体和局部处理机制,提出了一种多路径卷积神经网络的轮廓感知新方法。方法 利用高斯金字塔尺度分解获得低分辨率子图,用来表征视觉信息中的整体轮廓;通过二维高斯导函数模拟经典感受野的方向选择性,获得描述细节特征的边界响应子图;构建多路径卷积神经网络,利用具有稀疏编码特性的子网络(Sparse Coded Sub-network,Sparse-Net)实现对整体轮廓的快速检测;利用具有冗余度增强编码特性的子网络(Redundancy Enhanced Coding Sub-network,Redundancy-Net)实现对局部细节特征提取;对上述多路径卷积神经网络响应进行融合编码,以实现轮廓响应的整体感知和局部检测融合,获取轮廓的精细化感知结果。结果 以berkely(伯克利大学)computer vision group提供的数据集BSDS500图库为实验对象,在GTX1080Ti环境下本文Sparse-Net对整体轮廓的检测速度达到42 Images/s(每秒42张图),为HFL方法1.2 Images/s的35倍;而Sparse-Net和 Redundancy-Net融合后的检测指标数据集尺度上最优(Optimal Data-set Scale,ODS)、图片尺度上最优(Optimal Image Scale,OIS)、平均精度(Average Precision,AP)分别为0.806、0.824、0.846,优于Holistically-nested Edge Detection(HED)方法和Richer Convolution Features for Edge Detection(RCF)方法,结果表明本文方法能有效突出主体轮廓并抑制纹理背景。结论 多路径卷积神经网络的轮廓感知应用,将有助于进一步理解视觉感知机制,并对减弱卷积神经网络的黑盒特性有着重要的意义。
关键词
Contour perception based on multipath convolution neural network

Tan Ming-ming,Fan Ying-le(Hangzhou DianZi University)

Abstract
Objective To introduce the global and local processing mechanism of visual information flow by constructing a visual information encoding and decoding model based on the correlation between visual nerve coding and contour perception, and propose a contour perception method based on multi-path convolution neural network. Method The Gauss pyramid scale decomposition was used to obtain low-resolution molecular images to characterize the whole contour of visual information; Two-dimensional Gauss derivative was used to simulate the directional selectivity of classical receptive fields to obtain boundary response sub-graphs describing details; A multi-path convolution neural network was constructed, and a sparse encoding sub-network (Sparse-Net) was used to realize the fast processing of the whole contour detection; redundancy enhanced coding (Redundancy-Net) is used to extract local details; and the response of the multi-path convolution neural network is fused and coded to achieve the integration of global perception and local detection of contour response and obtain the fine perception results of contour. Result With the BSDS500 image database provided by Berkely Computer Vision Group as the experimental object, the detection speed of Sparse-Net in GTX1080Ti environment reached 42 Images/s (42 pictures per second), which was 35 times higher than that of HFL method 1.2 Images/s, while the detection index data set of Sparse-Net and Redundancy-Net after fusion was the best in scale (ODS) and picture scale (OIS) and AP are 0.806, 0.824 and 0.846 respectively, which are better than the Holistically-nested Edge Detection (HED) and Richer Convolution Features for Edge Detection (RCF) methods, which are based on the analysis of the lateral output feature map, progressive encoding and decoding and feature fusion from the shallow to the deep layer of the network, learning fine contour features and achieving end-to-end contour detection. The results show that the proposed method can not only effectively highlight the main contour and suppress the texture background, but also improve the detection efficiency of contour. Conclusion Although convolution neural network can be explained by visual mechanism in some dimensions, such as convolution operation corresponding to the topological mapping of retinal visual information, and pooling operation is related to complex cells and simple cells in visual pathway, convolution neural network is still a black box model which depends heavily on massive samples on the whole. Considering that the actual visual pathway is not simply a serial transmission of information, but a fusion of local and global characteristics of multi-channel visual information flow in the visual cortex, a Gauss pyramid decomposition model is constructed to realize sparse encoding of spatial scale of visual information and obtain low-resolution molecular maps representing the overall characteristics, and then lateral suppression of non-classical receptive fields is used in the lateral geniculate region. To achieve the isotropic suppression of background information, considering the ability of primary visual cortex for information processing in the visual radiation region, a classical receptive field with directional selection characteristics is set up, and a two-dimensional Gauss derivative model is constructed to process the visual information by directional selection, and the boundary response sub-graph representing local features is obtained. Considering the local details of external excitation and the layer-by-layer perception of overall information in the primary and advanced visual cortex, a multi-path convolution neural network is constructed, in which the fast detection path is composed of a sub-network Sparse-Net containing a pooling unit to realize sparse coding of the overall image contour; the detail detection path is composed of a sub-network Redundancy-Net containing a void convolution unit to realize image bureau. Redundancy enhancement coding of part details. Finally, the feedback and fusion process of high-level visual cortex to visual information flow is simulated, and the above-mentioned multi-path convolution neural network response is fused and coded to achieve the overall perception and local detection fusion of contour response, and finally the fine perception results of contour are obtained. Contour perception based on multi-path convolution neural network is helpful to further understand the mechanism of visual perception, and is of great significance to weaken the black-box characteristics of convolution neural network. Taking the natural scene image subject contour perception under complex texture background as an example, simulating the neural coding mechanism of multi-path cooperative work in primary visual pathway will help to understand the intrinsic mechanism of visual system and its specific application in visual perception, and provide a new idea for subsequent image understanding and analysis based on visual mechanism.
Keywords
QQ在线


订阅号|日报