生物视觉系统的神经网络编码模型综述

郑雅菁; 余肇飞; 黄铁军

doi:10.11834/jig.220461

类脑视觉 | 浏览量 : 0 下载量: 105 CSCD: 1

PDF
导出
分享
收藏
专辑

生物视觉系统的神经网络编码模型综述
A literature review for neural networks-based encoding models of biological visual system
2023年28卷第2期页码：335-357
收稿：2022-05-23，

修回：2022-10-12，

录用：2022-10-19，

纸质出版：2023-02-16
DOI： 10.11834/jig.220461
稿件说明：

移动端阅览

郑雅菁, 余肇飞, 黄铁军. 生物视觉系统的神经网络编码模型综述[J]. 中国图象图形学报, 2023,28(2):335-357. DOI： 10.11834/jig.220461.

Yajing Zheng, Zhaofei Yu, Tiejun Huang. A literature review for neural networks-based encoding models of biological visual system[J]. Journal of Image and Graphics, 2023, 28(2): 335-357. DOI： 10.11834/jig.220461.

摘要

生物视觉系统的研究一直是计算机视觉算法的重要灵感来源。有许多计算机视觉算法与生物视觉研究具有不同程度的对应关系，包括从纯粹的功能启发到用于解释生物观察的物理模型的方法。从视觉神经科学向计算机视觉界传达的经典观点是视觉皮层分层层次处理的结构。而人工神经网络设计的灵感来源正是视觉系统中的分层结构设计。深度神经网络在计算机视觉和机器学习等领域都占据主导地位。许多神经科学领域的学者也开始将深度神经网络应用在生物视觉系统的计算建模中。深度神经网络多层的结构设计加上误差的反向传播训练，使得它可以拟合绝大多数函数。因此，深度神经网络在学习视觉刺激与神经元响应的映射关系并取得目前性能最好的模型同时，网络内部的单元甚至学习出生物视觉系统子单元的表达。本文将从视网膜等初级视觉皮层和高级视觉皮层（如，视觉皮层第4区（visual area 4，V4）和下颞叶皮层（inferior temporal，IT））分别介绍基于神经网络的视觉系统编码模型。主要内容包括：1）有关视觉系统模型的概念与定义；2）初级视觉系统的神经网络预测模型；3）任务驱动的高级视觉皮层编码模型。最后本文还将介绍最新有关无监督学习的神经编码模型，并展望基于神经网络的视觉系统编码模型的技术挑战与可能的发展方向。

Abstract

The biological visual system

an important part of the brain's nervous system

has evolved over hundreds of millions of years. About 70% of the information that humans obtain from the outside world comes from vision. Its complicated systematic functions are relevant to visual pathways and visual cortex

as well as its mechanism. Human-perceptive and energy-efficient vision ability is better than machine-based vision system like real-time sensor data processing

perception tasks and motion control. To realize a more advanced machine vision paradigm

it is still challenged to learn from the design of biological ingenious vision system effectively. The biological vision systems-contextual researches can be recognized as one of the key aspects for computer vision algorithms. Conventional visual neuroscience to the computer vision domain is focused on the structure of hierarchical processing in the visual cortex. The following artificial neural networks (ANNs) are targeted on the hierarchical structure design in the visual system. Visual system is mainly composed of the eyes (retina)

the lateral geniculate nucleus and the visual cortex (including the primary visual cortex and the striatal cortex). The human visual cortex and its relevance account for about 1/3 area of the cerebral cortex. It has the ability for visual information-related (e.g.

extraction

processing and integration) and advanced brain functions-organized (e.g.

learning

memory

decision-making

and emotion). For example

for the task of object recognition

the human brain can identify thousands of objects effectively

but this challenging issue is required to be resolved for machine-relevant. In recent years

deep neural networks (DNNs) have been projecting for computer vision and machine learning. To fit more multiple functions of network

the DNN plus multi-layer structure is designed for the back-propagation training. The biological visual system can be used to recognize as the mapping-learnt relationship between the external visual information and the internal neuron expression. In addition

the neural network itself is a biological visual system-derived multi-layer structure design. Nowadays

the DNNs are the most accurate model for learning the mapping relationship between visual stimuli and neuron responses. The internal units of the ANN can learn the expressions of the internal subunits of the visual system further. The DNNs-hierarchical can predict the visual representation of visual neural response as well (e.g.

V2 and interior temporal of visual cortex). Furthermore

the latest unsupervised learning is employed to visual cortex. To outreach a new generation of general artificial intelligence (AGI)

the research and development of ANNs and the exploration of brain function and its structure can be mutual-benefited. Our visual system-based review is focused on neural network-based coding models on the basis of primary visual cortex like retina and advanced visual cortex (e.g.

IT area). The main literatures are involved in: 1) concept and definition of the visual system model

2) the neural network prediction model of the primary visual system

and 3) the goal-driven advanced visual cortex coding model. The latest unsupervised learning reseaches are reviewed and summarized literatly. Technical challenges and future development directions of its neural network-encoding model are predicted further.

关键词

Keywords

references

Antolík J, Hofer S B, Bednar J A and Mrsic-Flogel T D. 2016. Model constrained by visual hierarchy improves prediction of neural responses to natural scenes. PLoS Computational Biology, 12(6): #e1004927 [DOI: 10.1371/journal.pcbi.1004927]

Bakhtiari S, Mineault P, Lillicrap T, Pack C C and Richards B A. 2021. The functional specialization of visual cortex emerges from training parallel pathways with self-supervised predictive learning. Advances in Neural Information Processing Systems, 34, 25164-25178.

Bashivan P, Kar K and DiCarlo J J. 2019. Neural population control via deep image synthesis. Science, 364(6439): #eaav9436 [DOI: 10.1126/science.aav9436]

Batty E, Merel J, Brackbill N, Heitman A, Sher A, Litke A, Chichilnisky E J and Paninski L. 2017. Multilayer recurrent network models of primate retinal ganglion cell responses//Proceedings of the 5th International Conference on Learning Representations. Toulon, France: [s. n.].

Bengio Y. 2009. Learning Deep Architectures for AI. Hanover, USA: Now Publishers Inc.

Bock D D, Lee W C A, Kerlin A M, Andermann M L, Hood G, Wetzel A W, Yurgenson S, Soucy E R, Kim H S and Reid R C. 2011. Network anatomy and in vivo physiology of visual cortical neurons. Nature, 471(7337): 177-182 [DOI: 10.1038/nature09802]

Boon M and Knuuttila T. 2009. Models as epistemic tools in engineering sciences//Meijers A, ed. Philosophy of Technology and Engineering Sciences. Amsterdam: Elsevier: 693-726 [ DOI: 10.1016/B978-0-444-51667-1.50030-6 http://dx.doi.org/10.1016/B978-0-444-51667-1.50030-6 ]

Boussaoud D, Desimone R and Ungerleider L G. 1991. Visual topography of area TEO in the macaque. Journal of Comparative Neurology, 306(4): 554-575 [DOI: 10.1002/cne.903060403]

Briggs F and Usrey W M. 2008. Emerging views of corticothalamic function. Current Opinion in Neurobiology, 18(4): 403-407 [DOI: 10.1016/j.conb.2008.09.002]

Burian R M. 1997. Exploratory experimentation and the role of histochemical techniques in the work of Jean Brachet, 1938-1952. History and Philosophy of the Life Sciences, 19(1): 27-45

Cadena S A, Denfield G H, Walker E Y, Gatys L A, Tolias A S, Bethge M and Ecker A S. 2019. Deep convolutional models improve predictions of macaque V1 responses to natural images. PLoS Computational Biology, 15(4): #e1006897 [DOI: 10.1371/journal.pcbi.1006897]

Carandini M, Demb J B, Mante V, Tolhurst D J, Dan Y, Olshausen B A, Gallant J L and Rust N C. 2005. Do we know what the early visual system does? Journal of Neuroscience, 25(46): 10577-10597 [DOI: 10.1523/JNEUROSCI.3726-05.2005]

Chaudhuri R, Knoblauch K, Gariel M A, Kennedy H and Wang X J. 2015. A large-scale circuit mechanism for hierarchical dynamical processing in the primate cortex. Neuron, 88(2): 419-431 [DOI: 10.1016/j.neuron.2015.09.008]

Chen T, Kornblith S, Norouzi M and Hinton G. 2020a. A simple framework for contrastive learning of visual representations//Proceedings of the 37th International Conference on Machine Learning. Vienna, Austria: JMLR. org: 1597-1607

Chen X L, Fan H Q, Girshick R and He K M. 2020b. Improved baselines with momentum contrastive learning[EB/OL]. [2020-03-09] . https://arxiv.org/pdf/2003.04297.pdf https://arxiv.org/pdf/2003.04297.pdf

Chichilnisky E J. 2001. A simple white noise analysis of neuronal light responses. Network, 12(2): 199-213 [DOI: 10.1080/713663221]

Choksi B, Mozafari M, VanRullen R and Reddy L. 2021. Multimodal neural networks better explain multivoxel patterns in the hippocampus[EB/OL]. [2021-12-11] . https://arxiv.org/pdf/2201.11517.pdf https://arxiv.org/pdf/2201.11517.pdf

Cichy R M and Kaiser D. 2019. Deep neural networks as scientific models. Trends in Cognitive Sciences, 23(4): 305-317 [DOI: 10.1016/j.tics.2019.01.009]

Cichy R M, Khosla A, Pantazis D, Torralba A and Oliva A. 2016. Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence. Scientific Reports, 6(1): #27755 [DOI: 10.1038/srep27755]

Conwell C, Mayo D, Buice M A, Katz B, Alvarez G A and Barbu A. 2021. Neural regression, representational similarity, model zoology and neural taskonomy at scale in rodent visual cortex. Advances in Neural Information Processing Systems, 34, 5590-5607.

Cox D D. 2014. Do we understand high-level vision? Current Opinion in Neurobiology, 25: 187-193 [DOI: 10.1016/j.conb.2014.01.016]

Cudeiro J and Sillito A M. 2006. Looking back: corticothalamic feedback and early visual processing. Trends in Neurosciences, 29(6): 298-306 [DOI: 10.1016/j.tins.2006.05.002]

DiCarlo J J, Zoccolan D and Rust N C. 2012. How does the brain solve visual object recognition? Neuron, 73(3): 415-434 [DOI: 10.1016/j.neuron.2012.01.010]

Eichner H, Klug T and Borst A. 2009. Neural simulations on multi-core architectures. Frontiers in Neuroinformatics, 3: #21 [DOI: 10.3389/neuro.11.021.2009]

Elman J L. 1990. Finding structure in time. Cognitive Science, 14(2): 179-211 [DOI: 10.1207/s15516709cog1402_1]

Enroth-Cugell C and Robson J G. 1984. Functional characteristics and diversity of cat retinal ganglion cells. Basic characteristics and quantitative description. Investigative Ophthalmology and Visual Science, 25(3): 250-267

Feest U. 2012. Exploratory experiments, concept formation, and theory construction in psychology//Feest U and Steinle F, eds. Scientific Concepts and Investigative Practice. Berlin: Walter de Gruyter, 3: 167-190 [DOI: 10.1515/9783110253610.167]

Felleman D J and van Essen D C. 1991. Distributed hierarchical processing in the primate cerebral cortex. Cerebral Cortex, 1(1): 1-47 [DOI: 10.1093/CERCOR/1.1.1-A]

Field G D and Chichilnisky E J. 2007. Information processing in the primate retina: circuitry and coding. Annual Review of Neuroscience, 30: 1-30 [DOI: 10.1146/annurev.neuro.30.051606.094252]

Fukushima K and Miyake S. 1982. Neocognitron: a self-organizing neural network model for a mechanism of visual pattern recognition//Competition and Cooperation in Neural Nets. Kyoto, Japan: Springer: 267-285 [ DOI: 10.1007/978-3-642-46466-9_18 http://dx.doi.org/10.1007/978-3-642-46466-9_18 ]

Geirhos R, Narayanappa K, Mitzkus B, Thieringer T, Bethge M, Wichmann F A and Brendel W. 2021. Partial success in closing the gap between human and machine vision. Advances in Neural Information Processing Systems, 34, 23885-23899

Gelfert A. 2016. Exploratory uses of scientific models//Gelfert A, ed. How to Do Science with Models. Cham: Springer: 71-99 [ DOI: 10.1007/978-3-319-27954-1_4 http://dx.doi.org/10.1007/978-3-319-27954-1_4 ]

Gilbert C D. 2013. The constructive nature of visual processing. Principles of Neural Science, 5: 556-576

Gilbert C D, Hirsch J A and Wiesel T N. 1990. Lateral interactions in visual cortex. Cold Spring Harbor Symposia on Quantitative Biology, 55: 663-677 [DOI: 10.1101/sqb.1990.055.01.063]

Girshick R, Donahue J, Darrell T and Malik J. 2016. Region-based convolutional networks for accurate object detection and segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(1): 142-158 [DOI: 10.1109/TPAMI.2015.2437384]

Gollisch T and Meister M. 2010. Eye smarter than scientists believed: neural computations in circuits of the retina. Neuron, 65(2): 150-164 [DOI: 10.1016/j.neuron.2009.12.009]

Güçlü U and van Gerven M A J. 2015. Deep neural networks reveal a gradient in the complexity of neural representations across the ventral stream. Journal of Neuroscience, 35(27): 10005-10014 [DOI: 10.1523/JNEUROSCI.5023-14.2015]

Gur M. 2015. Space reconstruction by primary visual cortex activity: a parallel, non-computational mechanism of object representation. Trends in Neurosciences, 38(4): 207-216 [DOI: 10.1016/j.tins.2015.02.005]

Haβ J, Blaschke S, Rammsayer T and Herrmann J M. 2008. A neurocomputational model for optimal temporal processing. Journal of Computational Neuroscience, 25(3): 449-464 [DOI: 10.1007/s10827-008-0088-4]

Hassabis D, Kumaran D, Summerfield C and Botvinick M. 2017. Neuroscience-inspired artificial intelligence. Neuron, 95(2): 245-258 [DOI: 10.1016/j.neuron.2017.06.011]

Hastie T, Tibshirani R and Friedman J. 2009. Unsupervised learning//The Elements of Statistical Learning. 2nd ed. New York, USA: Springer: #485 [ DOI: 10.1007/978-0-387-84858-7_14 http://dx.doi.org/10.1007/978-0-387-84858-7_14 ]

He K M, Zhang X Y, Ren S Q and Sun J. 2016. Identity mappings in deep residual networks//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands: Springer: 630-645 [ DOI: 10.1007/978-3-319-46493-0_38 http://dx.doi.org/10.1007/978-3-319-46493-0_38 ]

Helmstaedter M, Briggman K L, Turaga S C, Jain V, Seung H S and Denk W. 2013. Connectomic reconstruction of the inner plexiform layer in the mouse retina. Nature, 500(7461): 168-174 [DOI: 10.1038/nature12346]

Higgins I, Chang L, Langston V, Hassabis D, Summerfield C, Tsao D and Botvinick M. 2020. Unsupervised deep learning identifies semantic disentanglement in single inferotemporal neurons[EB/OL]. [2020-06-25] . https://arxiv.org/pdf/2006.14304.pdf https://arxiv.org/pdf/2006.14304.pdf

Hochreiter S and Schmidhuber J. 1997. Long short-term memory. Neural Computation, 9(8): 1735-1780 [DOI: 10.1162/neco.1997.9.8.1735]

Hong H, Yamins D L K, Majaj N J and DiCarlo J J. 2016. Explicit information for category-orthogonal object properties increases along the ventral stream. Nature Neuroscience, 19(4): 613-622 [DOI: 10.1038/nn.4247]

Huang T J, Zheng Y J, Yu Z F, Chen R, Li Y, Xiong R Q, Ma L, Zhao J W, Dong S W, Zhu L, Li J N, Jia S S, Fu Y H, Shi B X, Wu S and Tian Y H. 2022. 1 000×faster camera and machine vision with ordinary devices. Engineering [DOI: 10.1016/j.eng.2022.01.012]

Hubel D H and Wiesel T N. 1962. Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. The Journal of Physiology, 160(1): 106-154 [DOI: 10.1113/jphysiol.1962.sp006837]

Issa E B, Cadieu C F and DiCarlo J J. 2018. Neural dynamics at successive stages of the ventral visual stream are consistent with hierarchical error signals. Elife, 7: #e42870 [DOI: 10.7554/eLife.42870]

Jaiswal A, Babu A R, Zadeh M Z, Banerjee D and Makedon F. 2020. A survey on contrastive self-supervised learning. Technologies, 9(1): #2 [DOI: 10.3390/technologies9010002]

James W, Burkhardt F, Bowers F and Skrupskelis I K. 1890. The Principles of Psychology: Vol. 1. London: Macmillan

Jones H E, Andolina I M, Ahmed B, Shipp S D, Clements J T C, Grieve K L, Cudeiro J, Salt T E and Sillito A M. 2012. Differential feedback modulation of center and surround mechanisms in parvocellular cells in the visual thalamus. Journal of Neuroscience, 32(45): 15946-15951 [DOI: 10.1523/JNEUROSCI.0831-12.2012]

Kar K, Kubilius J, Schmidt K, Issa E B and DiCarlo J J. 2019. Evidence that recurrent circuits are critical to the ventral stream's execution of core object recognition behavior. Nature Neuroscience, 22(6): 974-983 [DOI: 10.1038/s41593-019-0392-5]

Kastner D B and Baccus S A. 2014. Insights from the retina into the diverse and general computations of adaptation, detection, and prediction. Current Opinion in Neurobiology, 25: 63-69 [DOI: 10.1016/j.conb.2013.11.012]

Khaligh-Razavi S M and Kriegeskorte N. 2014. Deep supervised, but not unsupervised, models may explain IT cortical representation. PLoS Computational Biology, 10(11): #e1003915 [DOI: 10.1371/journal.pcbi.1003915]

Kharitonov E, Rivière M, Synnaeve G, Wolf L, Mazaré P E, Douze M, and Dupoux E. 2021. Data augmenting contrastive learning of speech representations in the time domain//2021 IEEE Spoken Language Technology Workshop (SLT). Shenzhen, China: IEEE: 215-222 [ DOI: 10.1109/SLT48900.2021.9383605 http://dx.doi.org/10.1109/SLT48900.2021.9383605 ]

Kietzmann T C, McClure P and Kriegeskorte N. 2018. Deep neural networks in computational neuroscience. BioRxiv: #133504 [ DOI: 10.1101/133504 http://dx.doi.org/10.1101/133504 ]

Kim J S, Greene M J, Zlateski A, Lee K, Richardson M, Turaga S C, Purcaro M, Balkam M, Robinson A, Behabadi B F, Campos M, Denk W and Seung H S. 2014. Space-time wiring specificity supports direction selectivity in the retina. Nature, 509(7500): 331-336 [DOI: 10.1038/nature13240]

Kindel W F, Christensen E D and Zylberberg J. 2017. Using deep learning to reveal the neural code for images in primary visual cortex[EB/OL]. [2022-06-19] . https://arxiv.org/pdf/1706.06208.pdf https://arxiv.org/pdf/1706.06208.pdf

Kingma D P and Welling M. 2019. An introduction to variational autoencoders[EB/OL]. [2019-12-11] . https://arxiv.org/pdf/1906.02691.pdf https://arxiv.org/pdf/1906.02691.pdf

Kisiel T. 1973. Scientific discovery: logical, psychological, or hermeneutical?//Carr D and Casey E S, eds. Explorations in Phenomenology. Dordrecht: Springer: 263-284 [ DOI: 10.1007/978-94-010-1999-6_12 http://dx.doi.org/10.1007/978-94-010-1999-6_12 ]

Klindt D A, Ecker A S, Euler T and Bethge M. 2017. Neural system identification for large populations separating "what" and "where". Advances in Neural Information Processing Systems. Long Beach, USA: Curran Associates Inc. : 3509-3519

Konkle T and Alvarez G A. 2021. Beyond category-supervision: computational support for domain-general pressures guiding human visual system representation[EB/OL]. [2022-05-08]. http://biorxiv.org/content/10.1101/2020.06.15.15324703

Kriegeskorte N. 2015. Deep neural networks: a new framework for modeling biological vision and brain information processing. Annual Review of Vision Science, 1(1): 417-446 [DOI: 10.1146/annurev-vision-082114-035447]

Krizhevsky A. 2009. Learning Multiple Layers of Features from Tiny Images. University of Toronto

Krizhevsky A, Sutskever I and Hinton G E. 2012. ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems. Burlington: Morgan Kaufmann Publishers: 1097-1105

Kubilius J, Wagemans J and Op de Beeck H P. 2014. A conceptual framework of computations in mid-level vision. Frontiers in Computational Neuroscience, 8: #158 [DOI: 10.3389/fncom.2014.00158]

LeCun Y and Bengio Y. 1998. Convolutional networks for images, speech, and time series//Arbib M A, ed. The Handbook of Brain Theory and Neural Networks. Cambridge, USA: The MIT Press: 255-258

LeCun Y and Misra I. 2021. Self-supervised learning: The dark matter of intelligence[EB/OL]. [2022-05-23] . https://ai.facebook.com/blog/self-supervised-learning-the-dark-matter-of-intelligence https://ai.facebook.com/blog/self-supervised-learning-the-dark-matter-of-intelligence

LeCun Y, Bengio Y and Hinton G. 2015. Deep learning. Nature, 521(7553): 436-444 [DOI: 10.1038/nature14539]

Leroux S, Molchanov P, Simoens P, Dhoedt B, Breuel T and Kautz J. 2018. IamNN: iterative and adaptive mobile neural network for efficient image classification [EB/OL]. [2018-04-26] . https://arxiv.org/pdf/1804.10123.pdf https://arxiv.org/pdf/1804.10123.pdf

Li X, Jie Z Q, Feng J S, Liu C S and Yan S C. 2018. Learning with rethinking: recurrently improving convolutional neural networks through feedback. Pattern Recognition, 79: 183-194 [DOI: 10.1016/j.patcog.2018.01.015]

Liao Q L and Poggio T. 2016. Bridging the gaps betweenresidual learning, recurrent neural networks and visual cortex [EB/OL]. [2016-04-13] . https://arxiv.org/pdf/1604.03640.pdf https://arxiv.org/pdf/1604.03640.pdf

Lindsay G W. 2015. Feature-based attention in convolutional neural networks [EB/OL]. [2022-12-09] . https://arxiv.org/pdf/1511.06408.pdf https://arxiv.org/pdf/1511.06408.pdf

Lindsey J, Ocko S A, Ganguli S and Deny S. 2019. A unified theory of early visual representations from retina to cortex through anatomically constrained deep CNNs [EB/OL]. [2022-01-03] . https://arxiv.org/pdf/1901.00945.pdf https://arxiv.org/pdf/1901.00945.pdf

Linsley D, Kim J, Veerabadran V, Windolf C and Serre T. 2018. Learning long-range spatial dependencies with horizontal gated recurrent units. Advances in Neural Information Processing Systems. Montréal, Canada: Curran Associates Inc. : 152-164

Liu J K and Gollisch T. 2015. Spike-triggered covariance analysis reveals phenomenological diversity of contrast adaptation in the retina. PLoS Computational Biology, 11(7): #e1004425 [DOI: 10.1371/journal.pcbi.1004425]

Liu J K, Schreyer H M, Onken A, Rozenblit F, Khani M H, Krishnamoorthy V, Panzeri S and Gollisch T. 2017. Inference of neuronal functional circuitry with spike-triggered non-negative matrix factorization. Nature Communications, 8(1): #149 [DOI: 10.1038/s41467-017-00156-9]

Lotter W, Kreiman G and Cox D. 2017. Deep predictive coding networks for video prediction and unsupervised learning [EB/OL]. [2022-03-01] . https://arxiv.org/pdf/1605.08104.pdf https://arxiv.org/pdf/1605.08104.pdf

Machens C K, Wehr M S and Zador A M. 2004. Linearity of cortical receptive fields measured with natural sounds. Journal of Neuroscience, 24(5): 1089-1100 [DOI: 10.1523/JNEUROSCI.4445-03.2004]

Mahendran A and Vedaldi A. 2015. Understanding deep image representations by inverting them//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 5188-5196 [ DOI: 10.1109/CVPR.2015.7299155 http://dx.doi.org/10.1109/CVPR.2015.7299155 ]

Maheswaranathan N, McIntosh L T, Kastner D B, Melander J, Brezovec L, Nayebi A, Wang J L, Ganguli S and Baccus S A. 2018. Deep learning models reveal internal structure and diverse computations in the retina under natural scenes. bioRxiv: #340943 [ DOI: 10.1101/340943 http://dx.doi.org/10.1101/340943 ]

Malach R, Levy I and Hasson U. 2002. The topography of high-order human object areas. Trends in Cognitive Sciences, 6(4): 176-184 [DOI: 10.1016/S1364-6613(02)01870-3]

Marblestone A H, Wayne G and Kording K P. 2016. Toward an integration of deep learning and neuroscience. Frontiers in Computational Neuroscience, 10: #94 [DOI: 10.3389/fncom.2016.00094]

Markov N T, Ercsey-Ravasz M, Van Essen D C, Knoblauch K, Toroczkai Z and Kennedy H. 2013. Cortical high-density counterstream architectures. Science, 342(6158): #1238406 [DOI: 10.1126/science.1238406]

Marmarelis P Z and Naka K I. 1972. White-noise analysis of a neuron chain: an application of the Wiener theory. Science, 175(4027): 1276-1278 [DOI: 10.1126/science.175.4027.1276]

McFarland J M, Cui Y W and Butts D A. 2013. Inferring nonlinear neuronal computation based on physiologically plausible inputs. PLoS Computational Biology, 9(7): #e1003143 [DOI: 10.1371/journal.pcbi.1003143]

McIntosh L, Maheswaranathan N, Sussillo D and Shlens J. 2018. Recurrent segmentation for variable computational budgets//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Salt Lake City, USA: IEEE: 1761-1770 [ DOI: 10.1109/CVPRW.2018.00216 http://dx.doi.org/10.1109/CVPRW.2018.00216 ]

McIntosh L T, Maheswaranathan N, Nayebi A, Ganguli S and Baccus S A. 2016. Deep learning models of the retinal response to natural scenes. Advances in Neural Information Processing Systems. Barcelona, Spain: Curran Associates Inc. : 1369-1377

Merabet L, Desautels A, Minville K and Casanova C. 1998. Motion integration in a thalamic visual nucleus. Nature, 396(6708): 265-268 [DOI: 10.1038/24382]

Merolla P A, Arthur J V, Alvarez-Icaza R, Cassidy A S, Sawada J, Akopyan F, Jackson B L, Imam N, Guo C, Nakamura Y, Brezzo B, Vo I, Esser S K, Appuswamy R, Taba B, Amir A, Flickner M D, Risk W P, Manohar R and Modha D S. 2014. A million spiking-neuron integrated circuit with a scalable communication network and interface. Science, 345(6197): 668-673 [DOI: 10.1126/science.1254642]

Michaelis C, Bethge M and Ecker A S. 2018. One-shot segmentation in clutter//Proceedings of the 35th International Conference on Machine Learning. Stockholm, Sweden: PMLR: 3549-3558

Milner A D and Goodale M A. 2008. Two visual systems re-viewed. Neuropsychologia, 46(3): 774-785 [DOI: 10.1016/j.neuropsychologia.2007.10.005]

Mineault P J, Bakhtiari S, Richards B A and Pack C C. 2021. Your head is there to move you around: goal-driven models of the primate dorsal pathway. Advances in Neural Information Processing Systems, 34, 28757-28771

Mishkin M, Ungerleider L G and Macko K A. 1983. Object vision and spatial vision: two cortical pathways. Trends in Neurosciences, 6: 414-417 [DOI: 10.1016/0166-2236(83)90190-X]

Mordvintsev A, Olah C and Tyka M. 2015. Inceptionism: going deeper into neural networks. URL https://research.googleblog.com/2015/06/inceptionism-going-deeper-into-neural.html https://research.googleblog.com/2015/06/inceptionism-going-deeper-into-neural.html

Nandy A S, Sharpee T O, Reynolds J H and Mitchell J F. 2013. The fine structure of shape tuning in area V4. Neuron, 78(6): 1102-1115 [DOI: 10.1016/j.neuron.2013.04.016]

Nayebi A, Bear D, Kubilius J, Kar K, Ganguli S, Sussillo D, DiCarlo J J and Yamins D L K. 2018. Task-driven convolutional recurrent models of the visual system. Advances in Neural Information Processing Systems. Montréal, Canada: Curran Associates Inc. : 5295-5306

Nayebi A, Kong N C L, Zhuang C X, Gardner J L, Norcia A M and Yamins D L K. 2021. Unsupervised models of mouse visual cortex. bioRxiv [ DOI: 10.1101/2021.06.16.448730 http://dx.doi.org/10.1101/2021.06.16.448730 ]

None. 2013. Focus on neurotechniques. Nature Neuroscience, 16(7): #771 [DOI: 10.1038/nn0713-771]

O'Connor D H, Fukui M M, Pinsk M A and Kastner S. 2002. Attention modulates responses in the human lateral geniculate nucleus. Nature Neuroscience, 5(11): 1203-1209 [DOI: 10.1038/nn957]

Paninski L. 2003. Convergence properties of some spike-triggered analysis techniques. Advances in Neural Information Processing Systems. Cambridge, USA: MIT Press: 189-196

Pillow J W, Shlens J, Paninski L, Sher A, Litke A M, Chichilnisky E J and Simoncelli E P. 2008. Spatio-temporal correlations and visual signalling in a complete neuronal population. Nature, 454(7207): 995-999 [DOI: 10.1038/nature07140]

Pinto N and Cox D D. 2012. GPU metaprogramming: a case study in biologically inspired machine vision//GPU Computing Gems Jade Edition. Amsterdam: Elsevier: 457-471

Pinto N, Cox D D and DiCarlo J J. 2008. Why is real-world visual object recognition hard? PLoS Computional Biology, 4(1): #e27 [DOI: 10.1371/journal.pcbi.0040027]

Pinto N, Doukhan D, DiCarlo J J and Cox D D. 2009. A high-throughput screening approach to discovering good forms of biologically inspired visual representation. PLoS Computational Biology, 5(11): #e1000579 [DOI: 10.1371/journal.pcbi.1000579]

Plesser H E, Eppler J M, Morrison A, Diesmann M and Gewaltig M O. 2007. Efficient parallel simulation of large-scale neuronal networks on clusters of multiprocessor computers//Proceedings of the 13th International Euro-Par Conference on Parallel Processing. Rennes, France: Springer: 672-681 [ DOI: 10.1007/978-3-540-74466-5_71 http://dx.doi.org/10.1007/978-3-540-74466-5_71 ]

Potjans T C and Diesmann M. 2014. The cell-type specific cortical microcircuit: relating structure and activity in a full-scale spiking network model. Cerebral Cortex, 24(3): 785-806 [DOI: 10.1093/cercor/bhs358]

Radford A, Kim J W, Hallacy C, Ramesh A, Goh G, Agarwal S, Sastry G, Askell A, Mishkin P, Clark J, Krueger G and Sutskever I. 2021. Learning transferable visual models from natural language supervision [EB/OL]. [2021-02-26] . https://arxiv.org/pdf/2103.00020.pdf https://arxiv.org/pdf/2103.00020.pdf

Rajalingham R, Issa E B, Bashivan P, Kar K, Schmidt K and DiCarlo J J. 2018. Large-scale, high-resolution comparison of the core visual object recognition behavior of humans, monkeys, and state-of-the-art deep artificial neural networks. Journal of Neuroscience, 38(33): 7255-7269 [DOI: 10.1523/JNEUROSCI.0388-18.2018]

Ramachandram D and Taylor G W. 2017. Deep multimodal learning: a survey on recent advances and trends. IEEE Signal Processing Magazine, 34(6): 96-108 [DOI: 10.1109/MSP.2017.2738401]

Rao R P N and Ballard D H. 1999. Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nature Neuroscience, 2(1): 79-87 [DOI: 10.1038/4580]

Riesenhuber M and Poggio T. 1999. Hierarchical models of object recognition in cortex. Nature Neuroscience, 2(11): 1019-1025 [DOI: 10.1038/14819]

Rolls E T and Milward T. 2000. A model of invariant object recognition in the visual system: learning rules, activation functions, lateral inhibition, and information-based performance measures. Neural Computation, 12(11): 2547-2572 [DOI: 10.1162/089976600300014845]

Rowekamp R J and Sharpee T O. 2017. Cross-orientation suppression in visual area V2. Nature Communications, 8: #15739 [DOI: 10.1038/ncomms15739]

Rucci M and Victor J D. 2015. The unsteady eye: an information-processing stage, not a bug. Trends in Neurosciences, 38(4): 195-206 [DOI: 10.1016/j.tins.2015.01.005]

Sahani M and Linden J F. 2003. How linear are auditory cortical responses? Advances in Neural Information Processing Systems. Cambridge, United States: MIT Press: 125-132

Samek W, Wiegand T and Müller K R. 2017. Explainable artificial intelligence: understanding, visualizing and interpreting deep learning models [EB/OL]. [2022-08-28] . https://arxiv.org/pdf/1708.08296.pdf https://arxiv.org/pdf/1708.08296.pdf

Scholte H S. 2018. Fantastic DNimals and where to find them. NeuroImage, 180: 112-113 [DOI: 10.1016/j.neuroimage.2017.12.077]

Schrimpf M, Kubilius J, Hong H, Majaj N J, Rajalingham R, Issa E B, Kar K, Bashivan P, Prescott-Roy J, Geiger F, Schmidt K, Yamins D L K and DiCarlo J J. 2020. Brain-score: which artificial neural network for object recognition is most brain-like? BioRxiv: #407007 [ DOI: 10.1101/407007 http://dx.doi.org/10.1101/407007 ]

Schwartz O, Pillow J W, Rust N C and Simoncelli E P. 2006. Spike-triggered neural characterization. Journal of Vision, 6(4): 484-507 [DOI: 10.1167/6.4.13]

Serre T, Wolf L, Bileschi S, Riesenhuber M and Poggio T. 2007. Robust object recognition with cortex-like mechanisms. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(3): 411-426 [DOI: 10.1109/TPAMI.2007.56]

Simoncelli E P and Olshausen B A. 2001. Natural image statistics and neural representation. Annual Review of Neuroscience, 24: 1193-1216 [DOI: 10.1146/annurev.neuro.24.1.1193]

Simonyan K, Vedaldi A and Zisserman A. 2014. Deep inside convolutional networks: visualising image classification models and saliency maps [EB/OL]. [2022-04-19] . https://arxiv.org/pdf/1312.6034.pdf https://arxiv.org/pdf/1312.6034.pdf

Simonyan K and Zisserman A. 2015. Very deep convolutional networks for large-scale image recognition [EB/OL]. [2015-04-10] . https://arxiv.org/pdf/1406.1556.pdf https://arxiv.org/pdf/1406.1556.pdf

Souihel S and Cessac B. 2021. On the potential role of lateral connectivity in retinal anticipation. Journal of Mathematical Neuroscience, 11: 3 [ DOI: 10.1186/s13408-020-00101-z]

Spoerer C J, McClure P and Kriegeskorte N. 2017. Recurrent convolutional neural networks: a better model of biological object recognition. Frontiers in Psychology, 8: #1551 [DOI: 10.3389/fpsyg.2017.01551]

Steffen L, Reichard D,Weinland J, Kaiser J, Roennau A and Dillmann R. 2019. Neuromorphic stereo vision: a survey of bio-inspired sensors and algorithms. Frontiers in Neurorobotics, 13: #28 [DOI: 10.3389/fnbot.2019.00028]

Steinle F. 1997. Entering new fields: exploratory uses of experimentation. Philosophy of Science, 64(S4): S65-S74 [DOI: 10.1086/392587]

Sterrett S G. 2014. The morals of model-making. Studies in History and Philosophy of Science Part A, 46: 31-45 [DOI: 10.1016/j.shpsa.2013.11.006]

Storrs K R, Anderson B L and Fleming R W. 2021. Unsupervised learning predicts human perception and misperception of gloss. Nature Human Behaviour, 5(10): 1402-1417 [DOI: 10.1038/s41562-021-01097-6]

Temam O and Héliot R. 2011. Implementation of signal processing tasks on neuromorphic hardware//Proceedings of 2011 International Joint Conference on Neural Networks. San Jose, USA: IEEE: 1120-1125 [ DOI: 10.1109/IJCNN.2011.6033349 http://dx.doi.org/10.1109/IJCNN.2011.6033349 ]

Ukita J, Yoshida T and Ohki K. 2018. Characterization of nonlinear receptive fields of visual neurons by convolutional neural network. bioRxiv: #348060 [ DOI: 10.1101/348060 http://dx.doi.org/10.1101/348060 ]

Ungerleider L G and Haxby J V. 1994. "What" and "where" in the human brain. Current Opinion in Neurobiology, 4(2): 157-165 [DOI: 10.1016/0959-4388(94)90066-3]

Vance P J, Das G P, Kerr D, Coleman S A, McGinnity T M, Gollisch T and Liu J K. 2018. Bioinspired approach to modeling retinal ganglion cells using system identification techniques. IEEE Transactions on Neural Networks and Learning Systems, 29(5): 1796-1808 [DOI: 10.1109/tnnls.2017.2690139]

van der Maaten L and Hinton G. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research, 9: 2579-2605

Van Essen D C. 2003. Organization of visual areas in macaque and human cerebral cortex//Chalupa L M and Werner J S, eds. The Visual Neurosciences. Cambridge, MA: MIT Press, 1: 507-521

Vintch B, Movshon J A and Simoncelli E P. 2015. A convolutional subunit model for neuronal responses in macaque V1. Journal of Neuroscience, 35(44): 14829-14841 [DOI: 10.1523/JNEUROSCI.2815-13.2015]

Walke E Y, Sinz F H, Froudarakis E, Fahey P G, Muhammad T, Ecker A S, Cobos E, Reimer J, Pitkow X and Tolias A S. 2018. Inception in visual cortex: in vivo-silico loops reveal most exciting images. bioRxiv: #506956 [ DOI: 10.1101/506956 http://dx.doi.org/10.1101/506956 ].

Waters C K. 2007. The nature and context of exploratory experimentation: an introduction to three case studies of exploratory research. History and Philosophy of the Life Sciences, 29(3): 275-284

Whiteway M R, Socha K, Bonin V and Butts D A. 2018. Characterizing the nonlinear structure of shared variability in cortical neuron populations using neural networks. bioRxiv: #407858 [ DOI: 10.1101/407858 http://dx.doi.org/10.1101/407858 ]

Xu T, Zhan J Y, Garrod O G B, Torr P H S, Zhu S C, Ince R A A and Schyns P G. 2018. Deeper interpretability of deep networks [EB/OL]. [2022-11-20] . https://arxiv.org/pdf/1811.07807.pdf https://arxiv.org/pdf/1811.07807.pdf

Yamins D, Hong H, Cadieu C and Dicarlo J J. 2013. Hierarchical modular optimization of convolutional networks achieves representations similar to macaque IT and human ventral stream. Advances in Neural Information Processing Systems. Lake Tahoe, Nevada: Curran Associates Inc. : 3093-3101

Yamins D L K and Dicarlo J J. 2016. Using goal-driven deep learning models to understand sensory cortex. Nature Neuroscience, 19(3): 356-365 [DOI: 10.1038/nn.4244]

Yamins D L K, Hong H, Cadieu C F, Solomon E A, Seibert D and DiCarlo J J. 2014. Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proceedings of the National Academy of Sciences of the United States of America, 111(23): 8619-8624 [DOI: 10.1073/pnas.1403112111]

Yan Q, Zheng Y J, Jia S S, Zhang Y C, Yu Z F, Chen F, Tian Y H, Huang T J and Liu J K. 2020. Revealing fine structures of the retinal receptive field by deep-learning networks. IEEE Transactions on Cybernetics, 52(1): 39-50 [DOI: 10.1109/TCYB.2020.2972983]

Yosinski J, Clune J, Nguyen A, Fuchs T and Lipson H. 2015. Understanding neural networks through deep visualization [EB/OL]. [2022-06-22] . https://arxiv.org/pdf/1506.06579.pdf https://arxiv.org/pdf/1506.06579.pdf

Yu Z F, Liu J K, Jia S S, Zhang Y C, Zheng Y J, Tian Y H and Huang T J. 2020. Toward the next generation of retinal neuroprosthesis: visual computation with spikes. Engineering, 6(4): 449-461 [DOI: 10.1016/j.eng.2020.02.004]

Zamir A R, Wu T L, Sun L, Shen W B, Shi B E, Malik J and Savarese S. 2017. Feedback networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 1808-1817 [ DOI: 10.1109/CVPR.2017.196 http://dx.doi.org/10.1109/CVPR.2017.196 ]

Zeiler M D and Fergus R. 2014. Visualizing and understanding convolutional networks//Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland: Springer: 818-833 [ DOI: 10.1007/978-3-319-10590-1_53 http://dx.doi.org/10.1007/978-3-319-10590-1_53 ]

Zeiler M D, Taylor G W and Fergus R. 2011. Adaptive deconvolutional networks for mid and high level feature learning//Proceedings of 2011 International Conference on Computer Vision. Barcelona, Spain: IEEE: 2018-2025 [ DOI: 10.1109/ICCV.2011.6126474 http://dx.doi.org/10.1109/ICCV.2011.6126474 ]

Zheng Y J, Jia S S, Yu Z F, Liu J K and Huang T J. 2021a. Unraveling neural coding of dynamic natural visual scenes via convolutional recurrent neural networks. Patterns, 2(10): #100350 [DOI: 10.1016/j.patter.2021.100350]

Zheng Y J, Zheng L X, Yu Z F, Shi B X, Tian Y H and Huang T J. 2021b. High-speed image reconstruction through short-term plasticity for spiking cameras//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE: 6354-6363 [ DOI: 10.1109/CVPR46437.2021.00629 http://dx.doi.org/10.1109/CVPR46437.2021.00629 ]

Zhou B L, Bau D, Oliva A and Torralba A. 2019. Interpreting deep visual representations via network dissection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(9): 2131-2145 [DOI: 10.1109/TPAMI.2018.2858759]

Zhou B L, Khosla A, Lapedriza A, Oliva A and Torralba A. 2015. Object detectors emerge in deep scene CNNs [EB/OL].[2022-04-15] . https://arxiv.org/pdf/1412.6856.pdf https://arxiv.org/pdf/1412.6856.pdf

Zhuang C X, Yan S M, Nayebi A, Schrimpf M, Frank M C, DiCarlo J J and Yamins D L K. 2021. Unsupervised neural network models of the ventral visual stream. Proceedings of the National Academy of Sciences of the United States of America, 118(3): #e2014196118 [DOI: 10.1073/pnas.2014196118]