发布时间: 2018-08-16 摘要点击次数: 全文下载次数: DOI: 10.11834/jig.170545 2018 | Volume 23 | Number 8 图像分析和识别

 收稿日期: 2017-10-25; 修回日期: 2018-03-07 基金项目: 国家自然科学基金项目（61502331，11701410） 第一作者简介: 盛家川, 1982年生, 女, 副教授, 2013年于天津大学获计算机应用技术专业博士学位, 主要从事图像处理、模式识别方面的研究。E-mail:jiachuansheng@tjufe.edu.cn;李玉芝, 女, 讲师, 主要从事机器学习、多媒体处理方面的研究。E-mail:liyuzhi@tjufe.edu.cn. 中图法分类号: TP301.6 文献标识码: A 文章编号: 1006-8961(2018)08-1193-14

# 关键词

Learning artistic objects for improved classification of Chinese paintings
Sheng Jiachuan, Li Yuzhi
School of Science and Technology, Tianjin University of Finance and Economics, Tianjin 300222, China
Supported by: National Natural Science Foundation of China (61502331, 11701410)

# Abstract

Objective Presently, existing research on art classification is primarily based on feature extraction and hence feature-based classification. Although such feature-based methods reported in the literature achieve a certain level of success, a major weakness lies in the considerable dependence of classification performances on the effectiveness of the features in describing the content of Chinese paintings. Given that traditional Chinese artists tend to rely on popular objects, such as figures, trees, flowers, birds, mountains, horses, and houses to express their artistic feelings and emotions, we explore a new concept of artistic object-based approach to classify traditional Chinese paintings in this study. In this way, automated classification can be integrated with perception, understanding, and interpretation of artistic expressions and emotions via the segmented artistic objects. Such an approach also possibly enables our proposed methods to be further developed into an interactive object-based classification approach for other forms of paintings. In comparison with the existing state of the arts, one advantage of our proposed approach over those based on features or content is that objects provide direct and integrated art expressions inside paintings. Method Our proposed method includes three stages of processing and analytics for traditional Chinese paintings, that is, 1) interactive art object segmentation; 2) description and characterization of art objects via convolution neural network (CNN), the most popular deep learning unit; and 3) SVM-based classification and fusion across all art objects. Specifically, via an iterative linear clustering algorithm, super-pixels are constructed to detect the difference between the color and position of each individual pixel. By maximizing the similarity within the neighborhood of those super-pixels, a sequence of objects can be segmented, and an interactive scheme can be designed, allowing users to add, revise, and interact with the content of paintings to achieve the best possible balance between subjective demand and objective art description. Afterward, a CNN-based deep learning unit is added to describe those objects, so its classification can be carried out with regard to the individual art object. Finally, an SVM unit is adopted to achieve the final fusion of all these classifications via consideration of each individual object within the given window, which is influenced and initialized through the training process. Result Extensive experiments are carried out, which are in four phases, each of which considers one impact factor, such as consideration of the number of artists, comparison with the existing state of the arts, consideration of benchmarking via content-based classifications, and assessment of contributions from CNN alone. Experimental results show that our proposed algorithm:1) outperforms several existing representative approaches, including MHMM and fusion-based method, 2) achieves effective fusion of all different object classifications, including CNN and SVM units, 3) captures the artistic emotions through those segmented art objects, and 4) shows potential for interactive classification of Chinese paintings via segmentation of artistic objects. Conclusion This study proposes the computerized classification and recognition of art styles based on artistic objects in paintings rather than the whole paintings. Experimental results reveal that the proposed algorithm outperforms the existing representative benchmarks, providing potential for developing effective digital tools for computerized management of Chinese paintings. In addition, this method can be used to formulate an important tool for computerized management of Chinese traditional paintings, providing a range of techniques for effective and efficient digitization, manipulation, understanding, perception, and interpretation of Chinese traditional arts as well as its legacy.

# Key words

artistic object segmentation; classification of Chinese paintings; convolutional neural network; fusion algorithm; deep learning; superpixel segmentation

# 1 相关算法

SuperLattice[12]算法使用贪婪策略，每次在边界成本图最小处利用水平和垂直路径来分割图像，从而得到超像素。该方法保持了规则的图像拓扑结构，产生规整的超像素网格，具有良好的分割精度和稳定性，同时超像素数量是可以人为规定的。但该方法产生的超像素优劣与图像的边界图质量有很大关系。

MeanShift[15]算法是一种无参数的迭代算法，通过概率密度函数使中心点收敛至密度最大的点。该方法产生的超像素形状规整，在稳定性和抗躁性上保持良好的性能。但是该方法速度不快，对于超像素数量无法控制，并且存在过分割问题。

TurboPixel[16]算法是基于几何流的水平集方法，首先选择初始种子点，通过曲率演化模型和骨架化过程来扩张种子点的区域，从而得到网格状超像素。该算法的运行时间与图像尺寸是正相关关系，能够人为规定生成的超像素数目，超像素形状规则且能保留图像的轮廓结构，同时改良了欠分割问题。但是生成的超像素形状不可控，对于分辨率较大的图像，不能满足快速高质量的图像分割。

SLIC[17]算法是一种基于聚类算法的超像素分割，由像素LAB颜色空间和像素位置共5维空间来生成超像素。该方法首先初始化聚类中心，并通过重设聚类中心将其移到邻域内梯度最小的地方，然后在聚类中心2S×2S的邻域内，为每个聚类中心分配匹配点，进而计算新的聚类中心与之前聚类中心的距离，根据阈值判断是否需要重新设置聚类中心。该方法生成的超像素大小均匀、形状规整，同时边界信息保持较好，能控制生成的超像素的数目，且该数量是算法的唯一输入参数。

# 2.2 交互式艺术目标分割

 $\zeta (\boldsymbol{S}, \boldsymbol{T}) = \sum\limits_{\eta = 1}^{4\;096} {\sqrt {H_S^\eta \times H_T^\eta } }$ (5)

MSRMAO算法步骤如下：

1) 对中国画艺术目标区域$\boldsymbol{F}$进行融合:

(1) 对每个区域$\boldsymbol{P} \in \boldsymbol{F}$, 标记其相邻区域为${\mathit{\boldsymbol{\bar R}}_F} = \{ {\mathit{\boldsymbol{A}}_\mathit{\boldsymbol{i}}}\}, i = 1, 2, \cdots, p$

(2) 对每个${\boldsymbol{A}_i} \notin \boldsymbol{F}$, 标记其相邻区域为${{\mathit{\boldsymbol{\bar R}}}_{{A_i}}} = \{ \mathit{\boldsymbol{R}}_j^{{A_i}}\}, j = 1, 2, \cdots, k$。显然有: $\mathit{\boldsymbol{P}} \in {{\mathit{\boldsymbol{\bar R}}}_{{A_i}}}$

(3) 计算$\zeta ({\boldsymbol{A}_i}, \boldsymbol{R}_j^{{A_i}})$，如果$\zeta ({\mathit{\boldsymbol{A}}_i}, \mathit{\boldsymbol{P}}) = \mathop {\max }\limits_{j = 1, 2, \cdots, k} \zeta ({\mathit{\boldsymbol{A}}_i}, \mathit{\boldsymbol{R}}_j^{{A_i}})$，则将$\boldsymbol{P}$${\boldsymbol{A}_i}合并，\boldsymbol{P} = \boldsymbol{P} \cup {\boldsymbol{A}_i}，否则不合并。 (4) 更新\boldsymbol{F}$$\boldsymbol{M}$

(5) 如果$\boldsymbol{F}$中未找到新的可合并区域，则进行步骤2)；否则返回步骤(1)。

2) 对背景区域$\boldsymbol{B}$进行融合:

(1) 对于每个区域$\boldsymbol{Q} \in \boldsymbol{B}$，标记其相邻区域为${{\mathit{\boldsymbol{\bar R}}}_Q} = \{ {\mathit{\boldsymbol{N}}_i}\}, i = 1, 2, \cdots, q$

(2) 对每个${\boldsymbol{N}_i} \notin \boldsymbol{B}$，标记其相邻区域为${{\mathit{\boldsymbol{\bar R}}}_{{N_i}}} = \{ \mathit{\boldsymbol{R}}_j^{{N_i}}\}, j = 1, 2, \cdots, k$。显然有: $Q \in {{\mathit{\boldsymbol{\bar R}}}_{{N_i}}}$

(3) 计算$\zeta ({\boldsymbol{N}_i}, R_j^{{\boldsymbol{N}_i}})$，如果$\zeta ({\boldsymbol{N}_i}, \boldsymbol{Q}) = \mathop {\max }\limits_{j = 1, 2, \cdots, k} \zeta ({\boldsymbol{N}_i}, \boldsymbol{R}_j^{{N_i}})$，就将$\boldsymbol{Q}$${\boldsymbol{N}_i}合并，\boldsymbol{Q} = \boldsymbol{Q} \cup {\boldsymbol{N}_i}，否则不合并。 (4) 更新\boldsymbol{B}$$\boldsymbol{M}$

(5) 如果$\boldsymbol{B}$中找不到新的可合并区域，则进行下一步；否则返回步骤(1)。

3) 对未标记区域$\boldsymbol{M}$进行融合：

(1) 对于每个未标记区域$\boldsymbol{O} \in \boldsymbol{M}$，标记其相邻区域为${{\mathit{\boldsymbol{\bar R}}}_o} = \{ {\mathit{\boldsymbol{G}}_\mathit{\boldsymbol{i}}}\}, i = 1, 2, \cdots, o$

(2) 对于每一个${\boldsymbol{G}_i} \notin \boldsymbol{B}$，且${\boldsymbol{G}_i} \notin \boldsymbol{F}$，标记其相邻区域${{\mathit{\boldsymbol{\bar R}}}_{{G_i}}} = \{ \mathit{\boldsymbol{R}}_j^{{G_i}}\}, j = 1, 2, \cdots, k$。显然，$\mathit{\boldsymbol{O}} \in {{\mathit{\boldsymbol{\bar R}}}_{{G_i}}}$

