当前位置: 首页 > 文章 > 基于多尺度引导注意力的人物图像合成方法研究 金陵科技学院学报 2023 (1) 12-19
Position: Home > Articles > 基于多尺度引导注意力的人物图像合成方法研究 Journal of Jinling Institute of Technology 2023 (1) 12-19

基于多尺度引导注意力的人物图像合成方法研究

作  者:
邬成;葛斌;郑海君;杨振文
单  位:
安徽理工大学计算机科学与工程学院
关键词:
生成对抗网络;多尺度特征;特征转换;人物图像合成;引导注意力机制
摘  要:
针对现有生成对抗网络合成的人物图像存在的残缺和模糊问题,提出一种基于多尺度特征提取和姿势引导特征转换的人物图像合成方法。利用深度卷积神经网络对图像和姿势进行多尺度特征提取,有效获取丰富的语义信息;在不同尺度特征的转换过程中注入引导注意力机制,利用姿势信息引导纹理特征进行正确转移和变换;使用马尔可夫判别网络(PatchGAN)作为判别器,增强对图像纹理细节的鉴别能力;最后在DeepFashion数据集上进行测试。结果表明:在定量上,该方法的结构相似度(SSIM)达到了0.772 9,峰值信噪比(PSNR)达到了19.060 4,Fréchet初始距离得分(FID)达到了11.476 5,可学习感知图像块相似度(LPIPS)达到了0.209 2;在定性上,比传统方法合成的人物图像具有更好的视觉效果。所提方法能有效解决残缺和模糊问题,提高合成人物图像的质量。
作  者:
WU Cheng;GE Bin;ZHENG Hai-jun;YANG Zhen-wen;Anhui University of Science and Technology;
关键词:
generative adversarial networks;;multi-scale feature;;feature transformation;;human image synthesis;;guided attention mechanism
摘  要:
Aiming at the problem of incomplete and blurred human images synthesized by existing generative adversarial networks, a human image synthesis method based on multi-scale feature extraction and pose guided feature transformation is proposed. The deep convolutional neural network is used to extract multi-scale features of images and poses to effectively obtain rich semantic information. The guided attention mechanism is injected in the conversion process of different scale features, and the posture information is used to correctly guide the transfer and transformation of texture features. The Markov discriminant network(PatchGAN) is used as a discriminator to further enhance the ability to distinguish image texture details. Finally, testing is conducted on the DeepFashion dataset. The results show as follows: Quantitatively, the structural similarity(SSIM) of the proposed method reaches 0.772 9. The peak signal-to-noise ratio(PSNR) reaches 19.060 4. The Fréchet inception distance score(FID) reaches 11.476 5. Learnable perceptual image patch similarity(LPIPS) reaches 0.209 2. Qualitatively, compared with the traditional methods, the synthesized human images have better visual effects. The proposed method can effectively solve the problems of incompleteness and blurring, and improve the quality of the synthesized human images.

相似文章

计量
文章访问数: 6
HTML全文浏览量: 0
PDF下载量: 0

所属期刊

推荐期刊