英语原文共 15 页,剩余内容已隐藏,支付完成后下载完整资料
毕业设计(论文)外文翻译
A Survey of Image Synthesis and Editing with Generative Adversarial Networks
Xian Wu, Kun Xu, and Peter Hall
Abstract: This paper presents a survey of image synthesis and editing with Generative Adversarial Networks (GANs). GANs consist of two deep networks, a generator and a discriminator, which are trained in a competitive way. Due to the power of deep networks and the competitive training manner, GANs are capable of producing reasonable and realistic images, and have shown great capability in many image synthesis and editing applications. This paper surveys recent GAN papers regarding topics including, but not limited to, texture synthesis, image inpainting, image-to-image translation, and image editing.
Key words: image synthesis; image editing; constrained image synthesis; generative adversarial networks; image-
to-image translation
- Introduction
With the rapid development of Internet and digital capturing devices, huge volumes of images have become readily available. There are now widespread demands for tasks requiring synthesizing and editing images, such as removing unwanted objects in wedding photographs, adjusting the colors of landscape images, and turning photographs into artwork or vice-versa. These and other problems have attracted significant attention within both the computer graphics and computer vision communities. A variety of methods have been proposed for image/video
Xian Wu and Kun Xu are with TNList and the Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China. E-mail: xukun@tsinghua.edu.cn. Peter Hall is with the Department of Computer Science, University of Bath, Bath, UK. To whom correspondence should be addressed. Manuscript received: 2017-11-15; accepted: 2017-11-20 |
editing and synthesis, including texture synthesis[1–3], image inpainting[4–6], image stylization[7,8], image deformation[9,10], and so on. Although many methods have been proposed, intelligent image synthesis and editing remains a challenging problem. This
is because these traditional methods are mostly
based on pixels[1,4,11], patches[8,10,12], and low-level image features[3,13], and lacking high-level semantic information.
In recent years, deep learning techniques have made a breakthrough in computer vision. Trained using large-scale data, deep neural networks substantially outperform previous techniques with regard to the semantic understanding of images. They claim state-of-the-art in various tasks, including image classification[14–16], object detection[17,18], image segmentation[19,20], etc.
Deep learning has also shown great ability in content generation. In 2014 Goodfellow et al.[21] proposed a generative model, called Generative Adversarial Networks (GANs). GANs contain two networks, a generator and a discriminator. The discriminator tries to distinguish fake images from real ones; the generator produces fake images but it tries to fool the discriminator. Both networks are jointly trained in a competitive way. The resulting generator is able to synthesize plausible images. GAN variants have now achieved impressive results in a variety of image synthesis and editing applications.
In this survey, we cover recent papers that leverage GANs for image synthesis and editing applications.
This survey discusses the ideas, contributions, and drawbacks of these networks. This survey is structured as follows. Section 2 provides a brief introduction to GANs and related variants. Section 3 discusses applications in image synthesis, including texture synthesis, image impainting, and face and human image synthesis. Section 4 discusses applications in constrained image synthesis, including general imageto-image translation, text-to-image, and sketch-toimage. Section 5 discusses applications in image editing and video generation. Finally, Section 6 provides a summary discussion and current challenges and limitations of GAN based methods.
- Generative Adversarial Networks
GANs were proposed by Goodfellow et al.[21] in 2014. They contain two networks, a generator G and a discriminator D. The generator tries to create fake but plausible images, while the discriminator tries to distinguish fake images (produced by the generator) from real images. Formally, the generator G maps a noise vector z in the latent space to an image: G(z)→ x, and the discriminator is defined as D(x)→[0,1], which classifies an image as a real image (i.e., close to 1) or as a fake image (i.e., close to 1).
To train the networks, the loss function is formulated as
where X denotes the set of real images, Z denotes the latent space. The above loss function (Eq. (1)) is referred to as the adversarialloss. The two networks are trained in a competitive fashion with back propagation. The structure of GANs is illustrated as Fig. 1.
Compared with other generative models such as Variational AutoEncoders (VAEs)[22], images generated by GANs are usually less blurred and more realistic. It is also theoretically proven that optimal GANs exist, that is the generator perfectly produces images which match the distributions of real images well, and
剩余内容已隐藏,支付完成后下载完整资料
资料编号:[254372],资料为PDF文档或Word文档,PDF文档可免费转换为Word
以上是毕业论文外文翻译,课题毕业论文、任务书、文献综述、开题报告、程序设计、图纸设计等资料可联系客服协助查找。