A huge progress in the field of image generation, is the effect of BigGAN really that good?

Editor's note: During this year's National Day, Lunzhi introduced a paper that was in the blind review stage of ICLR 2019: BigGAN. This is the result of a collaboration between Heriot-Watt University and DeepMind researchers. According to the experimental results, they improved the model IS by more than 100 points - from 52.52 to 166.3 - a huge progress in the field of image generation. But as we all know, the images presented in the paper are usually the best of the best, and the media will try their best to beautify or even "deify" the results when promoting. So the effect of BigGAN is really that good?

When I first saw these images, I was honestly surprised. It's not because of the hidden content of the images themselves, but they are all generated by a neural network called BigGAN, and they are all fake. I have never seen such a generated image that looks like a photo.

The above 8 pictures are taken from BigGAN's paper: Large Scale GAN Training for High Fidelity Natural Image Synthesis (arXiv:1809.11096), interested readers can read it. A few months ago, the paper caused a huge stir in the machine learning community, not only generating high-resolution 512x512 images, but also achieving a historic high score on the Inception standard benchmark. While people are amazed at the huge processing power (512 TPUv3) that the paper team can support, they can't help but wonder: Is BigGAN cheating? Is it a direct copy of the training set images?

To this end, many researchers went to the original ImageNet to test their ideas, but they finally came to the conclusion that these images were indeed generated by BigGAN itself.

Although previous practice has confirmed that BigGAN is "honest", if we "bar" again, a reasonable doubt is that one reason why the results of the paper are impressive is that they are all selected images. Just a few days ago, BigGAN released its own TF Hub Demo. I believe many people have already experienced it and found this problem. The model performs very well on common objects, such as dogs and simple landscapes, because they tend to be monolithic and simple in structure, but terrible at generating more complex and diverse crowds.

So what does the imperfect side of BigGAN look like? Here are some generated images released by the researchers:

There is no doubt that these three pictures show clocks, but they are different from real objects. These clocks are more like scenes in people's dreams: weird letters and redundant hands. Speaking responsibly, these are common problems in the generated images of BigGAN, it cannot learn various letters and characters in the data set, and GAN itself does not provide a counting function, so we can often find a lot of legs in it. Spiders and frogs with too many eyes, and sometimes a train with two locomotives can be seen.

As for humans... Compared to other GANs that can generate diverse images, BigGAN is actually pretty good at generating human images. But we're humans, and we're very good at finding "missing" parts on the face, body, and body of this species, so the following results are still pretty troubling.

So if we take a quick look at the series of images generated by BigGAN, we can see that many of them have an eerie beauty. For example, the model follows the composition and light and shadow learned from the dataset when generating the following landscape images, but when these materials from different samples are mixed together, they feel familiar and strange.

When it tries to "replicate" various man-made devices (washing machines? furnaces?), the images presented are artistic, like some exaggerated and rich cutscenes in the movie.

What's more, BigGAN can also mimic soft focus in macro, a photography technique that consciously reduces the sharpness of the lens to obtain a soft expression effect. As shown in the picture below, we can't see what the objects in the picture are, but they all show a strong sense of painting.

Even the most ordinary things, BigGAN seems to be a filter, rendering them so beautiful and unforgettable.

Is this art? For computer vision tasks, these "imaginative" distortions are exactly where BigGAN falls short, after all, its goal is to generate images that are extremely realistic and at the same time as diverse as possible. It's not authoring, it's just modeling the data it sees - ImageNet, a huge general-purpose dataset for training various image processing algorithms.

However, we also have to realize that the researcher's meticulous selection process in the output of BigGAN is actually an act of art, including the article itself. You can tell a story this way, or make an unforgettable beautiful movie, it all depends on the dataset you collect and the output you choose. In the future, algorithms like BigGAN will transform human art—not replacing human artists, but becoming a powerful new collaborative tool.

No Touch Keychains

door opener keychain no touch,no touch keychain,no touch tool keychain,no touch keychain bulk, no touch keychain wholesale

Shenzhen Konchang Electronic Technology Co.,Ltd , https://www.konchang.com