gan訓練失敗
This work borrows heavily from the Pytorch DCGAN Tutorial and the NVIDA paper on progressive GANs.
這項工作大量借鑒了Pytorch DCGAN教程 和 有關漸進式GAN 的 NVIDA論文 。
One area of computer vision I’ve been wanting to explore are GANs. So when my wife and I moved into a home that had some extra wall space, I realized I could create a network to make some wall art and avoid a trip to Bed Bath & Beyond (2 birds with one code!).
我一直想探索的計算機視覺領域之一是GAN。 因此,當我和我的妻子搬進一所擁有一些額外墻壁空間的房屋時,我意識到我可以創建一個網絡來制作一些墻壁藝術品,并避免去Bed Bath&Beyond(兩只鳥只用一個密碼!)旅行。
什么是GAN? (What are GANs?)
GANs (Generative Adversarial Networks) work using two synergistic neural networks: one that creates forgery images (the generator), and another neural net that takes in the forgery images along with real examples of art and attempts to classify them as either real or fake (the discriminator). The networks then iterate, the generator getting better at making fakes and the discriminator getting better at detecting them. At the end of the process, you hopefully have a generator that can randomly create authentic-looking art. This method can be applied to generate more than images. In her book You Look Like a Thing and I Love You, Janelle Shane discusses using GANs to make everything from cookie recipes to pick up lines (which is where the book gets its namesake).
GAN(Generative Adversarial Networks)使用兩個協同神經網絡進行工作:一個創建偽造圖像(生成器),另一個神經網絡將偽造圖像與真實的藝術實例一起輸入并嘗試將它們分類為真實或偽造(鑒別器)。 然后網絡進行迭代,生成器在偽造品方面變得更好,而鑒別器在偽造品方面變得更好。 在此過程的最后,您希望有一個生成器可以隨機創建看起來真實的藝術品。 該方法可以應用于生成更多的圖像。 珍妮爾·謝恩(Janelle Shane)在她的《 你看起來像一件東西,我愛你》一書中,討論了使用GAN制作從餅干食譜到撿拾食物的各種東西(這就是該書的名字)。
If you don’t know what GANs are I suggest reading this Pytorch article for an in-depth explanation.
如果您不知道GAN是什么,我建議您閱讀這篇Pytorch 文章以獲得更深入的解釋。
挑戰性 (Challenges)
Creating a GANs model that generates satisfactory results comes with several difficulties which I’ll need to address in my project.
創建可產生令人滿意結果的GAN模型會帶來一些困難,我將在項目中解決這些困難。
Data. Like all neural networks, you’ll need a lot of data; however, GANs appear to have an even more voracious appetite. Most GAN projects I’ve read about have leveraged tens or hundreds of thousands images. In contrast, my dataset is only a few thousand images that I was able to pull from a Google image search. In terms of style, I’d love to end with something that resembles a Rothko, but I’ll settle for generic Bed Bath and Beyond.
數據。 像所有神經網絡一樣,您將需要大量數據。 但是,GAN的胃口似乎更大。 我讀過的大多數GAN項目都利用了成千上萬的圖像。 相反,我的數據集僅是我能夠從Google圖片搜索中提取的幾千張圖片。 在風格方面,我很想以類似于Rothko的東西作為結尾,但是我會選擇通用的Bed Bath and Beyond。
Training time. In NVIDA’s paper on progressive GANs, they trained their network for days using multiple GPUs. In my case I’ll be using Google Colab and hope the free-tier hardware will be good enough.
訓練時間。 在NVIDA關于漸進式GAN的論文中 ,他們使用多個GPU訓練了幾天的網絡。 就我而言,我將使用Google Colab,希望免費的硬件足夠好。
Mode Collapse. Besides being the name of my new dubstep project, mode collapse is what happens when the variety of the generated images begin to converge. Essentially the generator is seeing that a few images are doing well at fooling the discriminator and decides to make all its output look like those few images.
模式崩潰。 除了作為我的新dubstep項目的名稱之外,模式崩潰是當生成的各種圖像開始融合時發生的情況。 本質上,生成器看到一些圖像在欺騙鑒別器方面表現良好,并決定使其所有輸出看起來像那幾幅圖像。
Image Resolution. The larger the wanted image, the larger the needed network. So how high of a resolution will I need? Well, the recommended number of pixels per inch for digital prints is 300, so if I want something I can hang in a 12x15" frame I’ll need a final resolution of 54,000 squared pixels! I obviously won’t be able to build a model to that high of a resolution, but for this experiment I’ll say that’s the goal and I’ll see where I end up. To help with this, I’ll also be using a progressive GANs approach. This was pioneered by NVIDA where they first trained a model at a low resolution and then progressively added the extra layers needed to increase the image resolution. You can think of it as wading into the pool instead of diving directly into the deep end. In their paper they were able to generate celebrity images at a resolution of 1024 x 1024 pixels (my target is only 50x that amount).
圖像分辨率。 所需的圖像越大,所需的網絡越大。 那么我需要多高的分辨率? 好吧,建議的數字打印每英寸像素數是300,因此,如果我想掛在12x15英寸的幀中,則最終分辨率必須為54,000平方像素!我顯然無法建立一個分辨率達到如此高的分辨率,但對于本實驗,我將說這是目標,然后看看最終結果。為幫助實現這一點,我還將使用漸進式GANs方法。他們首先以低分辨率訓練模型,然后逐步添加增加圖像分辨率所需的額外圖層,您可以將其視為涉入池中,而不是直接潛入較深的一端。生成分辨率為1024 x 1024像素的名人圖片(我的目標僅為該數量的50x)。
獲取代碼 (Getting in the Code)
My full code can be found on github. The main things I want show in this article are the generator and the discriminator.
我的完整代碼可以在github上找到。 我想在本文中展示的主要內容是生成器和鑒別器。
The Discriminator. My discriminator looks like any other image classification network. The unique thing about this class is that it takes the number of layers (based on the image size) as a parameter. This is allows me to do the “progressive” part of the Progressive GANs without having to rewrite my classes each time I increment the image size.
鑒別器。 我的鑒別器看起來像任何其他圖像分類網絡。 該類的獨特之處在于它將層數(基于圖像大小)作為參數。 這使我可以進行漸進式GAN的“漸進式”部分,而不必每次增加圖像大小時都重寫類。
class Discriminator(nn.Module):def __init__(self, ngpu, n_layers):super(Discriminator, self).__init__()self.ngpu = ngpuself.n_layers = n_layers# makes the desired number of convolutional layersself.layers = nn.ModuleList([nn.Conv2d(N_CHANNELS, N_DISC_CHANNELS * 2, 4, 2, 1, bias=False)])self.layers.extend([nn.Conv2d(N_DISC_CHANNELS * 2, N_DISC_CHANNELS * 2, 4, 2, 1, bias=False) for i in range(self.n_layers - 2)])self.layers.append(nn.Conv2d(N_DISC_CHANNELS * 2, 1, 4, 1, 0, bias=False))# transformationsself.batch2 = nn.BatchNorm2d(N_DISC_CHANNELS * 2)self.LeakyReLU = nn.LeakyReLU(0.2)self.sigmoid = nn.Sigmoid()def forward(self, x):for i, name in enumerate(self.layers):x = self.layers[i](x)if i == 0:x = self.LeakyReLU(x) elif self.layers[i].out_channels == N_DISC_CHANNELS * 2:x = self.batch2(x)x = self.LeakyReLU(x)else:x = self.sigmoid(x)return x
The Generator. The generator is essentially the reverse of the discriminator. It takes a vector of random values as noise and uses transposed convolutional layers to scale up the noise into an image. The more layers I have the larger the end image.
發電機。 生成器本質上是鑒別器的反向。 它采用隨機值向量作為噪聲,并使用轉置的卷積層將噪聲放大為圖像。 層越多,最終圖像就越大。
class Generator(nn.Module):def __init__(self, ngpu, n_layers):super(Generator, self).__init__()self.ngpu = ngpuself.n_layers = n_layers# makes the desired number of transposed convo layersself.layers = nn.ModuleList([nn.ConvTranspose2d(GEN_INPUT_SIZE, N_GEN_CHANNELS * 2, 4, 1, 0, bias=False)])self.layers.extend([nn.ConvTranspose2d(N_GEN_CHANNELS * 2, N_GEN_CHANNELS * 2, 4, 2, 1, bias=False) for i in range(self.n_layers - 3)])self.layers.extend([nn.ConvTranspose2d(N_GEN_CHANNELS * 2, N_GEN_CHANNELS, 4, 2, 1, bias=False),nn.ConvTranspose2d(N_GEN_CHANNELS, N_CHANNELS, 4, 2, 1, bias=False)]) # other transformationsself.batch1 = nn.BatchNorm2d(N_GEN_CHANNELS)self.batch2 = nn.BatchNorm2d(N_GEN_CHANNELS * 2)self.relu = nn.ReLU(True)self.tanh = nn.Tanh()def forward(self, x):for i, name in enumerate(self.layers):x = self.layers[i](x)if self.layers[i].out_channels == N_GEN_CHANNELS * 2:x = self.batch2(x)x = self.relu(x)elif self.layers[i].out_channels == N_GEN_CHANNELS:x = self.batch1(x)x = self.relu(x)else:x = self.tanh(x)return x
測試網絡 (Testing the Network)
Before I dive into trying to generate abstract art, I first want to test my network to make sure things are set up correctly. To do this I’m going to run the network on a dataset of images from another GANs project and then see if I get similar results. The animeGAN project is a good fit for this use-case. For their project they used 143,000 images of anime characters’ faces to create a generator that makes new characters. After downloading their dataset, I ran my model for 100 epochs with a target image size of 32 pixels, and voila!
在嘗試生成抽象藝術之前,我首先要測試我的網絡以確保正確設置。 為此,我將在另一個GANs項目的圖像數據集上運行網絡,然后查看是否獲得相似的結果。 animeGAN項目非常適合此用例。 在他們的項目中,他們使用了143,000張動漫人物面Kong圖像來創建生成新角色的生成器。 下載他們的數據集后,我將模型運行了100個時間段,目標圖像尺寸為32像素,瞧!

The results are actually better than I expected. With these results, I’m confident that my network is set up correctly and I can move to my dataset.
結果實際上比我預期的要好。 有了這些結果,我相信我的網絡設置正確并且可以移動到數據集。
訓練 (Training)
Now it’s time to finally train the model on the art data. My initial image size is going to be a meager 32 pixels. I’ll train at this size for a while after which I’ll add an additional layer to the generator and discriminator to double the image size to 64. It’s just rinse and repeat until I get to a satisfactory image resolution. But how do I know when to progress on to the next size? There’s a lot of work that’s been done around this question; I’m going to take the simple approach of training until I get a GPU usage limit from Google and then I will manually check the results. If they look like they need more time, I’ll wait a day (so the usage limit is lifted) and train another round.
現在是時候對該藝術數據進行模型訓練了。 我的初始圖像大小將只有32個像素。 我將以這種尺寸訓練一會兒,然后在生成器和鑒別器上添加一個額外的層,以將圖像尺寸增加一倍,達到64。只是沖洗并重復直到獲得令人滿意的圖像分辨率。 但是我怎么知道什么時候繼續前進到下一個尺寸呢? 關于這個問題已經做了很多工作。 在我從Google獲得GPU使用限制之前,我將采用簡單的培訓方法,然后我將手動檢查結果。 如果他們看起來需要更多時間,我將等待一天(因此取消了使用限制)并進行另一輪訓練。

32 Pixel Results. My first set of results look great. Not only is there no sign of mode-collapse, the generator even replicated that some images include a frame.
32像素結果。 我的第一組結果看起來很棒。 不僅沒有模式崩潰的跡象,生成器甚至復制了一些圖像包含幀的信息。

64 and 128 Pixel Results. The 64 pixel results also turned out pretty well; however, by the time I increased the size to 128 pixels I was starting to see signs of mode collapse in the generator results.
64和128像素結果。 64像素的結果也很好。 但是,當我將大小增加到128像素時,我開始看到生成器結果中出現模式崩潰的跡象。

256 Pixel Results. By the time I got to this image size, mode-collapse had reduced the results to only about 3 or 4 types of images. I suspect this may have to do with my limited dataset. By the time I got to this resolution I only had about 1000 images, and it’s possible that the generator is just mimicking a few of the images in that collection.
256像素結果。 到我達到此圖像大小時,模式崩潰將結果減少到僅約3或4種類型的圖像。 我懷疑這可能與我有限的數據集有關。 到達到此分辨率時,我只有大約1000張圖像,并且生成器可能只是在模仿該集合中的一些圖像。

結論 (Conclusion)
In the end my progressive GANs model didn’t progress very far. However, I am still amazed with what a fairly simple network was able to create. It was shocking when it generated anime faces or when it placed some of its generated paintings in frames. I understand why people consider GANs one of the greatest machine learning breakthroughs in recent years. For now this was just my hello world introduction to GANs, but I’ll probably be coming back.
在結束我的進步甘斯模型沒有很遠的進展 。 但是,我仍然對一個相當簡單的網絡能夠創建的內容感到驚訝。 當它生成動漫面Kong或將其生成的某些繪畫放置在框架中時,令人震驚。 我理解為什么人們將GAN視為近年來最大的機器學習突破之一。 目前,這只是我對GAN的介紹,但我可能會回來。
翻譯自: https://towardsdatascience.com/i-tried-and-failed-to-use-gans-to-create-art-but-it-was-still-worth-it-c392bcd29f39
gan訓練失敗
本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。 如若轉載,請注明出處:http://www.pswp.cn/news/390887.shtml 繁體地址,請注明出處:http://hk.pswp.cn/news/390887.shtml 英文地址,請注明出處:http://en.pswp.cn/news/390887.shtml
如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!