gan訓練失敗_我嘗試過(但失敗了)使用GAN來創作藝術品,但這仍然值得。

gan訓練失敗

This work borrows heavily from the Pytorch DCGAN Tutorial and the NVIDA paper on progressive GANs.

這項工作大量借鑒了Pytorch DCGAN教程 有關漸進式GAN NVIDA論文

One area of computer vision I’ve been wanting to explore are GANs. So when my wife and I moved into a home that had some extra wall space, I realized I could create a network to make some wall art and avoid a trip to Bed Bath & Beyond (2 birds with one code!).

我一直想探索的計算機視覺領域之一是GAN。 因此,當我和我的妻子搬進一所擁有一些額外墻壁空間的房屋時,我意識到我可以創建一個網絡來制作一些墻壁藝術品,并避免去Bed Bath&Beyond(兩只鳥只用一個密碼!)旅行。

什么是GAN? (What are GANs?)

GANs (Generative Adversarial Networks) work using two synergistic neural networks: one that creates forgery images (the generator), and another neural net that takes in the forgery images along with real examples of art and attempts to classify them as either real or fake (the discriminator). The networks then iterate, the generator getting better at making fakes and the discriminator getting better at detecting them. At the end of the process, you hopefully have a generator that can randomly create authentic-looking art. This method can be applied to generate more than images. In her book You Look Like a Thing and I Love You, Janelle Shane discusses using GANs to make everything from cookie recipes to pick up lines (which is where the book gets its namesake).

GAN(Generative Adversarial Networks)使用兩個協同神經網絡進行工作:一個創建偽造圖像(生成器),另一個神經網絡將偽造圖像與真實的藝術實例一起輸入并嘗試將它們分類為真實或偽造(鑒別器)。 然后網絡進行迭代,生成器在偽造品方面變得更好,而鑒別器在偽造品方面變得更好。 在此過程的最后,您希望有一個生成器可以隨機創建看起來真實的藝術品。 該方法可以應用于生成更多的圖像。 珍妮爾·謝恩(Janelle Shane)在她的《 你看起來像一件東西,我愛你》一書中,討論了使用GAN制作從餅干食譜到撿拾食物的各種東西(這就是該書的名字)。

If you don’t know what GANs are I suggest reading this Pytorch article for an in-depth explanation.

如果您不知道GAN是什么,我建議您閱讀這篇Pytorch 文章以獲得更深入的解釋。

挑戰性 (Challenges)

Creating a GANs model that generates satisfactory results comes with several difficulties which I’ll need to address in my project.

創建可產生令人滿意結果的GAN模型會帶來一些困難,我將在項目中解決這些困難。

Data. Like all neural networks, you’ll need a lot of data; however, GANs appear to have an even more voracious appetite. Most GAN projects I’ve read about have leveraged tens or hundreds of thousands images. In contrast, my dataset is only a few thousand images that I was able to pull from a Google image search. In terms of style, I’d love to end with something that resembles a Rothko, but I’ll settle for generic Bed Bath and Beyond.

數據。 像所有神經網絡一樣,您將需要大量數據。 但是,GAN的胃口似乎更大。 我讀過的大多數GAN項目都利用了成千上萬的圖像。 相反,我的數據集僅是我能夠從Google圖片搜索中提取的幾千張圖片。 在風格方面,我很想以類似于Rothko的東西作為結尾,但是我會選擇通用的Bed Bath and Beyond。

Training time. In NVIDA’s paper on progressive GANs, they trained their network for days using multiple GPUs. In my case I’ll be using Google Colab and hope the free-tier hardware will be good enough.

訓練時間。 在NVIDA關于漸進式GAN的論文中 ,他們使用多個GPU訓練了幾天的網絡。 就我而言,我將使用Google Colab,希望免費的硬件足夠好。

Mode Collapse. Besides being the name of my new dubstep project, mode collapse is what happens when the variety of the generated images begin to converge. Essentially the generator is seeing that a few images are doing well at fooling the discriminator and decides to make all its output look like those few images.

模式崩潰。 除了作為我的新dubstep項目的名稱之外,模式崩潰是當生成的各種圖像開始融合時發生的情況。 本質上,生成器看到一些圖像在欺騙鑒別器方面表現良好,并決定使其所有輸出看起來像那幾幅圖像。

Image Resolution. The larger the wanted image, the larger the needed network. So how high of a resolution will I need? Well, the recommended number of pixels per inch for digital prints is 300, so if I want something I can hang in a 12x15" frame I’ll need a final resolution of 54,000 squared pixels! I obviously won’t be able to build a model to that high of a resolution, but for this experiment I’ll say that’s the goal and I’ll see where I end up. To help with this, I’ll also be using a progressive GANs approach. This was pioneered by NVIDA where they first trained a model at a low resolution and then progressively added the extra layers needed to increase the image resolution. You can think of it as wading into the pool instead of diving directly into the deep end. In their paper they were able to generate celebrity images at a resolution of 1024 x 1024 pixels (my target is only 50x that amount).

圖像分辨率。 所需的圖像越大,所需的網絡越大。 那么我需要多高的分辨率? 好吧,建議的數字打印每英寸像素數是300,因此,如果我想掛在12x15英寸的幀中,則最終分辨率必須為54,000平方像素!我顯然無法建立一個分辨率達到如此高的分辨率,但對于本實驗,我將說這是目標,然后看看最終結果。為幫助實現這一點,我還將使用漸進式GANs方法。他們首先以低分辨率訓練模型,然后逐步添加增加圖像分辨率所需的額外圖層,您可以將其視為涉入池中,而不是直接潛入較深的一端。生成分辨率為1024 x 1024像素的名人圖片(我的目標僅為該數量的50x)。

獲取代碼 (Getting in the Code)

My full code can be found on github. The main things I want show in this article are the generator and the discriminator.

我的完整代碼可以在github上找到。 我想在本文中展示的主要內容是生成器和鑒別器。

The Discriminator. My discriminator looks like any other image classification network. The unique thing about this class is that it takes the number of layers (based on the image size) as a parameter. This is allows me to do the “progressive” part of the Progressive GANs without having to rewrite my classes each time I increment the image size.

鑒別器。 我的鑒別器看起來像任何其他圖像分類網絡。 該類的獨特之處在于它將層數(基于圖像大小)作為參數。 這使我可以進行漸進式GAN的“漸進式”部分,而不必每次增加圖像大小時都重寫類。

class Discriminator(nn.Module):def __init__(self, ngpu, n_layers):super(Discriminator, self).__init__()self.ngpu = ngpuself.n_layers = n_layers# makes the desired number of convolutional layersself.layers = nn.ModuleList([nn.Conv2d(N_CHANNELS, N_DISC_CHANNELS * 2, 4, 2, 1, bias=False)])self.layers.extend([nn.Conv2d(N_DISC_CHANNELS * 2, N_DISC_CHANNELS * 2, 4, 2, 1, bias=False) for i in range(self.n_layers - 2)])self.layers.append(nn.Conv2d(N_DISC_CHANNELS * 2, 1, 4, 1, 0, bias=False))# transformationsself.batch2 = nn.BatchNorm2d(N_DISC_CHANNELS * 2)self.LeakyReLU = nn.LeakyReLU(0.2)self.sigmoid = nn.Sigmoid()def forward(self, x):for i, name in enumerate(self.layers):x = self.layers[i](x)if i == 0:x = self.LeakyReLU(x)            elif self.layers[i].out_channels == N_DISC_CHANNELS * 2:x = self.batch2(x)x = self.LeakyReLU(x)else:x = self.sigmoid(x)return x

The Generator. The generator is essentially the reverse of the discriminator. It takes a vector of random values as noise and uses transposed convolutional layers to scale up the noise into an image. The more layers I have the larger the end image.

發電機。 生成器本質上是鑒別器的反向。 它采用隨機值向量作為噪聲,并使用轉置的卷積層將噪聲放大為圖像。 層越多,最終圖像就越大。

class Generator(nn.Module):def __init__(self, ngpu, n_layers):super(Generator, self).__init__()self.ngpu = ngpuself.n_layers = n_layers# makes the desired number of transposed convo layersself.layers = nn.ModuleList([nn.ConvTranspose2d(GEN_INPUT_SIZE, N_GEN_CHANNELS * 2, 4, 1, 0, bias=False)])self.layers.extend([nn.ConvTranspose2d(N_GEN_CHANNELS * 2, N_GEN_CHANNELS * 2, 4, 2, 1, bias=False) for i in range(self.n_layers - 3)])self.layers.extend([nn.ConvTranspose2d(N_GEN_CHANNELS * 2, N_GEN_CHANNELS, 4, 2, 1, bias=False),nn.ConvTranspose2d(N_GEN_CHANNELS, N_CHANNELS, 4, 2, 1, bias=False)])                   # other transformationsself.batch1 = nn.BatchNorm2d(N_GEN_CHANNELS)self.batch2 = nn.BatchNorm2d(N_GEN_CHANNELS * 2)self.relu = nn.ReLU(True)self.tanh = nn.Tanh()def forward(self, x):for i, name in enumerate(self.layers):x = self.layers[i](x)if self.layers[i].out_channels == N_GEN_CHANNELS * 2:x = self.batch2(x)x = self.relu(x)elif self.layers[i].out_channels == N_GEN_CHANNELS:x = self.batch1(x)x = self.relu(x)else:x = self.tanh(x)return x

測試網絡 (Testing the Network)

Before I dive into trying to generate abstract art, I first want to test my network to make sure things are set up correctly. To do this I’m going to run the network on a dataset of images from another GANs project and then see if I get similar results. The animeGAN project is a good fit for this use-case. For their project they used 143,000 images of anime characters’ faces to create a generator that makes new characters. After downloading their dataset, I ran my model for 100 epochs with a target image size of 32 pixels, and voila!

在嘗試生成抽象藝術之前,我首先要測試我的網絡以確保正確設置。 為此,我將在另一個GANs項目的圖像數據集上運行網絡,然后查看是否獲得相似的結果。 animeGAN項目非常適合此用例。 在他們的項目中,他們使用了143,000張動漫人物面Kong圖像來創建生成新角色的生成器。 下載他們的數據集后,我將模型運行了100個時間段,目標圖像尺寸為32像素,瞧!

Image for post
Results from my GAN model
我的GAN模型的結果

The results are actually better than I expected. With these results, I’m confident that my network is set up correctly and I can move to my dataset.

結果實際上比我預期的要好。 有了這些結果,我相信我的網絡設置正確并且可以移動到數據集。

訓練 (Training)

Now it’s time to finally train the model on the art data. My initial image size is going to be a meager 32 pixels. I’ll train at this size for a while after which I’ll add an additional layer to the generator and discriminator to double the image size to 64. It’s just rinse and repeat until I get to a satisfactory image resolution. But how do I know when to progress on to the next size? There’s a lot of work that’s been done around this question; I’m going to take the simple approach of training until I get a GPU usage limit from Google and then I will manually check the results. If they look like they need more time, I’ll wait a day (so the usage limit is lifted) and train another round.

現在是時候對該藝術數據進行模型訓練了。 我的初始圖像大小將只有32個像素。 我將以這種尺寸訓練一會兒,然后在生成器和鑒別器上添加一個額外的層,以將圖像尺寸增加一倍,達到64。只是沖洗并重復直到獲得令人滿意的圖像分辨率。 但是我怎么知道什么時候繼續前進到下一個尺寸呢? 關于這個問題已經做了很多工作。 在我從Google獲得GPU使用限制之前,我將采用簡單的培訓方法,然后我將手動檢查結果。 如果他們看起來需要更多時間,我將等待一天(因此取消了使用限制)并進行另一輪訓練。

Image for post
Hello darkness my old friend
你好,黑暗,我的老朋友

32 Pixel Results. My first set of results look great. Not only is there no sign of mode-collapse, the generator even replicated that some images include a frame.

32像素結果。 我的第一組結果看起來很棒。 不僅沒有模式崩潰的跡象,生成器甚至復制了一些圖像包含幀的信息。

Image for post
Generated images at size 32
生成的圖像大小為32

64 and 128 Pixel Results. The 64 pixel results also turned out pretty well; however, by the time I increased the size to 128 pixels I was starting to see signs of mode collapse in the generator results.

64和128像素結果。 64像素的結果也很好。 但是,當我將大小增加到128像素時,我開始看到生成器結果中出現模式崩潰的跡象。

Image for post
Starting to see identical output
開始看到相同的輸出

256 Pixel Results. By the time I got to this image size, mode-collapse had reduced the results to only about 3 or 4 types of images. I suspect this may have to do with my limited dataset. By the time I got to this resolution I only had about 1000 images, and it’s possible that the generator is just mimicking a few of the images in that collection.

256像素結果。 到我達到此圖像大小時,模式崩潰將結果減少到僅約3或4種類型的圖像。 我懷疑這可能與我有限的數據集有關。 到達到此分辨率時,我只有大約1000張圖像,并且生成器可能只是在模仿該集合中的一些圖像。

Image for post
Mode collapse
模式崩潰

結論 (Conclusion)

In the end my progressive GANs model didn’t progress very far. However, I am still amazed with what a fairly simple network was able to create. It was shocking when it generated anime faces or when it placed some of its generated paintings in frames. I understand why people consider GANs one of the greatest machine learning breakthroughs in recent years. For now this was just my hello world introduction to GANs, but I’ll probably be coming back.

在結束我的進步甘斯模型沒有很遠的進展 。 但是,我仍然對一個相當簡單的網絡能夠創建的內容感到驚訝。 當它生成動漫面Kong或將其生成的某些繪畫放置在框架中時,令人震驚。 我理解為什么人們將GAN視為近年來最大的機器學習突破之一。 目前,這只是我對GAN的介紹,但我可能會回來。

翻譯自: https://towardsdatascience.com/i-tried-and-failed-to-use-gans-to-create-art-but-it-was-still-worth-it-c392bcd29f39

gan訓練失敗

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/390887.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/390887.shtml
英文地址,請注明出處:http://en.pswp.cn/news/390887.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

怎么樣實現對一個對象的深拷貝

問題:怎么樣實現對一個對象的深拷貝 使用深拷貝的方法有點難實現啊。要保證原來的對象和克隆對象不是共享同一個引用的步驟是什么啊? 回答一 一種安全的方法是先序列化對象,然后反序列化。這保證了所有東西都是一個新的引用。 這里有一篇…

19.7 主動模式和被動模式 19.8 添加監控主機 19.9 添加自定義模板 19.10 處理圖形中的亂碼 19.11 自動發現...

2019獨角獸企業重金招聘Python工程師標準>>> 19.7 主動模式和被動模式 ? 主動或者被動是相對客戶端來講的 ? 被動模式,服務端會主動連接客戶端獲取監控項目數據,客戶端被動地接受連接,并把監控信息傳遞給服務端 服務端請求以后&…

Codeforces Round #444 (Div. 2) C.Solution for Cube 模擬

向題解低頭,向大佬低頭(。﹏。)orz……模擬也不能亂模啊……要好好分析題意,簡化簡化再簡化orz敲黑板 六個面的魔方,能一步還原的情況一定是只有2個面是單色,其余四個面,每個面2種顏色,而且不會出現任意兩面…

fcc認證_介紹fCC 100:我們對2019年杰出貢獻者的年度總結

fcc認證2019 has been a big year for the global freeCodeCamp community.對于全球freeCodeCamp社區來說,2019年是重要的一年。 More people are answering questions on the forum. 越來越多的人在論壇上回答問題。 Our publication has several new, rising aut…

華盛頓特區與其他地區的差別_使用華盛頓特區地鐵數據確定可獲利的廣告位置...

華盛頓特區與其他地區的差別深度分析 (In-Depth Analysis) Living in Washington DC for the past 1 year, I have come to realize how WMATA metro is the lifeline of this vibrant city. The metro network is enormous and well-connected throughout the DMV area. When …

Windows平臺下kafka環境的搭建

近期在搞kafka,在Windows環境搭建的過程中遇到一些問題,把具體的流程幾下來防止后面忘了。 準備工作: 1.安裝jdk環境 http://www.oracle.com/technetwork/java/javase/downloads/index.html 2.下載kafka的程序安裝包: http://kafk…

deeplearning.ai 改善深層神經網絡 week2 優化算法

這一周的主題是優化算法。 1. Mini-batch: 上一門課討論的向量化的目的是去掉for循環加速優化計算,X [x(1) x(2) x(3) ... x(m)],X的每一個列向量x(i)是一個樣本,m是樣本個數。但當樣本很多時(比如m500萬&#xff09…

gcc匯編匯編語言_什么是匯編語言?

gcc匯編匯編語言Assembly Language is the interface between higher level languages (C, Java, etc) and machine code (binary). For a compiled language, the compiler transforms higher level code into assembly language code.匯編語言是高級語言(C ,Java等…

鋪裝s路畫法_數據管道的鋪裝之路

鋪裝s路畫法Data is a key bet for Intuit as we invest heavily in new customer experiences: a platform to connect experts anywhere in the world with customers and small business owners, a platform that connects to thousands of institutions and aggregates fin…

leetcode421. 數組中兩個數的最大異或值(貪心算法)

給你一個整數數組 nums &#xff0c;返回 nums[i] XOR nums[j] 的最大運算結果&#xff0c;其中 0 ≤ i ≤ j < n 。 進階&#xff1a;你可以在 O(n) 的時間解決這個問題嗎&#xff1f; 示例 1&#xff1a; 輸入&#xff1a;nums [3,10,5,25,2,8] 輸出&#xff1a;28 解…

IBM推全球首個5納米芯片:計劃2020年量產

IBM日前宣布&#xff0c;該公司已取得技術突破&#xff0c;利用5納米技術制造出密度更大的芯片。這種芯片可以將300億個5納米開關電路集成在指甲蓋大小的芯片上。 IBM推全球首個5納米芯片 IBM表示&#xff0c;此次使用了一種新型晶體管&#xff0c;即堆疊硅納米板&#xff0c;將…

drop sql語句_用于從表中刪除數據SQL Drop View語句

drop sql語句介紹 (Introduction) This guide covers the SQL statement for dropping (deleting) one or more view objects.本指南介紹了用于刪除(刪除)一個或多個視圖對象SQL語句。 A View is an object that presents data from one or more tables.視圖是顯示來自一個或多…

async 和 await的前世今生 (轉載)

async 和 await 出現在C# 5.0之后&#xff0c;給并行編程帶來了不少的方便&#xff0c;特別是當在MVC中的Action也變成async之后&#xff0c;有點開始什么都是async的味道了。但是這也給我們編程埋下了一些隱患&#xff0c;有時候可能會產生一些我們自己都不知道怎么產生的Bug&…

項目案例:qq數據庫管理_2小時元項目:項目管理您的數據科學學習

項目案例:qq數據庫管理Many of us are struggling to prioritize our learning as a working professional or aspiring data scientist. We’re told that we need to learn so many things that at times it can be overwhelming. Recently, I’ve felt like there could be …

react 示例_2020年的React Cheatsheet(+真實示例)

react 示例Ive put together for you an entire visual cheatsheet of all of the concepts and skills you need to master React in 2020.我為您匯總了2020年掌握React所需的所有概念和技能的完整視覺摘要。 But dont let the label cheatsheet fool you. This is more than…

leetcode 993. 二叉樹的堂兄弟節點

在二叉樹中&#xff0c;根節點位于深度 0 處&#xff0c;每個深度為 k 的節點的子節點位于深度 k1 處。 如果二叉樹的兩個節點深度相同&#xff0c;但 父節點不同 &#xff0c;則它們是一對堂兄弟節點。 我們給出了具有唯一值的二叉樹的根節點 root &#xff0c;以及樹中兩個…

Java之Set集合的怪

工作中可能用Set比較少&#xff0c;但是如果用的時候&#xff0c;出的一些問題很讓人摸不著頭腦&#xff0c;然后我就看了一下Set的底層實現&#xff0c;大吃一驚。 ###看一個問題 Map map new HashMap();map.put(1,"a");map.put(12,"ab");map.put(123,&q…

為mysql數據庫建立索引

前些時候&#xff0c;一位頗高級的程序員居然問我什么叫做索引&#xff0c;令我感到十分的驚奇&#xff0c;我想這絕不會是滄海一粟&#xff0c;因為有成千上萬的開發者&#xff08;可能大部分是使用MySQL的&#xff09;都沒有受過有關數據庫的正規培訓&#xff0c;盡管他們都為…

查詢數據庫中有多少個數據表_您的數據中有多少汁?

查詢數據庫中有多少個數據表97%. That’s the percentage of data that sits unused by organizations according to Gartner, making up so-called “dark data”.97 &#xff05;。 根據Gartner的說法&#xff0c;這就是組織未使用的數據百分比&#xff0c;即所謂的“ 暗數據…

記錄一個Python鼠標自動模塊用法和selenium加載網頁插件的設置

寫爬蟲&#xff0c;或者網頁自動化&#xff0c;讓程序自動完成一些重復性的枯燥的網頁操作&#xff0c;是最常見的需求。能夠解放雙手&#xff0c;空出時間看看手機&#xff0c;或者學習別的東西&#xff0c;甚至還能幫朋友親戚減輕工作量。 然而&#xff0c;網頁自動化代碼編寫…