對于fcn,經常要使用到Deconvolution進行上采樣。對于caffe使用者,使用Deconvolution上采樣,其參數往往直接給定,不需要通過學習獲得。
給定參數的方式很有意思,可以通過兩種方式實現,但是這兩種方式并非完全等價,各有各的價值。
第一種方式: 通過net_surgery給定,
這種方式最開始出現在FCN中。https://github.com/shelhamer/fcn.berkeleyvision.org/blob/master/voc-fcn32s/solve.py
代碼如下:
import caffe
import surgery, scoreimport numpy as np
import os
import systry:import setproctitlesetproctitle.setproctitle(os.path.basename(os.getcwd()))
except:passweights = '../ilsvrc-nets/vgg16-fcn.caffemodel'# init
caffe.set_device(int(sys.argv[1]))
caffe.set_mode_gpu()solver = caffe.SGDSolver('solver.prototxt')
solver.net.copy_from(weights)# surgeries (這里就是對于反卷積層的參數進行初始化)
interp_layers = [k for k in solver.net.params.keys() if 'up' in k]
surgery.interp(solver.net, interp_layers)# scoring
val = np.loadtxt('../data/segvalid11.txt', dtype=str)for _ in range(25):solver.step(4000)score.seg_tests(solver, False, val, layer='score')
上采樣的函數:
# make a bilinear interpolation kerneldef upsample_filt(self,size):factor = (size + 1) // 2if size % 2 == 1:center = factor - 1else:center = factor - 0.5og = np.ogrid[:size, :size]return (1 - abs(og[0] - center) / factor) * \(1 - abs(og[1] - center) / factor)# set parameters s.t. deconvolutional layers compute bilinear interpolation# N.B. this is for deconvolution without groupsdef interp_surgery(self,net, layers):for l in layers:print lm, k, h, w = net.params[l][0].data.shape #僅僅修改w,不需要修改bias,其為0print("deconv shape:\n")print m, k, h, w if m != k and k != 1:print 'input + output channels need to be the same or |output| == 1'raiseif h != w:print 'filters need to be square'raisefilt = self.upsample_filt(h)print(filt)net.params[l][0].data[range(m), range(k), :, :] = filt
第二種方式:直接在Deconvolution中給定參數weight_filler,即:
代碼如下:
layer {name: "fc8_upsample"type: "Deconvolution"bottom: "fc8"top: "fc8_upsample"param {lr_mult: 0decay_mult: 0}param {lr_mult: 0decay_mult: 0}convolution_param {num_output: 1kernel_size: 16stride: 8pad: 3weight_filler { # 這里相當于上面的直接賦值type: "bilinear"}}
}
weight_filler初始化成雙線性就等價于直接按照上面的方式賦值。
看起來好像以上兩種方法一樣,但是實際上有不同。主要區別在對于num_output>1的情形。
比如對于一個輸入是2個通道的map,希望對其進行上采樣,自然我們希望分別對于map放大即可。如果使用Deconvolution,則shape大小為2,2,16,16(設其大小為16*16).不考慮bias項。
假設按照上面的方式初始化,則對于第一種方法,得到結果:
[0,0,:,:]:
[[ 0.00390625 0.01171875 0.01953125 0.02734375 0.03515625 0.04296875
0.05078125 0.05859375 0.05859375 0.05078125 0.04296875 0.03515625
0.02734375 0.01953125 0.01171875 0.00390625]
[ 0.01171875 0.03515625 0.05859375 0.08203125 0.10546875 0.12890625
0.15234375 0.17578125 0.17578125 0.15234375 0.12890625 0.10546875
0.08203125 0.05859375 0.03515625 0.01171875]
[ 0.01953125 0.05859375 0.09765625 0.13671875 0.17578125 0.21484375
0.25390625 0.29296875 0.29296875 0.25390625 0.21484375 0.17578125
0.13671875 0.09765625 0.05859375 0.01953125]
[ 0.02734375 0.08203125 0.13671875 0.19140625 0.24609375 0.30078125
0.35546875 0.41015625 0.41015625 0.35546875 0.30078125 0.24609375
0.19140625 0.13671875 0.08203125 0.02734375]
[ 0.03515625 0.10546875 0.17578125 0.24609375 0.31640625 0.38671875
0.45703125 0.52734375 0.52734375 0.45703125 0.38671875 0.31640625
0.24609375 0.17578125 0.10546875 0.03515625]
[ 0.04296875 0.12890625 0.21484375 0.30078125 0.38671875 0.47265625
0.55859375 0.64453125 0.64453125 0.55859375 0.47265625 0.38671875
0.30078125 0.21484375 0.12890625 0.04296875]
[ 0.05078125 0.15234375 0.25390625 0.35546875 0.45703125 0.55859375
0.66015625 0.76171875 0.76171875 0.66015625 0.55859375 0.45703125
0.35546875 0.25390625 0.15234375 0.05078125]
[ 0.05859375 0.17578125 0.29296875 0.41015625 0.52734375 0.64453125
0.76171875 0.87890625 0.87890625 0.76171875 0.64453125 0.52734375
0.41015625 0.29296875 0.17578125 0.05859375]
[ 0.05859375 0.17578125 0.29296875 0.41015625 0.52734375 0.64453125
0.76171875 0.87890625 0.87890625 0.76171875 0.64453125 0.52734375
0.41015625 0.29296875 0.17578125 0.05859375]
[ 0.05078125 0.15234375 0.25390625 0.35546875 0.45703125 0.55859375
0.66015625 0.76171875 0.76171875 0.66015625 0.55859375 0.45703125
0.35546875 0.25390625 0.15234375 0.05078125]
[ 0.04296875 0.12890625 0.21484375 0.30078125 0.38671875 0.47265625
0.55859375 0.64453125 0.64453125 0.55859375 0.47265625 0.38671875
0.30078125 0.21484375 0.12890625 0.04296875]
[ 0.03515625 0.10546875 0.17578125 0.24609375 0.31640625 0.38671875
0.45703125 0.52734375 0.52734375 0.45703125 0.38671875 0.31640625
0.24609375 0.17578125 0.10546875 0.03515625]
[ 0.02734375 0.08203125 0.13671875 0.19140625 0.24609375 0.30078125
0.35546875 0.41015625 0.41015625 0.35546875 0.30078125 0.24609375
0.19140625 0.13671875 0.08203125 0.02734375]
[ 0.01953125 0.05859375 0.09765625 0.13671875 0.17578125 0.21484375
0.25390625 0.29296875 0.29296875 0.25390625 0.21484375 0.17578125
0.13671875 0.09765625 0.05859375 0.01953125]
[ 0.01171875 0.03515625 0.05859375 0.08203125 0.10546875 0.12890625
0.15234375 0.17578125 0.17578125 0.15234375 0.12890625 0.10546875
0.08203125 0.05859375 0.03515625 0.01171875]
[ 0.00390625 0.01171875 0.01953125 0.02734375 0.03515625 0.04296875
0.05078125 0.05859375 0.05859375 0.05078125 0.04296875 0.03515625
0.02734375 0.01953125 0.01171875 0.00390625]]
[0,1,:,:]:
[[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
[1,0,:,:]:
[[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
[1,1,:,:]:
[[ 0.00390625 0.01171875 0.01953125 0.02734375 0.03515625 0.04296875
0.05078125 0.05859375 0.05859375 0.05078125 0.04296875 0.03515625
0.02734375 0.01953125 0.01171875 0.00390625]
[ 0.01171875 0.03515625 0.05859375 0.08203125 0.10546875 0.12890625
0.15234375 0.17578125 0.17578125 0.15234375 0.12890625 0.10546875
0.08203125 0.05859375 0.03515625 0.01171875]
[ 0.01953125 0.05859375 0.09765625 0.13671875 0.17578125 0.21484375
0.25390625 0.29296875 0.29296875 0.25390625 0.21484375 0.17578125
0.13671875 0.09765625 0.05859375 0.01953125]
[ 0.02734375 0.08203125 0.13671875 0.19140625 0.24609375 0.30078125
0.35546875 0.41015625 0.41015625 0.35546875 0.30078125 0.24609375
0.19140625 0.13671875 0.08203125 0.02734375]
[ 0.03515625 0.10546875 0.17578125 0.24609375 0.31640625 0.38671875
0.45703125 0.52734375 0.52734375 0.45703125 0.38671875 0.31640625
0.24609375 0.17578125 0.10546875 0.03515625]
[ 0.04296875 0.12890625 0.21484375 0.30078125 0.38671875 0.47265625
0.55859375 0.64453125 0.64453125 0.55859375 0.47265625 0.38671875
0.30078125 0.21484375 0.12890625 0.04296875]
[ 0.05078125 0.15234375 0.25390625 0.35546875 0.45703125 0.55859375
0.66015625 0.76171875 0.76171875 0.66015625 0.55859375 0.45703125
0.35546875 0.25390625 0.15234375 0.05078125]
[ 0.05859375 0.17578125 0.29296875 0.41015625 0.52734375 0.64453125
0.76171875 0.87890625 0.87890625 0.76171875 0.64453125 0.52734375
0.41015625 0.29296875 0.17578125 0.05859375]
[ 0.05859375 0.17578125 0.29296875 0.41015625 0.52734375 0.64453125
0.76171875 0.87890625 0.87890625 0.76171875 0.64453125 0.52734375
0.41015625 0.29296875 0.17578125 0.05859375]
[ 0.05078125 0.15234375 0.25390625 0.35546875 0.45703125 0.55859375
0.66015625 0.76171875 0.76171875 0.66015625 0.55859375 0.45703125
0.35546875 0.25390625 0.15234375 0.05078125]
[ 0.04296875 0.12890625 0.21484375 0.30078125 0.38671875 0.47265625
0.55859375 0.64453125 0.64453125 0.55859375 0.47265625 0.38671875
0.30078125 0.21484375 0.12890625 0.04296875]
[ 0.03515625 0.10546875 0.17578125 0.24609375 0.31640625 0.38671875
0.45703125 0.52734375 0.52734375 0.45703125 0.38671875 0.31640625
0.24609375 0.17578125 0.10546875 0.03515625]
[ 0.02734375 0.08203125 0.13671875 0.19140625 0.24609375 0.30078125
0.35546875 0.41015625 0.41015625 0.35546875 0.30078125 0.24609375
0.19140625 0.13671875 0.08203125 0.02734375]
[ 0.01953125 0.05859375 0.09765625 0.13671875 0.17578125 0.21484375
0.25390625 0.29296875 0.29296875 0.25390625 0.21484375 0.17578125
0.13671875 0.09765625 0.05859375 0.01953125]
[ 0.01171875 0.03515625 0.05859375 0.08203125 0.10546875 0.12890625
0.15234375 0.17578125 0.17578125 0.15234375 0.12890625 0.10546875
0.08203125 0.05859375 0.03515625 0.01171875]
[ 0.00390625 0.01171875 0.01953125 0.02734375 0.03515625 0.04296875
0.05078125 0.05859375 0.05859375 0.05078125 0.04296875 0.03515625
0.02734375 0.01953125 0.01171875 0.00390625]]
而第二種方式全部都是[0,0,:,:]這樣的矩陣。
以上兩種方法應該是第一種對的。因為Deconvolution 其實與卷積類似,按照第一種結果才能分別單獨地對map上采樣,而采用第二種則將會得到兩個相同的map。(因為綜合了兩個輸入map的信息)
因此結論: 對于多個輸入輸出的Deconvolution,采用方法1,對于單個輸入的,方法1,2通用。
附上Deconvolution的官方編碼:
說明:
以上的稱述有點瑕疵,其實caffe已經解決了上述的問題,我之前沒有好好留意。 關鍵就在group這個選項。
如果num_output>1,則填上group: c 再加上weight_filler: { type: “bilinear” },即可完成初始化。