faster rcnn學習之rpn 的生成

接著上一節《 faster rcnn學習之rpn訓練全過程》,假定我們已經訓好了rpn網絡,下面我們看看如何利用訓練好的rpn網絡生成proposal.

其網絡為rpn_test.pt

# Enter your network definition here.
# Use Shift+Enter to update the visualization.
name: "VGG_CNN_M_1024"
input: "data"
input_shape {dim: 1dim: 3dim: 224dim: 224
}
input: "im_info"
input_shape {dim: 1dim: 3
}
layer {name: "conv1"type: "Convolution"bottom: "data"top: "conv1"convolution_param {num_output: 96kernel_size: 7stride: 2}
}
layer {name: "relu1"type: "ReLU"bottom: "conv1"top: "conv1"
}
layer {name: "norm1"type: "LRN"bottom: "conv1"top: "norm1"lrn_param {local_size: 5alpha: 0.0005beta: 0.75k: 2}
}
layer {name: "pool1"type: "Pooling"bottom: "norm1"top: "pool1"pooling_param {pool: MAXkernel_size: 3stride: 2}
}
layer {name: "conv2"type: "Convolution"bottom: "pool1"top: "conv2"convolution_param {num_output: 256pad: 1kernel_size: 5stride: 2}
}
layer {name: "relu2"type: "ReLU"bottom: "conv2"top: "conv2"
}
layer {name: "norm2"type: "LRN"bottom: "conv2"top: "norm2"lrn_param {local_size: 5alpha: 0.0005beta: 0.75k: 2}
}
layer {name: "pool2"type: "Pooling"bottom: "norm2"top: "pool2"pooling_param {pool: MAXkernel_size: 3stride: 2}
}
layer {name: "conv3"type: "Convolution"bottom: "pool2"top: "conv3"convolution_param {num_output: 512pad: 1kernel_size: 3}
}
layer {name: "relu3"type: "ReLU"bottom: "conv3"top: "conv3"
}
layer {name: "conv4"type: "Convolution"bottom: "conv3"top: "conv4"convolution_param {num_output: 512pad: 1kernel_size: 3}
}
layer {name: "relu4"type: "ReLU"bottom: "conv4"top: "conv4"
}
layer {name: "conv5"type: "Convolution"bottom: "conv4"top: "conv5"convolution_param {num_output: 512pad: 1kernel_size: 3}
}
layer {name: "relu5"type: "ReLU"bottom: "conv5"top: "conv5"
}#========= RPN ============layer {name: "rpn_conv/3x3"type: "Convolution"bottom: "conv5"top: "rpn/output"convolution_param {num_output: 256kernel_size: 3 pad: 1 stride: 1}
}
layer {name: "rpn_relu/3x3"type: "ReLU"bottom: "rpn/output"top: "rpn/output"
}
layer {name: "rpn_cls_score"type: "Convolution"bottom: "rpn/output"top: "rpn_cls_score"convolution_param {num_output: 18   # 2(bg/fg) * 9(anchors)kernel_size: 1 pad: 0 stride: 1}
}
layer {name: "rpn_bbox_pred"type: "Convolution"bottom: "rpn/output"top: "rpn_bbox_pred"convolution_param {num_output: 36   # 4 * 9(anchors)kernel_size: 1 pad: 0 stride: 1}
}
layer {bottom: "rpn_cls_score"top: "rpn_cls_score_reshape"name: "rpn_cls_score_reshape"type: "Reshape"reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }
}#========= RoI Proposal ============layer {name: "rpn_cls_prob"type: "Softmax"bottom: "rpn_cls_score_reshape"top: "rpn_cls_prob"
}
layer {name: 'rpn_cls_prob_reshape'type: 'Reshape'bottom: 'rpn_cls_prob'top: 'rpn_cls_prob_reshape'reshape_param { shape { dim: 0 dim: 18 dim: -1 dim: 0 } }
}
layer {name: 'proposal'type: 'Python'bottom: 'rpn_cls_prob_reshape'bottom: 'rpn_bbox_pred'bottom: 'im_info'top: 'rois'top: 'scores'python_param {module: 'rpn.proposal_layer'layer: 'ProposalLayer'param_str: "'feat_stride': 16"}
}

同樣借用文獻[1]的圖 ,網絡繪制出來如下:我們發現與rpn基本相同。




如上,一張大小為224*224的圖片經過前面的5個卷積層,輸出256張大小為13*13的 特征圖(你也可以理解為一張13*13*256大小的特征圖,256表示通道數),然后使用1*1的卷積輸出13*13*18的rpn_cls_score,和13*13*36的rpn_bbox_pred。rpn_cls_score經過了reshape,準備進行softmax輸出。


接著rpn_cls_score_reshape使用softmax輸出了rpn_cls_prob,再reshape回去,輸出rpn_cls_prob_reshape。


最后rpn_cls_prob_reshape(1*18*13*13),rpn_bbox_pred(1*36*13*13),im_info (1*3)輸入到proposal層中輸出了rois與scores。

layer {name: 'proposal'type: 'Python'bottom: 'rpn_cls_prob_reshape'bottom: 'rpn_bbox_pred'bottom: 'im_info'top: 'rois'top: 'scores'python_param {module: 'rpn.proposal_layer'layer: 'ProposalLayer'param_str: "'feat_stride': 16"}
}
我們來看看proposal_layer,

  def setup(self, bottom, top):# parse the layer parameter string, which must be valid YAMLlayer_params = yaml.load(self.param_str_)self._feat_stride = layer_params['feat_stride']anchor_scales = layer_params.get('scales', (8, 16, 32))self._anchors = generate_anchors(scales=np.array(anchor_scales))self._num_anchors = self._anchors.shape[0]if DEBUG:print 'feat_stride: {}'.format(self._feat_stride)print 'anchors:'print self._anchors# rois blob: holds R regions of interest, each is a 5-tuple# (n, x1, y1, x2, y2) specifying an image batch index n and a# rectangle (x1, y1, x2, y2)top[0].reshape(1, 5)# scores blob: holds scores for R regions of interestif len(top) > 1:top[1].reshape(1, 1, 1, 1)
anchor_target_layer.py 的setup類似,設置了top的shape,并且生成了左上角頂點的anchors。

    def forward(self, bottom, top):# Algorithm:## for each (H, W) location i#   generate A anchor boxes centered on cell i#   apply predicted bbox deltas at cell i to each of the A anchors# clip predicted boxes to image# remove predicted boxes with either height or width < threshold# sort all (proposal, score) pairs by score from highest to lowest# take top pre_nms_topN proposals before NMS# apply NMS with threshold 0.7 to remaining proposals# take after_nms_topN proposals after NMS# return the top proposals (-> RoIs top, scores top)assert bottom[0].data.shape[0] == 1, \'Only single item batches are supported'cfg_key = str(self.phase) # either 'TRAIN' or 'TEST'pre_nms_topN  = cfg[cfg_key].RPN_PRE_NMS_TOP_Npost_nms_topN = cfg[cfg_key].RPN_POST_NMS_TOP_Nnms_thresh    = cfg[cfg_key].RPN_NMS_THRESHmin_size      = cfg[cfg_key].RPN_MIN_SIZE# the first set of _num_anchors channels are bg probs   (前9個是背景,后面的是前景預測)# the second set are the fg probs, which we wantscores = bottom[0].data[:, self._num_anchors:, :, :]bbox_deltas = bottom[1].dataim_info = bottom[2].data[0, :]if DEBUG:print 'im_size: ({}, {})'.format(im_info[0], im_info[1])print 'scale: {}'.format(im_info[2])# 1. Generate proposals from bbox deltas and shifted anchorsheight, width = scores.shape[-2:]if DEBUG:print 'score map size: {}'.format(scores.shape)# Enumerate all shiftsshift_x = np.arange(0, width) * self._feat_strideshift_y = np.arange(0, height) * self._feat_strideshift_x, shift_y = np.meshgrid(shift_x, shift_y)shifts = np.vstack((shift_x.ravel(), shift_y.ravel(),shift_x.ravel(), shift_y.ravel())).transpose()# Enumerate all shifted anchors:## add A anchors (1, A, 4) to# cell K shifts (K, 1, 4) to get# shift anchors (K, A, 4)# reshape to (K*A, 4) shifted anchorsA = self._num_anchorsK = shifts.shape[0]anchors = self._anchors.reshape((1, A, 4)) + \shifts.reshape((1, K, 4)).transpose((1, 0, 2))anchors = anchors.reshape((K * A, 4))# Transpose and reshape predicted bbox transformations to get them# into the same order as the anchors:## bbox deltas will be (1, 4 * A, H, W) format# transpose to (1, H, W, 4 * A)# reshape to (1 * H * W * A, 4) where rows are ordered by (h, w, a)# in slowest to fastest order# 為了與anchors的shape對應,故做了此變換bbox_deltas = bbox_deltas.transpose((0, 2, 3, 1)).reshape((-1, 4))# Same story for the scores:## scores are (1, A, H, W) format# transpose to (1, H, W, A)# reshape to (1 * H * W * A, 1) where rows are ordered by (h, w, a)# 為了與anchors的shape對應,故做了此變換scores = scores.transpose((0, 2, 3, 1)).reshape((-1, 1))# Convert anchors into proposals via bbox transformations,生成預測(x1,y1,x2,y2)proposals = bbox_transform_inv(anchors, bbox_deltas)# 2. clip predicted boxes to imageproposals = clip_boxes(proposals, im_info[:2])# 3. remove predicted boxes with either height or width < threshold# (NOTE: convert min_size to input image scale stored in im_info[2])keep = _filter_boxes(proposals, min_size * im_info[2])proposals = proposals[keep, :]scores = scores[keep]# 4. sort all (proposal, score) pairs by score from highest to lowest# 5. take top pre_nms_topN (e.g. 6000)order = scores.ravel().argsort()[::-1]if pre_nms_topN > 0:order = order[:pre_nms_topN]proposals = proposals[order, :]scores = scores[order]# 6. apply nms (e.g. threshold = 0.7)# 7. take after_nms_topN (e.g. 300)# 8. return the top proposals (-> RoIs top)keep = nms(np.hstack((proposals, scores)), nms_thresh)if post_nms_topN > 0:keep = keep[:post_nms_topN]proposals = proposals[keep, :]scores = scores[keep]# Output rois blob# Our RPN implementation only supports a single input image, so all# batch inds are 0# rois 的shape為1*5,(n,x1,y1,x2,y2) ,這里生成的box的尺度是縮放后的。batch_inds = np.zeros((proposals.shape[0], 1), dtype=np.float32)blob = np.hstack((batch_inds, proposals.astype(np.float32, copy=False)))top[0].reshape(*(blob.shape))top[0].data[...] = blob# [Optional] output scores blobif len(top) > 1:top[1].reshape(*(scores.shape))top[1].data[...] = scores
而forward中,先是生成了所有的anchor,然后利用預測地偏移量與生成的anchor一起生成proposal.

再接著進行了一些刪減操作以及nms去重。返回前景分數最高的一些proposals及對應的scores.注意生成的proposal是相對于

輸入尺度的,也就是縮放后的尺度。



我們再回到train_faster_rcnn_alt_opt中。看Stage 1 RPN, generate proposals'

  mp_kwargs = dict(queue=mp_queue,imdb_name=args.imdb_name,rpn_model_path=str(rpn_stage1_out['model_path']),cfg=cfg,rpn_test_prototxt=rpn_test_prototxt)p = mp.Process(target=rpn_generate, kwargs=mp_kwargs)p.start()rpn_stage1_out['proposal_path'] = mp_queue.get()['proposal_path']p.join()

在rpn_generate中,載入了網絡,且使用了生成的rpn網絡,接下來imdb_proposals根據網絡與imdb生成了rpn_proposals。

imdb_proposals在generate.py中。?

def im_proposals(net, im):"""Generate RPN proposals on a single image."""blobs = {}blobs['data'], blobs['im_info'] = _get_image_blob(im)net.blobs['data'].reshape(*(blobs['data'].shape))net.blobs['im_info'].reshape(*(blobs['im_info'].shape))blobs_out = net.forward(data=blobs['data'].astype(np.float32, copy=False),im_info=blobs['im_info'].astype(np.float32, copy=False))scale = blobs['im_info'][0, 2]boxes = blobs_out['rois'][:, 1:].copy() / scalescores = blobs_out['scores'].copy()return boxes, scoresdef imdb_proposals(net, imdb):"""Generate RPN proposals on all images in an imdb."""_t = Timer()imdb_boxes = [[] for _ in xrange(imdb.num_images)]for i in xrange(imdb.num_images):im = cv2.imread(imdb.image_path_at(i))_t.tic()imdb_boxes[i], scores = im_proposals(net, im)_t.toc()print 'im_proposals: {:d}/{:d} {:.3f}s' \.format(i + 1, imdb.num_images, _t.average_time)if 0:dets = np.hstack((imdb_boxes[i], scores))# from IPython import embed; embed()_vis_proposals(im, dets[:3, :], thresh=0.9)plt.show()return imdb_boxes
可以看到在im_proposals中有

  boxes = blobs_out['rois'][:, 1:].copy() / scale
所以rpn生成的proposal經過了縮放,又回到了原始圖片的尺度。

imdb_boxes的shape是N*5.N為盒子的序號。


參考:

1.?http://blog.csdn.net/zy1034092330/article/details/62044941

2. https://www.zhihu.com/question/35887527/answer/140239982





本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/258634.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/258634.shtml
英文地址,請注明出處:http://en.pswp.cn/news/258634.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

初學java之常用組件

1 2 import javax.swing.*;3 4 import java.awt.*;5 class Win extends JFrame6 {7 JTextField mytext; // 設置一個文本區8 JButton mybutton;9 JCheckBox mycheckBox[]; 10 JRadioButton myradio[]; 11 ButtonGroup group; //為一…

anaconda 安裝在c盤_最省心的Python版本和第三方庫管理——初探Anaconda

打算把公眾號和知乎專欄的文章搬運一點過來。 歷史文章可以去關注我的公眾號&#xff1a;不二小段&#xff0c;或者知乎&#xff1a;段小草。也歡迎來看我的視頻學Python↓↓↓跟不二學Python這篇文章可以作為Python入門的第一站可以結合這期視頻來看&#xff0c;基本上是這期視…

Iris recognition papers in the top journals in 2017

轉載自&#xff1a;https://kiennguyenstuff.wordpress.com/2017/10/05/iris-recognition-papers-in-the-top-journals-in-2017/ Top journals: – IEEE Transaction on Pattern Analysis and Machine Intelligence (PAMI) – Pattern Recognition (PR) – IEEE Transaction on…

判斷瀏覽器是否為IE內核的最簡單的方法

沒啥說的&#xff0c;直接貼代碼&#xff0c;算是ie hack了。 if (![1,]) {alert(is ie); } 轉載于:https://www.cnblogs.com/jasondan/p/3716660.html

dubbo控制中心部署,權重配置,以及管控臺中各個配置的簡單查看

dubbo給我們提供了現成的后臺管理網站&#xff0c;專門管理這些服務&#xff0c;應用&#xff0c;路由規則&#xff0c;動態配置&#xff0c;訪問控制、權重控制、負載均衡等等&#xff0c;還可以查看系統日志&#xff0c;系統狀態&#xff0c;系統環境等等&#xff0c;功能很是…

給git配置http代理

1. 安裝socat apt-get install socat 2. 創建配置文件&#xff0c;取名gitproxy填入以下內容&#xff1a; #!/bin/sh_proxy135.245.48.33_proxyport8000 exec socat STDIO PROXY:$_proxy:$1:$2,proxyport$_proxyport 加上可執行權限chmod x gitproxy&#xff0c;將此文件放在環…

faster rcnn在自己的數據集上訓練

本文是一個總結&#xff0c;參考了網上的眾多資料&#xff0c;匯集而成&#xff0c;以供自己后續參考。 一般說來&#xff0c;訓練自己的數據&#xff0c;有兩種方法&#xff1a;第一種就是將自己的數據集完全改造成VOC2007的形式&#xff0c;然后放到py-faster-rcnn/data 目錄…

1001種玩法 | 1001種玩法--數據存儲(2)

新智云www.enncloud.cn第二趴 Flockdb&#xff1a;一個高容錯的分布式圖形數據庫 FlockDB是一個存儲圖數據的分布式數據庫&#xff0c;圖數據庫的存儲對象是數學概念圖論里面的圖&#xff0c;而非圖片。Twitter使用它來存儲人與人之間的關系圖&#xff0c;這些關系包括&#xf…

python邏輯量有什么_Python中的邏輯運算符有什么?

邏輯運算符用于組合多個條件測試語句。假設“我今年18歲”和“我身高2米”這兩個語句&#xff0c;前一個語句是真的&#xff0c;后一個語句是假的&#xff0c;因此&#xff0c;“我今年18歲&#xff0c;并且我身高2米”這個語句是假的。其中&#xff0c;“并且”可以認為是邏輯…

時區日期處理及定時 (NSDate,NSCalendar,NSTimer,NSTimeZone)

NSDate存儲的是世界標準時(UTC)&#xff0c;輸出時需要根據時區轉換為本地時間 Dates NSDate類提供了創建date&#xff0c;比較date以及計算兩個date之間間隔的功能。Date對象是不可改變的。 如果你要創建date對象并表示當前日期&#xff0c;你可以alloc一個NSDate對象并調用in…

Android ListView分頁,動態添加數據

1.ListView分頁的實現&#xff0c;重點在于實現OnScrollListener接口&#xff0c;判斷滑動到最后一項時&#xff0c;是否還有數據可以加載&#xff0c; 我們可以利用listView.addFootView(View v)方法進行提示 自定義一個ListView&#xff08;這里本來想進行一些自定已修改的。…

faster rcnn的測試

當訓練結束后&#xff0c;faster rcnn的模型保存在在py-faster-rcnn/output目錄下&#xff0c;這時就可以用已有的模型對新的數據進行測試。 下面簡要說一下測試流程。 測試的主要代碼是./tools/test_net.py&#xff0c;并且使用到了fast_rcnn中test.py。 主要流程就是&…

python重點知識 鉆石_python——子類對象如何訪問父類的同名方法

1. 為什么只說方法不說屬性關于“子類對象如何訪問父類的同名屬性“是沒有意義的。因為父類的屬性子類都有&#xff0c;子類還有父類沒有的屬性&#xff0c;在初始化時&#xff0c;給子類對象具體化所有的給定屬性&#xff0c;完全沒必要訪問父類的屬性&#xff0c;因為是一樣的…

Android-Universal-Image-Loader 的使用說明

這個圖片異步載入并緩存的類已經被非常多開發人員所使用&#xff0c;是最經常使用的幾個開源庫之中的一個&#xff0c;主流的應用&#xff0c;隨便反編譯幾個火的項目&#xff0c;都能夠見到它的身影。但是有的人并不知道怎樣去使用這庫怎樣進行配置&#xff0c;網上查到的信息…

faster rcnn end2end 訓練與測試

除了前面講過的rpn與fast rcnn交替訓練外&#xff0c;faster rcnn還提供了一種近乎聯合的訓練&#xff0c;姑且稱為end2end訓練。 根據論文所講&#xff0c;end2end的訓練一氣呵成&#xff0c;對于前向傳播&#xff0c;rpn可以作為預設的網絡提供proposal.而在后向傳播中&…

jquery ui動態切換主題的一種實現方式

這兩天看coreservlets上的jQuery教程&#xff0c;雖然比較老了&#xff0c;不過講得還是不錯。最后一部分講jQuery ui 主題切換&#xff0c;用他介紹的方法實現不了。于是自己修改了下&#xff0c;可以了。代碼如下&#xff1a;html部分&#xff1a;<fieldset class"ui…

[學習總結]7、Android AsyncTask完全解析,帶你從源碼的角度徹底理解

我們都知道&#xff0c;Android UI是線程不安全的&#xff0c;如果想要在子線程里進行UI操作&#xff0c;就需要借助Android的異步消息處理機制。之前我也寫過了一篇文章從源碼層面分析了Android的異步消息處理機制&#xff0c;感興趣的朋友可以參考 Android Handler、Message完…

python字頻統計軟件_python結巴分詞以及詞頻統計實例

python結巴分詞以及詞頻統計實例發布時間&#xff1a;2018-03-20 14:52,瀏覽次數&#xff1a;773, 標簽&#xff1a;python# codingutf-8Created on 2018年3月19日author: chenkai結巴分詞支持三種分詞模式&#xff1a;精確模式: 試圖將句子最精確地切開&#xff0c;適合文…

html從入門到賣電腦(三)

CSS3中和動畫有關的屬性有三個 transform、 transition 和 animation。下面來一一說明: transform 從字面來看transform的釋義為改變&#xff0c;使…變形&#xff1b;轉換 。這里我們就可以理解為變形。那都能怎么變呢&#xff1f; none 表示不進行變換&#xff1b; rotat…

visual studio 2015安裝 無法啟動程序,因為計算機丟失D3DCOMPILER_47.dll 的解決方法

對于題目中的解決方法&#xff0c;我查到了微軟提供的一個方案&#xff1a;https://support.microsoft.com/en-us/help/4019990/update-for-the-d3dcompiler-47-dll-component-on-windows 進入如下頁面&#xff1a;http://www.catalog.update.microsoft.com/Search.aspx?qKB4…