最近在處理一批數據,10的8次方,處理完畢大概要一個月,并且這個程序占用的CPU只有一個(我從來沒有注意到這個問題啊啊啊)。
突然師兄提醒我可以把10的8次方條數據拆成10個10的7次方,作為10條任務并行處理,我艸,三天就跑完了啊,坑爹呢這是我之前怎么沒想到呢混蛋!!
?
以后單任務的程序一定要注意下CPU的使用情況。
?
并行處理也有個簡單的方法,就是把原始文件給切割后提交,讓隊列調度程序給你并行調度就ok了。大家不要拍磚啊,這個玩意兒還是挺有用處的。
下面這個破腳本,哦,是perl腳本,用來切割文件的。我這里講某個文件切割成,每4000條數據一個文件,每1000個文件一個文件夾,閑話少說,上酸菜:
- #!/usr/bin/perl?-w??
- #?Program?name:?filter_pro.pl??
- #?Author??????:?bbsunchen??
- #?Contact?????:?bbsunchen?at?gm*il.com??
- #?Date????????:?11/10/2011??
- #?Last?Update?:?11/10/2011??
- #?Reference???:?Please?cite?our?following?papers?when?you?are?using?this?script.??
- ??
- #?Description?:???
- ??
- #===============================================================================================================??
- use?warnings;??
- use?strict;??
- use?Getopt::Long;??
- use?Cwd?qw(abs_path);??
- use?File::Basename?qw(dirname);??
- ??
- my?%opts;??
- GetOptions(\%opts,"dir:s");??
- my?$usage=?<<"USAGE";??
- ????Program:?$0??
- ????INPUT:??
- ????????-dir????????full?path?of?file??
- ??
- ????OUTPUT:??
- USAGE??
- die?$usage?unless?($opts{dir}?&&?-e?$opts{dir});??
- ??
- my?$cwd;??
- if?($opts{dir}?=~?m{^/})??
- {??
- ??$cwd?=?dirname($opts{dir});??
- }??
- else??
- {??
- ??$cwd?=?dirname(abs_path($opts{dir}));??
- }??
- open?DIR,?$opts{dir};??
- my?$seq_num?=?0;??
- my?$title?=?"";??
- my?$data?=?"";??
- while(<DIR>)??
- {??
- ????$seq_num++;??
- ????if($seq_num?%?2?!=?0)??
- ????{??
- ????????$title?=?$_;??
- ????????next;??
- ????}else??
- ????{??
- ????????$data?=?$_;???
- ????}??
- ????my?$decide_path?=?0;??
- ????if($seq_num?%?2?==?0)??
- ????{??
- ????????$decide_path?=?$seq_num?/?2;???
- ????}else??
- ????{??
- ????????$decide_path?=?int($seq_num?/?2)?+?1;??
- ????}??
- ??????
- ????my?$file_name?=?int($decide_path?/?4000);??
- ????my?$path_name?=?int($file_name?/?1000);??
- ????my?$temp_path?=?"$cwd/$path_name";??
- ????mkdir?$temp_path,0775?unless?(-e?"$temp_path");??
- ????die?$!?unless?($opts{dir}?&&?-e?$opts{dir});??
- ????open?OUT,?">>?$temp_path/$file_name.fa";??
- ????print?OUT?$title;??
- ????print?OUT?$data;??
- ????close?OUT;??
- }??
- close?DIR;?