查看文章 |
一个百度mp3自动下载工具
2008年04月03日 星期四 22:06
前一段付亮写了一个百度下载歌曲的工具http://fuliang.javaeye.com/blog/176323感觉很不错,这两天反正没有心思看书,不如写写程序,我也写了一个Perl语言的功能还很弱,没有图形化前端,只要把$oldname改成想下载的歌名(改成命令行参数可能更好,以后再说吧),就会自动搜索下载,在那么多可选的下载来源中,我选择链接速度最快的来下(通过ping),而付亮是让用户选择哪个来下,这点我跟他讨论过,我感觉还是不要让用户去选择,毕竟用户不会选择速度慢的来下吧,下载使用的是libcurl多线程下载库,其实wget就已经很不错了,我一直使用wget,不过wget不支持多线程,有机会争取要把wget改成多线程,也算为开源做点贡献。代码如下: #!/usr/bin/perl -w use strict; use LWP::UserAgent; use HTTP::Request; use HTTP::Request::Common; use LWP::Simple; use Encode; use HTML::SimpleLinkExtor; use URI; use threads; use Thread::Queue; use WWW::Curl::Easy; open STDERR,'> log' or die "$!"; my $oldname = "千里之外"; #改成你想下载的歌曲名 my $songname = decode('UTF-8', $oldname); $songname = encode('GBK', $songname); $songname =~ s/([^a-zA-Z0-9_\.\-\~\/:\\])/uc sprintf("%%%02x", ord($1))/eg; my $url = "http://mp3.baidu.com/m?f=ms&tn=baidump3&ct=134217728&lf=&rn=&word=".$songname."&lm=0"; my $web = LWP::UserAgent->new(); $web->timeout(60); $web->agent("Mozilla/5.0"); $web->max_size(1000000); my $response = $web->get($url); if(not $response->is_success) { print "get the $url failed\n"; exit; } my $content = $response->content; $content = decode('GBK',$content); $content =~ s/\r//g; my %alllinks = (); my $extor = HTML::SimpleLinkExtor->new(); $extor->parse($content); for my $new_url ($extor->a) { if($new_url =~ /.*?\.mp3,,.*?/) { my $response = $web->get($new_url); next if(not $response->is_success); my $content = $response->content; $content = decode('GBK',$content); my $extractor = HTML::SimpleLinkExtor->new(); $extractor->parse($content); my $link = ($extractor->a)[0]; $alllinks{encode('UTF-8',$link)} = 1; } } my $url_queue = Thread::Queue->new(); my @threads = (); my $thread_cnt = 10; for my $link (keys %alllinks) { $url_queue->enqueue($link); } my $best_url : shared = ""; my $min_time : shared = 30000000; for (1..$thread_cnt) { push @threads,threads->new( sub { while(my $url = $url_queue->dequeue_nb) { my $uri = URI->new($url); $uri = $uri->host; my $response = `ping -c 1 -W 30 $uri`; if($response =~ m/time=([^ ]*)/) { lock $min_time; if($1 < $min_time) { $min_time = $1; $best_url = $url; } } } } ); } for (@threads) { $_->join; } print "MIN_TIME=$min_time, BEST_URL=$best_url\n"; my $curl = WWW::Curl::Easy->new(); my (@head, @body); $curl->setopt(CURLOPT_HEADERFUNCTION, \&write_callback); $curl->setopt(CURLOPT_WRITEFUNCTION, \&write_callback); $curl->setopt(CURLOPT_HTTPHEADER, \@head); $curl->setopt(CURLOPT_FILE, \@body); $curl->setopt(CURLOPT_URL, $best_url); die "curl perform failed!" if($curl->perform() != 0); $content = join("", @body); open OUT, "> $oldname.mp3" or die "$!"; print OUT $content; close OUT; sub write_callback { my ($chunk,$variable)=@_; push @{$variable}, $chunk; return length($chunk); } |
最近读者: