百度首页 | 百度空间
 
查看文章
 
一个百度mp3自动下载工具
2008年04月03日 星期四 22:06
前一段付亮写了一个百度下载歌曲的工具http://fuliang.javaeye.com/blog/176323感觉很不错,这两天反正没有心思看书,不如写写程序,我也写了一个Perl语言的功能还很弱,没有图形化前端,只要把$oldname改成想下载的歌名(改成命令行参数可能更好,以后再说吧),就会自动搜索下载,在那么多可选的下载来源中,我选择链接速度最快的来下(通过ping),而付亮是让用户选择哪个来下,这点我跟他讨论过,我感觉还是不要让用户去选择,毕竟用户不会选择速度慢的来下吧,下载使用的是libcurl多线程下载库,其实wget就已经很不错了,我一直使用wget,不过wget不支持多线程,有机会争取要把wget改成多线程,也算为开源做点贡献。代码如下:


#!/usr/bin/perl -w

use strict;
use LWP::UserAgent;
use HTTP::Request;
use HTTP::Request::Common;
use LWP::Simple;
use Encode;
use HTML::SimpleLinkExtor;
use URI;
use threads;
use Thread::Queue;
use WWW::Curl::Easy;

open STDERR,'> log' or die "$!";
my $oldname = "千里之外"; #改成你想下载的歌曲名
my $songname = decode('UTF-8', $oldname);
$songname = encode('GBK', $songname);
$songname =~ s/([^a-zA-Z0-9_\.\-\~\/:\\])/uc   sprintf("%%%02x",   ord($1))/eg;

my $url = "http://mp3.baidu.com/m?f=ms&tn=baidump3&ct=134217728&lf=&rn=&word=".$songname."&lm=0";
my $web = LWP::UserAgent->new();
$web->timeout(60);
$web->agent("Mozilla/5.0");
$web->max_size(1000000);

my $response = $web->get($url);
if(not $response->is_success)
{
    print "get the $url failed\n";
    exit;
}
my $content = $response->content;
$content = decode('GBK',$content);
$content =~ s/\r//g;

my %alllinks = ();
my $extor = HTML::SimpleLinkExtor->new();
$extor->parse($content);
for my $new_url ($extor->a)
{
    if($new_url =~ /.*?\.mp3,,.*?/)
    {
        my $response = $web->get($new_url);
        next if(not $response->is_success);
        my $content = $response->content;
        $content = decode('GBK',$content);
        my $extractor = HTML::SimpleLinkExtor->new();
        $extractor->parse($content);
        my $link = ($extractor->a)[0];
        $alllinks{encode('UTF-8',$link)} = 1;
    }
}

my $url_queue = Thread::Queue->new();
my @threads = ();
my $thread_cnt = 10;
for my $link (keys %alllinks)
{
    $url_queue->enqueue($link);
}

my $best_url : shared = "";
my $min_time : shared = 30000000;
for (1..$thread_cnt)
{
    push @threads,threads->new(
        sub
        {
            while(my $url = $url_queue->dequeue_nb)
            {
                my $uri = URI->new($url);
                $uri = $uri->host;
                my $response = `ping -c 1 -W 30 $uri`;
                if($response =~ m/time=([^ ]*)/)
                {
                    lock $min_time;
                    if($1 < $min_time)
                    {
                        $min_time = $1;
                        $best_url = $url;
                    }
                }
            }
        }
    );
}
for (@threads)
{
    $_->join;
}

print "MIN_TIME=$min_time, BEST_URL=$best_url\n";

my $curl = WWW::Curl::Easy->new();
my (@head, @body);
$curl->setopt(CURLOPT_HEADERFUNCTION, \&write_callback);
$curl->setopt(CURLOPT_WRITEFUNCTION, \&write_callback);
$curl->setopt(CURLOPT_HTTPHEADER, \@head);
$curl->setopt(CURLOPT_FILE, \@body);
$curl->setopt(CURLOPT_URL, $best_url);
die "curl perform failed!" if($curl->perform() != 0);
$content = join("", @body);
open OUT, "> $oldname.mp3" or die "$!";
print OUT $content;
close OUT;

sub write_callback
{
    my ($chunk,$variable)=@_;
    push @{$variable}, $chunk;
    return length($chunk);
}

类别:默认分类 | 添加到搜藏 | 浏览() | 评论 (0)
 
最近读者:
 
网友评论:
发表评论:
姓 名:
网址或邮箱: (选填)
内 容:
验证码:
 

     

©2008 Baidu