文章/答案/技术大牛

发布

社区首页 >问答首页 >我可以通过管道将网页的源代码从curl传输到perl吗？

问我可以通过管道将网页的源代码从curl传输到perl吗？
EN

Stack Overflow用户

提问于 2011-12-06 22:01:24

回答 2查看 611关注 0票数 1

我正在分析许多网站的源码，一个有数千个页面的庞大网络。现在我想在每个ĺ中搜索东西，我想找出关键字出现的次数。

为了解析网页，我使用curl并通过管道将输出发送到"grep perl“，这不起作用，所以我想使用-c。可以完全利用perl来抓取页面吗？

例如。

cat RawJSpiderOutput.txt | grep parsed | awk -F " " '{print $2}' | xargs -I replaceStr curl replaceStr?myPara=en | perl -lne '$c++while/myKeywordToSearchFor/g;END{print$c}'

说明:在上面的文本文件中，我有可用的和不可用的URL。使用"Grep parsed“获取可用的URL。在awk中，我选择了包含纯可用URL的第二列。到目前一切尚好。现在来看这个问题:我使用Curl获取源代码(还附加了一些参数)，并通过管道将每个页面的整个源代码传递给perl，以便计算"myKeywordToSearchFor“的出现次数。只有在可能的情况下，我才愿意用perl来做这件事。

谢谢!

perl

curl

回答 2

Stack Overflow用户

回答已采纳

发布于 2011-12-06 23:12:39

这仅使用Perl (未经测试)：

use strict;
use warnings;

use File::Fetch;

my $count;
open my $SPIDER, '<', 'RawJSpiderOutput.txt' or die $!;
while (<$SPIDER>) {
    chomp;
    if (/parsed/) {
        my $url = (split)[1];
        $url .= '?myPara=en';
        my $ff = File::Fetch->new(uri => $url);
        $ff->fetch or die $ff->error;
        my $fetched = $ff->output_file;
        open my $FETCHED, '<', $fetched or die $!;
        while (<$FETCHED>) {
            $count++ if /myKeyword/;
        }
        unlink $fetched;
    }
}
print "$count\n";

票数 3

Stack Overflow用户

发布于 2011-12-06 22:56:10

再试试像这样的，

   perl -e 'while(<>){my @words = split ' ';for my $word(@words){if(/myKeyword/){++$c}}} print "$c\n"'

即

   while (<>)               # as long as we're getting input (into “$_”)
   { my @words = split ' '; # split $_ (implicit) into whitespace, so we examine each word
     for my $word (@words)  #  (and don't miss two keywords on one line)
     { if (/myKeyword/)     # whenever it's found,
       { ++$c } } }         # increment the counter (auto-vivified)
   print "$c\n"             # and after end of file is reached, print the counter

或者，拼写为strict-like

   use strict;
   my $count = 0;
   while (my $line = <STDIN>) # except that <> is actually more magical than this
   { my @words = split ' ' => $line;
     for my $word (@words)
     { ++$count; } } }
   print "$count\n";

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/8401109

复制

相似问题

问我可以通过管道将网页的源代码从curl传输到perl吗？
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问我可以通过管道将网页的源代码从curl传输到perl吗？EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问我可以通过管道将网页的源代码从curl传输到perl吗？
EN