文章/答案/技术大牛

发布

社区首页 >问答首页 >从文本文件中获取特定单词的PerlScript

问从文本文件中获取特定单词的PerlScript
EN

Stack Overflow用户

提问于 2021-03-05 23:53:12

回答 3查看 73关注 0票数 0

我有一个文本文件，其中包含这样的数据：

#alstrong textert tcp $EXTERNAL_NET $HTTP_PORTS -> $HOME_NET any (msg:"ET ACTIVEX Microsoft Whale Intelligent Application Gateway ActiveX Buffer Overflow-1"; flow:established,to_client; file_data; **content:"8D9563A9-8D5F-459B-87F2-BA842255CB9A"**; nocase; **content:"CheckForUpdates"**; nocase; distance:0; pcre:"/<OBJECT\s+[^>]*classid\s*=\s*[\x22\x27]?\s*clsid\s*\x3a\s*\x7B?\s*8D9563A9-8D5F-459B-87F2-BA842255CB9A/si";reference:url,dev.metasploit.com/redmine/projects/framework/repository/entry/modules/exploits/windows/browser/mswhale_checkforupdates.rb; reference:url,www.kb.cert.org/vuls/id/789121; reference:url,doc.emergingthreats.net/2010562; classtype:web-application-attack; sid:2010562; rev:6; metadata:affected_product Windows_XP_Vista_7_8_10_Server_32_64_Bit, attack_target Client_Endpoint, created_at 2010_07_30, deployment Perimeter, signature_severity Major, tag ActiveX, updated_at 2016_07_01;)

我需要提取名为"Content“的字段中的所有单词，并将它们存储在另一个文本文件中。我在Perl中发现了这段代码(我没有使用它的经验)，它是提取所有字段还是只提取第一个字段？

#!/local/bin/perl5 -w
# Description:
# Extract bit-pattern from content-part of Snort-rules.
# Choose rules that have only one content-part.
# Store distinct patterns only.
# Choose length of shortest and longest pattern to store.
$rulesdir = "/hom/geirni/www_docs/research/snort202_win32/Snort/rules";
@rulefiles = `ls $rulesdir/*.rules`;
$camfile = "camdata.txt";
#
$minLength = 4; # Bytes
$maxLength = 32;
#
# Find content-part of rules
for $rulefile(@rulefiles){
    #
    open(INFILE, "<".$rulefile) or die
      "Can't open ".$rulefile."\n";
    @rules = <INFILE>;
    close(INFILE);
    #
    for $rule(@rules){
        #
        $contentParts = 0;
        #
        if($rule =~ /content:/){
            @parts = split(/;/, $rule);
            for $part(@parts){
                if($part =~ /content:/){
                    $content = $part;
                    $contentParts++;
                    # Remove anything before content-part
                    $content =~ s/^.*content:.*?\"//i;
                    # Remove anything after content-part
                    $content =~ s/\"$.*//g;
                }
            }
        }
        #
        # Store content-part
        if ($contentParts == 1){
            push(@contents, $content);
        }
    }
}
#
#
#
# Convert content-strings to hex. Store only distinct patterns
for $content(@contents){
    #
    $pipe = 0; # hex patterns are limited by pipes; |00 bc 55|
    $char = ""; # Current character in content; ASCII or hex
    $pattern = ""; # Content converted to hex
    #
    # Loop through current content-string
    for ($i=0; $i<=length($content)-1; $i++){ # -1 for newline
        #
        $char = substr($content, $i, 1);
        #
        # Control over pipes
        if($char =~ /\|/){
            if(!$pipe){
                $pipe = 1;
            }
            else {
                $pipe = 0;
            }
            next; # Skip to next character
        }
        #
        # Convert to lowcase hex
        if(!$pipe){ # ASCII-value
            $pattern .= sprintf("%x", ord($char));
        }
        else { # hex-value
            $char =~ s/ //; # Remove blanks
            $pattern .= "\l$char";
        }
    }
    #
    # Store converted pattern
    if((length($pattern) >= $minLength*2) &&
       (length($pattern) <= $maxLength*2)){
        $hexPatterns{$pattern} = "dummyValue"; # Keys will be distinct
    }
}
#
#
#
# Print patterns, that have no subsets, to file
open(OUTFILE, ">".$camfile) or die
  "Can't open ".$camfile."\n";
#
@patterns = keys %hexPatterns;
$count = 0; # Count patterns that are written to file
#
HEXLOOP:
for($i=0; $i<=$#patterns; $i++){
    for($j=0; $j<=$#patterns; $j++){ # Search for subsets
        #
        next if($i==$j); # Do not compare a pattern with itself
        #
        next HEXLOOP if # Skip if subset is found
          ((length($patterns[$i]) <= length($patterns[$j])) &&
           ($patterns[$j] =~ /$patterns[$i]/));
    }
    print OUTFILE $patterns[$i]."\n";
    $count++;
}
#
close(OUTFILE);
#
#
#
# msg
print
  "\n".
  " Wrote ".$count." patterns to file: \"".$camfile."\"\n".
  "\n";

perl

回答 3

Stack Overflow用户

发布于 2021-03-06 06:05:02

我在Perl中找到了这段代码(我没有使用它的经验)

与其是您甚至不敢阅读其注释的Perl脚本，不如考虑使用：

grep -Po 'content:".*?"' <text >another_text

要删除content:、引号和破折号，您可以使用：

grep -Po '(?<=content:").*?(?=")' <text | tr -d - >another_text

票数 1

Stack Overflow用户

发布于 2021-03-06 07:36:23

下面的perl脚本将"content“数据提取到屏幕上(逐行)。要存储数据，请将输出重定向到文件中。

#!/usr/bin/env perl
#
# vim: ai ts=4 sw=4

use strict;
use warnings;
use feature 'say';

while( my $line = <> ) {
        my @array = $line =~ /content:"(.*?)"/g;
        say join "\t", @array;
}

以script.pl filename用户身份运行脚本

输出

8D9563A9-8D5F-459B-87F2-BA842255CB9A    CheckForUpdates

票数 1

Stack Overflow用户

发布于 2021-03-23 02:51:16

@Armali我想在代码中编辑这个部分，这样它就可以在同一行中再次检查是否有其他内容部分，还可以提取它们并在不同的行中打印它们：

#
    if($rule =~ /content:/){
        @parts = split(/;/, $rule);
        for $part(@parts){
            if($part =~ /content:/){
                $content = $part;
                $contentParts++;
                # Remove anything before content-part
                $content =~ s/^.*content:.*?\"//i;
                # Remove anything after content-part
                $content =~ s/\"$.*//g;
            }
        }
    }
    #
    # Store content-part
    if ($contentParts == 1){
        push(@contents, $content);
    }
}

}

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/66495616

复制

相似问题

问从文本文件中获取特定单词的PerlScript
EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问从文本文件中获取特定单词的PerlScriptEN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问从文本文件中获取特定单词的PerlScript
EN