首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >从文本文件中获取特定单词的PerlScript

从文本文件中获取特定单词的PerlScript
EN

Stack Overflow用户
提问于 2021-03-05 23:53:12
回答 3查看 73关注 0票数 0

我有一个文本文件,其中包含这样的数据:

代码语言:javascript
复制
#alstrong textert tcp $EXTERNAL_NET $HTTP_PORTS -> $HOME_NET any (msg:"ET ACTIVEX Microsoft Whale Intelligent Application Gateway ActiveX Buffer Overflow-1"; flow:established,to_client; file_data; **content:"8D9563A9-8D5F-459B-87F2-BA842255CB9A"**; nocase; **content:"CheckForUpdates"**; nocase; distance:0; pcre:"/<OBJECT\s+[^>]*classid\s*=\s*[\x22\x27]?\s*clsid\s*\x3a\s*\x7B?\s*8D9563A9-8D5F-459B-87F2-BA842255CB9A/si";reference:url,dev.metasploit.com/redmine/projects/framework/repository/entry/modules/exploits/windows/browser/mswhale_checkforupdates.rb; reference:url,www.kb.cert.org/vuls/id/789121; reference:url,doc.emergingthreats.net/2010562; classtype:web-application-attack; sid:2010562; rev:6; metadata:affected_product Windows_XP_Vista_7_8_10_Server_32_64_Bit, attack_target Client_Endpoint, created_at 2010_07_30, deployment Perimeter, signature_severity Major, tag ActiveX, updated_at 2016_07_01;)

我需要提取名为"Content“的字段中的所有单词,并将它们存储在另一个文本文件中。我在Perl中发现了这段代码(我没有使用它的经验),它是提取所有字段还是只提取第一个字段?

代码语言:javascript
复制
#!/local/bin/perl5 -w
# Description:
# Extract bit-pattern from content-part of Snort-rules.
# Choose rules that have only one content-part.
# Store distinct patterns only.
# Choose length of shortest and longest pattern to store.
$rulesdir = "/hom/geirni/www_docs/research/snort202_win32/Snort/rules";
@rulefiles = `ls $rulesdir/*.rules`;
$camfile = "camdata.txt";
#
$minLength = 4; # Bytes
$maxLength = 32;
#
# Find content-part of rules
for $rulefile(@rulefiles){
    #
    open(INFILE, "<".$rulefile) or die
      "Can't open ".$rulefile."\n";
    @rules = <INFILE>;
    close(INFILE);
    #
    for $rule(@rules){
        #
        $contentParts = 0;
        #
        if($rule =~ /content:/){
            @parts = split(/;/, $rule);
            for $part(@parts){
                if($part =~ /content:/){
                    $content = $part;
                    $contentParts++;
                    # Remove anything before content-part
                    $content =~ s/^.*content:.*?\"//i;
                    # Remove anything after content-part
                    $content =~ s/\"$.*//g;
                }
            }
        }
        #
        # Store content-part
        if ($contentParts == 1){
            push(@contents, $content);
        }
    }
}
#
#
#
# Convert content-strings to hex. Store only distinct patterns
for $content(@contents){
    #
    $pipe = 0; # hex patterns are limited by pipes; |00 bc 55|
    $char = ""; # Current character in content; ASCII or hex
    $pattern = ""; # Content converted to hex
    #
    # Loop through current content-string
    for ($i=0; $i<=length($content)-1; $i++){ # -1 for newline
        #
        $char = substr($content, $i, 1);
        #
        # Control over pipes
        if($char =~ /\|/){
            if(!$pipe){
                $pipe = 1;
            }
            else {
                $pipe = 0;
            }
            next; # Skip to next character
        }
        #
        # Convert to lowcase hex
        if(!$pipe){ # ASCII-value
            $pattern .= sprintf("%x", ord($char));
        }
        else { # hex-value
            $char =~ s/ //; # Remove blanks
            $pattern .= "\l$char";
        }
    }
    #
    # Store converted pattern
    if((length($pattern) >= $minLength*2) &&
       (length($pattern) <= $maxLength*2)){
        $hexPatterns{$pattern} = "dummyValue"; # Keys will be distinct
    }
}
#
#
#
# Print patterns, that have no subsets, to file
open(OUTFILE, ">".$camfile) or die
  "Can't open ".$camfile."\n";
#
@patterns = keys %hexPatterns;
$count = 0; # Count patterns that are written to file
#
HEXLOOP:
for($i=0; $i<=$#patterns; $i++){
    for($j=0; $j<=$#patterns; $j++){ # Search for subsets
        #
        next if($i==$j); # Do not compare a pattern with itself
        #
        next HEXLOOP if # Skip if subset is found
          ((length($patterns[$i]) <= length($patterns[$j])) &&
           ($patterns[$j] =~ /$patterns[$i]/));
    }
    print OUTFILE $patterns[$i]."\n";
    $count++;
}
#
close(OUTFILE);
#
#
#
# msg
print
  "\n".
  " Wrote ".$count." patterns to file: \"".$camfile."\"\n".
  "\n";
EN

回答 3

Stack Overflow用户

发布于 2021-03-06 06:05:02

我在Perl中找到了这段代码(我没有使用它的经验)

与其是您甚至不敢阅读其注释的Perl脚本,不如考虑使用:

代码语言:javascript
复制
grep -Po 'content:".*?"' <text >another_text

要删除content:、引号和破折号,您可以使用:

代码语言:javascript
复制
grep -Po '(?<=content:").*?(?=")' <text | tr -d - >another_text
票数 1
EN

Stack Overflow用户

发布于 2021-03-06 07:36:23

下面的perl脚本将"content“数据提取到屏幕上(逐行)。要存储数据,请将输出重定向到文件中。

代码语言:javascript
复制
#!/usr/bin/env perl
#
# vim: ai ts=4 sw=4

use strict;
use warnings;
use feature 'say';

while( my $line = <> ) {
        my @array = $line =~ /content:"(.*?)"/g;
        say join "\t", @array;
}

script.pl filename用户身份运行脚本

输出

代码语言:javascript
复制
8D9563A9-8D5F-459B-87F2-BA842255CB9A    CheckForUpdates
票数 1
EN

Stack Overflow用户

发布于 2021-03-23 02:51:16

@Armali我想在代码中编辑这个部分,这样它就可以在同一行中再次检查是否有其他内容部分,还可以提取它们并在不同的行中打印它们:

代码语言:javascript
复制
#
    if($rule =~ /content:/){
        @parts = split(/;/, $rule);
        for $part(@parts){
            if($part =~ /content:/){
                $content = $part;
                $contentParts++;
                # Remove anything before content-part
                $content =~ s/^.*content:.*?\"//i;
                # Remove anything after content-part
                $content =~ s/\"$.*//g;
            }
        }
    }
    #
    # Store content-part
    if ($contentParts == 1){
        push(@contents, $content);
    }
}

}

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/66495616

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档