根据以下两种情况,我需要帮助从字符串中提取"BODY“部分:
案例1:
Var1 =
Content-Type: text/plain; charset="UTF-8"
BODY
--000000000000ddc1610580816add
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
BODY56 text/html
--000000000000ddc1610580816add-案例2:
Var1=
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
BODY
--000000000000ddc1610580816add--
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
BODY56 text/html
--000000000000ddc1610580816add-我想做:
如果Var1包含:Content-Type: text/plain; charset="UTF-8",那么提取Content-Type: text/plain; charset="UTF-8"和--000000000000ddc1610580816add之间的文本
如果Var1包含:
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable然后在以下之间提取文本:
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable和--000000000000ddc1610580816add--。
我的代码,如果有人能修好的话,我需要修复它:
if (index($body, "Content-Type: text\/plain; charset=\"UTF-8\"\n
Content-Transfer-Encoding: quoted-printable") != -1) {
$body =~ /Content-Type: text\/plain; charset="UTF-8"\n
Content-Transfer-Encoding: quoted-printable(.*?)--00.*/s ;
$body=$1;
}
elsif (index($body, "Content-Type: text\/plain; charset=\"UTF-8\"") != -1)
{
$body =~ /Content-Type: text\/plain; charset="UTF-8"(.*?)--00.*/s ;
$body=$1;
}发布于 2019-02-01 11:17:07
一种解决方案:使用/ms修饰符,请参阅佩雷
#!/usr/bin/perl
use strict;
use warnings;
my $regex = qr/\AContent-Type: [^\n]+\n(?:^Content-Transfer-Encoding: [^\n]+\n)?(.+)^--.+\Z/ms;
my $body;
my $input = <<'END_OF_STRING';
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
INPUT 1 BODY
--000000000000ddc1610580816add--
END_OF_STRING
($body) = ($input =~ $regex)
or die "mismatch in INPUT 1!\n";
print "INPUT 1 '${body}'\n";
$input = <<'END_OF_STRING';
Content-Type: text/plain; charset="UTF-8"
INPUT 2 BODY
--000000000000ddc1610580816add--
END_OF_STRING
($body) = ($input =~ $regex)
or die "mismatch in INPUT 2!\n";
print "INPUT 2 '${body}'\n";
exit 0;测试运行:
$ perl dummy.pl
INPUT 1 '
INPUT 1 BODY
'
INPUT 2 '
INPUT 2 BODY
'更新:带有OP提供的新输入字符串的:
#!/usr/bin/perl
use strict;
use warnings;
# multipart MIME content as single string
my $input = <<'END_OF_STRING';
--0000000000007bcdff05808169f5
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
BODY text/plain
--0000000000007bcdff05808169f5
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
BODY text/html
--0000000000007bcdff05808169f5
END_OF_STRING
# split into multiple parts at the separator
foreach my $part (split(/^--[^\n]+\n/ms, $input)) {
# skip empty parts
next if $part =~ /\A\s*\Z/m;
# split header and body
my($header, $body) = split("\n\n", $part, 2);
# Only match parts with text/plain content
# "Content-Type" must be matched case-insensitive
if ($header =~ m{^(?i)Content-Type(?-i):\s+text/plain[;\s]}ms) {
print "plain text BODY: '${body}'\n";
}
}
exit 0;测试输出:
$ perl dummy.pl
plain text BODY: 'BODY text/plain
'https://stackoverflow.com/questions/54477772
复制相似问题