我是一名perl新手,一直在使用Email::MIME了解如何正确解析包含多个部分的电子邮件。我刚刚确定了另一个我目前的努力无法正确阅读的组合。
Content-Type: multipart/mixed; boundary="===============1811908679642194059=="
MIME-Version: 1.0
This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--===============1811908679642194059==
Content-Type: multipart/signed; micalg=pgp-sha256;
protocol="application/pgp-signature";
boundary="lGJM242FL2E9Wh4auTNwQRWOeFI0Wj9mB"
This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--lGJM242FL2E9Wh4auTNwQRWOeFI0Wj9mB
Content-Type: multipart/alternative;
boundary="------------CC2F0C038668F58F6EDEA0D2"
This is a multi-part message in MIME format.
--------------CC2F0C038668F58F6EDEA0D2
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D文本/纯文本部分是我想要的部分,但阅读"text“组件只会给我一个”这是一个多部分...“行,就是这样。这是我开发的代码,用于阅读具有类似子部分的其他电子邮件,但它不能正确解释这一部分。
它看起来与作为Email::MIME:的一部分的"body“函数有关。
This decodes and returns the body of the object as a byte string. For
top-level objects in multi-part messages, this is highly likely to be
something like "This is a multi-part message in MIME format."在Email::MIME中使用什么函数才能正确读取此内容类型?
如何正确识别此电子邮件中的内容类型?是"multipart/mixed“、"text/plain”还是"multipart/alternative"?
我甚至想在这里使用子部分方法吗?
my @mailData;
my $msg = Email::MIME->new($buf);
foreach my $part ( $msg->subparts ) {
foreach my $sub_part ($part->subparts) {
print $sub_part->content_type;
if ($sub_part->content_type =~ m!text!) {
@mailData = split( '\n', $sub_part->body);
}
}
}上面的代码只打印"This is a multi message...“在@mailData数组中。
发布于 2017-03-18 04:46:20
过去几天,我一直在使用Email::MIME,MIME::Parser和MIME::Entity,以便自动化处理大量电子邮件。我发现对同一封电子邮件进行编码的标准方法太少了,这比我想象的要难得多。
这是处理邮件头和邮件正文的一种非常可靠的方法。非常感谢所有在此过程中提供帮助的人。
#!/usr/bin/perl -w
use strict;
use MIME::Parser;
use MIME::Entity;
use Email::MIME;
# Read the email from STDIN
my $buf;
while(<STDIN> ){
$buf .= $_;
}
# This creates msg-NNNN-N.txt and signature-N.asc files
# and I don't know why. Related to output_to_core?
my $parser = MIME::Parser->new;
$parser->extract_uuencode(1);
$parser->extract_nested_messages(1);
$parser->output_to_core(0);
# For reading headers
my $entity = $parser->parse_data($buf);
# For reading the body (of an mbox)
my $msg = Email::MIME->new($buf);
# Use MIME::Entity to read various headers.
my $subject = $entity->head->get('Subject');
my $from = $entity->head->get('From');
my $AdvDate = $entity->head->get('Date');
$AdvDate =~ s/\n//g; $subject =~ s/\n//g; $from =~ s/\n//g;
print "Subject: $subject\n";
print "From: $from\n";
print "Date: $AdvDate\n";
my @mailData;
# walk through all the different attachments. Stop at the first one that matches and
# read its contents into mailData. The first one typically appeared to be the primary one.
$msg->walk_parts(sub {
my ($part) = @_;
#warn($part->content_type . ": " . $part->subparts);
if (($part->content_type =~ /text\/plain; charset=\"?utf-8\"?/i) && !@mailData) {
#print $part->body;
@mailData = split( '\n', $part->body);
}
elsif (($part->content_type =~ /text\/plain; charset=\"?us-ascii\"?/i) && !@mailData) {
#print $part->body;
@mailData = split( '\n', $part->body);
}
elsif (($part->content_type =~ /text\/plain; charset=\"?windows-1252\"?/i) && !@mailData) {
#print $part->body;
@mailData = split( '\n', $part->body);
}
elsif (($part->content_type =~ /text\/plain; charset=\"?iso-8859-1\"?/i) && !@mailData) {
#print $part->body;
@mailData = split( '\n', $part->body);
}
});
# manipulate the body of the message stored in mailData
foreach my $line (@mailData) {
print "$line\n";
}https://stackoverflow.com/questions/42823610
复制相似问题