我的本地文件夹中有一个mhtml存储在file_path,其中包含用户事件日志。我能够使用以下代码打开它:
with open(file_path, 'r') as fp:
message = email.message_from_file(fp)
for part in message.walk():
if (part.get_content_type() == "text/html"):
soup = BeautifulSoup(part.get_payload(decode=False), 'html.parser')
parsed_data = soup.prettify()
print(parsed_data)下面是我使用上述代码获得的输出的一部分:
<!DOCTYPE html>
<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type"/>
<link href="main.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript">
function zoomToggle(num)
{
var img = document.getElementById("ss-" + num);
if (img.className == "screenshot") {
img.className = "screenshot-thumb";
}
else {
img.className = "screenshot";
}
return false;
}
</script>
<title>
Recorded Steps
</title>
</head>
<body>
<!-- This is the recorded XML data that was used in generating this page. -->
<xml id="recordeddata">
<script id="myXML" type="text/xml">
<?xml version="1.0" encoding="UTF-8"?>
<Report>
<System MajorVersion="10" MinorVersion="0" ServicePackMajor="0" ServicePackMinor="0" BuildNumber="18362" Sku="101" Platform="2" />
<UserActionData>
<RecordSession SessionCount="1" StartTime="11:51:58 AM" StopTime="11:52:39 AM" ActionCount="11" MissedActionCount="0">
<EachAction ActionNumber="1" Time="11:52:01 AM" Pid="8316" ProgramId="0000f519feec486de87ed73cb92d3cac802400000000" FileId="0000c07130e269ebfeefcb7d893f01498bc96aa5ab24" FileVersion="10.0.18362.1 (WinBuild.160101.0800)" FileDescription="Windows Explorer" FileCompany="Microsoft Corporation" FileName="EXPLORER.EXE" CommandLine="EXPLORER.EXE">
<Description>User left click on "Word - 1 running window (button)"</Description>
<Action>Mouse Left Click</Action>
<CursorCoordsXY>1383,1061</CursorCoordsXY>
<ScreenCoordsXYWH>0,0,1920,1080</ScreenCoordsXYWH>
<UIAStack>
<Level AutomationId="Microsoft.Office.WINWORD.EXE.15" BoundingRectangle="1358,1030,62,50" ControlType="50000" Name="Word - 1 running window" LocalizedControlType="button" />
<Level BoundingRectangle="610,1030,950,50" ClassName="MSTaskListWClass" ControlType="50021" FrameworkId="Win32" Name="Running applications" LocalizedControlType="tool bar" />
<Level BoundingRectangle="610,1030,950,50" ClassName="MSTaskSwWClass" ControlType="50033" FrameworkId="Win32" Name="Running applications" LocalizedControlType="pane" />
<Level AutomationId="40965" BoundingRectangle="608,1030,952,50" ClassName="ReBarWindow32" ControlType="50033" FrameworkId="Win32" LocalizedControlType="pane" />
<Level BoundingRectangle="0,1030,1920,50" ClassName="Shell_TrayWnd" ControlType="50033" FrameworkId="Win32" Name="Taskbar" LocalizedControlType="pane" />
</UIAStack>
<ScreenshotFileName>screenshot0001.JPEG</ScreenshotFileName>
</EachAction>
<EachAction ActionNumber="2" Time="11:52:03 AM" Pid="17892" ProgramId="00062dde378ad4a45da1a2e77a67bb6a717100000000" 还有多个..。在那之后。
我正在尝试使用EachAction检索EachAction选项卡中的特性。我无法找到解决这个问题的办法。有人能帮我一下吗?
谢谢
发布于 2020-06-03 17:39:55
如注释中所述,要获得必要的数据,可以使用此脚本(对于我在下面使用的txt变量,您可以使用您的HTML有效负载):
soup = BeautifulSoup(txt, 'html.parser')
# locate <script> with XML data
script = soup.select_one('script#myXML')
# parse the XML data
xml_soup = BeautifulSoup(script.contents[0], 'html.parser')
# get data
all_data = []
for each_eaction in xml_soup.select('EachAction'):
all_data.append({'ActionNumber': each_eaction['actionnumber'],
'Time':each_eaction['time'],
'FileDescription':each_eaction['filedescription'],
'Action':each_eaction.find('action').get_text(strip=True)})
# print data:
for line in all_data:
print('{:<30}{:<30}{:<30}{:<30}'.format(*line.values()))指纹:
1 11:52:01 AM Windows Explorer Mouse Left Click
2 12:52:01 AM Windows Explorer 1 Mouse Left Click 2
3 10:52:01 AM Windows Explorer 2 Mouse Left Click 3 我使用的HTML:
txt = '''<!DOCTYPE html>
<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type"/>
<link href="main.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript">
function zoomToggle(num)
{
var img = document.getElementById("ss-" + num);
if (img.className == "screenshot") {
img.className = "screenshot-thumb";
}
else {
img.className = "screenshot";
}
return false;
}
</script>
<title>
Recorded Steps
</title>
</head>
<body>
<!-- This is the recorded XML data that was used in generating this page. -->
<xml id="recordeddata">
<script id="myXML" type="text/xml">
<?xml version="1.0" encoding="UTF-8"?>
<Report>
<System MajorVersion="10" MinorVersion="0" ServicePackMajor="0" ServicePackMinor="0" BuildNumber="18362" Sku="101" Platform="2" />
<UserActionData>
<RecordSession SessionCount="1" StartTime="11:51:58 AM" StopTime="11:52:39 AM" ActionCount="11" MissedActionCount="0">
<EachAction ActionNumber="1" Time="11:52:01 AM" Pid="8316" ProgramId="0000f519feec486de87ed73cb92d3cac802400000000" FileId="0000c07130e269ebfeefcb7d893f01498bc96aa5ab24" FileVersion="10.0.18362.1 (WinBuild.160101.0800)" FileDescription="Windows Explorer" FileCompany="Microsoft Corporation" FileName="EXPLORER.EXE" CommandLine="EXPLORER.EXE">
<Description>User left click on "Word - 1 running window (button)"</Description>
<Action>Mouse Left Click</Action>
<CursorCoordsXY>1383,1061</CursorCoordsXY>
<ScreenCoordsXYWH>0,0,1920,1080</ScreenCoordsXYWH>
<UIAStack>
<Level AutomationId="Microsoft.Office.WINWORD.EXE.15" BoundingRectangle="1358,1030,62,50" ControlType="50000" Name="Word - 1 running window" LocalizedControlType="button" />
<Level BoundingRectangle="610,1030,950,50" ClassName="MSTaskListWClass" ControlType="50021" FrameworkId="Win32" Name="Running applications" LocalizedControlType="tool bar" />
<Level BoundingRectangle="610,1030,950,50" ClassName="MSTaskSwWClass" ControlType="50033" FrameworkId="Win32" Name="Running applications" LocalizedControlType="pane" />
<Level AutomationId="40965" BoundingRectangle="608,1030,952,50" ClassName="ReBarWindow32" ControlType="50033" FrameworkId="Win32" LocalizedControlType="pane" />
<Level BoundingRectangle="0,1030,1920,50" ClassName="Shell_TrayWnd" ControlType="50033" FrameworkId="Win32" Name="Taskbar" LocalizedControlType="pane" />
</UIAStack>
<ScreenshotFileName>screenshot0001.JPEG</ScreenshotFileName>
</EachAction>
<EachAction ActionNumber="2" Time="12:52:01 AM" Pid="8316" ProgramId="0000f519feec486de87ed73cb92d3cac802400000000" FileId="0000c07130e269ebfeefcb7d893f01498bc96aa5ab24" FileVersion="10.0.18362.1 (WinBuild.160101.0800)" FileDescription="Windows Explorer 1" FileCompany="Microsoft Corporation" FileName="EXPLORER.EXE" CommandLine="EXPLORER.EXE">
<Description>User left click on "Word - 1 running window (button)"</Description>
<Action>Mouse Left Click 2</Action>
<CursorCoordsXY>1383,1061</CursorCoordsXY>
<ScreenCoordsXYWH>0,0,1920,1080</ScreenCoordsXYWH>
<UIAStack>
<Level AutomationId="Microsoft.Office.WINWORD.EXE.15" BoundingRectangle="1358,1030,62,50" ControlType="50000" Name="Word - 1 running window" LocalizedControlType="button" />
<Level BoundingRectangle="610,1030,950,50" ClassName="MSTaskListWClass" ControlType="50021" FrameworkId="Win32" Name="Running applications" LocalizedControlType="tool bar" />
<Level BoundingRectangle="610,1030,950,50" ClassName="MSTaskSwWClass" ControlType="50033" FrameworkId="Win32" Name="Running applications" LocalizedControlType="pane" />
<Level AutomationId="40965" BoundingRectangle="608,1030,952,50" ClassName="ReBarWindow32" ControlType="50033" FrameworkId="Win32" LocalizedControlType="pane" />
<Level BoundingRectangle="0,1030,1920,50" ClassName="Shell_TrayWnd" ControlType="50033" FrameworkId="Win32" Name="Taskbar" LocalizedControlType="pane" />
</UIAStack>
<ScreenshotFileName>screenshot0001.JPEG</ScreenshotFileName>
</EachAction>
<EachAction ActionNumber="3" Time="10:52:01 AM" Pid="8316" ProgramId="0000f519feec486de87ed73cb92d3cac802400000000" FileId="0000c07130e269ebfeefcb7d893f01498bc96aa5ab24" FileVersion="10.0.18362.1 (WinBuild.160101.0800)" FileDescription="Windows Explorer 2" FileCompany="Microsoft Corporation" FileName="EXPLORER.EXE" CommandLine="EXPLORER.EXE">
<Description>User left click on "Word - 1 running window (button)"</Description>
<Action>Mouse Left Click 3</Action>
<CursorCoordsXY>1383,1061</CursorCoordsXY>
<ScreenCoordsXYWH>0,0,1920,1080</ScreenCoordsXYWH>
<UIAStack>
<Level AutomationId="Microsoft.Office.WINWORD.EXE.15" BoundingRectangle="1358,1030,62,50" ControlType="50000" Name="Word - 1 running window" LocalizedControlType="button" />
<Level BoundingRectangle="610,1030,950,50" ClassName="MSTaskListWClass" ControlType="50021" FrameworkId="Win32" Name="Running applications" LocalizedControlType="tool bar" />
<Level BoundingRectangle="610,1030,950,50" ClassName="MSTaskSwWClass" ControlType="50033" FrameworkId="Win32" Name="Running applications" LocalizedControlType="pane" />
<Level AutomationId="40965" BoundingRectangle="608,1030,952,50" ClassName="ReBarWindow32" ControlType="50033" FrameworkId="Win32" LocalizedControlType="pane" />
<Level BoundingRectangle="0,1030,1920,50" ClassName="Shell_TrayWnd" ControlType="50033" FrameworkId="Win32" Name="Taskbar" LocalizedControlType="pane" />
</UIAStack>
<ScreenshotFileName>screenshot0001.JPEG</ScreenshotFileName>
</EachAction>
</UserActionData>
</Report>
</script>
</body>
</html>'''https://stackoverflow.com/questions/62158595
复制相似问题