首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何在Python中通过标记查找XML元素

如何在Python中通过标记查找XML元素
EN

Stack Overflow用户
提问于 2021-09-29 13:28:31
回答 2查看 137关注 0票数 0

我很难识别Python中正确的元素。我真正想看到的是recently-used.xbel.中最新访问的文件因此,我想对每个文件进行迭代,以找到具有最新的修改的或最新的访问的文件--这是XML文件的样子。

代码语言:javascript
复制
<?xml version="1.0" encoding="UTF-8"?>
<xbel version="1.0"
      xmlns:bookmark="http://www.freedesktop.org/standards/desktop-bookmarks"
      xmlns:mime="http://www.freedesktop.org/standards/shared-mime-info"
>
  <bookmark href="file:///tmp/google-chrome-stable_current_amd64.deb" added="2021-09-14T12:09:05Z" modified="2021-09-14T12:09:05Z" visited="2021-09-15T09:12:13Z">
    <info>
      <metadata owner="http://freedesktop.org">
        <mime:mime-type type="application/vnd.debian.binary-package"/>
        <bookmark:applications>
          <bookmark:application name="Firefox" exec="&apos;firefox %u&apos;" modified="2021-09-14T12:09:05Z" count="1"/>
        </bookmark:applications>
      </metadata>
    </info>
  </bookmark>
  <bookmark href="file:///home/test/Git/testprog" added="2021-09-15T09:12:13Z" modified="2021-09-15T09:12:13Z" visited="2021-09-15T09:12:13Z">
    <info>
      <metadata owner="http://freedesktop.org">
        <mime:mime-type type="inode/directory"/>
        <bookmark:applications>
          <bookmark:application name="code" exec="&apos;code %u&apos;" modified="2021-09-15T09:12:13Z" count="1"/>
        </bookmark:applications>
      </metadata>
    </info>
  </bookmark>
  <bookmark href="file:///home/test/.local/share/recently-used.xbel" added="2021-09-15T09:51:57Z" modified="2021-09-15T09:51:57Z" visited="2021-09-15T09:51:57Z">
    <info>
      <metadata owner="http://freedesktop.org">
        <mime:mime-type type="application/x-xbel"/>
        <bookmark:applications>
          <bookmark:application name="code" exec="&apos;code %u&apos;" modified="2021-09-15T09:51:57Z" count="1"/>
        </bookmark:applications>
      </metadata>
    </info>
  </bookmark>
  <bookmark href="file:///tmp/slack-desktop-4.19.2-amd64.deb" added="2021-09-15T11:45:49Z" modified="2021-09-15T11:45:49Z" visited="2021-09-16T13:26:26Z">
    <info>
      <metadata owner="http://freedesktop.org">
        <mime:mime-type type="application/vnd.debian.binary-package"/>
        <bookmark:applications>
          <bookmark:application name="Firefox" exec="&apos;firefox %u&apos;" modified="2021-09-15T11:45:49Z" count="1"/>
        </bookmark:applications>
      </metadata>
    </info>
  </bookmark>
  <bookmark href="file:///home/test/Downloads/google-chrome-stable_current_amd64.deb" added="2021-09-15T11:52:39Z" modified="2021-09-15T11:52:39Z" visited="2021-09-16T13:26:26Z">
    <info>
      <metadata owner="http://freedesktop.org">
        <mime:mime-type type="application/vnd.debian.binary-package"/>
        <bookmark:applications>
          <bookmark:application name="Firefox" exec="&apos;firefox %u&apos;" modified="2021-09-15T11:52:39Z" count="1"/>
        </bookmark:applications>
      </metadata>
    </info>
  </bookmark>
  <bookmark href="file:///home/test/Documents/libretest" added="2021-09-15T11:58:53Z" modified="2021-09-15T11:58:53Z" visited="2021-09-16T13:26:26Z">
    <info>
      <metadata owner="http://freedesktop.org">
        <mime:mime-type type="application/octet-stream"/>
        <bookmark:applications>
          <bookmark:application name="LibreOffice 6.4" exec="&apos;soffice %u&apos;" modified="2021-09-15T11:58:53Z" count="1"/>
        </bookmark:applications>
      </metadata>
    </info>
  </bookmark>
  <bookmark href="file:///home/test/Documents/libretest.odt" added="2021-09-15T11:58:53Z" modified="2021-09-15T15:42:04Z" visited="2021-09-16T13:26:26Z">
    <info>
      <metadata owner="http://freedesktop.org">
        <mime:mime-type type="application/vnd.oasis.opendocument.text"/>
        <bookmark:applications>
          <bookmark:application name="LibreOffice 6.4" exec="&apos;soffice %u&apos;" modified="2021-09-15T15:42:04Z" count="12"/>
        </bookmark:applications>
      </metadata>
    </info>
  </bookmark>
  <bookmark href="file:///home/test/Git/node-socket" added="2021-09-16T13:26:25Z" modified="2021-09-16T13:26:49Z" visited="2021-09-16T13:26:26Z">
    <info>
      <metadata owner="http://freedesktop.org">
        <mime:mime-type type="inode/directory"/>
        <bookmark:applications>
          <bookmark:application name="code" exec="&apos;code %u&apos;" modified="2021-09-16T13:26:49Z" count="2"/>
        </bookmark:applications>
      </metadata>
    </info>
  </bookmark>
</xbel>

在我的代码中,我试图访问bookmark:applications,但没有成功。

代码语言:javascript
复制
    home = str(Path.home())
    root = ET.parse(home + '/.local/share/recently-used.xbel').getroot()
    print(root)
    print('lower')
    for bookmark in root.iter('bookmark'):
        print(bookmark)
        for applications in bookmark.find('applications'):
            print(applications)

访问bookmark:applications和查找上次访问的正确方法是什么?

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2021-09-29 14:41:11

这对于访问书签非常有用:应用程序和dataframe将帮助您获得最新访问/修改的带有应用程序名称的书签。

代码语言:javascript
复制
import xml.etree.ElementTree as ET
import pandas as pd

root = ET.parse('/content/sample.xml').getroot()
lst = []

for bookmark in bookmarklist:
  bookmark_lst = []
  print(bookmark.attrib)
  bookmark_lst.append(bookmark.attrib['href'])
  bookmark_lst.append(bookmark.attrib['modified'])
  bookmark_lst.append(bookmark.attrib['visited'])
  for ele in list(bookmark.iter()) :
    if 'application' in ele.tag:
      if 'name' in ele.attrib:
        bookmark_lst.append(ele.attrib['name'])
  lst.append(bookmark_lst)

df = pd.DataFrame(lst,columns ['href','modified','visited','application_name'])

df['modified'] = pd.to_datetime(df['modified'])
df['visited'] = pd.to_datetime(df['visited'])

least_recent_date = df['visited'].min()
most_recent_date = df['visited'].max()
票数 1
EN

Stack Overflow用户

发布于 2021-09-29 20:29:30

代码语言:javascript
复制
from lxml import etree

NS = {"n": "http://www.freedesktop.org/standards/desktop-bookmarks"}

root = etree.parse("book.xml")
bookmarks = root.xpath("//bookmark")
most_recent_bookmark = max(
    bookmarks,
    key=lambda bmark: bmark.xpath(
        "string(.//n:application/@modified)",
        namespaces=NS,
    ),
)

print("Most recent href: " + most_recent_bookmark.xpath("string(@href)"))
print(
    "Most recent modified: "
    + most_recent_bookmark.xpath("string(.//n:application/@modified)", namespaces=NS)
)

输出:

代码语言:javascript
复制
Most recent href: file:///home/test/Git/node-socket
Most recent modified: 2021-09-16T13:26:49Z

您遇到的问题是,如何在原始xml中指定由bookmark:表示的名称空间,在代码示例中指定n:表示的名称空间。xpath()find()findall()函数都允许您提供名称空间字典。

https://docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.ElementTree.Element.findall

如果如您所说的,名称空间可能会发生变化,则可以使用XPath .//*[local-name() = 'application']/@modified代替没有命名空间参数的.//n:application/@modified。(但是,看到生产者任意更改名称空间,我会感到惊讶,因为它只是要求所有消耗数据的东西都中断。url与"application“一样是节点名称的一部分。)

票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/69377374

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档