新手需要帮助使代码面向对象。
我试图用不同的方法编写一个类来处理XML文件。其中一个方法的目标是返回一个字典,其中嵌入附件的文件名和编码的数据字符串分别作为键和值。
我已经设法让这门课在课堂之外完成了:
import xml.etree.ElementTree as ET
tree = ET.parse('invoice.xml')
root = tree.getroot()
namespace = {
'cac': 'urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2',
'cbc': 'urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2',
'ext': 'urn:oasis:names:specification:ubl:schema:xsd:CommonExtensionComponents-2',
'ccts': 'urn:un:unece:uncefact:documentation:2',
'xsi': 'http://www.w3.org/2001/XMLSchema-instance'
}
attachments = {}
for document in root.findall('cac:AdditionalDocumentReference', namespace):
filename = document.find('cbc:ID', namespace).text
print(filename)
# Find the embedded file
for child in document.findall('cac:Attachment', namespace):
attachment = child.find('cbc:EmbeddedDocumentBinaryObject', namespace).text
attachments[filename] = attachment但我无法将其转换为类方法,因为类方法返回一个空字典。我正在做的代码是:
import xml.etree.ElementTree as ET
class Invoice:
"""
Common tasks in relation to EHF invoices.
"""
namespace = {
'cac': 'urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2',
'cbc': 'urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2',
'ext': 'urn:oasis:names:specification:ubl:schema:xsd:CommonExtensionComponents-2',
'ccts': 'urn:un:unece:uncefact:documentation:2',
'xsi': 'http://www.w3.org/2001/XMLSchema-instance'
}
attachments = {}
def __init__(self, invoice):
"""Initialize invoice attributes."""
self.invoice = invoice
# Dictionary for namespace used in EHF invoices
self.namespace = self.namespace
def encoded_attachments(self):
"""
Return the embedded attachments from the EHF invoice in encoded form
as a dictionary.
Keys = filenames
Value = base64 encoded files
"""
for document in self.invoice.findall('cac:AdditonalDocumentReference', self.namespace):
# Find filename
filename = document.find('cbc:ID', self.namespace).text
# Find the embedded file
for child in document.findall('cac:Attachment', namespace):
attachment = child.find('cbc:EmbeddedDocumentBinaryObject', self.namespace).text
# Add filename and attachment to dictionary
self.attachments[filename] = attachment
return(self.attachments)
tree = ET.parse('invoice.xml')
root = tree.getroot()
ehf = Invoice(root)
attach_dict = ehf.encoded_attachments()
print(attach_dict)我认为课堂上有些重要的东西我错过了。任何帮助都是非常感谢的。
编辑:
xml文件的一部分。被编码的数据替换为虚拟文本字符串。
<?xml version="1.0" encoding="UTF-8"?>
<Invoice xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2"
xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2"
xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2"
xmlns:ext="urn:oasis:names:specification:ubl:schema:xsd:CommonExtensionComponents-2"
xmlns:ccts="urn:un:unece:uncefact:documentation:2"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<cbc:CustomizationID>urn:cen.eu:en16931:2017#compliant#urn:fdc:peppol.eu:2017:poacc:billing:3.0</cbc:CustomizationID>
<cbc:ProfileID>urn:fdc:peppol.eu:2017:poacc:billing:01:1.0</cbc:ProfileID>
<cbc:ID>1060649</cbc:ID>
<cbc:IssueDate>2020-01-23</cbc:IssueDate>
<cbc:DueDate>2020-02-07</cbc:DueDate>
<cbc:InvoiceTypeCode>380</cbc:InvoiceTypeCode>
<cbc:TaxPointDate>2020-01-23</cbc:TaxPointDate>
<cbc:DocumentCurrencyCode>NOK</cbc:DocumentCurrencyCode>
<cbc:BuyerReference>N/A</cbc:BuyerReference>
<cac:AdditionalDocumentReference>
<cbc:ID>invoice_attachment_filename.pdf</cbc:ID>
<cbc:DocumentTypeCode>130</cbc:DocumentTypeCode>
<cbc:DocumentDescription>CommercialInvoice</cbc:DocumentDescription>
<cac:Attachment>
<cbc:EmbeddedDocumentBinaryObject mimeCode="application/pdf" filename="1060649.pdf">BASE64ENCODEDTEXT</cbc:EmbeddedDocumentBinaryObject>
</cac:Attachment>
</cac:AdditionalDocumentReference>
</Invoice>发布于 2020-07-22 13:46:17
答案是(滚筒.)一切都是正确的,但是在这里比较新旧代码:
old: for document in root.findall('cac:AdditionalDocumentReference', namespace)
new: for document in self.invoice.findall('cac:AdditonalDocumentReference', self.namespace)
^顺便说一句,您可以省略行self.namespace = self.namespace。
发布于 2020-07-22 12:38:53
使用self不一致
for child in document.findall('cac:Attachment', **namespace**):
attachment = child.find('cbc:EmbeddedDocumentBinaryObject', **self.namespace**).text发布于 2020-07-22 12:49:57
你在这里犯了两个错误。
一个是您使用的是类变量(在这里阅读:https://docs.python.org/3/tutorial/classes.html)
第二个是gokaai所说的here。
这应该是可行的:
import xml.etree.ElementTree as ET
class Invoice:
"""
Common tasks in relation to EHF invoices.
"""
def __init__(self, invoice):
"""Initialize invoice attributes."""
self.invoice = invoice
self.namespace = {
'cac': 'urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-3',
'cbc': 'urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-3',
'ext': 'urn:oasis:names:specification:ubl:schema:xsd:CommonExtensionComponents-3',
'ccts': 'urn:un:unece:uncefact:documentation:1',
'xsi': 'http://www.w2.org/2001/XMLSchema-instance'
}
self.attachments = {}
def encoded_attachments(self):
"""
Return the embedded attachments from the EHF invoice in encoded form
as a dictionary.
Keys = filenames
Value = base64 encoded files
"""
for document in self.invoice.findall('cac:AdditonalDocumentReference', self.namespace):
# Find filename
filename = document.find('cbc:ID', self.namespace).text
# Find the embedded file
for child in document.findall('cac:Attachment', self.namespace):
# Add filename and attachment to dictionary
self.attachments[filename] = child.find('cbc:EmbeddedDocumentBinaryObject', self.namespace).text
return self.attachmentshttps://stackoverflow.com/questions/63034082
复制相似问题