我有以下包含列的df:
DueDate
0 <cbc:DueDate>2020-10-18</cbc:DueDate>
1 <cbc:DueDate>2020-01-08</cbc:DueDate>
2 NaN
3 NaN
Streetname
0 <cbc:StreetName>Xerox GmbH</cbc:StreetName>
1 <cbc:StreetName>Rompslomp.nl B.V.</cbc:StreetName>
2 <cbc:StreetName>STAS picture</cbc:StreetName>
3 <cbc:StreetName>Rex International B.V.</cbc:StreetName>
PostalAdress
0 </cac:PostalAddress>
1 </cac:PostalAddress>
2 </cac:PostalAddress>
3 </cac:PostalAddress>
Name: PostalAdressClose, dtype: object当我尝试使用以下代码将此代码写入文本文件时:
# xml document to be expanding with per row details
fac_doc_template = """<?xml version="1.0"?>
<Invoice xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2" xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ccts="urn:un:unece:uncefact:documentation:2" xsi:schemaLocation="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2 http://docs.oasis-open.org/ubl/os-UBL-2.1/xsd/maindoc/UBL-Invoice-2.1.xsd">
<cbc:UBLVersionID>2.1</cbc:UBLVersionID>
<cbc:CustomizationID>urn:www.cenbii.eu:transaction:biitrns010:ver2.0:extended:urn:www.peppol.eu:bis:peppol4a:ver2.0:extended:urn:www.simplerinvoicing.org:si:si-ubl:ver1.1.x</cbc:CustomizationID>
<cbc:ProfileID>urn:www.cenbii.eu:profile:bii04:ver2.0</cbc:ProfileID>
{fac_details}"""
# per row details
# todo: expand for all of the column values you want
fac_details_xml_template = """{Streetname}
{DueDate}
"""然后,我使用下面的代码遍历这些列,将每个列写入一个单独的文件:
def series_to_fac_details_xml(s):
return fac_details_xml_template.format(**s)
for index, row in df3.iterrows():
details = series_to_fac_details_xml(row)
with open(fr"C:\Users\Max12\Desktop\xml\pdfminer\UiPath\output\{index}.xml", "w") as f:
f.write(fac_doc_template.format(fac_details=details))我有个问题..我希望当值为NaN时跳过NaN,但是当我使用以下命令将NaN转换为空字符串时:
df3 = df3.replace(np.nan, '', regex=True)我在输出文件中得到了白线。所需的输出是在发生NaN时立即继续向文件的下一列写入(不带空格)。你能帮帮我吗?
发布于 2020-12-18 05:52:58
假设你有这个DataFrame:
import numpy as np
import pandas as pd
df = pd.DataFrame({'DueDate': ['2020-01-01','2020-01-02',np.nan],
'Streetname':['Main Street 1', 'Main Street 2', 'Main Street 3']
})
df
>>>
DueDate Streetname
0 2020-01-01 Main Street 1
1 2020-01-02 Main Street 2
2 NaN Main Street 3然后,您可以像使用df = df.replace(np.nan,'', regex=True)一样替换NaN。
之后,我建议你执行一个apply函数,并创建一个新的序列来形成你的阵型。
z = df.apply(lambda x: x['Streetname'] + ' ' + x['DueDate'], axis=1)稍后,您可以调用z.to_string(index=False)并将此代码写入您的文件。如果你不喜欢换行符,你可以用z.to_string(index=False).replace('\n','')代替它们。我认为这会让你的代码更干净一些,因为你不需要遍历所有的行。
我真的希望这对你有帮助,这回答了你的问题。
https://stackoverflow.com/questions/65348229
复制相似问题