我有一些Python代码来迭代一个大型的XML文件,以便在一个逗号分隔的元素中解析出某些结果。
当代码提取结果时,我也需要它来计算结果。如何在当前代码中编写此循环,以及应将其放在何处?在我的.split()函数之后的循环中?之后?
我的代码:
#Working python code
#Import libraries
import webbrowser
from lxml import etree
#Parses the file to get the tree paths
tree = etree.parse('g_d_anime.xml')
root = tree.getroot()
#Starts the XPATH using findall to verify the elements inside the tree are right
#Might not be necessary, but it's not a bad check to have
tree.findall('WorksXML/Work')
#Opens the output file for the results
h_file = open("xml_stats.html","w")
#Start writing the HTML lines
h_file.write("This block shows the producers and genres for this file.")
h_file.write('<br>')
#These loops first split the string, then select the specific value I want to pull out
for Producers in root.iter('Producers'):
p = Producers.text.split(',')
for producer in p:
#if you require only to display only 'Nitroplus', put '==' in place of '!='
if producer == 'Aniplex':
print(p) #This prints all the values in the strings
h_file.write('<li>' + str(producer) + '</li>') #This only writes the selected value
h_file.write('<br>')
for Genres in root.iter('Genres'):
g = Genres.text.split(',')
for genre in g:
#if you require only to display only 'Magic', put '==' in place of '!='
if genre == 'Magic':
print(g) #This prints all the values in the strings
h_file.write('<li>'+ str(genre) + '</li>') #This only writes the selected value
h_file.write('<br>')
#Should the counting loop for the above go here? Or within the above loop?
h_file.close()
webbrowser.open_new_tab("xml_stats.html")我不知道在哪里放置count循环、.sum()或计数器行,如果更简单的话。结果应该是这样的:
Aniplex: 7
Magic: 9该文件有超过1500条记录,结构如下:
<WorksXML>
<Work>
<Title>Fullmetal Alchemist: Brotherhood</Title>
<Type>TV</Type>
<Episodes>64</Episodes>
<Status>Finished Airing</Status>
<Start_airing>2009-4-5</Start_airing>
<End_airing>2010-7-4</End_airing>
<Starting_season>Spring</Starting_season>
<Broadcast_time>Sundays at 17:00 (JST)</Broadcast_time>
<Producers>Aniplex,Square Enix,Mainichi Broadcasting System,Studio Moriken</Producers>
<Licensors>Funimation,Aniplex of America</Licensors>
<Studios>Bones</Studios>
<Sources>Manga</Sources>
<Genres>Action,Military,Adventure,Comedy,Drama,Magic,Fantasy,Shounen</Genres>
<Duration>24 min. per ep.</Duration>
<Rating>R</Rating>
<Score>9.25</Score>
<Scored_by>719706</Scored_by>
<Members>1176368</Members>
<Favorites>105387</Favorites>
<Description>"In order for something to be obtained, something of equal value must be lost." Alchemy is bound by this Law of Equivalent Exchange—something the young brothers Edward and Alphonse Elric only realize after attempting human transmutation: the one forbidden act of alchemy. They pay a terrible price for their transgression—Edward loses his left leg, Alphonse his physical body. It is only by the desperate sacrifice of Edward's right arm that he is able to affix Alphonse's soul to a suit of armor. Devastated and alone, it is the hope that they would both eventually return to their original bodies that gives Edward the inspiration to obtain metal limbs called "automail" and become a state alchemist, the Fullmetal Alchemist. Three years of searching later, the brothers seek the Philosopher's Stone, a mythical relic that allows an alchemist to overcome the Law of Equivalent Exchange. Even with military allies Colonel Roy Mustang, Lieutenant Riza Hawkeye, and Lieutenant Colonel Maes Hughes on their side, the brothers find themselves caught up in a nationwide conspiracy that leads them not only to the true nature of the elusive Philosopher's Stone, but their country's murky history as well. In between finding a serial killer and racing against time, Edward and Alphonse must ask themselves if what they are doing will make them human again... or take away their humanity.
</Description>
</Work>
...
</WorksXML>我知道循环应该是这样的:
for pr in producer:
pr = 0
if pr in producer:
producer[pr] = producer[pr] + 1
else:
producer[pr] = 1如果有人有更好的方式写这个,请分享。
发布于 2021-12-03 08:22:17
由于您只想计数Aniplex和Magic,所以应该将其放在if块中,然后循环之后将其写入文件:
aniplex_count = 0
magic_count = 0
for Producers in root.iter("Producers"):
p = Producers.text.split(",")
for producer in p:
if producer == "Aniplex":
aniplex_count += 1
h_file.write(
"<li>" + str(producer) + "</li>"
)
h_file.write("<br>")
for Genres in root.iter("Genres"):
g = Genres.text.split(",")
for genre in g:
if genre == "Magic":
magic_count += 1
h_file.write(
"<li>" + str(genre) + "</li>"
)
h_file.write("<br>")
h_file.write(f"<h2>Aniplex: {aniplex_count}</h2>")
h_file.write(f"<h2>Magic: {magic_count}</h2>")https://stackoverflow.com/questions/70211098
复制相似问题