有一个名为"name.txt“的文件
内容如下
<td>
<input class="name" value="Michael">
<input class="age" value="22">
<input class="location" value="hebei">
</td>
<td>
<input class="name" value="Jack">
<input class="age" value="23">
<input class="location" value="NewYo">
</td>现在我想使用pyquery获取所有输入标记,然后遍历输入标记
使用'.filter‘获取所有姓名类和年龄类
最后,获取name和age的值,并将所有结果写入名为‘name_file.txt’的文件中。
我的代码如下
# -*- coding: utf-8 -*-
from pyquery import PyQuery as pq
doc = pq(filename='name.txt')
input = doc('input')
for result in input.items():
name_result = result.filter('.name')
age_result = result.filter('.age')
name = name_result.attr('value')
age = age_result.attr('value')
print "%s:%s" %(name,age)
c = "%s:%s" %(name,age)
f = file('name_file.txt','w')
f.write(c)
f.close()但是现在,我遇到了两个问题
1.我得到的结果不是"Michael:22",而是"Michael:None“和"None:22”
2.我写入的'name_file‘的内容只是'None:None',并不是我得到的所有结果。
发布于 2017-07-08 13:02:42
第一个问题源于循环遍历所有<input ... >元素(由doc('input')收集),因此只能获取姓名或年龄,而不能同时获取两者。你可以做的是循环遍历各个<td> ... </td>块并提取匹配的子代-这有点浪费,但为了与你的想法保持一致:
from pyquery import PyQuery as pq
doc = pq(filename='name.txt') # open our document from `name.txt` file
for result in doc('td').items(): # loop through all <td> ... </td> items
name_result = result.find('.name') # grab a tag with class="name"
age_result = result.find('.age') # grab a tag with class="age"
name = name_result.attr('value') # get the name's `value` attribute value
age = age_result.attr('value') # get the age's `value` attribute value
print("{}:{}".format(name, age)) # print it to the STDOUT as name:age至于第二部分-你在写模式下打开你的name_file.txt文件,写一行,然后在每个循环中关闭它-当你在写模式下打开一个文件时,它会截断其中的所有内容,所以你在每个循环中一直写第一行。试着这样做:
from pyquery import PyQuery as pq
doc = pq(filename='name.txt') # open our document from `name.txt` file
with open("name_file.txt", "w") as f: # open name_file.txt for writing
for result in doc('td').items(): # loop through all <td> ... </td> items
name_result = result.find('.name') # grab a tag with class="name"
age_result = result.find('.age') # grab a tag with class="age"
name = name_result.attr('value') # get the name's `value` attribute value
age = age_result.attr('value') # get the age's `value` attribute value
print("{}:{}".format(name, age)) # print values to the STDOUT as name:age
f.write("{}:{}\n".format(name, age)) # write to the file as name:age + a new line 发布于 2017-07-08 13:26:59
from pyquery import PyQuery as pq
doc = pq(filename = 'text.txt')
input=doc.children('body')
f = file('name_file.txt', 'w')
for x in [result.html() for result in input.items('td')]:
x=pq(x)
name = x('input').eq(0).attr('value')
age = x('input').eq(1).attr('value')
print "%s:%s" % (name, age)
c = "%s:%s" % (name, age)
f.write(c)
f.close()您不能在循环中包含文件打开语句,否则在每次循环迭代中只会用一条记录覆盖文件。
类似地,在循环之后关闭它,而不是在插入每条记录之后。
https://stackoverflow.com/questions/44982257
复制相似问题