我是编程和python的新手,正在尝试访问DC bikeshare程序中给定站点的可用自行车数量。我相信最好的方法是使用BeautifulSoup。好消息是,数据在这里看起来是一种干净的格式:https://www.capitalbikeshare.com/data/stations/bikeStations.xml
下面是一个工作站的示例:
<station>
<id>1</id>
<name>15th & S Eads St</name>
<terminalName>31000</terminalName>
<lastCommWithServer>1460217337648</lastCommWithServer>
<lat>38.858662</lat>
<long>-77.053199</long>
<installed>true</installed>
<locked>false</locked>
<installDate>0</installDate>
<removalDate/>
<temporary>false</temporary>
<public>true</public>
<nbBikes>7</nbBikes>
<nbEmptyDocks>8</nbEmptyDocks>
<latestUpdateTime>1460192501598</latestUpdateTime>
</station>我在寻找<nbBikes>的值。我有一个python脚本的开始,这个脚本将显示前5个站点的值(一旦我得到了控制,我将处理选择我想要的站点),但它没有返回任何值。脚本如下:
# bikeShareParse.py - parses the capital bikeshare info page
import bs4, requests
url = "https://www.capitalbikeshare.com/data/stations/bikeStations.xml"
res = requests.get(url)
res.raise_for_status()
#create the soup element from the file
soup = bs4.BeautifulSoup("res.text", "lxml")
# defines the part of the page we are looking for
nbikes = soup.select('#text')
#limits number of results for testing
numOpen = 5
for i in range(numOpen):
print nbikes我认为我的问题(除了不知道如何在堆栈溢出问题中正确格式化代码)是nbikes = soup.select('#text')的值不正确。然而,我似乎不能用任何东西代替'#text‘来获取任何值,更不用说我想要的值了。
我处理这个问题的方式正确吗?如果是这样的话,我错过了什么?
谢谢
发布于 2016-04-10 06:22:45
此脚本创建一个结构为station_ID,bikes_remaining的字典。它从下面的开头进行修改:http://www.plotsofdots.com/archives/68
# from http://www.plotsofdots.com/archives/68
import xml.etree.ElementTree as ET
import urllib2
#we parse the data using urlib2 and xml
site='https://www.capitalbikeshare.com/data/stations/bikeStations.xml'
htm=urllib2.urlopen(site)
doc = ET.parse(htm)
#we get the root tag
root=doc.getroot()
root.tag
#we define empty lists for the empty bikes
sID=[]
embikes=[]
#we now use a for loop to extract the information we are interested in
for country in root.findall('station'):
sID.append(country.find('id').text)
embikes.append(int(country.find('nbBikes').text))
#this just tests that the process above works, can be commented out
#print embikes
#print sID
#use zip to create touples and then parse them into a dataframe
prov=zip(sID,embikes)
print prov[0]https://stackoverflow.com/questions/36519932
复制相似问题