我使用以下代码从一个网站检索经济数据:
from bs4 import BeautifulSoup
from selenium import webdriver
url = 'https://www.fxstreet.com/economic-calendar'
driver = webdriver.Chrome()
driver.get(url)
html = driver.page_source
soup = BeautifulSoup(html, 'lxml')
for tr in soup.findAll('tr',{'class':['fxst-tr-event', 'fxst-oddRow', 'fxit-eventrow', 'fxst-evenRow', 'fxs_cal_nextEvent']}):
event = tr.find('div', {'class': 'fxit-event-title'}).text
currency = tr.find('div', {'class': 'fxit-event-name'}).text
actual = tr.find('div', {'class': 'fxit-actual'}).text
forecast = tr.find('div', {'class': 'fxit-consensus'}).text
previous = tr.find('div', {'class': 'fxst-td-previous fxit-previous'}).text
time = tr.find('div', {'class': 'fxit-eventInfo-time fxs_event_time'}).text
volatility = tr.find('div', {'class': 'fxit-eventInfo-vol-c fxit-event-info-desktop'}).span['title']
print(u'\t{}\t{}\t{}\t{}').format(time, currency, event, volatility)打印语句的输出如下:
23:30
AUD
AiG Performance of Construction Index (Jul)
Moderate volatility expected
23:50
JPY
JP Foreign Reserves (Jul)
Low volatility expected
24h
CAD
August Civic Holiday
No volatility expected
01:30
AUD
ANZ Job Advertisements (Jun)
Low volatility expected
n/a
CNY
Foreign Exchange Reserves (MoM) (Jul)
Low volatility expected
05:00
JPY
Coincident Index (Jun)Preliminar
Moderate volatility expected
05:00是否可以将输出格式化为在行中打印,如下所示?
23:30 AUD AiG Performance of Construction Index (Jul) Moderate volatility expected
23:50 JPY JP Foreign Reserves (Jul) Low volatility expected
24h CAD August Civic Holiday No volatility expected
01:30 AUD ANZ Job Advertisements (Jun) Low volatility expected
n/a CNY Foreign Exchange Reserves (MoM) (Jul) Low volatility expected
05:00 JPY Coincident Index (Jun)Preliminary Moderate volatility expected最终目标是削减这个输出并将其粘贴到一个Excel文件中。提前感谢!
发布于 2017-08-07 02:12:17
试着剥掉像这样的新线:
from bs4 import BeautifulSoup
from selenium import webdriver
url = 'https://www.fxstreet.com/economic-calendar'
driver = webdriver.Chrome()
driver.get(url)
html = driver.page_source
soup = BeautifulSoup(html, 'lxml')
for tr in soup.findAll('tr',{'class':['fxst-tr-event', 'fxst-oddRow', 'fxit-eventrow', 'fxst-evenRow', 'fxs_cal_nextEvent']}):
event = tr.find('div', {'class': 'fxit-event-title'}).text
currency = tr.find('div', {'class': 'fxit-event-name'}).text
actual = tr.find('div', {'class': 'fxit-actual'}).text
forecast = tr.find('div', {'class': 'fxit-consensus'}).text
previous = tr.find('div', {'class': 'fxst-td-previous fxit-previous'}).text
time = tr.find('div', {'class': 'fxit-eventInfo-time fxs_event_time'}).text
volatility = tr.find('div', {'class': 'fxit-eventInfo-vol-c fxit-event-info-desktop'}).span['title']
print(u'\t{}\t{}\t{}\t{}').format(time.strip(), currency.strip(), event.strip(), volatility.strip()) 这样,每个字符串都不会有换行符。
发布于 2017-08-07 02:58:27
为了补充另一个答案,因为您提到“最终目标是削减此输出并将其粘贴到一个Excel文件中”,您可能也有兴趣从数据中生成一个.csv,这样它就可以很容易地导出到import csv中,而不是复制粘贴,在import csv之后,您需要将循环更改为:
with open("data.csv", "w") as csv_file:
for tr in soup.findAll('tr',{'class':['fxst-tr-event', 'fxst-oddRow', 'fxit-eventrow', 'fxst-evenRow', 'fxs_cal_nextEvent']}):
event = tr.find('div', {'class': 'fxit-event-title'}).text
currency = tr.find('div', {'class': 'fxit-event-name'}).text
actual = tr.find('div', {'class': 'fxit-actual'}).text
forecast = tr.find('div', {'class': 'fxit-consensus'}).text
previous = tr.find('div', {'class': 'fxst-td-previous fxit-previous'}).text
time = tr.find('div', {'class': 'fxit-eventInfo-time fxs_event_time'}).text
volatility = tr.find('div', {'class': 'fxit-eventInfo-vol-c fxit-event-info-desktop'}).span['title']
line = [time.strip(),currency.strip(),event.strip(),volatility.strip()]
writer = csv.writer(csv_file, delimiter=',')
writer.writerow(line)
print(line)https://stackoverflow.com/questions/45538472
复制相似问题