我正试图从http://www.nasdaqomxnordic.com/shares/listed-companies/nordic-large-cap那里得到这张桌子。
但是,当我尝试运行下面的代码片段时,代码就永远不会完成。
# Using request
import requests
url = 'http://www.nasdaqomxnordic.com/shares/listed-companies/nordic-large-cap'
requests.get(url)
# Using pandas
import pandas as pd
url = 'http://www.nasdaqomxnordic.com/shares/listed-companies/nordic-large-cap'
pd.read_html(url)用其他人替换url是很好的,例如,https://en.wikipedia.org/wiki/List_of_S%26P_500_companies大约需要一秒钟。
发布于 2022-10-12 13:04:59
这是一种得到那张桌子的方法:
import requests
import pandas as pd
from bs4 import BeautifulSoup as bs
headers = {
'accept-language': 'en-US,en;q=0.9',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36'
}
url = 'http://www.nasdaqomxnordic.com/shares/listed-companies/nordic-large-cap'
r = requests.get(url, headers=headers)
table = bs(r.text, 'html.parser').select_one('table#listedCompanies')
df = pd.read_html(str(table))[0]
print(df)终点站的结果:
Name Symbol Currency ISIN Sector ICB Code Fact Sheet
0 TRATON 8TRA SEK DE000TRAT0N7 Industrials 5020 NaN
1 AAK AAK SEK SE0011337708 Consumer Goods 4510 NaN
2 ABB Ltd ABB SEK CH0012221716 Industrials 5020 NaN
3 Addtech B ADDT B SEK SE0014781795 Industrials 5020 NaN
4 AFRY AFRY SEK SE0005999836 Industrials 5010 NaN
... ... ... ... ... ... ... ...
249 Wallenstam B WALL B SEK SE0017780133 Real Estate 3510 NaN
250 Wihlborgs Fastigheter WIHL SEK SE0018012635 Real Estate 3510 NaN
251 Wärtsilä Oyj Abp WRT1V EUR FI0009003727 Industrials 5020 NaN
252 YIT Oyj YIT EUR FI0009800643 Industrials 5010 NaN
253 Zealand Pharma ZEAL DKK DK0060257814 Health Care 2010 NaN
254 rows × 7 columnshttps://stackoverflow.com/questions/74041882
复制相似问题