我想把股票市场的名字从泛欧交易所划掉。问题是BeautifulSoup找不到存储股票名称的<td>...</td>标记:
该页面具有:
<td class="stocks-name sorting_1" data-order="1000MERCIS"><a href="/en/product/equities/FR0010285965-ALXP/1000mercis/almil/quotes" data-order="1000MERCIS" data-title-hover="1000MERCIS">1000MERCIS</a></td>我写道:
url = "https://live.euronext.com/en/products/equities/list"
HEADERS = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36"
}
response = requests.get(url, headers=HEADERS).content
soup_data = BeautifulSoup(response, "lxml")
names = soup_data.find_all("td", attrs={"id":"stocks-name sorting_1"})
print(names)我只得到了:
[]谢谢
发布于 2021-07-03 01:44:57
import requests
import pandas as pd
from bs4 import BeautifulSoup
def main(url):
params = {
"mics": "ALXB,ALXL,ALXP,XPAR,XAMS,XBRU,XLIS,XMLI,MLXB,ENXB,ENXL,TNLA,TNLB,XLDN,XESM,XMSM,XATL,VPXB,XOSL,XOAS,MERK",
"display_datapoints": "dp_stocks",
"display_filters": "df_stocks"
}
data = {
"draw": "1",
"columns[0][data]": "0",
"columns[0][name]": "",
"columns[0][searchable]": "true",
"columns[0][orderable]": "true",
"columns[0][search][value]": "",
"columns[0][search][regex]": "false",
"columns[1][data]": "1",
"columns[1][name]": "",
"columns[1][searchable]": "true",
"columns[1][orderable]": "false",
"columns[1][search][value]": "",
"columns[1][search][regex]": "false",
"columns[2][data]": "2",
"columns[2][name]": "",
"columns[2][searchable]": "true",
"columns[2][orderable]": "false",
"columns[2][search][value]": "",
"columns[2][search][regex]": "false",
"columns[3][data]": "3",
"columns[3][name]": "",
"columns[3][searchable]": "true",
"columns[3][orderable]": "false",
"columns[3][search][value]": "",
"columns[3][search][regex]": "false",
"columns[4][data]": "4",
"columns[4][name]": "",
"columns[4][searchable]": "true",
"columns[4][orderable]": "false",
"columns[4][search][value]": "",
"columns[4][search][regex]": "false",
"columns[5][data]": "5",
"columns[5][name]": "",
"columns[5][searchable]": "true",
"columns[5][orderable]": "false",
"columns[5][search][value]": "",
"columns[5][search][regex]": "false",
"columns[6][data]": "6",
"columns[6][name]": "",
"columns[6][searchable]": "true",
"columns[6][orderable]": "false",
"columns[6][search][value]": "",
"columns[6][search][regex]": "false",
"order[0][column]": "0",
"order[0][dir]": "asc",
"start": "0",
"length": "100",
"search[value]": "",
"search[regex]": "false",
"iDisplayLength": "2000",
"iDisplayStart": "0",
"sSortDir_0": "asc"
}
r = requests.post(url, params=params, data=data).json()
allin = {'Names': []}
for x in r['aaData']:
soup = BeautifulSoup(x[0], 'lxml')
allin['Names'].append(soup.a.text)
df = pd.DataFrame(allin)
print(df)
if __name__ == "__main__":
main('https://live.euronext.com/en/pd/data/stocks')输出:
Names
0 1000MERCIS
1 2020 BULKERS
2 2CRSI
3 2MX ORGANIC
4 2MX ORGANIC BS
... ...
1740 ZCI LIMITED
1741 ZENITH ENERGY
1742 ZENOBE GRAMME CERT
1743 ZWIPE
1744 ØRN SOFTWARE HLD.
[1745 rows x 1 columns]https://stackoverflow.com/questions/68228077
复制相似问题