我正在使用BeautifulSoup从这个网站https://www.gurufocus.com/insider/summary中抓取信息
有两个价格列,它们的价格值不同,但它们的类和元素完全相同。下面是类信息:
<td data-v-575fbbfb="" class="right-align number-field" data-column="Price" row-idx="0">
<span style="color: ">$2.12</span></td>这是我代码的一部分
from bs4 import BeautifulSoup
import requests
import pandas as pd
price = []
for pr in soup.find_all('td',{'class': 'right-align number-field','data-column': 'Price'}):
price.append(pr.text)有谁知道如何区分这两种价格,并将它们拼凑成两个数组?
发布于 2020-06-13 04:53:22
你可以使用zip()内置函数来完成任务。
例如:
import requests
from bs4 import BeautifulSoup
url = 'https://www.gurufocus.com/insider/summary'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
tds = soup.select('td[data-column="Price"]')
price_column_1, price_column_2 = [], []
for td_col1, td_col2 in zip(tds[::2], tds[1::2]):
price_column_1.append(td_col1.text)
price_column_2.append(td_col2.text)
# print to screen
for p1, p2 in zip(price_column_1, price_column_2):
print('{:<10}{}'.format(p1, p2))打印:
$2.05 $2.12
$15.42 $14.79
$0.02 $0.02
$0.64 $0.63
$73.13 $76.89
$298.75 $308.05
$512.74 $517.77
$341.27 $357
$300.99 $311.13
$38.34 $39.02
$20.79 $21.72
$5.65 $5.37
$14.30 $14.43
$37.93 $36.24
$174.90 $177.79
$79.58 $83.49
$79.58 $83.49
$63.91 $66.56
$25.31 $25.90
$93.04 $95.37
$67.73 $72.59
$67.73 $71.59
$67.71 $71.55
$11.31 $10.93
$58.67 $60.62
$22.64 $25.21
$3.98 $4.01
$6.47 $6.25
$9.08 $8.84
$23.69 $23.79
$174.23 $178.10
$100.07 $99.75
$11.89 $12.01
$0.83 $0.83
$41.15 $25
$41.15 $25
$41.15 $25
$7.23 $4.73
$23.04 $21.27
$37.97 $35.57发布于 2020-06-13 05:01:05
您还可以使用pandas直接获取该表,并使用列名:
import pandas as pd
import requests
r = requests.get("https://www.gurufocus.com/insider/summary")
data = pd.read_html(r.text, attrs = {'class': 'data-table'})[0]
data.columns = [
'Ticker', 'Links', 'Company', 'Price1', 'Insider Name', 'Insider Position',
'Date', 'Buy/Sell', 'Insider Trading Shares', 'Shares Change', 'Price2',
'Cost(000)', 'Final Share', 'Price Change Since Insider Trade (%)',
'Dividend Yield %', 'PE Ratio', 'Market Cap ($M)', 'None'
]
print(data[["Price1","Price2"]])输出:
Price1 Price2
0 $2.05 $2.12
1 $15.42 $14.79
2 $0.02 $0.02
3 $0.64 $0.63
4 $73.13 $76.89
5 $298.75 $308.05
6 $512.74 $517.77
7 $341.27 $357
8 $300.99 $311.13
9 $38.34 $39.02
10 $20.79 $21.72
11 $5.65 $5.37
12 $14.30 $14.43
13 $37.93 $36.24
14 $174.90 $177.79
15 $79.58 $83.49
16 $79.58 $83.49
17 $63.91 $66.56
18 $25.31 $25.90
19 $93.04 $95.37
20 $67.73 $72.59
21 $67.73 $71.59
22 $67.71 $71.55
23 $11.31 $10.93
24 $58.67 $60.62
25 $22.64 $25.21
26 $3.98 $4.01
27 $6.47 $6.25
28 $9.08 $8.84
29 $23.69 $23.79
30 $174.23 $178.10
31 $100.07 $99.75
32 $11.89 $12.01
33 $0.83 $0.83
34 $41.15 $25
35 $41.15 $25
36 $41.15 $25
37 $7.23 $4.73
38 $23.04 $21.27
39 $37.97 $35.57https://stackoverflow.com/questions/62351936
复制相似问题