我正在尝试使用公司的名称在ARIN中自动获得相应的句柄ids,比如"Google“。
https://search.arin.net/rdap/?query=google*
我的天真方法是使用requests和BeautifulSoup
import requests
from bs4 import BeautifulSoup
html = 'https://search.arin.net/rdap/?query='
comp = 'google*'
r = requests.get(html + comp)
soup = BeautifulSoup(r.text, 'html.parser')
#example search
search = soup.body.find_all(text = "Handle$")然而,当我使用requests时,我没有得到与我简单地使用Google Chrome时相同的输出。requests返回的html代码不同,我无法访问相应的句柄。
有人知道怎么修改代码吗?
发布于 2021-08-11 14:47:21
您在页面上看到的数据是从外部API URL加载的。您可以使用requests模块进行模拟:
import json
import requests
api_url = "https://rdap.arin.net/registry/entities"
params = {"fn": "google*"}
data = requests.get(api_url, params=params).json()
# pretty print the data:
print(json.dumps(data, indent=4))打印:
...
{
"handle": "GF-231",
"vcardArray": [
"vcard",
[
[
"version",
{},
"text",
"4.0"
],
[
"fn",
{},
"text",
"GOOGLE FIBER INC"
],
[
"adr",
{
"label": "3425 MALONE DR\nCHAMBLEE\nGA\n30341\nUnited States"
},
"text",
[
"",
"",
"",
"",
"",
"",
""
]
],
[
"kind",
{},
"text",
"org"
]
]
],
...https://stackoverflow.com/questions/68744125
复制相似问题