本文档定义了一些MS服务的URL和IP:https://docs.microsoft.com/en-us/microsoft-365/enterprise/urls-and-ip-address-ranges?view=o365-worldwide#exchange-online
我的目标是编写一个Python脚本来检查这个文档的最后更新日期。如果日期发生了变化(意味着某些IP发生了变化),我需要立即知道。我找不到任何API来实现这个目标,所以我写了这个脚本:
from bs4 import BeautifulSoup
import requests
import time
import re
url = "https://docs.microsoft.com/en-us/microsoft-365/enterprise/urls-and-ip-address-ranges?view=o365-worldwide#exchange-online"
#set the headers as a browser
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
while True:
response = requests.get(url,headers=headers)
soup = BeautifulSoup(response.text,"html.parser")
last_update_fra = soup.find(string=re.compile("01/04/2021"))
time.sleep(60)
soup = BeautifulSoup(requests.get(url, headers=headers).text, "html.parser")
if soup.find(string=re.compile("01/04/2021")) == last_update_fra:
print(last_update_fra)
continue
else:
#send an email for notification
pass我不确定这是不是最好的方法。因为如果日期将更改,我还需要将我的脚本更新为另一个日期(更新日期)。另外,这可以用BeautifulSoup来做吗?或者有另一种更好的方法?
发布于 2021-02-11 16:57:57
这里的汤很好。我甚至看不到有数据的XHR请求。
我注意到了几件事:
代码:
import requests
import time
from bs4 import BeautifulSoup
url = 'https://docs.microsoft.com/en-us/microsoft-365/enterprise/urls-and-ip-address-ranges?view=o365-worldwide#exchange-online'
#set the headers as a browser
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
last_update_fra = ''
while True:
time.sleep(60)
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
found_date = soup.find('time').text
if found_date == last_update_fra:
print(last_update_fra)
continue
else:
# store new date
last_update_fra = found_date
#send an email for notification
passhttps://stackoverflow.com/questions/66151005
复制相似问题