import requests
MSA_request=""">G1
MGCTLSAEDKAAVERSKMIDRNLREDGEKAAREVKLLLL
>G2
MGCTVSAEDKAAAERSKMIDKNLREDGEKAAREVKLLLL
>G3
MGCTLSAEERAALERSKAIEKNLKEDGISAAKDVKLLLL"""
q={"stype":"protein","sequence":MSA_request,"outfmt":"clustal"}
r=requests.post("http://www.ebi.ac.uk/Tools/msa/clustalo/",data=q)这是我的脚本,我向网站发送了这个请求,但结果看起来我什么都没做,web服务没有收到我的请求。这种方法以前也适用于其他网站,也许这个页面会弹出一个窗口来询问cookie协议?
发布于 2017-03-08 04:51:59
您引用的页面上的表单有一个单独的URL,即
http://www.ebi.ac.uk/Tools/services/web_clustalo/toolform.ebi您可以在浏览器中使用DOM检查器验证这一点。因此,为了继续使用requests,您需要访问正确的页面
r=requests.post("http://www.ebi.ac.uk/Tools/services/web_clustalo/toolform.ebi",data=q)这将提交一个包含输入数据的作业,它不会直接返回结果。要检查结果,需要从前一个响应中提取作业ID,然后生成另一个请求(没有数据)以
http://www.ebi.ac.uk/Tools/services/web_clustalo/toolresult.ebi?jobId=...但是,你绝对应该检查这种程序化访问是否与该网站的TOS兼容…
下面是一个示例:
from lxml import html
import requests
import sys
import time
MSA_request=""">G1
MGCTLSAEDKAAVERSKMIDRNLREDGEKAAREVKLLLL
>G2
MGCTVSAEDKAAAERSKMIDKNLREDGEKAAREVKLLLL
>G3
MGCTLSAEERAALERSKAIEKNLKEDGISAAKDVKLLLL"""
q={"stype":"protein","sequence":MSA_request,"outfmt":"clustal"}
r = requests.post("http://www.ebi.ac.uk/Tools/services/web_clustalo/toolform.ebi",data = q)
tree = html.fromstring(r.text)
title = tree.xpath('//title/text()')[0]
#check the status and get the job id
status, job_id = map(lambda s: s.strip(), title.split(':', 1))
if status != "Job running":
sys.exit(1)
#it might take some time for the job to finish
time.sleep(10)
#download the results
r = requests.get("http://www.ebi.ac.uk/Tools/services/web_clustalo/toolresult.ebi?jobId=%s" % (job_id))
#prints the full response
#print(r.text)
#isolate the alignment block
tree = html.fromstring(r.text)
alignment = tree.xpath('//pre[@id="alignmentContent"]/text()')[0]
print(alignment)https://stackoverflow.com/questions/42657842
复制相似问题