我正在尝试向DSpace 5.5API发送get请求,以检查DSpace中是否存在具有给定句柄的项。
当我在浏览器中测试它时,它工作得很好(返回代码200,我已经得到了关于搜索项目的数据)。
然后我开始用Python控制台中的Python 3 requests模块测试发送请求。同样,DSpace API在响应中返回了正确的响应码(200)和json数据。
因此,我在我的脚本中实现了测试过的函数,突然DSpace应用程序接口开始返回错误代码500。在DSpace日志中,我遇到了以下错误消息:
org.dspace.rest.RestIndex @ REST Login Success for user: jakub.rihak@ruk.cuni.cz
2017-01-03 15:38:34,326 ERROR org.dspace.rest.Resource @ Something get wrong. Aborting context in finally statement.
2017-01-03 15:38:34,474 ERROR org.dspace.rest.Resource @ Something get wrong. Aborting context in finally statement.2017-01-03 15:38:34,598 ERROR org.dspace.rest.Resource @ Something get ERROR。正在终止finally语句中的上下文。
根据DSpace文档,请求应该是这样的:
GET /handle/{handle-prefix}/{handle-suffix}它指向我们的DSpace服务器上的处理API端点,所以整个请求应该被发送到https://dspace.cuni.cz/rest/handle/123456789/937 (我想你可以自己测试它)。
在浏览器中,我得到以下响应:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<item>
<expand>metadata</expand
<expand>parentCollection</expand>
<expand>parentCollectionList</expand>
<expand>parentCommunityList</expand>
<expand>bitstreams</expand>
<expand>all</expand>
<handle>123456789/937</handle>
<id>1423</id>
<name>Komparace vývoje české a slovenské pravicové politiky od roku 1989 do současnosti</name>
<type>item</type>
<archived>true</archived>
<lastModified>2016-12-20 17:52:30.641</lastModified
<withdrawn>false</withdrawn>
</item>在Python控制台中测试时,我的代码如下所示:
from urllib.parse import urljoin
import requests
def document_in_dspace(handle):
url = 'https://dspace.cuni.cz/rest/handle/'
r_url = urljoin(url, handle)
print(r_url)
r = requests.get(r_url)
if r.status_code == requests.codes.ok:
print(r.text)
print(r.reason)
return True
else:
print(r.reason)
print(r.text)
return False在Python控制台中使用document_in_dspace('123456789/937')调用此函数后,响应如下:
https://dspace.cuni.cz/rest/handle/123456789/937
{"id":1423,"name":"Komparace vývoje české a slovenské pravicové politiky od roku 1989 do současnosti","handle":"123456789/937","type":"item","link":"/rest/items/1423","expand":["metadata","parentCollection","parentCollectionList","parentCommunityList","bitstreams","all"],"lastModified":"2016-12-20 17:52:30.641","parentCollection":null,"parentCollectionList":null,"parentCommunityList":null,"bitstreams":null,"archived":"true","withdrawn":"false"}
OK
True所以我决定在我的脚本中实现这个函数(不做任何修改),但是现在当调用函数时,DSpace应用程序接口返回响应码500。
实现细节如下:
def get_workflow_process(document):
if document.document_in_dspace(handle=document.handle) is True:
return 'delete'
else:
return None
wf_process = get_workflow_process(document)
log.msg("Document:", document.doc_id, "Workflow process:", wf_process)输出结果为:
2017-01-04 11:08:45+0100 [-] DSPACE API response code: 500
2017-01-04 11:08:45+0100 [-] Internal Server Error
2017-01-04 11:08:45+0100 [-]
2017-01-04 11:08:45+0100 [-] False
2017-01-04 11:08:45+0100 [-] Document: 28243 Workflow process: None您能为我提供一些建议吗?可能是什么原因造成的,以及如何解决?我很惊讶这能在Python控制台中工作,但不能在实际的脚本中工作,而且似乎我自己也搞不懂。谢谢!
发布于 2017-01-04 22:23:45
我想我想通了。该问题可能与document_in_dspace函数的handle参数中的一些尾随换行符有关。更新后的函数如下所示:
def document_in_dspace(handle):
url = 'https://dspace.cuni.cz/rest/handle/' # TODO: Move to config
hdl = handle.rstrip()
prefix, suffix = str(hdl).split(sep='/')
r_url = url + prefix + '/' + suffix
log.msg("DSpace API request url is:", r_url)
r = requests.get(r_url, timeout=1)
if r.status_code == requests.codes.ok:
log.msg("DSPACE API response code:", r.status_code)
log.msg("Document with handle", handle, "found in DSpace!")
log.msg("Document handle:", handle)
log.msg("Request:\n", r.request.headers)
log.msg("\n")
log.msg(r.reason)
return True
else:
log.msg("DSPACE API response code:", r.status_code)
log.msg("Document with handle", handle, "not found in DSpace!")
log.msg("Document handle:", handle)
log.msg("Request:\n", r.request.headers)
log.msg("\n")
log.msg(r.reason)
return False基本上,我所做的就是在句柄字符串上调用.rstrip()来去除所有不需要的尾随字符,然后我分离句柄的prefix和suffix部分(只是为了确保),并通过将所有部分连接在一起来构造请求url (r_url)。
我会在将来让这个函数更漂亮,但至少现在可以像预期的那样工作。
输出如下:
2017-01-04 15:06:16+0100 [-] Checking if document with handle 123456789/937
is in DSpace...
2017-01-04 15:06:16+0100 [-] DSpace API request url is: https://dspace.cuni.cz/rest/handle/123456789/937
2017-01-04 15:06:16+0100 [-] DSPACE API response code: 200
2017-01-04 15:06:16+0100 [-] Document with handle 123456789/937
found in DSpace!
2017-01-04 15:06:16+0100 [-] Document handle: 123456789/937
2017-01-04 15:06:16+0100 [-] Request:
{'Accept-Encoding': 'gzip, deflate', 'User-Agent': 'python-requests/2.11.1', 'Connection': 'keep-alive', 'Accept': '*/*'}
2017-01-04 15:06:16+0100 [-]
2017-01-04 15:06:16+0100 [-] OK然而,当具有给定句柄的项不存在于存储库中时,DSpace API似乎返回响应代码500,而不是响应代码404。
https://stackoverflow.com/questions/41461547
复制相似问题