首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >为HTMLQuestion MTurk构建boto3 XML

为HTMLQuestion MTurk构建boto3 XML
EN

Stack Overflow用户
提问于 2017-10-21 00:54:34
回答 2查看 997关注 0票数 4

我试图使用HTMLQuestion数据结构和boto3 3的命中函数构建XML,以提交给亚马逊的机械土耳其服务。根据文档,XML应该被格式化为像这样

我已经创建了一个类TurkTaskAssembler,它具有生成xml并通过API将该XML传递给this平台的方法。我使用boto3库来处理与亚马逊的通信。

我生成的XML似乎格式不正确,因为当我试图通过API传递这个XML时,我会得到一个验证错误,如下所示:

代码语言:javascript
复制
>>> tta = TurkTaskAssembler("What color is the sky?")
>>> response = tta.create_hit_task()
>>> ParamValidationError: Parameter validation failed: Invalid type for parameter Question, value: <Element HTMLQuestion at 0x1135f68c0>, type: <type 'lxml.etree._Element'>, valid types: <type 'basestring'>

然后,我修改了create_question_xml方法,使用tostring方法将XML信封转换为字符串,但这会产生一个不同的错误:

代码语言:javascript
复制
>>> tta = TurkTaskAssembler("What color is the sky?")
>>> tta.create_hit_task()
>>> ClientError: An error occurred (ParameterValidationError) when calling the CreateHIT operation: There was an error parsing the XML question or answer data in your request.  Please make sure the data is well-formed and validates against the appropriate schema. Details: cvc-elt.1.a: Cannot find the declaration of element 'HTMLQuestion'. (1508611228659 s)

我真的不知道自己做错了什么,而且很少有XML经验。

以下是所有相关代码:

代码语言:javascript
复制
import os
import boto3
from lxml.etree import Element, SubElement, CDATA, tostring
from .settings import mturk_access_key_id, mturk_access_secret_key

xml_schema_url = 'http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2011-11-11/HTMLQuestion.xsd'


class TurkTaskAssembler(object):

    def __init__(self, question):
        self.client = boto3.client(
            service_name='mturk',
            region_name='us-east-1',
            endpoint_url='https://mturk-requester-sandbox.us-east-1.amazonaws.com',
            aws_access_key_id=mturk_access_key_id,
            aws_secret_access_key=mturk_access_secret_key
        )
        self.question = question

    def create_question_xml(self):
        # questionFile = open(os.path.join(__location__, "question.xml"), "r")
        # question = questionFile.read()
        # return question
        XHTML_NAMESPACE = xml_schema_url
        XHTML = "{%s}" % XHTML_NAMESPACE
        NSMAP = {
            None : XHTML_NAMESPACE,
            'xsi': 'http://www.w3.org/2001/XMLSchema-instance',
            ''
            }
        envelope = Element("HTMLQuestion", nsmap=NSMAP)

        html =  """
            <!DOCTYPE html>
            <html>
             <head>
              <meta http-equiv='Content-Type' content='text/html; charset=UTF-8'/>
              <script type='text/javascript' src='https://s3.amazonaws.com/mturk-public/externalHIT_v1.js'></script>
             </head>
             <body>
              <form name='mturk_form' method='post' id='mturk_form' action='https://www.mturk.com/mturk/externalSubmit'>
              <input type='hidden' value='' name='assignmentId' id='assignmentId'/>
              <h1>Answer this question</h1>
              <p>{question}</p>
              <p><textarea name='comment' cols='80' rows='3'></textarea></p>
              <p><input type='submit' id='submitButton' value='Submit' /></p></form>
              <script language='Javascript'>turkSetAssignmentID();</script>
             </body>
            </html>
            """.format(question=self.question)

        html_content = SubElement(envelope, 'HTMLContent')
        html_content.text = CDATA(html)
        xml_meta = """<?xml version="1.1" encoding="utf-8"?>"""
        return xml_meta + tostring(envelope, encoding='utf-8')

    def create_hit_task(self):
        response = self.client.create_hit(
            MaxAssignments=1,
            AutoApprovalDelayInSeconds=10800,
            LifetimeInSeconds=10800,
            AssignmentDurationInSeconds=300,
            Reward='0.05',
            Title='a title',
            Keywords='some keywords',
            Description='a description',
            Question=self.create_question_xml(),
        )
        return response
EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2017-10-21 20:33:56

为什么不简单地将XML数据放在单独的XML文件中(就像您所做的那样,但是注释掉了)?这将防止您必须合并几个模块和大量的代码。

使用您描述的这里模板,创建question.xml

代码语言:javascript
复制
<HTMLQuestion xmlns="http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2011-11-11/HTMLQuestion.xsd">
  <HTMLContent><![CDATA[
<!DOCTYPE html>
<html>
 <head>
  <meta http-equiv='Content-Type' content='text/html; charset=UTF-8'/>
  <script type='text/javascript' src='https://s3.amazonaws.com/mturk-public/externalHIT_v1.js'></script>
 </head>
 <body>
  <form name='mturk_form' method='post' id='mturk_form' action='https://www.mturk.com/mturk/externalSubmit'>
  <input type='hidden' value='' name='assignmentId' id='assignmentId'/>
  <h1>Answer this question</h1>
  <p>{question}</p>
  <p><textarea name='comment' cols='80' rows='3'></textarea></p>
  <p><input type='submit' id='submitButton' value='Submit' /></p></form>
  <script language='Javascript'>turkSetAssignmentID();</script>
 </body>
</html>
]]>
  </HTMLContent>
  <FrameHeight>450</FrameHeight>
</HTMLQuestion>

然后在您的create_question_xml()函数中:

代码语言:javascript
复制
def create_question_xml(self):
    question_file = open("question.xml", "r").read()
    xml = question_file.format(question=self.question)
    return xml

那应该是你所需要的。

票数 5
EN

Stack Overflow用户

发布于 2017-10-21 21:13:13

我认为您有点混淆了Amazon建议您使用的3种格式。据我所见,你和HTMLQuestion一起去了。(另外两个是:ExternalQuestionQuestionFormData)。

要将问题保留为HTMLQuestion格式,只需使用文档中提供的简单示例,无需用XML包装。这里有一个固定的函数:

代码语言:javascript
复制
def create_question_html(self):
    # you can extract template into a file, 
    # as @Mangohero1 suggested which would simplify code a bit.

    return """
    <HTMLQuestion xmlns="http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2011-11-11/HTMLQuestion.xsd">
      <HTMLContent><![CDATA[
    <!DOCTYPE html>
    <html>
     <head>
      <meta http-equiv='Content-Type' content='text/html; charset=UTF-8'/>
      <script type='text/javascript' src='https://s3.amazonaws.com/mturk-public/externalHIT_v1.js'></script>
     </head>
     <body>
      <form name='mturk_form' method='post' id='mturk_form' action='https://www.mturk.com/mturk/externalSubmit'>
      <input type='hidden' value='' name='assignmentId' id='assignmentId'/>
      <h1>{question}</h1>
      <p><textarea name='comment' cols='80' rows='3'></textarea></p>
      <p><input type='submit' id='submitButton' value='Submit' /></p></form>
      <script language='Javascript'>turkSetAssignmentID();</script>
     </body>
    </html>
    ]]>
      </HTMLContent>
      <FrameHeight>450</FrameHeight>
    </HTMLQuestion>
    """.format(question=self.question)
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/46859079

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档