我开始在我的机器学习模型的开发中使用aws sagemaker,并且我正在尝试构建一个lambda函数来处理一个sagemaker标签工作的响应。我已经创建了自己的lambda函数,但是当我尝试读取事件内容时,我可以看到事件dict完全是空的,所以我没有任何要读取的数据。
我已经为lambda函数的角色赋予了足够的权限。包括:- AmazonS3FullAccess.- AmazonSagemakerFullAccess。- AWSLambdaBasicExecutionRole
我尝试为后置注释Lambda使用这段代码(适用于python3.6):
以及这个git存储库中的这个:
但它们似乎都不起作用。
在创建标签作业时,我使用boto3 3的函数来创建sagemaker:job
下面是创建标签作业的代码:
def create_labeling_job(client,bucket_name ,labeling_job_name, manifest_uri, output_path):
print("Creating labeling job with name: %s"%(labeling_job_name))
response = client.create_labeling_job(
LabelingJobName=labeling_job_name,
LabelAttributeName='annotations',
InputConfig={
'DataSource': {
'S3DataSource': {
'ManifestS3Uri': manifest_uri
}
},
'DataAttributes': {
'ContentClassifiers': [
'FreeOfAdultContent',
]
}
},
OutputConfig={
'S3OutputPath': output_path
},
RoleArn='arn:aws:myrolearn',
LabelCategoryConfigS3Uri='s3://'+bucket_name+'/config.json',
StoppingConditions={
'MaxPercentageOfInputDatasetLabeled': 100,
},
LabelingJobAlgorithmsConfig={
'LabelingJobAlgorithmSpecificationArn': 'arn:image-classification'
},
HumanTaskConfig={
'WorkteamArn': 'arn:my-private-workforce-arn',
'UiConfig': {
'UiTemplateS3Uri':'s3://'+bucket_name+'/templatefile'
},
'PreHumanTaskLambdaArn': 'arn:aws:lambda:us-east-1:432418664414:function:PRE-BoundingBox',
'TaskTitle': 'Title',
'TaskDescription': 'Description',
'NumberOfHumanWorkersPerDataObject': 1,
'TaskTimeLimitInSeconds': 600,
'AnnotationConsolidationConfig': {
'AnnotationConsolidationLambdaArn': 'arn:aws:my-custom-post-annotation-lambda'
}
}
)
return response这是我为lambda函数准备的:
print("Received event: " + json.dumps(event, indent=2))
print("event: %s"%(event))
print("context: %s"%(context))
print("event headers: %s"%(event["headers"]))
parsed_url = urlparse(event['payload']['s3Uri']);
print("parsed_url: ",parsed_url)
labeling_job_arn = event["labelingJobArn"]
label_attribute_name = event["labelAttributeName"]
label_categories = None
if "label_categories" in event:
label_categories = event["labelCategories"]
print(" Label Categories are : " + label_categories)
payload = event["payload"]
role_arn = event["roleArn"]
output_config = None # Output s3 location. You can choose to write your annotation to this location
if "outputConfig" in event:
output_config = event["outputConfig"]
# If you specified a KMS key in your labeling job, you can use the key to write
# consolidated_output to s3 location specified in outputConfig.
kms_key_id = None
if "kmsKeyId" in event:
kms_key_id = event["kmsKeyId"]
# Create s3 client object
s3_client = S3Client(role_arn, kms_key_id)
# Perform consolidation
return do_consolidation(labeling_job_arn, payload, label_attribute_name, s3_client)我尝试用以下方法调试事件对象:
print("Received event: " + json.dumps(event, indent=2))但是它只打印了一个空字典:Received event: {}
我希望输出的内容如下:
#Content of an example event:
{
"version": "2018-10-16",
"labelingJobArn": <labelingJobArn>,
"labelCategories": [<string>], # If you created labeling job using aws console, labelCategories will be null
"labelAttributeName": <string>,
"roleArn" : "string",
"payload": {
"s3Uri": <string>
}
"outputConfig":"s3://<consolidated_output configured for labeling job>"
}最后,当我尝试使用以下方法获得标签作业ARN时:
labeling_job_arn = event["labelingJobArn"]我只得到一个KeyError (这很有意义,因为字典是空的)。
发布于 2019-08-02 10:28:09
我发现了这个问题,我需要将我的Lamda函数所使用的角色的ARN作为一个可信的实体添加到用于Sagemaker标签作业的角色上。
我刚去了Roles > MySagemakerExecutionRole > Trust Relationships并补充道:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::xxxxxxxxx:role/My-Lambda-Role",
...
],
"Service": [
"lambda.amazonaws.com",
"sagemaker.amazonaws.com",
...
]
},
"Action": "sts:AssumeRole"
}
]
}这让它对我有用。
发布于 2019-12-04 10:49:29
我也是这样做的,但是在标有“对象”部分,我得到了失败的结果,而在输出对象中,我得到了来自Post函数的以下错误:
"annotation-case0-test3-metadata": {
"retry-count": 1,
"failure-reason": "ClientError: The JSON output from the AnnotationConsolidationLambda function could not be read. Check the output of the Lambda function and try your request again.",
"human-annotated": "true"
}
}https://stackoverflow.com/questions/57273357
复制相似问题