我正在使用AWS DMS将数据从Aurora提取到S3,并希望在将数据加载到S3时使用我选择的csvDelimiter,即^A (即control-A,八进制表示\001)。我该怎么做?。缺省情况下,当S3用作DMS的目标时,它使用",“作为缺省分隔符
compressionType=NONE;csvDelimiter=,;csvRowDelimiter=\n;
但是我想使用下面的compressionType=NONE;csvDelimiter='\001';csvRowDelimiter=\n;
但是它将分隔符打印为输出中的文本:I'\001'12345'\001'Abc'
我正在使用AWS DMS控制台在分隔符下面尝试设置目标终结点,但不起作用:
\\001 \u0001 '\u0001' \u01 \001
实际结果:I'\001'12345'\001'Abc'预期结果:I^A12345^AAbc
发布于 2020-01-15 02:02:51
以下是我为解决此问题所做的工作:
我使用aws命令行在我的目标s3端点中设置了这个分隔符。https://docs.aws.amazon.com/translate/latest/dg/setup-awscli.html
aws cli命令:
aws dms modify-endpoint --endpoint-arn arn:aws:dms:us-west-2:000001111222:endpoint:OXXXXXXXXXXXXXXXXXXXX4 --endpoint-identifier dms-ep-tgt-s3-abc --endpoint-type target --engine-name s3 --extra-connection-attributes "bucketFolder=data/folderx;bucketname=bkt-xyz;CsvRowDelimiter=^D;CompressionType=NONE;CsvDelimiter=^A;" --service-access-role-arn arn:aws:iam::000001111222:role/XYZ-Datalake-DMS-Role --s3-settings ServiceAccessRoleArn=arn:aws:iam::000001111222:role/XYZ-Datalake-DMS-Role,BucketName=bkt-xyz,CompressionType=NONE输出:
{
"Endpoint": {
"Status": "active",
"S3Settings": {
"CompressionType": "NONE",
"EnableStatistics": true,
"BucketFolder": "data/folderx",
"CsvRowDelimiter": "\u0004",
"CsvDelimiter": "\u0001",
"ServiceAccessRoleArn": "arn:aws:iam::000001111222:role/XYZ-Datalake-DMS-Role",
"BucketName": "bkt-xyz"
},
"EndpointType": "TARGET",
"ServiceAccessRoleArn": "arn:aws:iam::000001111222:role/XYZ-Datalake-DMS-Role",
"SslMode": "none",
"EndpointArn": "arn:aws:dms:us-west-2:000001111222:endpoint:OXXXXXXXXXXXXXXXXXXXX4",
"ExtraConnectionAttributes": "bucketFolder=data/folderx;bucketname=bkt-xyz;CompressionType=NONE;CsvDelimiter=\u0001;CsvRowDelimiter=\u0004;",
"EngineDisplayName": "Amazon S3",
"EngineName": "s3",
"EndpointIdentifier": "dms-ep-tgt-s3-abc"
}}
注意:运行aws cli命令后,DMS控制台将不会显示端点中的分隔符(不可见,因为它是一个特殊字符)。但是一旦运行了任务,它就会出现在s3文件的数据中。
https://stackoverflow.com/questions/57881152
复制相似问题