我已经为相同的功能构建了两个不同转换的管道。
在效率和/或资源利用率方面,是否有任何基准来比较这两个管道?
详细解释:管道1:仅使用2个映射数据流。一个有4个转换,另一个有20个转换。管道2:使用2个映射数据流。一个有4个转换,第二个DF有15个转换和Databricks notebook。
我想从1.效率2.资源利用率3.成本方面比较这两个管道
有什么建议吗?
谢谢
发布于 2020-09-01 16:46:26
我认为你可以比较管道的输出,输出包含你想要的值。
下面是流水线执行的输出示例:
{
"dataRead": 8192,
"dataWritten": 612,
"filesRead": 1,
"sourcePeakConnections": 1,
"sinkPeakConnections": 2,
"rowsRead": 1,
"rowsCopied": 1,
"copyDuration": 12,
"throughput": 0.667,
"errors": [],
"effectiveIntegrationRuntime": "DefaultIntegrationRuntime (East US)",
"usedDataIntegrationUnits": 4,
"billingReference": {
"activityType": "DataMovement",
"billableDuration": [
{
"meterType": "AzureIR",
"duration": 0.06666666666666667,
"unit": "DIUHours"
}
]
},
"usedParallelCopies": 1,
"executionDetails": [
{
"source": {
"type": "AzureBlobStorage",
"region": "Central US"
},
"sink": {
"type": "AzureSqlDatabase",
"region": "East US"
},
"status": "Succeeded",
"start": "2020-09-01T08:20:09.1734161Z",
"duration": 12,
"usedDataIntegrationUnits": 4,
"usedParallelCopies": 1,
"profile": {
"queue": {
"status": "Completed",
"duration": 9
},
"transfer": {
"status": "Completed",
"duration": 3,
"details": {
"listingSource": {
"type": "AzureBlobStorage",
"workingDuration": 0
},
"readingFromSource": {
"type": "AzureBlobStorage",
"workingDuration": 0
},
"writingToSink": {
"type": "AzureSqlDatabase",
"workingDuration": 0
}
}
}
},
"detailedDurations": {
"queuingDuration": 9,
"transferDuration": 3
}
}
],
"dataConsistencyVerification": {
"VerificationResult": "NotVerified"
},
"durationInQueue": {
"integrationRuntimeQueue": 0
}
}在门户网站上:


https://stackoverflow.com/questions/63670577
复制相似问题