我试图从包含多个JSON和普通文本的日志文件中提取特定的JSON,在这种情况下,我试图提取包含"Output payload“文本的JSON。我尝试了多种方法,但无法提取所需的JSON,文件格式为:
[2020-05-17 15:32:11.698000] INFO [worker-1] org.mule.api.processor.LoggerMessageProcessor [[cloudhub-us-claim-services-1-0-0-prod].post:/claims/{claimNumber}/predictionScores:experience-claims-predictionscore-api.config.7.771]: PredictionScoreAPILogger-7c506940-987d-11ea-9ef4-0a5226a8e24f:16634746: Initialization: Request successfully logged to mirror queue
[2020-05-17 15:32:12.190000] INFO [worker-1] org.mule.transformer.simple.MessagePropertiesTransformer [[cloudhub-us-claim-services-1-0-0-prod].experience-claims-predictionscore-api.prediction-details-claim-updates.stage1.839]: Property with key 'response', not found on message using 'null'. Since the value was marked optional, nothing was set on the message for this property
[2020-05-17 15:32:12.192000] DEBUG [worker-1] aiml.logging.debug [[cloudhub-us-claim-services-1-0-0-prod].experience-claims-predictionscore-api.prediction-details-claim-updates.stage1.839]: PredictionScoreAPILogger-7c506940-987d-11ea-9ef4-0a5226a8e24f:16634746:Datarobot API Call: Output payload received from Datarobot API: {
"prediction": "N",
"predictionScore": 0.0000629713,
"predictionExplanations": "lineItem : 0|feature: ADJER_CANNOT_COMPUTE_TWG_SUGGESTED_TIME_ZERO|Value: Y|strength: -1.4469371757,\nlineItem : 1|feature: ADJER_CANNOT_COMPUTE_TWG_SUGGESTED_PRICE|Value: Y|strength: -1.1968554807,\nlineItem : 2|feature: MONTHS_DIFF_CLAIM_REPAIR_FACILITY_FIRST_CLAIM|Value: 61|strength: -1.0681064444"
}发布于 2020-05-20 13:05:32
您可以将文件作为文本读取,然后使用regex对其进行解析。如下所示:
import re
logfile = open(logfilepath, 'r')
log = logfile.read()
logfile.close()
objects = re.findall("(Output payload.*:\s?)(\{\s?[\s\S]+?\s?\})", log)我已经为您的给定样本测试了正则表达式,它工作正常。所以这段代码也应该可以工作。一旦获得了所有的JSON对象,您就可以很容易地找到您正在寻找的对象。
Happy hacking :)
编辑:根据修改后的问题修改正则表达式。正则表达式现在查找"Output payload“字符串。
https://stackoverflow.com/questions/61905358
复制相似问题