我有一份文件,看上去像:
sort=SIZE:NumberDecreasing
FieldText=(((EQUAL{226742}:LocationId)) AND ())
FieldText=(((EQUAL{226742}:LocationId)) AND ((EQUAL{1}:LOD AND NOTEQUAL{1}:SCR AND EMPTY{}:RPDCITYID AND NOTEQUAL{1}:Industrial)))
FieldText=( NOT EQUAL{1}:ISSCHEME AND EQUAL{215629}:LocationId)
sort=DEALDATE:decreasing由此,我想要一个冒号前面的单词,如果有{}括号,前面也有冒号,然后是冒号后面的单词。理想情况下,这些应该是文件中唯一剩下的东西,每一个都在各自的新行中。
然后,输出将类似于:
SIZE:NumberDecreasing
EQUAL:LocationId
EQUAL:LocationId
EQUAL:LOD
NOTEQUAL:SCR
EMPTY:RPDCITYID
NOTEQUAL:Industrial
EQUAL:ISSCHEME
EQUAL:LocationId
DEALDATE:decreasing.> ^.?+ {0-9}:(a+)替换为:...\1:\2.
打算运行几次,然后替换.使用\n,我可以删除多个换行符。
上下文:这是我正在执行的日志分析,我已经删除了datestamps,并将查询的元素降到了排序和FieldText参数。
我没有常规的UNIX工具-我在windows环境中工作
原始日志如下所示:
03/11/2011 16:25:44 [9] ACTION=Query&summary=Context&print=none&printFields=DISPLAYNAME%2CRECORDTYPE%2CSTREET%2CTOWN%2CCOUNTY%2CPOSTCODE%2CLATITUDE%2CLONGITUDE&DatabaseMatch=Autocomplete&sort=RECORDTYPE%3Areversealphabetical%2BDRETITLE%3Aincreasing&maxresults=200&FieldText=%28WILD%7Bbournemou%2A%7D%3ADisplayName%20NOT%20MATCH%7BScheme%7D%3ARecordType%29 (10.55.81.151)
03/11/2011 16:25:45 [9] Returning 23 matches
03/11/2011 16:25:45 [9] Query complete
03/11/2011 16:25:46 [8] ACTION=GetQueryTagValues&documentCount=True&databaseMatch=Deal&minScore=70&weighfieldtext=false&FieldName=TotalSizeSizeInSquareMetres%2CAnnualRental%2CDealType%2CYield&start=1&FieldText=%28MATCH%7BBournemouth%7D%3ATown%29 (10.55.81.151)
03/11/2011 16:25:46 [12] ACTION=Query&databaseMatch=Deal&maxResults=50&minScore=70&sort=DEALDATE%3Adecreasing&weighfieldtext=false&totalResults=true&PrintFields=LocationId%2CLatitude%2CLongitude%2CDealId%2CFloorOrUnitNumber%2CAddressAlias%2A%2CEGAddressAliasID%2COriginalBuildingName%2CSubBuilding%2CBuildingName%2CBuildingNumber%2CDependentStreet%2CStreet%2CDependentLocality%2CLocality%2CTown%2CCounty%2CPostcode%2CSchemeName%2CBuildingId%2CFullAddress%2CDealType%2CDealDate%2CSalesPrice%2CYield%2CRent%2CTotalSizeSizeInSquareMetres%2CMappingPropertyUsetype&start=1&FieldText=%28MATCH%7BBournemouth%7D%3ATown%29 (10.55.81.151)
03/11/2011 16:25:46 [8] GetQueryTagValues complete
03/11/2011 16:25:47 [12] Returning 50 matches
03/11/2011 16:25:47 [12] Query complete
03/11/2011 16:25:51 [13] ACTION=Query&print=all&databaseMatch=locationidsearch&sort=RELEVANCE%2BPOSTCODE%3Aincreasing&maxResults=10&start=1&totalResults=true&minscore=70&weighfieldtext=false&FieldText=%28%20NOT%20LESS%7B50%7D%3AOFFICE%5FPERCENT%20AND%20EXISTS%7B%7D%3AOFFICE%5FPERCENT%20NOT%20EQUAL%7B1%7D%3AISSCHEME%29&Text=%28Brazennose%3AFullAddress%2BAND%2BHouse%3AFullAddress%29&synonym=True (10.55.81.151)
03/11/2011 16:25:51 [13] Returning 3 matches
03/11/2011 16:25:51 [13] Query complete整个练习的目的是找出哪些字段正在被查询和排序(以及我们如何对它们进行查询/排序)-为此目的,输出也可能是不同的-尽管这并不重要。
发布于 2011-11-11 13:00:58
下面的Perl程序已经完成,并在源代码中包含了您的示例数据。它产生的输出与您所描述的完全相同,包括由于中间空间而将NOT EQUAL{1}:ISSCHEME报告为EQUAL:ISSCHEME。
use strict;
use warnings;
while (<DATA>) {
print "$1:$2\n" while /(\w+) (?: \{\d*\} )? : (\w+)/xg;
}
__DATA__
sort=SIZE:NumberDecreasing
FieldText=(((EQUAL{226742}:LocationId)) AND ())
FieldText=(((EQUAL{226742}:LocationId)) AND ((EQUAL{1}:LOD AND NOTEQUAL{1}:SCR AND EMPTY{}:RPDCITYID AND NOTEQUAL{1}:Industrial)))
FieldText=( NOT EQUAL{1}:ISSCHEME AND EQUAL{215629}:LocationId)
sort=DEALDATE:decreasing输出
SIZE:NumberDecreasing
EQUAL:LocationId
EQUAL:LocationId
EQUAL:LOD
NOTEQUAL:SCR
EMPTY:RPDCITYID
NOTEQUAL:Industrial
EQUAL:ISSCHEME
EQUAL:LocationId
DEALDATE:decreasinghttps://stackoverflow.com/questions/8067573
复制相似问题