我试图通过添加table1和2来丰富我的field_to_enrich1数据,其中字段1-3是相同的,_time就在table1事件的_time之前。
为了根据注释澄清,“就在前面”,我指的是在当前事件的_time字段之前发生的第一个日志事件,其中字段1-3都是匹配的。
我在field1、field2、field3上做了一个左联接,但是我正在试图找出如何在这两个表之间进行_time关联。
我有两张桌子在斯普伦克,如下所示。
Table1
_time,field1,field2,field3,field4
2022-11-10 13:19:55.308,oepwy0s4mjt,n6u,field4_random_123
2022-11-10 13:19:56.308,6onbcity1n2,lwe,field4_random_456
2022-11-10 13:19:57.308,9rfkuntl7qx,2tc,field4_random_567
2022-11-10 13:19:58.308,fn44tlt6rtt,8tm,field4_random_234
2022-11-10 13:19:59.308,gj11nax4o68,lr3,field4_random_458
2022-11-10 13:20:00.308,mdgdj03sx9c,7pc,field4_random_124Table2
_time,field1,field2,field3,field_to_enrich1,field_to_enrich2
2022-11-10 13:19:55.108,oepwy0s4mjt,n6u,83zuyt8vdyFF,ljr5furt0mFF
2022-11-10 13:19:55.208,oepwy0s4mjt,n6u,83zuyt8vdy75,ljr5furt0mfs
2022-11-10 13:19:56.108,6onbcity1n2,lwe,yeg1lhraoeGG,ngmly4majhGG
2022-11-10 13:19:56.208,6onbcity1n2,lwe,yeg1lhraoef0,ngmly4majhom
2022-11-10 13:19:57.108,9rfkuntl7qx,2tc,pfe6vssh0qej,me4yghhmj26t
2022-11-10 13:19:57.208,9rfkuntl7qx,2tc,pfe6vssh0qej,me4yghhmj26t
2022-11-10 13:19:58.108,fn44tlt6rtt,8tm,8l06613lartf,bx5h3v9l1udg
2022-11-10 13:19:58.208,fn44tlt6rtt,8tm,8l06613lartf,bx5h3v9l1udg
2022-11-10 13:19:59.208,oepwy0s4mjt,n6u,asdfasdfasdf,asdfasdfasdf
2022-11-10 13:20:00.208,oepwy0s4mjt,n6u,oimlkmjhgggh,asdfiiiidddd下面是上面表的输出示例。
Table3
_time,field1,field2,field3,field_to_enrich1,field_to_enrich2
2022-11-10 13:19:55.308,oepwy0s4mjt,n6u,field4_random_123,83zuyt8vdy75,ljr5furt0mfs
2022-11-10 13:19:56.308,6onbcity1n2,lwe,field4_random_456,yeg1lhraoef0,ngmly4majhom
2022-11-10 13:19:57.308,9rfkuntl7qx,2tc,field4_random_567,pfe6vssh0qej,me4yghhmj26t
2022-11-10 13:19:58.308,fn44tlt6rtt,8tm,field4_random_234,8l06613lartf,bx5h3v9l1udg
2022-11-10 13:19:59.308,gj11nax4o68,lr3,field4_random_458,FILLNULL,FILLNULL2
2022-11-10 13:20:00.308,mdgdj03sx9c,7pc,field4_random_124,FILLNULL,FILLNULL2任何帮助都将不胜感激。
发布于 2022-11-10 21:35:05
我更喜欢避免join,因为它很昂贵,但没有其他选择。我们可以通过使用_time命令来丢弃给定字段集的所有最新事件,从而处理“在此之前”的"_time“要求。
<<your search for Table1>>
| fields _time,field1,field2,field3,field4
| join type=left field1,field2,field3 [
<<your search for Table2>>
| _time,field1,field2,field3,field_to_enrich1,field_to_enrich2
```Keep only the most recent event for each triplet```
| dedup field1,field2,field3
]
| fillnull value="FILLNULL" field_to_enrich1
| fillnull value="FILLNULL2" field_to_enrich2
| table _time,field1,field2,field3,field_to_enrich1,field_to_enrich2答案2:这是一些应该处理重复事件的丑陋之处--至少它适用于示例数据。注意:我删除了对'field3‘的引用,因为它没有包含在数据中。此外,我在示例中将_time更改为时间,以便在查询中可以使用_time。
| makeresults
| eval data="time,field1,field2,field4
2022-11-10 13:19:55.308,oepwy0s4mjt,n6u,field4_random_123
2022-11-10 13:19:56.308,6onbcity1n2,lwe,field4_random_456
2022-11-10 13:19:57.308,9rfkuntl7qx,2tc,field4_random_567
2022-11-10 13:19:58.308,fn44tlt6rtt,8tm,field4_random_234
2022-11-10 13:19:59.308,gj11nax4o68,lr3,field4_random_458
2022-11-10 13:20:00.308,mdgdj03sx9c,7pc,field4_random_124"
| eval _raw=data
| multikv forceheader=1
| eval _time=strptime(time,"%Y-%m-%d %H:%M:%S.%3N")
| sort - _time
```Above defines test data. Replace with your search for Table1```字段_time,field1,field2,field4
Define fields we'll need in the map command
etime=_time,efield1=field1,efield2=field2,efield4=field4
根据表1`中的预期行数更改the搜索的值
maxsearches=1000 search=,field1,field2,field_to_enrich1,field_to_enrich2
2022-11-10 13:19:55.108,oepwy0s4mjt,n6u,83zuyt8vdyFF,ljr5furt0mFF2022-11-10 13:19:55.208,oepwy0s4mjt,n6u,83zuyt8vdy75,ljr5furt0mfs2022-11-10 13:19:56.108,6onbcity1n2,lwe,yeg1lhraoeGG,ngmly4majhGG2022-11-10 13:19:56.208,6onbcity1n2,lwe,yeg1lhraoef0,ngmly4majhom2022-11-10 13:19:57.108,9rfkuntl7qx,2tc,pfe6vssh0qej,me4yghhmj26t2022-11-10 13:19:57.208,9rfkuntl7qx,2tc,pfe6vssh0qej,me4yghhmj26t2022-11-10 13:19:58.108,fn44tlt6rtt,8tm,8l06613lartf,bx5h3v9l1udg2022-11-10 13:19:58.208,fn44tlt6rtt,8tm,8l06613lartf,bx5h3v9l1udg2022-11-10 13:19:59.208,oepwy0s4mjt,n6u,asdfasdfasdf,asdfasdfasdf2022-11-10 13:20:00.208,oepwy0s4mjt,n6u,oimlkmjhgggh,asdfiiiidddd\"\x{e76f}\x{e76f}\x{e76f} _raw=data \ multikv forceheader=1
\"%Y-%m-%d %H:%M:%S.%3N\")
Replace with your search for Table2```
| sort - _time
```Look for the fields passed in from Table1```field1=$efield1$ field2=$efield2$ _time<$etime$
If get more than one, pick the first (most recent)
第1项
Use the values of _time and field4 from Table1
_time=$etime$,field4=\"$efield4$\“
If nothing was found in Table2 then assign values
_time=$etime$,field1=\"$efield1$\",field2=\"$efield2$\",field4=\"$efield4$\",field_to_enrich1=\“FILLNULL\\”,field_to_enrich2=\“FILLNULL 2\”
_time,field1,field2,field4,field_to_enrich1,field_to_enrich2"
https://stackoverflow.com/questions/74394193
复制相似问题