我想删除一些字符串,并使用emeditor从文件中保存我需要的字符串的其他部分。
文件行如下:
{"message":"{\"_\":\"user\",\"pFlags\":{\"contact\":true},\"user_flags\":2143,\"id\":702212125,\"access_hash\":\"914250561826\",\"first_name\":\"david\",\"last_name\":\"jones\",\"username\":\"david_d192\",\"phone\":\"051863329875\",\"status\":{\"_\":\"userStatusRecently\"}}","phone":"051863329875","version":"3","type":"unknown","token":"1556189892619764206","p_id":702212125,"username":"david_d192","type":"redis","user_flags":2143,"host":"win",from":"contacts"}
{"index": {"_type": "_doc", "_id": "36GG54F"}}
{"message":"{\"_\":\"user\",\"pFlags\":{\"contact\":true},\"user_flags\":2143,\"id\":702212125,\"access_hash\":\"914250561826\",\"first_name\":\"david\",\"last_name\":\"jones\",\"username\":\"david_d192\",\"phone\":\"051863329875\",\"status\":{\"_\":\"userStatusRecently\"}}","phone":"051863329875","version":"3","type":"unknown","token":"1556189892619764206","p_id":702212125,"username":"david_d192","type":"redis","user_flags":2143,"host":"win",from":"contacts"}
{"index": {"_type": "_doc", "_id": "36GG54F"}}
{"message":"{\"_\":\"user\",\"pFlags\":{\"contact\":true},\"user_flags\":2143,\"id\":702212125,\"access_hash\":\"914250561826\",\"first_name\":\"david\",\"last_name\":\"jones\",\"phone\":\"051863329875\",\"status\":{\"_\":\"userStatusRecently\"}}","phone":"051863329875","version":"3","type":"unknown","token":"1556189892619764206","p_id":702212125,"type":"redis","user_flags":2143,"host":"win",from":"contacts"}
{"index": {"_type": "_doc", "_id": "36GG54F"}}我想将id、first_name、last_name、电话、用户名(如果存在)保存在每条线路的=>中
id:702212125 first_name:david last_name:jones phone:051863329875 username:david_d192,
id:702212125 first_name:david last_name:jones phone:051863329875 username:david_d192,
id:702212125 first_name:david last_name:jones phone:051863329875,我怎么能做到这一点?
谢谢
发布于 2020-06-18 15:58:17
JSON解析是实现这一点的最佳方式(https://linuxconfig.org/how-to-parse-data-from-json-into-python)。但是,您可以使用regex (这里以PCRE (PHP)形式提供)来使您的工作变得更加困难:
获取所有id:
(?<=id\":\s\")(\w+)(?=\")请参见示例:https://regex101.com/r/g5vfEd/1
获取所有名字:
(?<=first_name\\\":\\\")(\w)+(?=\\)请参见示例:https://regex101.com/r/g5vfEd/2
获取所有姓氏:
(?<=last_name\\\":\\\")(\w)+(?=\\)请参见示例:https://regex101.com/r/g5vfEd/3
获取所有电话号码:
(?<=phone\\\":\\\")(\w)+(?=\\)请参见示例:https://regex101.com/r/g5vfEd/4
获取所有用户名(如果存在):
(?<=username\\\":\\\")(\w)+(?=\\)请参见示例:https://regex101.com/r/g5vfEd/5
完成模式以匹配所有内容:
id\\?\":\s?\"?(\w+),?[\\\"].*first_name\\\":\\"(\w+).*last_name\\\":\\\"(\w+).*phone\":\"(\d+).*(?=username)?\":\"(\w+).*返回3个匹配项,每个匹配项包含以下5个组(此处显示的是match 1):
Group 1. 85-94 702212125
Group 2. 145-150 david
Group 3. 169-174 jones
Group 4. 285-297 051863329875
Group 5. 454-462 contacts发布于 2020-06-18 20:59:36
由于您已经标记了regex和Emeditor,您可以尝试这样做。
Emeditor版本19.1及以上版本支持如下所示的regex命名组:
(?<id>expression) 并使用以下形式命名反向引用:
\k<id>因此,步骤:
查找并替换(Ctrl-H)。勾选“匹配大小写”并选择“正则表达式”。
查找:
\\"id\\"[\\":]*(?<id>[^\\":,]*).*?\\"first_name\\"[\\":]*(?<first_name>[^\\":,]*).*?\\"last_name\\"[\\":]*(?<last_name>[^\\":,]*).*?\\"phone\\"[\\":]*(?<phone>[^\\":,]*)(.*?"username"[\\":]*(?<username>[^\\":,]*))?替换为:
id:\k<id>\tfirst_name:\k<first_name>\tlast_name:\k<last_name>\tphone:\k<phone>\tusername:\k<username>单击“提取”按钮旁边的向下箭头并选择“新建文档”单击“提取”按钮以输出到新的制表符分隔文件。
https://stackoverflow.com/questions/62444793
复制相似问题