我必须用下面的内容解析多个文件的内容:
style=3D""><a href=3D"https://123456789.com/accounts/confirm_email/19AbCDx=
K/bWFyY29A1234529zYW50dWNjaS5ldQ/?app_redirect=3DFalse&ndid=3DHMTU1Mjk=
wODY5OTA1MDk2NTptYXJjb0BtYXJjb3NhbnR1Y2NpLmV1Ojg1OQ" style=3D"color:#3b599我必须提取https链接,但我的grep命令不能忽略新行return,并以中继结果结束:
命令
grep -r -m1 -oh "https://123456789.com/accounts/confirm_email*\s*[^ ]*" /folder/结果
https://123456789.com/accounts/confirm_email/19AbCDx=DESIDERED结果
https://123456789.com/accounts/confirm_email/19AbCDx=K/bWFyY29A1234529zYW50dWNjaS5ldQ/?app_redirect=3DFalse&ndid=3DHMTU1MjkwODY5OTA1MDk2NTptYXJjb0BtYXJjb3NhbnR1Y2NpLmV1Ojg1OQPS:'=‘字符不是(总是)链接的一部分,但在换行时它是文件的格式。
注意:https://123456789.com/accounts/confirm_email/是链接在所有文件中重复的唯一常量。
如果我添加-z选项,-m1选项被忽略,结果是:
https://123456789.com/accounts/confirm_email/19AbCDx=
K/bWFyY29A1234529zYW50dWNjaS5ldQ/?app_redirect=3DFalse&ndid=3DHMTU1Mjk=
wODY5OTA1MDk2NTptYXJjb0BtYXJjb3NhbnR1Y2NpLmV1Ojg1OQ"https://123456789.com/accounts/confirm_email/19AbCDx=
K/bWFyY29A1234529zYW50dWNjaS5ldQ/?app_redirect=3DFalse&ndid=3DHMTU1Mjk=
wODY5OTA1MDk2NTptYXJjb0BtYXJjb3NhbnR1Y2NpLmV1Ojg1OQ"https://123456789.com/accounts/confirm_email/19AbCDx=
K/bWFyY29A1234529zYW50dWNjaS5ldQ/?app_redirect=3DFalse&ndid=3DHMTU1Mjk=
wODY5OTA1MDk2NTptYXJjb0BtYXJjb3NhbnR1Y2NpLmV1Ojg1OQ"如果我在命令后面添加|head -3,但最后一行重复了http
命令
grep -r -oh -z "https://123456789.com/accounts/confirm_email*\s*[^ ]*" /folder/ |head-3
https://123456789.com/accounts/confirm_email/19AbCDx=
K/bWFyY29A1234529zYW50dWNjaS5ldQ/?app_redirect=3DFalse&ndid=3DHMTU1Mjk=
wODY5OTA1MDk2NTptYXJjb0BtYXJjb3NhbnR1Y2NpLmV1Ojg1OQ"https://123456789.com/accounts/confirm_email/19AbCDx=我怎么能排除它呢?
发布于 2019-03-20 22:47:07
man grep
-z, --null-data
Treat the input as a set of lines, each terminated by a zero
byte (the ASCII NUL character) instead of a newline. - -所以:
$ grep -z -r -m1 -oh "https://123456789.com/accounts/confirm_email*\s*[^ ]*" file输出:
https://123456789.com/accounts/confirm_email/19AbCDx=
K/bWFyY29A1234529zYW50dWNjaS5ldQ/?app_redirect=3DFalse&ndid=3DHMTU1Mjk=
wODY5OTA1MDk2NTptYXJjb0BtYXJjb3NhbnR1Y2NpLmV1Ojg1OQ"换行符仍然存在,但您可以使用tr -d \\n删除它们
https://stackoverflow.com/questions/55263189
复制相似问题