我需要在字符串的第一行找到所有文件:"StockID“和"SellPrice”。
以下是一些文件示例:
1.csv:
StockID Dept Cat2 Cat4 Cat5 Cat6 Cat1 Cat3 Title Notes Active Weight Sizestr Colorstr Quantity Newprice StockCode DateAdded SellPrice PhotoQuant PhotoStatus Description stockcontrl Agerestricted
1 0 0 0 0 22 0 RAF Air Crew Oxygen Connector 50801 1 150 0 0 50866 2018-09-11 05:54:03 65 5 1
\r\nA wartime RAF aircrew oxygen hose connector.
\r\n
\r\nAir Ministry stamped with Ref. No. 6D/482, Mk IVA.
\r\n
\r\nBrass spring loaded top bayonet fitting for the 'walk around' oxygen bottle extension hose (see last photo).
\r\n
\r\nIn a good condition. 2 0
1 0 0 0 0 15 0 WW2 US Airforce Type Handheld Microphone 50619 1 300 1 0 50691 2017-12-06 09:02:11 20 9 1
\r\nWW2 US Airforce Handheld Microphone type NAF 213264-6 and sprung mounting Bracket No. 213264-2.
\r\n
\r\nType RS 38-A.
\r\n
\r\nMade by Telephonics Corp.
\r\n
\r\nIn a un-issued condition. 3 0
1 0 0 0 0 22 0 RAF Seat Type Parachute Harness 1 4500 1 0 50367 2016-11-04 12:02:26 155 8 1
\r\nPost War RAF Pilot Seat Type Parachute Harness.
\r\n
\r\nThis Irvin manufactured harness is 'new old' stock and is unissued.
\r\n
\r\nThe label states Irvin Harness type C, Mk10, date 1976.
\r\nIt has Irvin marked buckles and complete harness straps all in 'mint' condition.
\r\n
\r\nFully working Irvin Quick Release Box and a canopy release Irvin 'D-Ring' Handle.
\r\n
\r\nThis harness is the same style type as the WW2 pattern seat type, and with some work could be made to look like one.
\r\n
\r\nIdeal for the re-enactor or collector (Not sold for parachuting).
\r\n
\r\nTotal weight of 4500 gms. 3 02.csv:
id user_id organization_id hash name email date first_name hear_about
1 2 15 Fairley teisjdaijdsaidja@domain.com 1129889679 John 0我只想找到第一行包含的文件:"StockID“和"SellPrice”;所以在这个示例中,我只想输出./1.csv
我设法做到了,但我现在被困住了;
where=$(find "./backup -type f)
for x in $where; do
head -1 $x | grep -w "StockID"
done发布于 2018-12-09 15:51:15
find + awk解决方案:
find ./backup -type f -exec \
awk 'NR == 1{ if (/StockID.*SellPrice/) print FILENAME; exit }' {} \;如果关键单词的顺序可能不同,请将模式/StockID.*SellPrice/替换为/StockID/ && /SellPrice/。
在大量文件的情况下,更有效的替代方法是(一次处理一堆文件;命令调用的总数将远远少于匹配的文件数):
find ./backup -type f -exec \
awk 'FNR == 1 && /StockID.*SellPrice/{ print FILENAME }{ nextfile }' {} +发布于 2018-12-10 00:58:04
使用GNU grep或兼容:
grep -Hrnm1 '^' ./backup | sed -n '/StockID.*SellPrice/s/:1:.*//p'递归grep将打印每个文件的第一行,并在不读取整个文件的情况下打印filename:1:line ( -m1标志应该在第1次匹配时退出),sed将打印line部分匹配模式的filename。
对于包含:1:本身或换行符的文件名,这将失败,但这是值得冒的风险,而不是设置一些缓慢的find + awk组合,为每个文件执行另一个进程。
发布于 2018-12-10 07:08:07
为了避免每个文件运行一个命令并读取整个文件,使用GNU awk:
(unset -v POSIXLY_CORRECT; exec find backup/ -type f -exec gawk '
/\/ && /\/ {print FILENAME}; {nextfile}' {} +)或者使用zsh:
set -o rematchpcre # where we know for sure \b is supported
for file (backup/**/*(ND.)) {
IFS= read -r line < $file &&
[[ $line =~ "\bStockID\b" ]] &&
[[ $line =~ "\bSellPrice\b" ]] &&
print -r $file
}或者:
set -o rematchpcre
print -rl backup/**/*(D.e:'
IFS= read -r line < $REPLY &&
[[ $line =~ "\bStockID\b" ]] &&
[[ $line =~ "\bSellPrice\b" ]]':)或者在本机扩展正则表达式支持bash、\>单词边界运算符的系统上(在其他系统上,您也可以尝试[[:<:]]/[[:>:]]或\b ):
RE1='\' RE2='\' find backup -type f -exec bash -c '
for file do
IFS= read -r line < "$file" &&
[[ $line =~ $RE1 ]] &&
[[ $line =~ $RE2 ]] &&
printf "%s\n" "$file"
done' bash {} +https://unix.stackexchange.com/questions/486931
复制相似问题