文章/答案/技术大牛

发布

社区首页 >问答首页 >搜索递归文件，其中包含第一行中字符串的特定组合。

问搜索递归文件，其中包含第一行中字符串的特定组合。
EN

Unix & Linux用户

提问于 2018-12-09 15:16:11

回答 4查看 182关注 0票数 3

我需要在字符串的第一行找到所有文件："StockID“和"SellPrice”。

以下是一些文件示例：

1.csv：

StockID Dept    Cat2    Cat4    Cat5    Cat6    Cat1    Cat3    Title   Notes   Active  Weight  Sizestr Colorstr    Quantity    Newprice    StockCode   DateAdded   SellPrice   PhotoQuant  PhotoStatus Description stockcontrl Agerestricted
 1   0   0   0   0   22  0   RAF Air Crew Oxygen Connector   50801   1   150   0   0   50866   2018-09-11 05:54:03 65  5   1   
\r\nA wartime RAF aircrew oxygen hose connector.
\r\n
\r\nAir Ministry stamped with Ref. No. 6D/482, Mk IVA.
\r\n
\r\nBrass spring loaded top bayonet fitting for the 'walk around' oxygen bottle extension hose (see last photo).
\r\n
\r\nIn a good condition.    2   0
 1   0   0   0   0   15  0   WW2 US Airforce Type Handheld Microphone    50619   1   300   1   0   50691   2017-12-06 09:02:11 20  9   1   
\r\nWW2 US Airforce Handheld Microphone type NAF 213264-6 and sprung mounting Bracket No. 213264-2.
\r\n
\r\nType RS 38-A.
\r\n
\r\nMade by Telephonics Corp.
\r\n
\r\nIn a un-issued condition.    3   0
 1   0   0   0   0   22  0   RAF Seat Type Parachute Harness  1   4500      1   0   50367   2016-11-04 12:02:26 155 8   1   
\r\nPost War RAF Pilot Seat Type Parachute Harness.
\r\n
\r\nThis Irvin manufactured harness is 'new old' stock and is unissued.
\r\n
\r\nThe label states Irvin Harness type C, Mk10, date 1976.
\r\nIt has Irvin marked buckles and complete harness straps all in 'mint' condition.
\r\n
\r\nFully working Irvin Quick Release Box and a canopy release Irvin  'D-Ring' Handle.
\r\n
\r\nThis harness is the same style type as the WW2 pattern seat type, and with some work could be made to look like one.
\r\n
\r\nIdeal for the re-enactor or collector (Not sold for parachuting).
\r\n
\r\nTotal weight of 4500 gms.   3   0

2.csv：

id  user_id organization_id hash    name    email   date    first_name  hear_about
1   2   15   Fairley teisjdaijdsaidja@domain.com 1129889679  John    0

我只想找到第一行包含的文件："StockID“和"SellPrice”；所以在这个示例中，我只想输出./1.csv

我设法做到了，但我现在被困住了；

where=$(find "./backup -type f)
for x in $where; do
   head -1 $x | grep -w "StockID"
done

linux

awk

grep

find

head

回答 4

Unix & Linux用户

回答已采纳

发布于 2018-12-09 15:51:15

find + awk解决方案：

find ./backup -type f -exec \
awk 'NR == 1{ if (/StockID.*SellPrice/) print FILENAME; exit }' {} \;

如果关键单词的顺序可能不同，请将模式/StockID.*SellPrice/替换为/StockID/ && /SellPrice/。

在大量文件的情况下，更有效的替代方法是(一次处理一堆文件；命令调用的总数将远远少于匹配的文件数)：

find ./backup -type f -exec \
awk 'FNR == 1 && /StockID.*SellPrice/{ print FILENAME }{ nextfile }' {} +

票数 6

Unix & Linux用户

发布于 2018-12-10 00:58:04

使用GNU grep或兼容：

grep -Hrnm1 '^' ./backup | sed -n '/StockID.*SellPrice/s/:1:.*//p'

递归grep将打印每个文件的第一行，并在不读取整个文件的情况下打印filename:1:line ( -m1标志应该在第1次匹配时退出)，sed将打印line部分匹配模式的filename。

对于包含:1:本身或换行符的文件名，这将失败，但这是值得冒的风险，而不是设置一些缓慢的find + awk组合，为每个文件执行另一个进程。

票数 1

Unix & Linux用户

发布于 2018-12-10 07:08:07

为了避免每个文件运行一个命令并读取整个文件，使用GNU awk：

(unset -v POSIXLY_CORRECT; exec find backup/ -type f -exec gawk '
  /\/ && /\/ {print FILENAME}; {nextfile}' {} +)

或者使用zsh：

set -o rematchpcre # where we know for sure \b is supported
for file (backup/**/*(ND.)) {
  IFS= read -r line < $file &&
   [[ $line =~ "\bStockID\b" ]] &&
   [[ $line =~ "\bSellPrice\b" ]] &&
   print -r $file
}

或者：

set -o rematchpcre
print -rl backup/**/*(D.e:'
  IFS= read -r line < $REPLY &&
   [[ $line =~ "\bStockID\b" ]] &&
   [[ $line =~ "\bSellPrice\b" ]]':)

或者在本机扩展正则表达式支持bash、\>单词边界运算符的系统上(在其他系统上，您也可以尝试[[:<:]]/[[:>:]]或\b )：

RE1='\' RE2='\' find backup -type f -exec bash -c '
  for file do
    IFS= read -r line < "$file" &&
    [[ $line =~ $RE1 ]] &&
    [[ $line =~ $RE2 ]] &&
    printf "%s\n" "$file"
  done' bash {} +

票数 1

页面原文内容由Unix & Linux提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://unix.stackexchange.com/questions/486931

复制

相似问题

问搜索递归文件，其中包含第一行中字符串的特定组合。
EN

回答 4

Unix & Linux用户

Unix & Linux用户

Unix & Linux用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问搜索递归文件，其中包含第一行中字符串的特定组合。EN

回答 4

Unix & Linux用户

Unix & Linux用户

Unix & Linux用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问搜索递归文件，其中包含第一行中字符串的特定组合。
EN