首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >通过行的唯一部分获取行,并仅显示该唯一部分的第一个匹配项

通过行的唯一部分获取行,并仅显示该唯一部分的第一个匹配项
EN

Stack Overflow用户
提问于 2013-07-19 04:42:12
回答 1查看 775关注 0票数 2

我正在尝试编写一个脚本,该脚本查看行的一部分,执行sort -u或其他操作来查找唯一的匹配项,然后显示输出,并按行的原始顺序排序。换句话说,只有该行的该部分的第一个匹配项才会出现。

我设法使用cut做到了这一点,但我的输出只显示了数据的剪切部分。我怎么做才能让它得到整行代码呢?

这是我到目前为止所得到的:

代码语言:javascript
复制
cut -d, -f6 infile.txt | cut -c4-11 | grep -n . | sort -t: -k2,2 -u | sort -t: -k1n,1 | cut -d: -f2-

我知道数据中不会有额外的:,会破坏这个脚本。但这只输出唯一的数据。我怎样才能得到整条线?我更喜欢远离perl,但是awk还可以(尽管我对它不是很了解)。

示例:

如果输入文件是这样的(注意,ABCDEFGH不是真实的,我把它放在那里只是为了说明我的意思):

代码语言:javascript
复制
A....,....,...........,.....,....,...20130718......,.........,...........,......
B....,....,...........,.....,....,...20130714......,.........,...........,......
C....,....,...........,.....,....,...20130718......,.........,...........,......
D....,....,...........,.....,....,...20130719......,.........,...........,......
E....,....,...........,.....,....,...20130713......,.........,...........,......
F....,....,...........,.....,....,...20130714......,.........,...........,......
G....,....,...........,.....,....,...20130630......,.........,...........,......
H....,....,...........,.....,....,...20130718......,.........,...........,......

我的程序输出:

代码语言:javascript
复制
20130718
20130714
20130719
20130713
20130630

我想看看:

代码语言:javascript
复制
A....,....,...........,.....,....,...20130718......,.........,...........,......
B....,....,...........,.....,....,...20130714......,.........,...........,......
D....,....,...........,.....,....,...20130719......,.........,...........,......
E....,....,...........,.....,....,...20130713......,.........,...........,......
G....,....,...........,.....,....,...20130630......,.........,...........,......
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2013-07-19 05:04:06

是的,awk是你最好的选择。下面是一个神秘的例子:

代码语言:javascript
复制
awk -F, '!seen[substr($6,4,8)]++' infile.txt

解释:

代码语言:javascript
复制
options:
  -F,              set the field separator to ,

condition:
  substr($6,4,8)   up to 8 characters starting at the fourth character
                   of the sixth field
  seen[...]++      seen is an associative array (dictionary). Increment the
                   value associated with ..., and return the old value
  !seen[...]++     if there was no old value, perform the action


action:
  There is no action, only a condition, so the default action is
  performed if the test succeeds. The default action is to print
  the line. So the  line will be printed if the relevant characters of
  the sixth field haven't yet been seen.

测试:

代码语言:javascript
复制
$ awk -F, '!seen[substr($6,4,8)]++' <<EOF
> A....,....,...........,.....,....,...20130718......,.........,...........,......
> B....,....,...........,.....,....,...20130714......,.........,...........,......
> C....,....,...........,.....,....,...20130718......,.........,...........,......
> D....,....,...........,.....,....,...20130719......,.........,...........,......
> E....,....,...........,.....,....,...20130713......,.........,...........,......
> F....,....,...........,.....,....,...20130714......,.........,...........,......
> G....,....,...........,.....,....,...20130630......,.........,...........,......
> H....,....,...........,.....,....,...20130718......,.........,...........,......
> EOF
A....,....,...........,.....,....,...20130718......,.........,...........,......
B....,....,...........,.....,....,...20130714......,.........,...........,......
D....,....,...........,.....,....,...20130719......,.........,...........,......
E....,....,...........,.....,....,...20130713......,.........,...........,......
G....,....,...........,.....,....,...20130630......,.........,...........,......
$
票数 5
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/17733498

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档