查找文本中出现频率最高的单词,按数字排序,每个数字打印在同一行
grep -oE '[[:alpha:]]' file.txt | sort | uniq -c | sort -nr它给了我们
3 linux
3 fedora
2 ubuntu
2 mandriva我在寻找
3 linux fedora
2 ubuntu mandriva
grep -oE '[[:alpha:]]' file.txt | sort | uniq -c | sort -nr结果
3 linux
3 fedora
2 ubuntu
2 mandriva我在寻找
3 linux fedora
2 ubuntu mandriva发布于 2019-04-12 20:48:36
我不能在bash单行代码中做到这一点,但我这里有一个简短的python脚本,如果它适合你的话。
import os
preMergedList = os.popen("grep -o -E '\w+' file.txt | sort | uniq -c | sort -nr").readlines()
countDict = {}
for line in preMergedList:
count, word = line.split(None)
count = int( count.strip() )
word = word.strip()
if not countDict.has_key( count ):
countDict[count] = ""
countDict[count] += word + " "
for count, wordString in sorted( countDict.iteritems(), reverse=True ):
print count, wordStringhttps://stackoverflow.com/questions/55642935
复制相似问题