首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何筛选出不包含其他列表中元素的列表?

如何筛选出不包含其他列表中元素的列表?
EN

Stack Overflow用户
提问于 2017-05-17 23:22:25
回答 2查看 86关注 0票数 0

我试图从下面的小列表中排除不包含特定POS标签的列表,但无法这样做。

代码语言:javascript
复制
a = ['VBG', 'RB', 'NNP']

我只想从下面输出的元组列表中获得包含以上标签的列表:(下面的标记可能不正确,但出于表示目的)

代码语言:javascript
复制
  data = [[('User', 'NNP'),
      ('is', 'VBG'),
      ('not', 'RB'),
      ('able', 'JJ'),
      ('to', 'TO'),
      ('order', 'NN'),
      ('products', 'NNS'),
      ('from', 'IN'),
      ('iShopCatalog', 'NN'),
      ('Coala', 'NNP'),
      ('excluding', 'VBG'),
      ('articles', 'NNS'),
      ('from', 'IN'),
      ('VWR', 'NNP')],
     [('Arfter', 'NNP'),
      ('transferring', 'VBG'),
      ('the', 'DT'),
      ('articles', 'NNS'),
      ('from', 'IN'),
      ('COALA', 'NNP'),
      ('to', 'TO'),
      ('SRM', 'VB'),
      ('the', 'DT'),
      ('Category', 'NNP'),
      ('S9901', 'NNP'),
      ('Dummy', 'NNP'),
      ('is', 'VBZ'),
      ('maintained', 'VBN')],
     [('Due', 'JJ'),
      ('to', 'TO'),
      ('this', 'DT'),
      ('the', 'DT'),
      ('user', 'NN'),
      ('is', 'VBZ'),
      ('not', 'RB'),
      ('able', 'JJ'),
      ('to', 'TO'),
      ('order', 'NN'),
      ('the', 'DT'),
      ('product', 'NN')],
     [('All', 'DT'),
      ('other', 'JJ'),
      ('users', 'NNS'),
      ('can', 'MD'),
      ('order', 'NN'),
      ('these', 'DT'),
      ('articles', 'NNS')],
     [('She', 'PRP'),
      ('can', 'MD'),
      ('order', 'NN'),
      ('other', 'JJ'),
      ('products', 'NNS'),
      ('from', 'IN'),
      ('a', 'DT'),
      ('POETcatalog', 'NNP'),
      ('without', 'IN'),
      ('any', 'DT'),
      ('problems', 'NNS')],
     [('Furtheremore', 'IN'),
      ('she', 'PRP'),
      ('is', 'VBZ'),
      ('able', 'JJ'),
      ('to', 'TO'),
      ('order', 'NN'),
      ('products', 'NNS'),
      ('from', 'IN'),
      ('the', 'DT'),
      ('Vendor', 'NNP'),
      ('VWR', 'NNP'),
      ('through', 'IN'),
      ('COALA', 'NNP')],
     [('But', 'CC'),
      ('articles', 'NNP'),
      ('from', 'VBG'),
      ('all', 'RB'),
      ('other', 'JJ'),
      ('suppliers', 'NNS'),
      ('are', 'NNP'),
      ('not', 'VBG'),
      ('orderable', 'RB')],
     [('I', 'PRP'),
      ('already', 'RB'),
      ('spoke', 'VBD'),
      ('to', 'TO'),
      ('anic', 'VB'),
      ('who', 'WP'),
      ('maintain', 'VBP'),
      ('the', 'DT'),
      ('catalog', 'NN'),
      ('COALA', 'NNP'),
      ('and', 'CC'),
      ('they', 'PRP'),
      ('said', 'VBD'),
      ('that', 'IN'),
      ('the', 'DT'),
      ('reason', 'NN'),
      ('should', 'MD'),
      ('be', 'VB'),
      ('the', 'DT'),
      ('assignment', 'NN'),
      ('of', 'IN'),
      ('the', 'DT'),
      ('plant', 'NN')],
     [('User', 'NNP'),
      ('is', 'VBZ'),
      ('a', 'DT'),
      ('assinged', 'JJ'),
      ('to', 'TO'),
      ('Universitaet', 'NNP'),
      ('Regensburg', 'NNP'),
      ('in', 'IN'),
      ('Scout', 'NNP'),
      ('but', 'CC'),
      ('in', 'IN'),
      ('P17', 'NNP'),
      ('table', 'NN'),
      ('YESRMCDMUSER01', 'NNP'),
      ('she', 'PRP'),
      ('is', 'VBZ'),
      ('assigned', 'VBN'),
      ('to', 'TO'),
      ('company', 'NN'),
      ('001500', 'CD'),
      ('Merck', 'NNP'),
      ('KGaA', 'NNP')],
     [('Please', 'NNP'),
      ('find', 'VB'),
      ('attached', 'JJ'),
      ('some', 'DT'),
      ('screenshots', 'NNS')]]

我的预期产出是:

代码语言:javascript
复制
data = [[('User', 'NNP'),
  ('is', 'VBG'),
  ('not', 'RB'),
  ('able', 'JJ'),
  ('to', 'TO'),
  ('order', 'NN'),
  ('products', 'NNS'),
  ('from', 'IN'),
  ('iShopCatalog', 'NN'),
  ('Coala', 'NNP'),
  ('excluding', 'VBG'),
  ('articles', 'NNS'),
  ('from', 'IN'),
  ('VWR', 'NNP')],
  [('But', 'CC'),
  ('articles', 'NNP'),
  ('from', 'VBG'),
  ('all', 'RB'),
  ('other', 'JJ'),
  ('suppliers', 'NNS'),
  ('are', 'NNP'),
  ('not', 'VBG'),
  ('orderable', 'RB')]

我试图通过编写以下代码来做到这一点,但无法做到这一点:

代码语言:javascript
复制
list1=[]
for i in data:
    list2 = []
    a = ['VBG', 'RB', 'NNP']
    for j in i:
        if all(i in j[1] for i in a):
            list2.append(j)
    list1.append(list2)
list1

返回空列表。谁能提供一个简单易懂的代码来获得我的预期输出。谢谢。

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2017-05-17 23:29:03

你在这里的情况:

代码语言:javascript
复制
if all(i in j[1] for i in a):

询问是否所有标记都在j[1]!中,然后只追加该项。但是最多只能有一个(给定你的数据),这就是为什么你要得到一个空的列表。相反,你想:

代码语言:javascript
复制
In [32]: from operator import itemgetter
    ...: list1=[]
    ...: a = ['VBG', 'RB', 'NNP']
    ...: for sub in data:
    ...:     tags = set(map(itemgetter(1), sub))
    ...:     if all(s in tags for s in a):
    ...:         list1.append(sub)
    ...:

这将检查* a中的所有项是否都在tags集合中--子列表.

代码语言:javascript
复制
In [33]: list1
Out[33]:
[[('User', 'NNP'),
  ('is', 'VBG'),
  ('not', 'RB'),
  ('able', 'JJ'),
  ('to', 'TO'),
  ('order', 'NN'),
  ('products', 'NNS'),
  ('from', 'IN'),
  ('iShopCatalog', 'NN'),
  ('Coala', 'NNP'),
  ('excluding', 'VBG'),
  ('articles', 'NNS'),
  ('from', 'IN'),
  ('VWR', 'NNP')],
 [('But', 'CC'),
  ('articles', 'NNP'),
  ('from', 'VBG'),
  ('all', 'RB'),
  ('other', 'JJ'),
  ('suppliers', 'NNS'),
  ('are', 'NNP'),
  ('not', 'VBG'),
  ('orderable', 'RB')]]
票数 3
EN

Stack Overflow用户

发布于 2017-05-17 23:33:02

这个解决方案看上去可能很奇怪,但它有效:

代码语言:javascript
复制
a = set(a)
def match(x):
  words,tags = zip(*x)
  return set(tags) & a == a
list(filter(match,data))
#[[('User', 'NNP'), ('is', 'VBG'), ('not', 'RB'), ('Coala', 'NNP'), 
#  ('excluding', 'VBG'), ('VWR', 'NNP')], [('Arfter', 'NNP'),     
#  ('transferring', 'VBG'), ('COALA', 'NNP'), ('Category', 'NNP'), 
#  ('S9901', 'NNP'), ('Dummy', 'NNP')], [('not', 'RB')], [], 
#  [('POETcatalog', 'NNP')], [('Vendor', 'NNP'), ('VWR', 'NNP'), 
#  ('COALA', 'NNP')], [('articles', 'NNP'), ('from', 'VBG'), ('all', 'RB'), 
#  ('are', 'NNP'), ('not', 'VBG'), ('orderable', 'RB')], [('already', 'RB'), 
#  ('COALA', 'NNP')], [('User', 'NNP'), ('Universitaet', 'NNP'), 
#  ('Regensburg', 'NNP'), ('Scout', 'NNP'), ('P17', 'NNP'), 
#  ('YESRMCDMUSER01', 'NNP'), ('Merck', 'NNP'), ('KGaA', 'NNP')], 
#  [('Please', 'NNP')]]
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/44036229

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档