首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何在列表中的范围间搜索?

如何在列表中的范围间搜索?
EN

Stack Overflow用户
提问于 2017-05-18 03:39:41
回答 2查看 76关注 0票数 0

我想找出两个范围之间的POS标签,这两个范围是NNP标签的索引值。

代码语言:javascript
复制
data = [[('User', 'NNP'),
  ('is', 'VBG'),
  ('not', 'RB'),
  ('able', 'JJ'),
  ('to', 'TO'),
  ('order', 'NN'),
  ('products', 'NNS'),
  ('from', 'IN'),
  ('iShopCatalog', 'NN'),
  ('Coala', 'NNP'),
  ('excluding', 'VBG'),
  ('articles', 'NNS'),
  ('from', 'IN'),
  ('VWR', 'NNP')],
 [('Arfter', 'NNP'),
  ('transferring', 'VBG'),
  ('the', 'DT'),
  ('articles', 'NNS'),
  ('from', 'IN'),
  ('COALA', 'NNP'),
  ('to', 'TO'),
  ('SRM', 'VB'),
  ('the', 'DT'),
  ('Category', 'NNP'),
  ('S9901', 'NNP'),
  ('Dummy', 'NNP'),
  ('is', 'VBZ'),
  ('maintained', 'VBN')],
 [('Due', 'JJ'),
  ('to', 'TO'),
  ('this', 'DT'),
  ('the', 'DT'),
  ('user', 'NN'),
  ('is', 'VBZ'),
  ('not', 'RB'),
  ('able', 'JJ'),
  ('to', 'TO'),
  ('order', 'NN'),
  ('the', 'DT'),
  ('product', 'NN')],
 [('All', 'DT'),
  ('other', 'JJ'),
  ('users', 'NNS'),
  ('can', 'MD'),
  ('order', 'NN'),
  ('these', 'DT'),
  ('articles', 'NNS')],
 [('She', 'PRP'),
  ('can', 'MD'),
  ('order', 'NN'),
  ('other', 'JJ'),
  ('products', 'NNS'),
  ('from', 'IN'),
  ('a', 'DT'),
  ('POETcatalog', 'NNP'),
  ('without', 'IN'),
  ('any', 'DT'),
  ('problems', 'NNS')],
 [('Furtheremore', 'IN'),
  ('she', 'PRP'),
  ('is', 'VBZ'),
  ('able', 'JJ'),
  ('to', 'TO'),
  ('order', 'NN'),
  ('products', 'NNS'),
  ('from', 'IN'),
  ('the', 'DT'),
  ('Vendor', 'NNP'),
  ('VWR', 'NNP'),
  ('through', 'IN'),
  ('COALA', 'NNP')],
 [('But', 'CC'),
  ('articles', 'NNP'),
  ('from', 'VBG'),
  ('all', 'RB'),
  ('other', 'JJ'),
  ('suppliers', 'NNS'),
  ('are', 'NNP'),
  ('not', 'VBG'),
  ('orderable', 'RB')],
 [('I', 'PRP'),
  ('already', 'RB'),
  ('spoke', 'VBD'),
  ('to', 'TO'),
  ('anic', 'VB'),
  ('who', 'WP'),
  ('maintain', 'VBP'),
  ('the', 'DT'),
  ('catalog', 'NN'),
  ('COALA', 'NNP'),
  ('and', 'CC'),
  ('they', 'PRP'),
  ('said', 'VBD'),
  ('that', 'IN'),
  ('the', 'DT'),
  ('reason', 'NN'),
  ('should', 'MD'),
  ('be', 'VB'),
  ('the', 'DT'),
  ('assignment', 'NN'),
  ('of', 'IN'),
  ('the', 'DT'),
  ('plant', 'NN')],
 [('User', 'NNP'),
  ('is', 'VBZ'),
  ('a', 'DT'),
  ('assinged', 'JJ'),
  ('to', 'TO'),
  ('Universitaet', 'NNP'),
  ('Regensburg', 'NNP'),
  ('in', 'IN'),
  ('Scout', 'NNP'),
  ('but', 'CC'),
  ('in', 'IN'),
  ('P17', 'NNP'),
  ('table', 'NN'),
  ('YESRMCDMUSER01', 'NNP'),
  ('she', 'PRP'),
  ('is', 'VBZ'),
  ('assigned', 'VBN'),
  ('to', 'TO'),
  ('company', 'NN'),
  ('001500', 'CD'),
  ('Merck', 'NNP'),
  ('KGaA', 'NNP')],
 [('Please', 'NNP'),
  ('find', 'VB'),
  ('attached', 'JJ'),
  ('some', 'DT'),
  ('screenshots', 'NNS')]]

以下是我的密码。

代码语言:javascript
复制
list1 = []
list4 = []
for i in data:
    list2 = []
    list3 = []
    for l,j in enumerate(i):
        if j[1] == 'NNP':
            list2.append(l)
            list3.append(j[0])
    list1.append(list2)
    list4.append(list3)

输出:

代码语言:javascript
复制
list1:

[[0, 9, 13],
 [0, 5, 9, 10, 11],
 [],
 [],
 [7],
 [9, 10, 12],
 [1, 6],
 [9],
 [0, 5, 6, 8, 11, 13, 20, 21],
 [0]]

list4

[['User', 'Coala', 'VWR'],
 ['Arfter', 'COALA', 'Category', 'S9901', 'Dummy'],
 [],
 [],
 ['POETcatalog'],
 ['Vendor', 'VWR', 'COALA'],
 ['articles', 'are'],
 ['COALA'],
 ['User',
  'Universitaet',
  'Regensburg',
  'Scout',
  'P17',
  'YESRMCDMUSER01',
  'Merck',
  'KGaA'],
 ['Please']]

从list1和list4中,我能够获得NNP的字符串和索引。但是我想知道,在每个列表中,如果VB,RB,JJ标签存在于NNP标签之间,使用NNP标签的索引值。

例如,在列表的第一个列表中,如何编写代码在范围(0-9)和(9-13)之间搜索是否存在VB、RB、JJ标记。

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2017-05-18 05:27:18

列表理解,压缩偏移量list1以获得范围的索引

逻辑在切片的data[0][j:k]元素中找到匹配的输出范围

代码语言:javascript
复制
[[j, k] for j, k in zip(list1[0][:], list1[0][1:])
        if any(t[1] in ['VB', 'RB', 'JJ'] for t in data[0][j:k])]

Out[107]: [[0, 9]]
票数 1
EN

Stack Overflow用户

发布于 2017-05-18 04:42:19

假设我正确理解了你的问题,以下几点应该有效:

代码语言:javascript
复制
search_list = ['VB', 'RB', 'JJ']
for index, set in enumerate(list1):
    temp = set[::-1] # makes a copy of the list in reverse
    while len(temp) > 1:
        first = temp.pop() # removes the last item (first item of set) to control while loop
        second = temp[-1] # references next item (new last item)
        for i in range(first, second + 1): # search all indices between first and second
            if data[index][i][1] in search_list: # index the data by same index as current list1 item
                do_stuff()

基本上:

  1. 使用外部for循环中的枚举来保持与原始数据的并行索引。
  2. 在list1中创建要处理的每个列表的副本。我做了一个反向复制,因为我个人不喜欢在索引中使用pop(),所以如果我想一次又一次地弹出列表的第一项,我就反转列表。您可以定期复制,并使用list.pop(0)删除和传递第一项
  3. 从列表中弹出最后(第一个)项,并引用下一个项。
  4. 使用这两个项创建索引数据的范围,并检查所述项。
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/44038192

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档