首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >python diff SequenceMatcher -修补列表

python diff SequenceMatcher -修补列表
EN

Stack Overflow用户
提问于 2012-06-28 23:17:10
回答 2查看 638关注 0票数 3

我修补一个列表,使其看起来像另一个列表:

代码语言:javascript
复制
a = [x for x in "qabxcd"]
b = [x for x in "abycdf"]
c = a[:]
s = SequenceMatcher(None, a, b)
for tag, i1, i2, j1, j2 in s.get_opcodes():
    print ("%7s a[%d:%d] (%s) b[%d:%d] (%s)" % 
    (tag, i1, i2, a[i1:i2], j1, j2, b[j1:j2]))
    if tag == "delete":
        del c[i1:i2]
    elif tag == "replace":
        c[i1:i2] = b[j1-1:j2-1]
    elif tag == "insert":
        c[i1:i2] = b[j1:j2]
print c
print b
print c == b
a == b

但列表并不相等:

代码语言:javascript
复制
 delete a[0:1] (['q']) b[0:0] ([])
  equal a[1:3] (['a', 'b']) b[0:2] (['a', 'b'])
replace a[3:4] (['x']) b[2:3] (['y'])
  equal a[4:6] (['c', 'd']) b[3:5] (['c', 'd'])
 insert a[6:6] ([]) b[5:6] (['f'])
['a', 'b', 'x', 'b', 'd', 'f']
['a', 'b', 'y', 'c', 'd', 'f']
False

有什么问题吗?

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2012-06-29 15:58:56

所有操作都会移动索引。当我要这样做的时候,我必须计算一下变化:

代码语言:javascript
复制
a = [x for x in "abyffgh fg99"]
b = [x for x in "999aby99ff9h9"]
c = a[:]

s = SequenceMatcher(None, a, b)

i = 0
for tag, i1, i2, j1, j2 in s.get_opcodes():
    print ("%7s a[%d:%d] (%s) b[%d:%d] (%s) c[%d:%d] (%s)" % 
    (tag, i1, i2, a[i1:i2], j1, j2, b[j1:j2], i1, i2, c[i1 + i:i2 + i]))
    if tag == "delete":
        del c[i1 + i:i2 + i]
        i -= i2 - i1
    elif tag == "replace":
        c[i1 + i:i2 + i] = b[j1:j2]
        i -= i2 - i1 - j2 + j1
    elif tag == "insert":
        c[i1 + i:i2 + i] = b[j1:j2]
        i += j2 - j1
    print c
    print i
print c
print b
print c == b
a == b

输出:

代码语言:javascript
复制
['9', '9', '9', 'a', 'b', 'y', '9', '9', 'f', 'f', '9', 'h', ' ', 'f', 'g', '9', '9']
5
 delete a[7:10] ([' ', 'f', 'g']) b[12:12] ([]) c[7:10] ([' ', 'f', 'g'])
['9', '9', '9', 'a', 'b', 'y', '9', '9', 'f', 'f', '9', 'h', '9', '9']
1
  equal a[10:11] (['9']) b[12:13] (['9']) c[10:11] (['h'])
['9', '9', '9', 'a', 'b', 'y', '9', '9', 'f', 'f', '9', 'h', '9', '9']
1
 delete a[11:12] (['9']) b[13:13] ([]) c[11:12] (['9'])
['9', '9', '9', 'a', 'b', 'y', '9', '9', 'f', 'f', '9', 'h', '9']
-1
['9', '9', '9', 'a', 'b', 'y', '9', '9', 'f', 'f', '9', 'h', '9']
['9', '9', '9', 'a', 'b', 'y', '9', '9', 'f', 'f', '9', 'h', '9']
True
票数 2
EN

Stack Overflow用户

发布于 2012-06-28 23:57:15

我想我明白为什么了:s.get_opcodes()返回的5元组在容器的初始状态下是有效的,也就是说,如果你的对象改变了,它们必须调整:这就是delete操作的情况,它改变了索引(这就是为什么'x'不会变成'y')。

据我所知,删除操作是唯一更改索引的操作,因此我将使用标记(我使用'#')替换已删除的项,并在末尾删除它:

代码语言:javascript
复制
>>> c = a[:]
>>> for tag, i1, i2, j1, j2 in s.get_opcodes():
    print ("%7s a[%d:%d] (%s) b[%d:%d] (%s)" % 
    (tag, i1, i2, a[i1:i2], j1, j2, b[j1:j2]))
    if tag == "delete":
        c[i1:i2] = ['#' for i in range(i1, i2)]
    elif tag == "replace":
        c[i1:i2] = b[j1:j2]
    elif tag == "insert":
        c[i1:i1] = b[j1:j2]
    print c


 delete a[0:1] (['q']) b[0:0] ([])
['#', 'a', 'b', 'x', 'c', 'd']
  equal a[1:3] (['a', 'b']) b[0:2] (['a', 'b'])
['#', 'a', 'b', 'x', 'c', 'd']
replace a[3:4] (['x']) b[2:3] (['y'])
['#', 'a', 'b', 'y', 'c', 'd']
  equal a[4:6] (['c', 'd']) b[3:5] (['c', 'd'])
['#', 'a', 'b', 'y', 'c', 'd']
 insert a[6:6] ([]) b[5:6] (['f'])
['#', 'a', 'b', 'y', 'c', 'd', 'f']
>>> c = [i for i in c if i != '#']
>>> c
['a', 'b', 'y', 'c', 'd', 'f']
>>> 
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/11247713

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档