我有相当简单的Python3代码,它可以从字符串中创建DFA。
它很好用,但我相信它可以更有效率。总共有3个循环:
据我所信,只有在找到前缀字符串时,我才能从第三个循环中挣脱出来。
我的评论很好地解释了我在代码中所做的事情。
def create_from_text(self, text):
"""
Create/edit the DFA from the provided text
"""
# We start by editing the first state to accept the first character in
# the text provided. After that we check the next character and create
# a new state that goes to the 2nd character. Here we check if a
# character that does not go to the 2nd character will create a prefix
# of the text in combination with an any number of characters preceeding
# it that . e.g. (oopoops)
# oo -> o We want a `p` but got `o`.
# But we check if, starting with 2nd character, `o`,
# any combination after that, appended with the bad
# character creates a prefix of the text. The first
# found will always be the best match.
# `oo` is a prefix of `oopoops` so we can start back
# at index 2 since `oo` is of length 2. We mark the
# state we are on with that state at that character.
self.states[0].transitions[text[0]] = 1
for i in range(len(text)):
if i == 0:
continue
new_state = State(i)
for j in self.alphabet:
if j == text[i]:
# Easiest part, just set the transition to the next state
new_state.addTransition(i + 1, j)
else:
# Now do that check we spoke of earlier
found = False
for k in range(len(text)):
if k == 0:
continue
try:
if text.index(text[k:i] + j) == 0 and k <= i:
new_state.addTransition(len(text[k:i] + j), j)
found = True
break
except ValueError:
pass
if not found:
new_state.addTransition(0, j)
self.states.append(new_state)
# Make every transition in the last state point to itself
last_state = State(len(self.states))
for j in self.alphabet:
last_state.addTransition(len(self.states), j)
self.states.append(last_state)字符串oopoops的示例输出:
Number of states: 1
Accepting states: 0
Alphabet: abcdefghijklmnopqrstuvwxyz
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 3 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 6 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0 0 0 7 0 0 0 0 0 0 0
7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 2048个字符的运行时间大约为5分钟。
发布于 2017-11-13 10:58:29
您的方法建议,正在使用给定的文本创建相应类的实例。因此,您应该使它成为一个classmethod:
@classmethod
def create_from_text(cls, text):
…您的多行注释描述了底层算法。您应该在方法的docstring中总结它,并将完整的描述移到相应的文档文档中。
这是无用的:
# Easiest part, just set the transition to the next state# Now do that check we spoke of earlier虽然这可能有帮助:
# Make every transition in the last state point to itself虽然i in for i in range(len(text))是非常不言自明的,但for j in self.alphabet:中的j却不是。
j是什么?另一个指数?一个字?一个角色?
考虑将j分别重命名为index、word或char。
除了self.addTransition之外,您始终遵循它。
字符串上的
而不是
for i in range(len(text)):
…
if j == text[i]:使用枚举:
for index, char in enumerate(text):
…
if j == char:https://codereview.stackexchange.com/questions/180240
复制相似问题