我正试着从一组歌词中得到一串对联。我们假设歌词是:
I saw a little hermit crab
His coloring was oh so drab
It’s hard to see the butterfly
Because he flies across the sky等等等等。
Once upon a time
She made a little rhyme
Of course, of course
Before we say again
The pain the pain
A horse, a horse
Lightening, thunder, all around
Soon the rain falls on the ground
I tire of writing poems and rhyme它们以字符串的形式存储在db中,由u'\r\n'分隔,并通过string.splitlines(树)进行分隔,对象将它们存储如下:
>>> lyrics[6].track_lyrics['lyrics']
[u'I saw a little hermit crab\r\n', u'His coloring was oh so drab\r\n', u'\r\n', u'It\u2019s hard to see the butterfly\r\n', u'Because he flies across the sky\r\n', u'\r\n', u'\r\n', u'Before we say again\r\n', u'The pain the pain\r\n', u'A horse, a horse\r\n', u'\r\n', u'Lightening, thunder, all around\r\n', u'Soon the rain falls on the ground\r\n', u'\r\n', u'I tire of writing poems and rhyme\r\n']我可以靠得很近:
len([i for i in lyrics if i != "\r\n"]) / 2但它也把一条、三条或多条线算作对联。
我就是这样做的,基本上说,如果前面有一个"\r\n"一行,在后面有两行,我们就是一个对联:
>>> for k,v in enumerate(lyric_list):
... if lyric_list[k+2] == "\r\n" and lyric_list[k-1] == "\r\n":
... print(v)
...
It’s hard to see the butterfly
Hear the honking of the goose
Lightening, thunder, all around但是,当然:
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
IndexError: list index out of range我可以使用try和except IndexError:这样的东西:
>>> if len(lyric_string) > 1:
... for k, v in enumerate(lyric_string):
... if k == 0 and lyric_string[k+2] == "\r\n":
... print(v)
... elif lyric_string[k-1] == "\r\n" and lyric_string[k+2] == "\r\n":
... print(v)
...
I saw a little hermit crab
It’s hard to see the butterfly
Hear the honking of the goose
His red sports car is just a dream
The children like the ocean shore
I made the cookies one by one
My cat, she likes to chase a mouse,
Lightening, thunder, all around
Traceback (most recent call last):
File "<stdin>", line 5, in <module>
IndexError: list index out of range我考虑过做这样的事情,它甚至更丑,不起作用!(只获得第一行和最后一行):
>>> if len(lyric_string) > 1:
... for k, v in enumerate(lyric_string):
... if k == 0 and lyric_string[k+2] == "\r\n":
... print(v)
... elif lyric_string[k-1] == "\r\n" and (k+2 > len(lyric_string) \
... or lyric_string[k+2] == "\r\b"):
... print(v)但我敢打赌有一种更雄辩的甚至是仿生的方法。
发布于 2015-01-13 07:02:48
我假设对联是包含2行的一组线。
您可以通过分割成块来实现这一点,然后计算每个块中的行数。在这个例子中,我计算了块中的换行符数(应该是1在一个对联中)。
>>> text = """I saw a little hermit crab
... His coloring was oh so drab
...
... It’s hard to see the butterfly
... Because he flies across the sky
...
... etc etc...
...
... Once upon a time
... She made a little rhyme
... Of course, of course
...
... Before we say again
... The pain the pain
... A horse, a horse
...
... Lightening, thunder, all around
... Soon the rain falls on the ground
...
... I tire of writing poems and rhyme
... """.replace('\n', '\r\n')
>>> len([block for block in text.split('\r\n\r\n') if block.count('\r\n') == 1])
3这也是假设每个块之间正好有两条新行。要处理2+换行符,可以使用:
import re
...
.. block for block in re.split(r'(?:\r\n){2,}', text) ..发布于 2015-01-13 06:52:21
一种比较简单的方法:用"“连接整个数组,并计数换行符的出现情况。
>>> s = """Once upon a time
... She made a little rhyme
... Of course, of course
...
... Before we say again
... The pain the pain
... A horse, a horse
...
... Lightening, thunder, all around
... Soon the rain falls on the ground
...
... I tire of writing poems and rhyme"""然后就这么做:
>>> s.strip().count("\n\n") + 1
4要在上面的代码中获得s,您需要做一个额外的连接。一个例子
s = "".join(lyrics[6].track_lyrics['lyrics'])我在我的系统上使用\n,你可能不得不在你的系统上使用\r\n。
https://stackoverflow.com/questions/27916228
复制相似问题