首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >从具有\r\n换行符的列表中计数对联

从具有\r\n换行符的列表中计数对联
EN

Stack Overflow用户
提问于 2015-01-13 06:37:08
回答 2查看 213关注 0票数 1

我正试着从一组歌词中得到一串对联。我们假设歌词是:

代码语言:javascript
复制
I saw a little hermit crab
His coloring was oh so drab

It’s hard to see the butterfly
Because he flies across the sky

等等等等。

代码语言:javascript
复制
Once upon a time
She made a little rhyme
Of course, of course

Before we say again
The pain the pain
A horse, a horse

Lightening, thunder, all around
Soon the rain falls on the ground

I tire of writing poems and rhyme

它们以字符串的形式存储在db中,由u'\r\n'分隔,并通过string.splitlines(树)进行分隔,对象将它们存储如下:

代码语言:javascript
复制
>>> lyrics[6].track_lyrics['lyrics']
[u'I saw a little hermit crab\r\n', u'His coloring was oh so drab\r\n', u'\r\n', u'It\u2019s hard to see the butterfly\r\n', u'Because he flies across the sky\r\n', u'\r\n',  u'\r\n', u'Before we say again\r\n', u'The pain the pain\r\n', u'A horse, a horse\r\n', u'\r\n', u'Lightening, thunder, all around\r\n', u'Soon the rain falls on the ground\r\n', u'\r\n', u'I tire of writing poems and rhyme\r\n']

我可以靠得很近:

代码语言:javascript
复制
len([i for i in lyrics if i != "\r\n"]) / 2

但它也把一条、三条或多条线算作对联。

我就是这样做的,基本上说,如果前面有一个"\r\n"一行,在后面有两行,我们就是一个对联:

代码语言:javascript
复制
>>> for k,v in enumerate(lyric_list):
...     if lyric_list[k+2] == "\r\n" and lyric_list[k-1] == "\r\n":
...             print(v)
... 
It’s hard to see the butterfly

Hear the honking of the goose


Lightening, thunder, all around

但是,当然:

代码语言:javascript
复制
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
IndexError: list index out of range

我可以使用tryexcept IndexError:这样的东西:

代码语言:javascript
复制
>>> if len(lyric_string) > 1:
...     for k, v in enumerate(lyric_string):
...             if k == 0 and lyric_string[k+2] == "\r\n":
...                     print(v)
...             elif lyric_string[k-1] == "\r\n" and lyric_string[k+2] == "\r\n":
...                     print(v)
... 
I saw a little hermit crab

It’s hard to see the butterfly

Hear the honking of the goose

His red sports car is just a dream

The children like the ocean shore

I made the cookies one by one

My cat, she likes to chase a mouse,

Lightening, thunder, all around

Traceback (most recent call last):
  File "<stdin>", line 5, in <module>
IndexError: list index out of range

我考虑过做这样的事情,它甚至更丑,不起作用!(只获得第一行和最后一行):

代码语言:javascript
复制
>>> if len(lyric_string) > 1:
...     for k, v in enumerate(lyric_string):
...             if k == 0 and lyric_string[k+2] == "\r\n":
...                     print(v)
...             elif lyric_string[k-1] == "\r\n" and (k+2 > len(lyric_string) \
...                                                     or lyric_string[k+2] == "\r\b"):
...                     print(v)

但我敢打赌有一种更雄辩的甚至是仿生的方法。

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2015-01-13 07:02:48

我假设对联是包含2行的一组线。

您可以通过分割成块来实现这一点,然后计算每个块中的行数。在这个例子中,我计算了块中的换行符数(应该是1在一个对联中)。

代码语言:javascript
复制
>>> text = """I saw a little hermit crab
... His coloring was oh so drab
... 
... It’s hard to see the butterfly
... Because he flies across the sky
... 
... etc etc...
... 
... Once upon a time
... She made a little rhyme
... Of course, of course
... 
... Before we say again
... The pain the pain
... A horse, a horse
... 
... Lightening, thunder, all around
... Soon the rain falls on the ground
... 
... I tire of writing poems and rhyme
... """.replace('\n', '\r\n')
>>> len([block for block in text.split('\r\n\r\n') if block.count('\r\n') == 1])
3

这也是假设每个块之间正好有两条新行。要处理2+换行符,可以使用:

代码语言:javascript
复制
import re
...
.. block for block in re.split(r'(?:\r\n){2,}', text) ..
票数 1
EN

Stack Overflow用户

发布于 2015-01-13 06:52:21

一种比较简单的方法:用"“连接整个数组,并计数换行符的出现情况。

代码语言:javascript
复制
>>> s = """Once upon a time
... She made a little rhyme
... Of course, of course
...
... Before we say again
... The pain the pain
... A horse, a horse
...
... Lightening, thunder, all around
... Soon the rain falls on the ground
...
... I tire of writing poems and rhyme"""

然后就这么做:

代码语言:javascript
复制
>>> s.strip().count("\n\n") + 1
4

要在上面的代码中获得s,您需要做一个额外的连接。一个例子

代码语言:javascript
复制
s = "".join(lyrics[6].track_lyrics['lyrics'])

我在我的系统上使用\n,你可能不得不在你的系统上使用\r\n

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/27916228

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档