首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >不替换换行符

不替换换行符
EN

Stack Overflow用户
提问于 2016-10-16 15:19:48
回答 2查看 206关注 0票数 0

我有一个机器人给我带来了这样的html代码:

代码语言:javascript
复制
<div class="std">
  <p>CAR:
    <span>Onix</span>
  </p>
  <p>MODEL: LTZ</p>
  <p>
    <span>COLOR:
    <span>Black</span>
  </p>
  <p>ACESSORIES:
    <span>ABS</span>
  </p>
  <p>
    <span>DESCRIPTION:</span>
    <span>The Chevrolet Onix is a subcompact car launched by American automaker Chevrolet in Brazil at the 2012 São Paulo International Motor Show[1] to succeed some versions of Chevrolet Celta. Offered initially as a five-door hatchback, a four-door sedan was launched in 2013 and called the Chevrolet Prisma.[2] The Onix is currently only sold in some South American countries part of Mercosur, including Brazil, Argentina, Colombia, Paraguay and Uruguay.</span>
  </p>
  <p>TECHNICAL DETAIL:
    <span>The Onix is available in three trim levels (LS, LT and LTZ) with two 4-cylinder engines, the 1.0-litre producing 78 PS (57 kW; 77 bhp) (petrol)/ 80 PS (59 kW; 79 bhp) (ethanol) and 1.4-litre 98 PS (72 kW; 97 bhp) (petrol)/106 PS (78 kW; 105 bhp) (ethanol) offering automatic or five-speed manual transmission..</span>
  </p>
</div>

我应用了下面的代码来删除HTML标记:

代码语言:javascript
复制
cleanr    = re.compile('<.*?>')
cleantext = re.sub(cleanr,'\n', html_code).strip()

它回到了我的身边:

代码语言:javascript
复制
CAR: Onix


MODEL: LTZ


COLOR:
Black



ACESSORIES:
ABS



DESCRIPTION:


The Chevrolet Onix is a subcompact car launched by American automaker Chevrolet in Brazil at the 2012 São Paulo International Motor Show[1] to succeed some versions of Chevrolet Celta. Offered initially as a five-door hatchback, a four-door sedan was launched in 2013 and called the Chevrolet Prisma.[2] The Onix is currently only sold in some South American countries part of Mercosur, including Brazil, Argentina, Colombia, Paraguay and Uruguay.



TECHNICAL DETAIL:
The Onix is available in three trim levels (LS, LT and LTZ) with two 4-cylinder engines, the 1.0-litre producing 78 PS (57 kW; 77 bhp) (petrol)/ 80 PS (59 kW; 79 bhp) (ethanol) and 1.4-litre 98 PS (72 kW; 97 bhp) (petrol)/106 PS (78 kW; 105 bhp) (ethanol) offering automatic or five-speed manual transmission..

现在,我需要删除换行符,这样的内容如下:

代码语言:javascript
复制
CAR: Onix
MODEL: LTZ
COLOR: Black
ACESSORIES: ABS
DESCRIPTION: The Chevrolet Onix is a subcompact car launched by American automaker Chevrolet in Brazil at the 2012 São Paulo International Motor Show[1] to succeed some versions of Chevrolet Celta. Offered initially as a five-door hatchback, a four-door sedan was launched in 2013 and called the Chevrolet Prisma.[2] The Onix is currently only sold in some South American countries part of Mercosur, including Brazil, Argentina, Colombia, Paraguay and Uruguay.
TECHNICAL DETAIL: The Onix is available in three trim levels (LS, LT and LTZ) with two 4-cylinder engines, the 1.0-litre producing 78 PS (57 kW; 77 bhp) (petrol)/ 80 PS (59 kW; 79 bhp) (ethanol) and 1.4-litre 98 PS (72 kW; 97 bhp) (petrol)/106 PS (78 kW; 105 bhp) (ethanol) offering automatic or five-speed manual transmission..

我在下面尝试了下面的代码,但它与换行不匹配:

代码语言:javascript
复制
cleantext = re.sub(r':\s*[\r\n]*', ': ', cleantext)

我还尝试了另一种代码:

代码语言:javascript
复制
cleantext = cleantext.replace(': \n', ': ')

也不起作用。我该怎么处理呢?

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2016-10-16 15:47:43

我认为你的问题有两部分,第一部分是用两行连接字符串,如下所示

COLOR: Black

COLOR: black

然后删除所有空行

对于第一部分,可以使用以下方法替换re.sub

cleantext = re.sub(r'(.*):\s*[\r\n](.*)', '\g<1>: \g<2>', cleantext)

为了删除空行,通过re.sub执行这一操作将很困难,所以我建议使用cleantext = "\n".join([line for line in cleantext.split('\n') if line.strip() != ''])

这将给出你所期望的答案。

票数 1
EN

Stack Overflow用户

发布于 2016-10-16 15:42:24

我觉得这个应该适合你

代码语言:javascript
复制
>>> string = """
CAR: Onix


MODEL: LTZ


COLOR:
Black



ACESSORIES:
ABS



DESCRIPTION:


The Chevrolet Onix is a subcompact car launched by American automaker Chevrolet in Brazil at the 2012 São Paulo International Motor Show[1] to succeed some versions of Chevrolet Celta. Offered initially as a five-door hatchback, a four-door sedan was launched in 2013 and called the Chevrolet Prisma.[2] The Onix is currently only sold in some South American countries part of Mercosur, including Brazil, Argentina, Colombia, Paraguay and Uruguay.



TECHNICAL DETAIL:
The Onix is available in three trim levels (LS, LT and LTZ) with two 4-cylinder engines, the 1.0-litre producing 78 PS (57 kW; 77 bhp) (petrol)/ 80 PS (59 kW; 79 bhp) (ethanol) and 1.4-litre 98 PS (72 kW; 97 bhp) (petrol)/106 PS (78 kW; 105 bhp) (ethanol) offering automatic or five-speed manual transmission..
"""
>>> list_string = string.split("\n\n\n")
>>> for each in list_string:
    print each.replace("\n","").strip()


CAR: Onix
MODEL: LTZ
COLOR:Black
ACESSORIES:ABS
DESCRIPTION:
The Chevrolet Onix is a subcompact car launched by American automaker Chevrolet in Brazil at the 2012 São Paulo International Motor Show[1] to succeed some versions of Chevrolet Celta. Offered initially as a five-door hatchback, a four-door sedan was launched in 2013 and called the Chevrolet Prisma.[2] The Onix is currently only sold in some South American countries part of Mercosur, including Brazil, Argentina, Colombia, Paraguay and Uruguay.
TECHNICAL DETAIL:The Onix is available in three trim levels (LS, LT and LTZ) with two 4-cylinder engines, the 1.0-litre producing 78 PS (57 kW; 77 bhp) (petrol)/ 80 PS (59 kW; 79 bhp) (ethanol) and 1.4-litre 98 PS (72 kW; 97 bhp) (petrol)/106 PS (78 kW; 105 bhp) (ethanol) offering automatic or five-speed manual transmission..
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/40071919

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档