我有带有段落格式的文本,date总是在每一段文章的前面。问题是,在每一篇文章之后,都会出现不同类型的unicode换行。我需要删除每个段落之间的每一个断行实例,并将其替换为两个\n\n。
所以从这个
05/12
The 1959 Mexico hurricane was a devastating tropical cyclone
that was one of the worst ever Pacific hurricanes. It
impacted the Pacific coast of Mexico in October 1959. The
hurricane killed at least 1,000 people.
11/01
The 1959 Mexico hurricane was a devastating tropical cyclone
that was one of the worst ever Pacific hurricanes. It
impacted the Pacific coast of Mexico in October 1959. The
hurricane killed at least 1,000 people.到这个
05/12
The 1959 Mexico hurricane was a devastating tropical cyclone
that was one of the worst ever Pacific hurricanes. It
impacted the Pacific coast of Mexico in October 1959. The
hurricane killed at least 1,000 people.
11/01
The 1959 Mexico hurricane was a devastating tropical cyclone
that was one of the worst ever Pacific hurricanes. It
impacted the Pacific coast of Mexico in October 1959. The
hurricane killed at least 1,000 people.我试过使用preg_replace(),但并不是每个实例都匹配吗?
$text = preg_replace('/\r?\n+(?=\d{2}\/\d{2})/', "\n\n", $text);发布于 2013-10-31 00:46:15
大约一个月前,我在一个类似的question上发布了这篇文章。
要匹配任何被认为是行中断序列的内容,可以使用\R
R匹配一个通用换行符;也就是说,任何被Unicode认为是换行序列的东西。这包括\v (垂直空格)匹配的所有字符和多字符序列\x0D\x0A。
试试这个吧。
$text = preg_replace('~\R+(?=\d{2}/\d{2})~u', "\n\n", $text);请参阅有关实现此功能的不同方法的PCRE文档。
https://stackoverflow.com/questions/19696328
复制相似问题