我要把段落字符串分成几个句子。当然,我使用的正则表达式与字符点(.)把段落分成几个句子问题是句子中的学术标题缩略语,每个缩略语都使用点(.)。所以我的判决完全错了,把这一段分开。
以下是一个段落的例子:
同时,茂物农业大学校长Suhardiyanto博士在讲话中要求研究生继续学习,按时完成学业。在场的有茂物农业大学研究生院副院长Dr.Dedi Jusadi,Prof.Dr茂物农业大学博士项目研究生院秘书。马利民。
只使用点(.)作为裁判,我得到:
Array (
[0] => Meanwhile Rector of Bogor Agricultural University, Prof
[1] => Dr
[2] => Herry Suhardiyanto, in his remarks requested that the graduate students should keep on studying and will finalize their studies on time
[3] => ...
)这是我真正想要的
Array (
[0] => Meanwhile Rector of Bogor Agricultural University, Prof. Dr. Herry Suhardiyanto, in his remarks requested that the graduate students should keep on studying and will finalize their studies on time
[1] => Present in that general audience were the Deputy Dean of the Graduate School of Bogor Agricultural University, Dr.Dedi Jusadi, Secretary of the Graduate School for Doctoral Program of Bogor Agricultural University, Prof.Dr. Marimin
)发布于 2013-03-09 04:51:03
您可以使用否定的查找背后:
如果需要,((?<!Prof)(?<!Dr)(?<!Mr)(?<!Mrs)(?<!Ms))\.会添加更多内容
在这里解释演示:http://regex101.com/r/xQ3xF9
代码可能是这样的:
$text="Meanwhile Rector of Bogor Agricultural University, Prof. Dr. Herry Suhardiyanto, in his remarks about Mr. John requested that the graduate students should keep on studying and will finalize their studies on time. Present in that general audience were Mrs. Peterson of the Graduate School of Bogor Agricultural University, Dr.Dedi Jusadi, Secretary of the Graduate School for Doctoral Program of Bogor Agricultural University, Prof.Dr. Marimin.";
$titles=array('(?<!Prof)', '(?<!Dr)', '(?<!Mr)', '(?<!Mrs)', '(?<!Ms)');
$sentences=preg_split('/('.implode('',$titles).')\./',$text);
print_r($sentences);发布于 2013-03-09 05:05:57
这似乎是可行的,但它是一个新的PHP函数,而不是严格的RegEx -
$begin = array( 0=>'Meanwhile in geography,',
1=>'Dr',
2=>'Henry Suhardiyanto, in his remarks, stated that ',
3=>'Dr',
4=>'Prof',
5=>'Jedi Dusadi was another ',
6=>'Prof');
$exclusions = array("Dr", "Prof", "Mr", "Mrs");
foreach ($begin as $pos => $sentence) {
if (in_array($sentence, $exclusions)) {
$begin[$pos+1] = $sentence . ". " . $begin[$pos+1];
unset($begin[$pos]);
array_values($begin);
}
} https://stackoverflow.com/questions/15307109
复制相似问题