首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >RegEx将排除学术头衔

RegEx将排除学术头衔
EN

Stack Overflow用户
提问于 2013-03-09 04:21:55
回答 2查看 537关注 0票数 4

我要把段落字符串分成几个句子。当然,我使用的正则表达式与字符点(.)把段落分成几个句子问题是句子中的学术标题缩略语,每个缩略语都使用点(.)。所以我的判决完全错了,把这一段分开。

以下是一个段落的例子:

同时,茂物农业大学校长Suhardiyanto博士在讲话中要求研究生继续学习,按时完成学业。在场的有茂物农业大学研究生院副院长Dr.Dedi Jusadi,Prof.Dr茂物农业大学博士项目研究生院秘书。马利民。

只使用点(.)作为裁判,我得到:

代码语言:javascript
复制
Array (
[0] => Meanwhile Rector of Bogor Agricultural University, Prof
[1] => Dr
[2] => Herry Suhardiyanto, in his remarks requested that the graduate students should keep on studying and will finalize their studies on time
[3] => ...
)

这是我真正想要的

代码语言:javascript
复制
Array (
[0] => Meanwhile Rector of Bogor Agricultural University, Prof. Dr. Herry Suhardiyanto, in his remarks requested that the graduate students should keep on studying and will finalize their studies on time
[1] => Present in  that general audience were  the Deputy Dean of the Graduate School of Bogor Agricultural University, Dr.Dedi Jusadi, Secretary of the Graduate School for Doctoral Program of Bogor Agricultural University, Prof.Dr. Marimin
)
EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2013-03-09 04:51:03

您可以使用否定的查找背后:

如果需要,((?<!Prof)(?<!Dr)(?<!Mr)(?<!Mrs)(?<!Ms))\.会添加更多内容

在这里解释演示:http://regex101.com/r/xQ3xF9

代码可能是这样的:

代码语言:javascript
复制
$text="Meanwhile Rector of Bogor Agricultural University, Prof. Dr. Herry Suhardiyanto, in his remarks about Mr. John requested that the graduate students should keep on studying and will finalize their studies on time. Present in that general audience were Mrs. Peterson of the Graduate School of Bogor Agricultural University, Dr.Dedi Jusadi, Secretary of the Graduate School for Doctoral Program of Bogor Agricultural University, Prof.Dr. Marimin.";

$titles=array('(?<!Prof)', '(?<!Dr)', '(?<!Mr)', '(?<!Mrs)', '(?<!Ms)');
$sentences=preg_split('/('.implode('',$titles).')\./',$text);
print_r($sentences);
票数 3
EN

Stack Overflow用户

发布于 2013-03-09 05:05:57

这似乎是可行的,但它是一个新的PHP函数,而不是严格的RegEx -

代码语言:javascript
复制
$begin = array( 0=>'Meanwhile in geography,',
            1=>'Dr',
            2=>'Henry Suhardiyanto, in his remarks, stated that ',
            3=>'Dr',
            4=>'Prof',
            5=>'Jedi Dusadi was another ',
            6=>'Prof');

$exclusions = array("Dr", "Prof", "Mr", "Mrs");

foreach ($begin as $pos => $sentence) {
if (in_array($sentence, $exclusions)) {
    $begin[$pos+1] = $sentence . ". " . $begin[$pos+1];
    unset($begin[$pos]);
    array_values($begin);
    }
}    
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/15307109

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档