首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >拆分java中的字符串而不删除匹配项

拆分java中的字符串而不删除匹配项
EN

Stack Overflow用户
提问于 2021-02-25 16:34:21
回答 1查看 41关注 0票数 0

我尝试在不删除匹配字符串的情况下拆分字符串,我取得了一定的成功,因为我发现可以使用(?<=-)|(?=-),但现在如果我实现它来提取链接,使用下面的regex表达式:

((?<=(http:\\/\\/\\S+))|(?=(http:\\/\\/\\S+)))我收到了一个奇怪的警告。实际上,拆分以下输入:

A wonderful serenity has taken possession of http://www.google.com my entire soul,\n like these sweet mornings of spring which I enjoy with my whole heart.

给出了这组字符串:

["A wonderful serenity has taken possession of ", "http://w", "w", "w", ".", "g", "o", "o", "g", "g", "l", "e", ".", "c", "o", "m", "my entire soul,\n like these sweet mornings of spring which I enjoy with my whole heart."]

..。

编辑:成功的输出应该是:

["A wonderful serenity has taken possession of ", "http://www.google.com", "my entire soul,\n like these sweet mornings of spring which I enjoy with my whole heart."]

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-02-25 17:34:21

这里一个可行的选择是使用正式的正则表达式迭代器,并搜索以下模式:

代码语言:js
复制
\\bhttps?://\\S+\\b|.*?(?=https?://|$)

此模式将首先尝试寻找URL,否则它将捕获所有向上的内容,但包括下一个URL或输入的末尾。以下是示例代码:

代码语言:javascript
复制
String input = "A wonderful serenity has taken possession of http://www.google.com my entire soul,\n like these sweet mornings of spring which I enjoy with my whole heart.";
String pattern = "\\bhttps?://\\S+\\b|.*?(?=https?://|$)";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(input);
List matches = new ArrayList<>();
while (m.find()) {
    matches.add(m.group());
}
System.out.println(matches);

这将打印:

代码语言:javascript
复制
[A wonderful serenity has taken possession of ,
 http://www.google.com,
 like these sweet mornings of spring which I enjoy with my whole heart., ]
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/66364963

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档