我尝试在不删除匹配字符串的情况下拆分字符串,我取得了一定的成功,因为我发现可以使用(?<=-)|(?=-),但现在如果我实现它来提取链接,使用下面的regex表达式:
((?<=(http:\\/\\/\\S+))|(?=(http:\\/\\/\\S+)))我收到了一个奇怪的警告。实际上,拆分以下输入:
A wonderful serenity has taken possession of http://www.google.com my entire soul,\n like these sweet mornings of spring which I enjoy with my whole heart.
给出了这组字符串:
["A wonderful serenity has taken possession of ", "http://w", "w", "w", ".", "g", "o", "o", "g", "g", "l", "e", ".", "c", "o", "m", "my entire soul,\n like these sweet mornings of spring which I enjoy with my whole heart."]
..。
编辑:成功的输出应该是:
["A wonderful serenity has taken possession of ", "http://www.google.com", "my entire soul,\n like these sweet mornings of spring which I enjoy with my whole heart."]
发布于 2021-02-25 17:34:21
这里一个可行的选择是使用正式的正则表达式迭代器,并搜索以下模式:
\\bhttps?://\\S+\\b|.*?(?=https?://|$)此模式将首先尝试寻找URL,否则它将捕获所有向上的内容,但包括下一个URL或输入的末尾。以下是示例代码:
String input = "A wonderful serenity has taken possession of http://www.google.com my entire soul,\n like these sweet mornings of spring which I enjoy with my whole heart.";
String pattern = "\\bhttps?://\\S+\\b|.*?(?=https?://|$)";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(input);
List matches = new ArrayList<>();
while (m.find()) {
matches.add(m.group());
}
System.out.println(matches);这将打印:
[A wonderful serenity has taken possession of ,
http://www.google.com,
like these sweet mornings of spring which I enjoy with my whole heart., ]https://stackoverflow.com/questions/66364963
复制相似问题