我有一个字符串,包含相当多的文本,我想搜索模式匹配。对于我找到的每个匹配项,我都希望从输入字符串中提取它,并将其存储在一个列表或一个String[]中,以便进行进一步的排序。
为此,我尝试使用Java正则表达式搜索我想要的模式,然后将这些匹配打印到我的控制台。但是我显然没有正确地处理我的RegEx,因为不仅我的匹配被返回,而且从输入字符串开始到我的RegEx的最终匹配的所有发现都被返回。
我拼命地想要找到一种方法,只返回我的RegEx匹配,没有任何其他东西!有人能提供一个可行的解决方案吗?我会非常感激,因为我现在被困住了!
要快速查看我正在做的事情,请参阅以下保存的RegEx:搜索30分钟等待时间
否则,下面是我的代码和我试图排序的整个输入字符串,作为我正在处理的数据的一个很好的例子:
final String regex = "(name=\"(.*?) 30 Minute Wait,)";
final String input1 = "name=\"Barnstormer, Fantasyland, 05 Minute Wait,name=\"Big Thunder Mountain Railroad, Frontierland, 05 Minute Wait,name=\"Celebrity Spotlight, Echo Lake, 05 Minute Wait,name=\"DINOSAUR, DinoLand U.S.A., 05 Minute Wait,name=\"Expedition Everest - Legend of the Forbidden Mountain, Asia, 05 Minute Wait,name=\"Gran Fiesta Tour StarringThree Caballeros, World Showcase, 05 Minute Wait,name=\"Great Movie Ride, Hollywood Boulevard, 05 Minute Wait,name=\"Mad Tea Party, Fantasyland, 05 Minute Wait,name=\"Meet Chewbacca at Star Wars Launch Bay, Animation Courtyard, 05 Minute Wait,name=\"Seas with Nemo & Friends, Future World, 05 Minute Wait,name=\"Star Wars Launch Bay Theater, Animation Courtyard, 05 Minute Wait,name=\"TriceraTop Spin, DinoLand U.S.A., 05 Minute Wait,name=\"Buzz Lightyear's Space Ranger Spin, Tomorrowland, 10 Minute Wait,name=\"Dumbo the Flying Elephant, Fantasyland, 10 Minute Wait,name=\"Encounter Kylo Ren at Star Wars Launch Bay, Animation Courtyard, 10 Minute Wait,name=\"it's a small world, Fantasyland, 10 Minute Wait,name=\"Kilimanjaro Safaris, Africa, 10 Minute Wait,name=\"Magic Carpets of Aladdin, Adventureland, 10 Minute Wait,name=\"Many Adventures of Winnie the Pooh, Fantasyland, 10 Minute Wait,name=\"Mickey and Minnie Starring in Red Carpet Dreams, Commissary Lane, 10 Minute Wait,name=\"Mickey's PhilharMagic, Fantasyland, 10 Minute Wait,name=\"Muppet*Vision 3D, Muppet Courtyard, 10 Minute Wait,name=\"Pirates of the Caribbean, Adventureland, 10 Minute Wait,name=\"Primeval Whirl, DinoLand U.S.A., 10 Minute Wait,name=\"Soarin', Future World, 10 Minute Wait,name=\"Spaceship Earth, Future World, 10 Minute Wait,name=\"Star Tours –Adventures Continue, Echo Lake, 10 Minute Wait,name=\"Toy Story Mania!, Pixar Place, 10 Minute Wait,name=\"Twilight Zone Tower of Terror™, Sunset Boulevard, 10 Minute Wait,name=\"Under the Sea ~ Journey ofLittle Mermaid, Fantasyland, 10 Minute Wait,name=\"Jungle Cruise, Adventureland, 15 Minute Wait,name=\"Mission: SPACE, Future World, 15 Minute Wait,name=\"Rock 'n' Roller Coaster Starring Aerosmith, Sunset Boulevard, 15 Minute Wait,name=\"Splash Mountain, Frontierland, 15 Minute Wait,name=\"Astro Orbiter, Tomorrowland, 20 Minute Wait,name=\"Meet Disney Pals at the Epcot Character Spot, Future World, 20 Minute Wait,name=\"Monsters, Inc. Laugh Floor, Tomorrowland, 20 Minute Wait,name=\"Meet Rapunzel and Tiana at Princess Fairytale Hall, Fantasyland, 25 Minute Wait,name=\"Space Mountain, Tomorrowland, 25 Minute Wait,name=\"Enchanted Tales with Belle, Fantasyland, 30 Minute Wait,name=\"Meet Cinderella and Elena at Princess Fairytale Hall, Fantasyland, 30 Minute Wait,name=\"Meet Tinker Bell at Town Square Theater, Main Street, U.S.A., 30 Minute Wait,name=\"Peter Pan's Flight, Fantasyland, 30 Minute Wait,name=\"Test Track, Future World, 30 Minute Wait,name=\"Meet Anna and Elsa at Royal Sommerhus, World Showcase, 40 Minute Wait,name=\"Tomorrowland Speedway, Tomorrowland, 40 Minute Wait,name=\"Frozen Ever After, World Showcase, 45 Minute Wait,name=\"Meet Mickey Mouse at Town Square Theater, Main Street, U.S.A., 55 Minute Wait,name=\"Meet Ariel at Her Grotto, Fantasyland, 65 Minute Wait,name=\"Seven Dwarfs Mine Train, Fantasyland, 80 Minute Wait,name=\"Haunted Mansion, Liberty Square, Temporarily Closed,name=\"Kali River Rapids, Asia, Temporarily Closed\n";
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(input1);
//Create a List String for storing the Wait Time matches that we find
List<String> waitTimesSorted = new ArrayList<String>();
//Create a loop that the matcher uses to search through the input string for our Wait Times
while (matcher.find()) {
//Add the matching wait times we find to a List String
waitTimesSorted.add(matcher.group());
}
//Print our matches to the console
System.out.println(waitTimesSorted);它的输出会找到我30分钟的等待时间,但也会返回输入字符串中找到的所有内容,直到我的匹配!
[name="Barnstormer, Fantasyland, 05 Minute Wait,name="Big Thunder Mountain Railroad, Frontierland, 05 Minute Wait,name="Celebrity Spotlight, Echo Lake, 05 Minute Wait,name="DINOSAUR, DinoLand U.S.A., 05 Minute Wait,name="Expedition Everest - Legend of the Forbidden Mountain, Asia, 05 Minute Wait,name="Gran Fiesta Tour StarringThree Caballeros, World Showcase, 05 Minute Wait,name="Great Movie Ride, Hollywood Boulevard, 05 Minute Wait,name="Mad Tea Party, Fantasyland, 05 Minute Wait,name="Meet Chewbacca at Star Wars Launch Bay, Animation Courtyard, 05 Minute Wait,name="Seas with Nemo & Friends, Future World, 05 Minute Wait,name="Star Wars Launch Bay Theater, Animation Courtyard, 05 Minute Wait,name="TriceraTop Spin, DinoLand U.S.A., 05 Minute Wait,name="Buzz Lightyear's Space Ranger Spin, Tomorrowland, 10 Minute Wait,name="Dumbo the Flying Elephant, Fantasyland, 10 Minute Wait,name="Encounter Kylo Ren at Star Wars Launch Bay, Animation Courtyard, 10 Minute Wait,name="it's a small world, Fantasyland, 10 Minute Wait,name="Kilimanjaro Safaris, Africa, 10 Minute Wait,name="Magic Carpets of Aladdin, Adventureland, 10 Minute Wait,name="Many Adventures of Winnie the Pooh, Fantasyland, 10 Minute Wait,name="Mickey and Minnie Starring in Red Carpet Dreams, Commissary Lane, 10 Minute Wait,name="Mickey's PhilharMagic, Fantasyland, 10 Minute Wait,name="Muppet*Vision 3D, Muppet Courtyard, 10 Minute Wait,name="Pirates of the Caribbean, Adventureland, 10 Minute Wait,name="Primeval Whirl, DinoLand U.S.A., 10 Minute Wait,name="Soarin', Future World, 10 Minute Wait,name="Spaceship Earth, Future World, 10 Minute Wait,name="Star Tours –Adventures Continue, Echo Lake, 10 Minute Wait,name="Toy Story Mania!, Pixar Place, 10 Minute Wait,name="Twilight Zone Tower of Terror™, Sunset Boulevard, 10 Minute Wait,name="Under the Sea ~ Journey ofLittle Mermaid, Fantasyland, 10 Minute Wait,name="Jungle Cruise, Adventureland, 15 Minute Wait,name="Mission: SPACE, Future World, 15 Minute Wait,name="Rock 'n' Roller Coaster Starring Aerosmith, Sunset Boulevard, 15 Minute Wait,name="Splash Mountain, Frontierland, 15 Minute Wait,name="Astro Orbiter, Tomorrowland, 20 Minute Wait,name="Meet Disney Pals at the Epcot Character Spot, Future World, 20 Minute Wait,name="Monsters, Inc. Laugh Floor, Tomorrowland, 20 Minute Wait,name="Meet Rapunzel and Tiana at Princess Fairytale Hall, Fantasyland, 25 Minute Wait,name="Space Mountain, Tomorrowland, 25 Minute Wait,name="Enchanted Tales with Belle, Fantasyland, 30 Minute Wait,, name="Meet Cinderella and Elena at Princess Fairytale Hall, Fantasyland, 30 Minute Wait,, name="Meet Tinker Bell at Town Square Theater, Main Street, U.S.A., 30 Minute Wait,, name="Peter Pan's Flight, Fantasyland, 30 Minute Wait,, name="Test Track, Future World, 30 Minute Wait,]我想要的是这样的东西:
name="Enchanted Tales with Belle, Fantasyland, 30 Minute Wait,, name="Meet Cinderella and Elena at Princess Fairytale Hall, Fantasyland, 30 Minute Wait,, name="Meet Tinker Bell at Town Square Theater, Main Street, U.S.A., 30 Minute Wait,, name="Peter Pan's Flight, Fantasyland, 30 Minute Wait,, name="Test Track, Future World, 30 Minute Wait,]有没有办法只找回我要找的东西?
我确实需要等待时间的精确匹配(我在这里使用30分钟作为一个示例),因为我试图将等待时间组划分为它们的等待时间(5分钟等待、10分钟等待、15分钟等待等等),然后对它们进行排序以确保每个组按字母顺序排列。所以,我不会在RegEx中寻找通用数字,我对等待时间非常具体,实际上有一系列预期的等待时间来生成RegEx,但这是另一回事,而不是问题。
发布于 2017-05-12 09:00:53
您的问题是,.*?也将遍历任何其他name=",使其匹配得太多。
为了防止这种情况,只需简单地排除=或"就可以防止这种情况发生。
另外,您不需要捕获整个匹配的表达式。无论如何,这是作为捕获组0完成的。
因此,regex name="([^"]*?) 30 Minute Wait,会这样做。
作为Java,它将是"name=\"([^\"]*?) 30 Minute Wait,"。
见regex101。
发布于 2017-05-12 09:38:18
发布于 2017-05-12 08:49:30
正则表达式的问题是,将“捕获组”放在整个输入周围,使用索引调用group()返回完全匹配,包括.*,这意味着等待时间之前的任何内容。
如果您将正则表达式更改为"name=\"(.*?) (30 Minute Wait),"并调用matcher.group(2),它将返回"30 Minute Wait"。
查看group(int)方法的Javadoc:https://docs.oracle.com/javase/8/docs/api/java/util/regex/Matcher.html#group-int-。
哦,您可能希望用"30"替换正则表达式中的"\\d+",以便找到任何数字,而不仅仅是30。
https://stackoverflow.com/questions/43933139
复制相似问题