首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >从HTML标记中的<li>检索超链接值?

从HTML标记中的<li>检索超链接值?
EN

Stack Overflow用户
提问于 2018-10-04 13:53:44
回答 2查看 66关注 0票数 2

我需要通过正则表达式在HTML标记中获取href值。

代码语言:javascript
复制
<html>
    <head>
  </head>
  <body class="directory">
    <input id="search" type="text" placeholder="Search" autocomplete="off" />
    <div id="wrapper">
      <h1><a href="/">~</a> / <a href="/public">public</a> / <a href="/public/img">img</a> / <a href="/public/img/events">events</a> / <a href="/public/img/events/poster">poster</a> / </h1>
      <ul id="files" class="view-tiles"><li><a href="/public/img/events" class="" title=".."><span class="name">..</span><span class="size"></span><span class="date"></span></a></li>
<li><a href="/public/img/events/poster/2018-09-26-1.PNG" class="" title="2018-09-26-1.PNG"><span class="name">2018-09-26-1.PNG</span><span class="size">1406471</span><span class="date">2018-9-16 18:37:23</span></a></li>
<li><a href="/public/img/events/poster/2018-09-26-2.PNG" class="" title="2018-09-26-2.PNG"><span class="name">2018-09-26-2.PNG</span><span class="size">530859</span><span class="date">2018-9-16 18:37:44</span></a></li>
<li><a href="/public/img/events/poster/2018-09-26-3.PNG" class="" title="2018-09-26-3.PNG"><span class="name">2018-09-26-3.PNG</span><span class="size">551409</span><span class="date">2018-9-16 18:38:24</span></a></li>
<li><a href="/public/img/events/poster/test" class="" title="test"><span class="name">test</span><span class="size">0</span><span class="date">2018-10-4 20:16:58</span></a></li></ul>
    </div>
  </body>
<html>

我想要一个包含

代码语言:javascript
复制
/public/img/events/poster/2018-09-26-1.PNG and 
/public/img/events/poster/2018-09-26-2.PNG and
/public/img/events/poster/2018-09-26-3.PNG.

我用的这个词:

代码语言:javascript
复制
/[<body\sclass="directory">].+[<li><a\shref\s*=\s*\"]([^">]+)\"\s+[class].+[<\/body>]/g

但是我得到的结果是:

代码语言:javascript
复制
<ul id="files" class="view-tiles"><li><a href="/public/img/events" class="" title=".."><span class="name">..</span><span class="size"></span><span class="date"></span></a></li>

<li><a href="/public/img/events/poster/2018-09-26-1.PNG" class="" title="2018-09-26-1.PNG"><span class="name">2018-09-26-1.PNG</span><span class="size">1406471</span><span class="date">2018-9-16 18:37:23</span></a></li>

<li><a href="/public/img/events/poster/2018-09-26-2.PNG" class="" title="2018-09-26-2.PNG"><span class="name">2018-09-26-2.PNG</span><span class="size">530859</span><span class="date">2018-9-16 18:37:44</span></a></li>

<li><a href="/public/img/events/poster/2018-09-26-3.PNG" class="" title="2018-09-26-3.PNG"><span class="name">2018-09-26-3.PNG</span><span class="size">551409</span><span class="date">2018-9-16 18:38:24</span></a></li>

<li><a href="/public/img/events/poster/test" class="" title="test"><span class="name">test</span><span class="size">0</span><span class="date">2018-10-4 20:16:58</span></a></li></ul>

有人能指点我吗?

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2018-10-04 14:12:40

您可以使用这个regex:

/<li[^>]*>[^<]*<a[^>]*href="([^"]+)"/g

然后访问href="([^"]+)捕获组,方法如下所示(假设您使用的是javascript):

代码语言:javascript
复制
    var myString = `<html>
    <head>
  </head>
  <body class="directory">
    <input id="search" type="text" placeholder="Search" autocomplete="off" />
    <div id="wrapper">
      <h1><a href="/">~</a> / <a href="/public">public</a> / <a href="/public/img">img</a> / <a href="/public/img/events">events</a> / <a href="/public/img/events/poster">poster</a> / </h1>
      <ul id="files" class="view-tiles"><li><a href="/public/img/events" class="" title=".."><span class="name">..</span><span class="size"></span><span class="date"></span></a></li>
<li><a href="/public/img/events/poster/2018-09-26-1.PNG" class="" title="2018-09-26-1.PNG"><span class="name">2018-09-26-1.PNG</span><span class="size">1406471</span><span class="date">2018-9-16 18:37:23</span></a></li>
<li><a href="/public/img/events/poster/2018-09-26-2.PNG" class="" title="2018-09-26-2.PNG"><span class="name">2018-09-26-2.PNG</span><span class="size">530859</span><span class="date">2018-9-16 18:37:44</span></a></li>
<li><a href="/public/img/events/poster/2018-09-26-3.PNG" class="" title="2018-09-26-3.PNG"><span class="name">2018-09-26-3.PNG</span><span class="size">551409</span><span class="date">2018-9-16 18:38:24</span></a></li>
<li><a href="/public/img/events/poster/test" class="" title="test"><span class="name">test</span><span class="size">0</span><span class="date">2018-10-4 20:16:58</span></a></li></ul>
    </div>
  </body>
<html>`;

var myRegexp = /<li[^>]*>[^<]*<a[^>]*href="([^"]+)"/g;
match = myRegexp.exec(myString);
while (match != null) {
  // matched text: match[0]
  // match start: match.index
  // capturing group n: match[n]
  console.log(match[1])
  match = myRegexp.exec(myString);
}

在代码示例中,应归功于this answer

更新1

作者要求包括身体标签的匹配。

只是好奇而已。如果要限制标记中的映射范围,如何更新表达式?我更新快车如下,但没有结果。]>^<]href=“(^”+)“*>

使用regex可以做的事情不多,通常不建议使用regexes进行高级html解析。您的方法给您提供了分行符的问题,以及您希望在单个主体中匹配多个li的事实。此外,根据HTML约定,<li>只允许在正文中使用。

如果要这样做,请将其分解为两个步骤,并匹配

代码语言:javascript
复制
    var myString = `<html>
    <head>
    <!-- Not valid HTML, just for testing -->
    <ul id="files" class="view-tiles"><li><a href="/public/img/events" class="" title=".."><span class="name">..</span><span class="size"></span><span class="date"></span></a></li>
    <li><a href="/public/img/events/poster/2018-09-26-1.PNG" class="" title="2018-09-26-1.PNG"><span class="name">2018-09-26-1.PNG</span><span class="size">1406471</span><span class="date">2018-9-16 18:37:23</span></a></li>
    <li><a href="/public/img/events/poster/2018-09-26-2.PNG" class="" title="2018-09-26-2.PNG"><span class="name">2018-09-26-2.PNG</span><span class="size">530859</span><span class="date">2018-9-16 18:37:44</span></a></li>
    <li><a href="/public/img/events/poster/2018-09-26-3.PNG" class="" title="2018-09-26-3.PNG"><span class="name">2018-09-26-3.PNG</span><span class="size">551409</span><span class="date">2018-9-16 18:38:24</span></a></li>
    <li><a href="/public/img/events/poster/test" class="" title="test"><span class="name">test</span><span class="size">0</span><span class="date">2018-10-4 20:16:58</span></a></li></ul>
  </head>
  <body class="directory">
    <input id="search" type="text" placeholder="Search" autocomplete="off" />
    <div id="wrapper">
      <h1><a href="/">~</a> / <a href="/public">public</a> / <a href="/public/img">img</a> / <a href="/public/img/events">events</a> / <a href="/public/img/events/poster">poster</a> / </h1>
      <ul id="files" class="view-tiles"><li><a href="/public/img/events" class="" title=".."><span class="name">..</span><span class="size"></span><span class="date"></span></a></li>
<li><a href="/public/img/events/poster/2018-09-26-1.PNG" class="" title="2018-09-26-1.PNG"><span class="name">2018-09-26-1.PNG</span><span class="size">1406471</span><span class="date">2018-9-16 18:37:23</span></a></li>
<li><a href="/public/img/events/poster/2018-09-26-2.PNG" class="" title="2018-09-26-2.PNG"><span class="name">2018-09-26-2.PNG</span><span class="size">530859</span><span class="date">2018-9-16 18:37:44</span></a></li>
<li><a href="/public/img/events/poster/2018-09-26-3.PNG" class="" title="2018-09-26-3.PNG"><span class="name">2018-09-26-3.PNG</span><span class="size">551409</span><span class="date">2018-9-16 18:38:24</span></a></li>
<li><a href="/public/img/events/poster/test" class="" title="test"><span class="name">test</span><span class="size">0</span><span class="date">2018-10-4 20:16:58</span></a></li></ul>
    </div>
  </body>
<html>`;

var bodyRegex = /<\s*body.*>([\s\S]*)<\s*\/body>/g;
var bodyString = bodyRegex.exec(myString)[0];

var myRegexp = /<li[^>]*>[^<]*<a[^>]*href="([^"]+)"/g;
match = myRegexp.exec(bodyString);
while (match != null) {
  // matched text: match[0]
  // match start: match.index
  // capturing group n: match[n]
  console.log(match[1])
  match = myRegexp.exec(bodyString);
}

票数 0
EN

Stack Overflow用户

发布于 2018-10-04 14:01:10

它必须是正则表达式吗?这个解决方案似乎奏效了。

代码语言:javascript
复制
const links = document.querySelectorAll('#files a');
  links.forEach(link => {
    console.log(link.getAttribute('href'));
  })
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/52648529

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档