首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >javascript解析html代码提取句子

javascript解析html代码提取句子
EN

Stack Overflow用户
提问于 2022-11-08 14:39:29
回答 1查看 41关注 0票数 0

我的代码html或更长,大约100行:

代码语言:javascript
复制
    <div id="translate">
           <div> <p>Web <b>dictaphone </b> is built using Thanks to Sole for the Oscilloscope code! English texts for 
                    beginners to practice reading and comprehension online and for free.</p> 
                    <p>Practicing your comprehension of <b>written English will</b> both improve your vocabulary and 
                    understanding of <span class="term-highlight">grammar</span> and word order. The texts below are designed to help you develop while 
                    giving you an instant evaluation of your progress.</p>
                  <p>All test went wrong</p>
                  <p>Web application work CH<sub class="pippo">2</sub> M5 only with localhost</p>
            </div>
<div class="code">
              <span>
                beginners to practice reading and comprehension online and for free <b>dictaphone </b>.
              </span>
              <span class="term-highlight">grammar</span>
              <p>All test went wrong</p>
            </div>
</div>

我在下面的代码中分析了这个html:

代码语言:javascript
复制
let infoElementMT = document.getElementById('translate'); 

recurseDomChildren(infoElementMT, 'en');

    export function recurseDomChildren(start, langFrom)
    {
        var nodes;
        if(start.childNodes.length != 0)
        {
            nodes = start.childNodes;
            loopNodeChildren(nodes, langFrom);
        }
    }
    
    function loopNodeChildren(nodes, langFrom)
    {
        var node;
        for(var i=0;i<nodes.length;i++)
        {
          node = nodes[i];
    
            if(node.childNodes)
            {
                recurseDomChildren(node, langFrom);
            }
            if(node.nodeType === 3){
              console.log("NODE text", node)
              //outputNode(node, langFrom);
            }
        }
    }


The result i have it is :

ODE text "Web "
NODE text "dictaphone "
NODE text " is built using Thanks to Sole for the Oscilloscope code! English texts for beginners to practice reading and comprehension online and for free. Practicing your comprehension of "
NODE text "written English will"
NODE text " both improve your vocabulary and understanding of "

HOw --我可以得到句子中粗体标记的结果吗?

代码语言:javascript
复制
     NODE text: "Web <b>dictaphone</b> is built using Thanks to Sole for the Oscilloscope code! English texts for beginners to practice reading and comprehension online and for free.
    
     NODE text: Practicing your comprehension of <b>written English will</b> both improve your vocabulary and understanding of "

NODE text: beginners to practice reading and comprehension online and for free <b>dictaphone </b>.

NODE text: grammar

考虑到html代码实际上要长得多,所以代码必须是递归的。

EN

回答 1

Stack Overflow用户

发布于 2022-11-09 14:05:26

(这不是一个真正的答案,只是评论太长太格式化,不能放在评论框中。)我计划在建立需求后删除此内容。)

我知道你想用bold标签做什么。我的问题是其余的要求。这是一个完全按照您的要求执行的函数,当您向该函数提供输入时,将给出您所要求的确切输出。

代码语言:javascript
复制
const convert = (input) => 
  `     NODE text: "Web <b>dictaphone</b> is built using Thanks to Sole for the Oscilloscope code! English texts for beginners to practice reading and comprehension online and for free.\n    \n     NODE text: Practicing your comprehension of <b>written English will</b> both improve your vocabulary and understanding of "\n\nNODE text: beginners to practice reading and comprehension online and for free <b>dictaphone </b>.\n\nNODE text: grammar    `


const input = `    <div id="translate">\n           <div> <p>Web <b>dictaphone </b> is built using Thanks to Sole for the Oscilloscope code! English texts for \n                    beginners to practice reading and comprehension online and for free.</p> \n                    <p>Practicing your comprehension of <b>written English will</b> both improve your vocabulary and \n                    understanding of <span class="term-highlight">grammar</span> and word order. The texts below are designed to help you develop while \n                    giving you an instant evaluation of your progress.</p>\n                  <p>All test went wrong</p>\n                  <p>Web application work CH<sub class="pippo">2</sub> M5 only with localhost</p>\n            </div>\n<div class="code">\n              <span>\n                beginners to practice reading and comprehension online and for free <b>dictaphone </b>.\n              </span>\n              <span class="term-highlight">grammar</span>\n              <p>All test went wrong</p>\n            </div>\n</div>`

console .log (convert (input))

我知道这不是你想要的。无论您提供什么输入,它都会给出相同的输出;这就是关键所在:我正试图充分收集您的需求。我猜可能有一个语言障碍在起作用,但我不清楚你的输入是如何变成你所要求的输出的。为什么这一行来自您的输入:

代码语言:javascript
复制
                  <p>All test went wrong</p>

没有出现在输出中?这个也一样吗?:

代码语言:javascript
复制
                  <p>Web application work CH<sub class="pippo">2</sub> M5 only with localhost</p>

再往下一点,这个?:

代码语言:javascript
复制
              <p>All test went wrong</p>

因此,很明显,您希望在输出字符串中保持粗体标记的完整性,但还不清楚您还想做什么。据我所知,您所要求的输出仅与您的输入略有关联。如果两者都是正确的,那么您需要解释丢失的节点应该发生什么。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/74362537

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档