文章/答案/技术大牛

发布

社区首页 >问答首页 >xQuery FLWOR如何计算出现的单词的频率

问xQuery FLWOR如何计算出现的单词的频率
EN

Stack Overflow用户

提问于 2020-12-10 05:14:24

回答 1查看 85关注 0票数 0

我正在查看一个XML文件，试图查找单词"has“之后的单词，并且正在尝试计算每个单词的出现频率。目前，我已经找到了单词"has“之后的所有单词，但这包含重复的单词。我如何才能做到这一点，所以我将“后继者”的单词分组并对每个单词进行计数？

我使用的是xQuery 1.0

XML文件的代码片段：

-<s n="2">

<w pos="CONJ" hw="that" c5="CJT-DT0">That </w>

<w pos="PRON" hw="you" c5="PNP">you</w>

<w pos="VERB" hw="be" c5="VBB">'re </w>

<w pos="VERB" hw="greet" c5="VVN">greeted </w>

<w pos="PREP" hw="in" c5="PRP">in </w>

<w pos="ART" hw="the" c5="AT0">the </w>

<w pos="ADJ" hw="first" c5="ORD">first </w>

<w pos="SUBST" hw="place" c5="NN1-VVB">place </w>

<w pos="PREP" hw="with" c5="PRP">with </w>

<w pos="UNC" hw="erm" c5="UNC">erm </w>

<w pos="ADV" hw="either" c5="AV0">either </w>

<w pos="SUBST" hw="silence" c5="NN1-VVB">silence </w>

<w pos="CONJ" hw="or" c5="CJC">or </w>

<w pos="ADJ" hw="some" c5="DT0">some </w>

<w pos="ADJ" hw="vague" c5="AJ0">vague </w>

<w pos="CONJ" hw="and" c5="CJC">and </w>

<w pos="ADV" hw="not" c5="XX0">not </w>

<w pos="ADV" hw="singularly" c5="AV0">singularly </w>

<w pos="ADJ" hw="hopeful" c5="AJ0">hopeful </w>

<w pos="SUBST" hw="mutter" c5="NN1-VVB">mutter</w>

<c c5="PUN">, </c>

<w pos="CONJ" hw="but" c5="CJC">but </w>

<w pos="ADV" hw="more" c5="AV0">more </w>

<w pos="ADV" hw="importantly" c5="AV0">importantly </w>

<w pos="PREP" hw="with" c5="PRP">with </w>

<w pos="ART" hw="a" c5="AT0">a </w>

<w pos="ADJ" hw="curious" c5="AJ0">curious </w>

<w pos="SUBST" hw="facial" c5="NN1-AJ0">facial </w>

<w pos="SUBST" hw="expression" c5="NN1">expression </w>

<w pos="VERB" hw="mingle" c5="VVD-VVN">mingled </w>

<w pos="PREP" hw="between" c5="PRP">between </w>

<w pos="UNC" hw="erm" c5="UNC">erm </w>

<w pos="SUBST" hw="dread" c5="NN1">dread </w>

<w pos="CONJ" hw="and" c5="CJC">and </w>

<w pos="SUBST" hw="contempt" c5="NN1">contempt</w>

<c c5="PUN">, </c>

<w pos="SUBST" hw="sort" c5="NN1">sort </w>

<w pos="PREP" hw="of" c5="PRF">of </w>

<w pos="SUBST" hw="thing" c5="NN1">thing </w>

<w pos="PRON" hw="you" c5="PNP">you</w>

<w pos="VERB" hw="would" c5="VM0">'d </w>

<w pos="VERB" hw="expect" c5="VVI">expect </w>


-<mw c5="CJS">

<w pos="PREP" hw="as" c5="PRP">as </w>

<w pos="CONJ" hw="if" c5="CJS">if </w>

</mw>

<w pos="PRON" hw="you" c5="PNP">you</w>

<w pos="VERB" hw="have" c5="VHD">'d </w>

<w pos="VERB" hw="say" c5="VVN">said </w>

<w pos="PRON" hw="you" c5="PNP">you </w>

<w pos="VERB" hw="be" c5="VBD">were </w>

<w pos="ART" hw="a" c5="AT0">a </w>

<w pos="SUBST" hw="sorcerer" c5="NN1">sorcerer</w>

<c c5="PUN">.</c>

</s>


-<s n="3">

<vocal desc="laugh"/>

<w pos="PRON" hw="i" c5="PNP">I </w>

<w pos="VERB" hw="find" c5="VVB">find </w>

<w pos="PRON" hw="myself" c5="PNX">myself </w>

<w pos="ART" hw="the" c5="AT0">the </w>

<w pos="ADJ" hw="only" c5="AJ0">only </w>

<w pos="SUBST" hw="thing" c5="NN1">thing </w>

<w pos="VERB" hw="be" c5="VBZ">is </w>

<w pos="PREP" hw="to" c5="TO0">to </w>

<w pos="VERB" hw="change" c5="VVI">change </w>

<w pos="ART" hw="the" c5="AT0">the </w>

<w pos="SUBST" hw="subject" c5="NN1">subject</w>

<c c5="PUN">.</c>

</s>


-<s n="4">

<w pos="ADJ" hw="this" c5="DT0">This </w>

<w pos="UNC" hw="erm" c5="UNC">erm </w>

<w pos="SUBST" hw="reaction" c5="NN1">reaction </w>

<w pos="PREP" hw="to" c5="PRP">to </w>

<w pos="ART" hw="the" c5="AT0">the </w>

<w pos="SUBST" hw="disclosure" c5="NN1">disclosure </w>

<w pos="PRON" hw="i" c5="PNP">I </w>

<w pos="VERB" hw="think" c5="VVB">think</w>

<w pos="VERB" hw="be" c5="VBZ">'s </w>

<w pos="ADJ" hw="exaggerated" c5="AJ0-VVN">exaggerated </w>

<w pos="CONJ" hw="but" c5="CJC">but </w>

<w pos="PREP" hw="on" c5="PRP">on </w>

<w pos="ART" hw="the" c5="AT0">the </w>

<w pos="ADJ" hw="other" c5="AJ0">other </w>

<w pos="SUBST" hw="hand" c5="NN1">hand </w>

<w pos="PRON" hw="there" c5="EX0">there</w>

<w pos="VERB" hw="be" c5="VBZ">'s </w>

<w pos="PRON" hw="something" c5="PNI">something </w>

<w pos="PREP" hw="in" c5="PRP">in </w>

<w pos="PRON" hw="it" c5="PNP">it</w>

<c c5="PUN">.</c>

</s>

我当前的代码是获取目标单词‘has’之后的所有单词：

<html>
<body>
<table border='1'>
<tr><td>Target</td><td>Successor</td></tr>

{
for $targetword in (collection("./?select=*xml"))//s//w
where lower-case(normalize-space($targetword))="has"
let $successor := lower-case(normalize-space($targetword/following-sibling::w[1]))
return <tr><td>{data($targetword)}</td><td>{$successor}</td></tr>
}
</table>
</body>
</html>

任何帮助我们都将不胜感激。

xml

xslt

xquery

frequency

flwor

回答 1

Stack Overflow用户

发布于 2020-12-10 05:56:33

我正在使用BaseX。

您需要向FLWOR expresssion添加分组

XQuery

xquery version "1.0";

<html>
<body>
<table border='1'>
<thead>
  <tr><th>Target</th><th>Successor</th><th>Rank</th></tr>
</thead>
<tbody>
{
  for $targetword in doc("e:\Temp\Hassan_Grouping.xml")//w
  where lower-case(normalize-space($targetword))="you"
  let $successor := lower-case(normalize-space($targetword/following-sibling::w[1]))
  group by $successor
  return <tr>
      <td>{data($targetword[1])}</td>
      <td>{$successor}</td>
      <td>{count($targetword)}</td>
    </tr>
}
</tbody></table>
</body>
</html>

输出

<html>
  <body>
    <table border="1">
      <thead>
        <tr>
          <th>Target</th>
          <th>Successor</th>
          <th>Rank</th>
        </tr>
      </thead>
      <tbody>
        <tr>
          <td>you</td>
          <td>'re</td>
          <td>1</td>
        </tr>
        <tr>
          <td>you</td>
          <td>'d</td>
          <td>2</td>
        </tr>
        <tr>
          <td>you</td>
          <td>were</td>
          <td>1</td>
        </tr>
      </tbody>
    </table>
  </body>
</html>

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/65225006

复制

相似问题

问xQuery FLWOR如何计算出现的单词的频率
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问xQuery FLWOR如何计算出现的单词的频率EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问xQuery FLWOR如何计算出现的单词的频率
EN