首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >xQuery FLWOR如何计算出现的单词的频率

xQuery FLWOR如何计算出现的单词的频率
EN

Stack Overflow用户
提问于 2020-12-10 05:14:24
回答 1查看 85关注 0票数 0

我正在查看一个XML文件,试图查找单词"has“之后的单词,并且正在尝试计算每个单词的出现频率。目前,我已经找到了单词"has“之后的所有单词,但这包含重复的单词。我如何才能做到这一点,所以我将“后继者”的单词分组并对每个单词进行计数?

我使用的是xQuery 1.0

XML文件的代码片段:

代码语言:javascript
复制
-<s n="2">

<w pos="CONJ" hw="that" c5="CJT-DT0">That </w>

<w pos="PRON" hw="you" c5="PNP">you</w>

<w pos="VERB" hw="be" c5="VBB">'re </w>

<w pos="VERB" hw="greet" c5="VVN">greeted </w>

<w pos="PREP" hw="in" c5="PRP">in </w>

<w pos="ART" hw="the" c5="AT0">the </w>

<w pos="ADJ" hw="first" c5="ORD">first </w>

<w pos="SUBST" hw="place" c5="NN1-VVB">place </w>

<w pos="PREP" hw="with" c5="PRP">with </w>

<w pos="UNC" hw="erm" c5="UNC">erm </w>

<w pos="ADV" hw="either" c5="AV0">either </w>

<w pos="SUBST" hw="silence" c5="NN1-VVB">silence </w>

<w pos="CONJ" hw="or" c5="CJC">or </w>

<w pos="ADJ" hw="some" c5="DT0">some </w>

<w pos="ADJ" hw="vague" c5="AJ0">vague </w>

<w pos="CONJ" hw="and" c5="CJC">and </w>

<w pos="ADV" hw="not" c5="XX0">not </w>

<w pos="ADV" hw="singularly" c5="AV0">singularly </w>

<w pos="ADJ" hw="hopeful" c5="AJ0">hopeful </w>

<w pos="SUBST" hw="mutter" c5="NN1-VVB">mutter</w>

<c c5="PUN">, </c>

<w pos="CONJ" hw="but" c5="CJC">but </w>

<w pos="ADV" hw="more" c5="AV0">more </w>

<w pos="ADV" hw="importantly" c5="AV0">importantly </w>

<w pos="PREP" hw="with" c5="PRP">with </w>

<w pos="ART" hw="a" c5="AT0">a </w>

<w pos="ADJ" hw="curious" c5="AJ0">curious </w>

<w pos="SUBST" hw="facial" c5="NN1-AJ0">facial </w>

<w pos="SUBST" hw="expression" c5="NN1">expression </w>

<w pos="VERB" hw="mingle" c5="VVD-VVN">mingled </w>

<w pos="PREP" hw="between" c5="PRP">between </w>

<w pos="UNC" hw="erm" c5="UNC">erm </w>

<w pos="SUBST" hw="dread" c5="NN1">dread </w>

<w pos="CONJ" hw="and" c5="CJC">and </w>

<w pos="SUBST" hw="contempt" c5="NN1">contempt</w>

<c c5="PUN">, </c>

<w pos="SUBST" hw="sort" c5="NN1">sort </w>

<w pos="PREP" hw="of" c5="PRF">of </w>

<w pos="SUBST" hw="thing" c5="NN1">thing </w>

<w pos="PRON" hw="you" c5="PNP">you</w>

<w pos="VERB" hw="would" c5="VM0">'d </w>

<w pos="VERB" hw="expect" c5="VVI">expect </w>


-<mw c5="CJS">

<w pos="PREP" hw="as" c5="PRP">as </w>

<w pos="CONJ" hw="if" c5="CJS">if </w>

</mw>

<w pos="PRON" hw="you" c5="PNP">you</w>

<w pos="VERB" hw="have" c5="VHD">'d </w>

<w pos="VERB" hw="say" c5="VVN">said </w>

<w pos="PRON" hw="you" c5="PNP">you </w>

<w pos="VERB" hw="be" c5="VBD">were </w>

<w pos="ART" hw="a" c5="AT0">a </w>

<w pos="SUBST" hw="sorcerer" c5="NN1">sorcerer</w>

<c c5="PUN">.</c>

</s>


-<s n="3">

<vocal desc="laugh"/>

<w pos="PRON" hw="i" c5="PNP">I </w>

<w pos="VERB" hw="find" c5="VVB">find </w>

<w pos="PRON" hw="myself" c5="PNX">myself </w>

<w pos="ART" hw="the" c5="AT0">the </w>

<w pos="ADJ" hw="only" c5="AJ0">only </w>

<w pos="SUBST" hw="thing" c5="NN1">thing </w>

<w pos="VERB" hw="be" c5="VBZ">is </w>

<w pos="PREP" hw="to" c5="TO0">to </w>

<w pos="VERB" hw="change" c5="VVI">change </w>

<w pos="ART" hw="the" c5="AT0">the </w>

<w pos="SUBST" hw="subject" c5="NN1">subject</w>

<c c5="PUN">.</c>

</s>


-<s n="4">

<w pos="ADJ" hw="this" c5="DT0">This </w>

<w pos="UNC" hw="erm" c5="UNC">erm </w>

<w pos="SUBST" hw="reaction" c5="NN1">reaction </w>

<w pos="PREP" hw="to" c5="PRP">to </w>

<w pos="ART" hw="the" c5="AT0">the </w>

<w pos="SUBST" hw="disclosure" c5="NN1">disclosure </w>

<w pos="PRON" hw="i" c5="PNP">I </w>

<w pos="VERB" hw="think" c5="VVB">think</w>

<w pos="VERB" hw="be" c5="VBZ">'s </w>

<w pos="ADJ" hw="exaggerated" c5="AJ0-VVN">exaggerated </w>

<w pos="CONJ" hw="but" c5="CJC">but </w>

<w pos="PREP" hw="on" c5="PRP">on </w>

<w pos="ART" hw="the" c5="AT0">the </w>

<w pos="ADJ" hw="other" c5="AJ0">other </w>

<w pos="SUBST" hw="hand" c5="NN1">hand </w>

<w pos="PRON" hw="there" c5="EX0">there</w>

<w pos="VERB" hw="be" c5="VBZ">'s </w>

<w pos="PRON" hw="something" c5="PNI">something </w>

<w pos="PREP" hw="in" c5="PRP">in </w>

<w pos="PRON" hw="it" c5="PNP">it</w>

<c c5="PUN">.</c>

</s>

我当前的代码是获取目标单词‘has’之后的所有单词:

代码语言:javascript
复制
<html>
<body>
<table border='1'>
<tr><td>Target</td><td>Successor</td></tr>

{
for $targetword in (collection("./?select=*xml"))//s//w
where lower-case(normalize-space($targetword))="has"
let $successor := lower-case(normalize-space($targetword/following-sibling::w[1]))
return <tr><td>{data($targetword)}</td><td>{$successor}</td></tr>
}
</table>
</body>
</html>

任何帮助我们都将不胜感激。

EN

回答 1

Stack Overflow用户

发布于 2020-12-10 05:56:33

我正在使用BaseX。

您需要向FLWOR expresssion添加分组

XQuery

代码语言:javascript
复制
xquery version "1.0";

<html>
<body>
<table border='1'>
<thead>
  <tr><th>Target</th><th>Successor</th><th>Rank</th></tr>
</thead>
<tbody>
{
  for $targetword in doc("e:\Temp\Hassan_Grouping.xml")//w
  where lower-case(normalize-space($targetword))="you"
  let $successor := lower-case(normalize-space($targetword/following-sibling::w[1]))
  group by $successor
  return <tr>
      <td>{data($targetword[1])}</td>
      <td>{$successor}</td>
      <td>{count($targetword)}</td>
    </tr>
}
</tbody></table>
</body>
</html>

输出

代码语言:javascript
复制
<html>
  <body>
    <table border="1">
      <thead>
        <tr>
          <th>Target</th>
          <th>Successor</th>
          <th>Rank</th>
        </tr>
      </thead>
      <tbody>
        <tr>
          <td>you</td>
          <td>'re</td>
          <td>1</td>
        </tr>
        <tr>
          <td>you</td>
          <td>'d</td>
          <td>2</td>
        </tr>
        <tr>
          <td>you</td>
          <td>were</td>
          <td>1</td>
        </tr>
      </tbody>
    </table>
  </body>
</html>
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/65225006

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档