有关CYK算法XSLT的任何信息,请查看下面的链接: 有两个输入xml,如下所示,我必须在xslt中传递sentance.xml,然后根据每个Rule.xml中的单词,在运行时从Rule.xml文件中读取值,然后生成下面给出的新XML。 仅使用XSLT、XPath和XML,不使用任何其他语言或关键字。
algorithm
1) sentance.xml
<?xml version="1.0" encoding="UTF-8"?>
<sentances>
<s>dog bark</s>
<s>cat drink milk</s>
</sentances>1) sentance.xml
<?xml version="1.0" encoding="UTF-8"?>
<allrules>
<rules>
<rule cat="s">
<rulechild cat="np"/>
<rulechild cat="vp"/>
</rule>
<rule cat="vp">
<rulechild cat="vt"/>
<rulechild cat="np"/>
</rule>
<rule cat="vp">
<rulechild cat="vi"/>
</rule>
</rules>
<words>
<word cat="vi">bark</word>
<word cat="vt">drink</word>
<word cat="pn">dog</word>
<word cat="pn">cat</word>
<word cat="pn">milk</word>
</words>
</allrules>OutPut XML应该如下所示:
<trees>
<tree>
<sentace>dog bark</sentace>
<node cat="s">
<node cat="np">
<word cat="pn">dog</word>
</node>
<node cat="vp">
<word cat="vi">bark</word>
</node>
</node>
</tree>
<tree>
<sentace>cat drink milk</sentace>
<node cat="s">
<node cat="np">
<word cat="pn">cat</word>
</node>
<node cat="vp">
<word cat="vt">drink</word>
<node cat="np">
<word cat="pn">milk</word>
</node>
</node>
</node>
</tree>
是否有可能实现CYK算法并使用XSLT生成上述输出
发布于 2015-10-29 05:34:27
这里有一个解决方案应该接近你所要求的。您没有指定XSLT版本。我已经在样式表中嵌入了规则和符号,但是您可以很容易地调整它们,使它们成为外部文档。
如果XSLT3.0对您不可用,则可以用尾尾递归替换折叠左()。
这个输入文档.
<sentences>
<sentence>dog bark</sentence>
<sentence>cat drink milk</sentence>
</sentences>...当输入到XSLT3.0样式表时.
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:fn="http://www.w3.org/2005/xpath-functions"
xmlns:so="http://stackoverflow.com/questions/33340967"
version="3.0"
exclude-result-prefixes="xsl xs fn so">
<xsl:output encoding="utf-8" omit-xml-declaration="yes" indent="yes" />
<xsl:strip-space elements="*" />
<xsl:variable name="rules" as="element(rule)*">
<rule cat="s">
<!-- All rules have precisely 2 children. -->
<rulechild cat="np"/>
<rulechild cat="vp"/>
</rule>
<rule cat="vp">
<rulechild cat="vt"/>
<rulechild cat="np"/>
</rule>
</xsl:variable>
<xsl:variable name="words" as="element(word)+">
<word cat="vi">bark</word>
<word cat="vp">bark</word>
<word cat="vt">drink</word>
<word cat="np">dog</word>
<word cat="np">cat</word>
<word cat="np">milk</word>
</xsl:variable>
<!--
The n'th analysis contains the CYK analysis for symbol sequences of length n.
Let their be s symbols in the sentence.
analysis[1] has s children.
analysis[s] has one child.
analysis[n] has s - n + 1 children
The children of analysis are node and only node.
node element represents a node in CYK analysis. This can either be a word or a string of symbols.
The index of the node within its parent analysis corresponds to the start symbol.
This index is equal to the index of the word within $words, of the starting word.
node has any number of children, but the only type this can be is permutation.
permutation represents a possible value for the node content, the competing alternatives
being all the sibling permutations. Thus if a node has no permutations, there is no
possiblity of a sequence of the given length being a correct grammar at that position
in the sentence.
Each permuation either has as children: 1 word; or 2 nodes.
The permuations in the first row (analysis[1] are all of the word type.
Subsequent rows have permutations of any type.
words and permutations all have an attribute cat, which is the symbol.
-->
<xsl:function name="so:analysis-1" as="element(analysis)">
<!-- Do the first row of CYK. -->
<xsl:param name="sentence" as="xs:string" />
<analysis>
<xsl:analyze-string select="$sentence" regex="\w+">
<xsl:matching-substring>
<xsl:variable name="word" select="." />
<node>
<xsl:for-each select="$words[. eq $word]">
<permutation cat="{@cat}">
<word cat="{@cat}"><xsl:value-of select="$word" /></word>
</permutation>
</xsl:for-each>
</node>
</xsl:matching-substring>
</xsl:analyze-string>
</analysis>
</xsl:function>
<xsl:function name="so:next-analysis" as="element(analysis)">
<!-- Given the first n rows of CYK, compute the n+1'th row. -->
<xsl:param name="rows" as="element(analysis)+" />
<xsl:variable name="word-count" select="count( $rows[1]/node)" as="xs:integer" />
<xsl:variable name="node-count" select="count( $rows[last()]/node) - 1" as="xs:integer" />
<xsl:variable name="seq-len" select="$word-count - $node-count + 1" as="xs:integer" />
<analysis>
<xsl:for-each select="1 to $node-count">
<xsl:variable name="index" select="." as="xs:integer" />
<node>
<xsl:for-each select="$rules">
<xsl:variable name="rule" as="element(rule)" select="." />
<xsl:for-each select="
for $sub-a in 1 to $seq-len - 1 return $sub-a
[$rows[$sub-a ]/node[$index ][permutation/@cat = $rule/rulechild[1]/@cat]]
[$rows[$seq-len - $sub-a]/node[$index + $sub-a][permutation/@cat = $rule/rulechild[2]/@cat]]">
<xsl:variable name="sub-a" select="." as="xs:integer" />
<permutation cat="{$rule/@cat}">
<node>
<xsl:copy-of select="$rows[$sub-a]/node[$index]/permutation[@cat eq $rule/rulechild[1]/@cat]" />
</node>
<node>
<xsl:copy-of select="$rows[$seq-len - $sub-a]/node[$index + $sub-a]/permutation[@cat eq $rule/rulechild[2]/@cat]" />
</node>
</permutation>
</xsl:for-each>
</xsl:for-each>
</node>
</xsl:for-each>
</analysis>
</xsl:function>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates />
</xsl:copy>
</xsl:template>
<xsl:template match="sentences">
<trees>
<xsl:apply-templates />
</trees>
</xsl:template>
<xsl:template match="sentence">
<tree>
<xsl:variable name="first-row" select="so:analysis-1(.)" />
<xsl:variable name="word-count" select="count( $first-row/node)" as="xs:integer" />
<xsl:sequence select="fold-left( 2 to $word-count, $first-row, function($a, $b) { $a, so:next-analysis(a) })
[last()]" />
</tree>
</xsl:template>
</xsl:stylesheet>...会产生这个输出.
<trees>
<tree>
<analysis>
<node>
<permutation cat="s">
<node>
<permutation cat="np">
<word cat="np">dog</word>
</permutation>
</node>
<node>
<permutation cat="vp">
<word cat="vp">bark</word>
</permutation>
</node>
</permutation>
</node>
</analysis>
</tree>
<tree>
<analysis>
<node>
<permutation cat="s">
<node>
<permutation cat="np">
<word cat="np">cat</word>
</permutation>
</node>
<node>
<permutation cat="vp">
<node>
<permutation cat="vt">
<word cat="vt">drink</word>
</permutation>
</node>
<node>
<permutation cat="np">
<word cat="np">milk</word>
</permutation>
</node>
</permutation>
</node>
</permutation>
</node>
</analysis>
</tree>
</trees>请注意,撤离者
我还没测试过这个。
替代方案
如果您对所有的排列都不感兴趣,并且只想要任何(第一个)置换,那么我们可以添加几个模板,除去所有的排列,只有一个。
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:fn="http://www.w3.org/2005/xpath-functions"
xmlns:so="http://stackoverflow.com/questions/33340967"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
version="3.0"
exclude-result-prefixes="xsl xs fn so">
<xsl:output encoding="utf-8" omit-xml-declaration="yes" indent="yes" />
<xsl:strip-space elements="*" />
<xsl:variable name="rules" as="element(rule)*">
<rule cat="s">
<!-- All rules have precisely 2 children. -->
<rulechild cat="np"/>
<rulechild cat="vp"/>
</rule>
<rule cat="vp">
<rulechild cat="vt"/>
<rulechild cat="np"/>
</rule>
</xsl:variable>
<xsl:variable name="words" as="element(word)+">
<word cat="vi">bark</word>
<word cat="vp">bark</word>
<word cat="vt">drink</word>
<word cat="np">dog</word>
<word cat="np">cat</word>
<word cat="np">milk</word>
</xsl:variable>
<xsl:function name="so:analysis-1" as="element(analysis)">
<!-- Do the first row of CYK. -->
<xsl:param name="sentence" as="xs:string" />
<analysis>
<xsl:analyze-string select="$sentence" regex="\w+">
<xsl:matching-substring>
<xsl:variable name="word" select="." />
<node>
<xsl:for-each select="$words[. eq $word]">
<permutation cat="{@cat}">
<word cat="{@cat}"><xsl:value-of select="$word" /></word>
</permutation>
</xsl:for-each>
</node>
</xsl:matching-substring>
</xsl:analyze-string>
</analysis>
</xsl:function>
<xsl:function name="so:next-analysis" as="element(analysis)">
<!-- Given the first n rows of CYK, compute the n+1'th row. -->
<xsl:param name="rows" as="element(analysis)+" />
<xsl:variable name="word-count" select="count( $rows[1]/node)" as="xs:integer" />
<xsl:variable name="node-count" select="count( $rows[last()]/node) - 1" as="xs:integer" />
<xsl:variable name="seq-len" select="$word-count - $node-count + 1" as="xs:integer" />
<analysis>
<xsl:for-each select="1 to $node-count">
<xsl:variable name="index" select="." as="xs:integer" />
<node>
<xsl:for-each select="$rules">
<xsl:variable name="rule" as="element(rule)" select="." />
<xsl:for-each select="
for $sub-a in 1 to $seq-len - 1 return $sub-a
[$rows[$sub-a ]/node[$index ][permutation/@cat = $rule/rulechild[1]/@cat]]
[$rows[$seq-len - $sub-a]/node[$index + $sub-a][permutation/@cat = $rule/rulechild[2]/@cat]]">
<xsl:variable name="sub-a" select="." as="xs:integer" />
<permutation cat="{$rule/@cat}">
<node>
<xsl:copy-of select="$rows[$sub-a]/node[$index]/permutation[@cat eq $rule/rulechild[1]/@cat]" />
</node>
<node>
<xsl:copy-of select="$rows[$seq-len - $sub-a]/node[$index + $sub-a]/permutation[@cat eq $rule/rulechild[2]/@cat]" />
</node>
</permutation>
</xsl:for-each>
</xsl:for-each>
</node>
</xsl:for-each>
</analysis>
</xsl:function>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates />
</xsl:copy>
</xsl:template>
<xsl:template match="sentences">
<trees>
<xsl:apply-templates />
</trees>
</xsl:template>
<xsl:template match="sentence">
<tree>
<xsl:variable name="first-row" select="so:analysis-1(.)" />
<xsl:apply-templates select="
fold-left(
2 to count( $first-row/node),
$first-row,
function($a, $b) { $a, so:next-analysis(a) })
[last()]" />
</tree>
</xsl:template>
<xsl:template match="analysis">
<xsl:apply-templates />
</xsl:template>
<xsl:template match="node[not( fn:empty(*))]">
<node cat="{permutation[1]/@cat}">
<xsl:apply-templates select="permutation[1]/*"/>
</node>
</xsl:template>
<xsl:template match="node[fn:empty(*)]">
<node xsi:nil="true" />
</xsl:template>
</xsl:stylesheet>替代输出
输出应该更整洁,看起来像这样.
<trees xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<tree>
<node cat="s">
<node cat="np">
<word cat="np">dog</word>
</node>
<node cat="vp">
<word cat="vp">bark</word>
</node>
</node>
</tree>
<tree>
<node cat="s">
<node cat="np">
<word cat="np">cat</word>
</node>
<node cat="vp">
<node cat="vt">
<word cat="vt">drink</word>
</node>
<node cat="np">
<word cat="np">milk</word>
</node>
</node>
</node>
</tree>
</trees>https://stackoverflow.com/questions/33340967
复制相似问题