文章/答案/技术大牛

发布

社区首页 >问答首页 >HXT:在Haskell中使用HXT按位置选择节点？

问HXT:在Haskell中使用HXT按位置选择节点？
EN

Stack Overflow用户

提问于 2013-07-23 06:18:03

回答 2查看 562关注 0票数 5

我正在尝试用Haskell解析一些XML文件。对于这项工作，我将使用HXT来获取一些关于现实世界应用程序中箭头的知识。所以我对箭头主题还很陌生。

在XPath (和HaXml)中，可以通过位置选择节点，比如说：/root/a[2]/b

即使在一遍又一遍地阅读文档之后，我也不知道如何使用HXT来做这样的事情。

下面是我正在使用的一些示例代码：

module Main where

import Text.XML.HXT.Core

testXml :: String
testXml = unlines
    [ "<?xml version=\"1.0\"?>"
    , "<root>"
    , "    <a>"
    , "        <b>first element</b>"
    , "        <b>second element</b>"
    , "    </a>"
    , "    <a>"
    , "        <b>third element</b>"
    , "    </a>"
    , "    <a>"
    , "        <b>fourth element</b>"
    , "        <b>enough...</b>"
    , "    </a>"
    , "</root>"
    ]

selector :: ArrowXml a => a XmlTree String
selector = getChildren /> isElem >>> hasName "a" -- how to select second <a>?
                       /> isElem >>> hasName "b"
                       /> getText

main :: IO ()
main = do
    let doc = readString [] testXml
    nodes <- runX $ doc >>> selector
    mapM_ putStrLn nodes

期望的输出将是：

third element

提前感谢！

arrows

hxt

haskell

回答 2

Stack Overflow用户

回答已采纳

发布于 2013-07-23 18:55:04

我相信的解决方案是选择"/root/a2/b“(在第二个"a”标签内的所有"b“标签)：

selector :: ArrowXml a => Int -> a XmlTree String
selector nth =
    (getChildren /> isElem >>> hasName "a")   -- the parentheses required!
    >. (!! nth) 
    /> isElem >>> hasName "b" /> getText

(结果为["third element"])。

说明:在我看来，class (..., ArrowList a, ...) => ArrowXml a，所以ArrowXml a是ArrowList的一个子类。通过ArrowList界面查看：

(>>.) :: a b c -> ([c] -> [d]) -> a b d
(>.) :: a b c -> ([c] -> d) -> a b d

因此，>>.可以使用一些提升的[c] -> [d]选择列表的子集，而>.可以使用[c] -> d类型的提升函数从列表中选择单个项目。因此，在选择子对象并对"a“进行标记过滤之后，让我们使用(!! nth) :: [a] -> a。

有一件重要的事情需要注意：

infix 1 >>>
infix 5 />
infix 8 >.

(所以我很难弄清楚为什么不带括号的>.不能像预期的那样工作)。因此，getChildren /> isElem >>> hasName "a"必须用括号括起来。

票数 5

Stack Overflow用户

发布于 2013-07-24 17:41:34

这只是EarlGray的答案的一个扩展。请参阅>>.和>.的说明！在问完这个问题后，我意识到我需要以一种特殊和确定的方式遍历这棵树。这是我用来解决我的特定问题的解决方案。对于其他人试图完成相同的事情的情况，我想分享示例代码。

假设我们想要提取第一个<a>和第二个<b>的文本。并不是所有的<a>元素都至少有两个列表，所以EarlGray的代码可以解决问题，因为您不能使用(!!)函数(空列表！)。

看看Control.Arrow.ArrowList中的函数single，它只使用列表箭头的第一个结果：

single :: ArrowList a => a b c -> a b c
single f = f >>. take 1

我们想提取第n个元素：

junction :: ArrowList a => a b c -> Int -> a b c
junction a nth = a >>. (take 1 . drop (nth - 1))

现在我们可以使用这个新箭头来构建选择器。因为junction修改了一个已有的箭头，所以有必要用括号把我们要用junction过滤的东西括起来。

selector :: ArrowXml a => a XmlTree String
selector = getChildren -- There is only one root element.
         -- For each selected element: Get a list of all children and filter them out.
         -- The junction function now selects at most one element.
         >>> (getChildren >>> isElem >>> hasName "a") `junction` 1 -- selects first <a>
         -- The same thing to select the second <b> for all the <a>s
         -- (But we had selected only one <a> in this case!
         -- Imagine commenting out the `junction` 1 above.)
         >>> (getChildren >>> isElem >>> hasName "b") `junction` 2 -- selects second <b>
         -- Now get the text of the element.
         >>> getChildren >>> getText

要提取该值并返回一个可能值：

main :: IO ()
main = do
    let doc = readString [] testXml
    text <- listToMaybe <$> (runX $ doc >>> selector)
    print text

这将输出带有示例XML文件的Just "second element"。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/17798417

复制

相似问题

问HXT:在Haskell中使用HXT按位置选择节点？
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问HXT:在Haskell中使用HXT按位置选择节点？EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问HXT:在Haskell中使用HXT按位置选择节点？
EN