首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >用HXT解析XML

用HXT解析XML
EN

Stack Overflow用户
提问于 2013-09-01 19:21:58
回答 1查看 542关注 0票数 2

我的目标是从这个XML文件中提取两个列表:

代码语言:javascript
复制
<famous_people>
  <famous_person>
    <first_name>Wolfgang</first_name>
    <last_name>Goethe</last_name>
    <year_of_birth>1749</year_of_birth>
    <country_of_origin>Germany</country_of_origin>
  </famous_person>
  <famous_person>
    <first_name>Miguel</first_name>
    <last_name>Cervantes</last_name>
    <widely_known_for>Don Quixote</widely_known_for>
  </famous_person>
</famous_people>

我想提取的清单是:

代码语言:javascript
复制
[[("first_name","Wolfgang"),("last_name","Goethe"),("year_of_birth","1749"),("country_of_origin","Germany")],[("first_name","Miguel"),("last_name","Cervantes"),("widely_known_for","Don Quixote")]]

我只设法达到了这样的程度:我感兴趣的所有元组都在一个大的平面列表中,这个GHCi输出就证明了这一点:

代码语言:javascript
复制
Prelude> import Text.XML.HXT.Core
Prelude Text.XML.HXT.Core> import Text.HandsomeSoup
Prelude Text.XML.HXT.Core Text.HandsomeSoup> 
Prelude Text.XML.HXT.Core Text.HandsomeSoup> html <- readFile "test.html"
Prelude Text.XML.HXT.Core Text.HandsomeSoup> 
Prelude Text.XML.HXT.Core Text.HandsomeSoup> let doc = readString [] html
Prelude Text.XML.HXT.Core Text.HandsomeSoup> 
Prelude Text.XML.HXT.Core Text.HandsomeSoup> runX $ doc >>> getChildren >>> getChildren >>> getChildren >>> multi (getName &&& deep getText)
[("first_name","Wolfgang"),("last_name","Goethe"),("year_of_birth","1749"),("country_of_origin","Germany"),("first_name","Miguel"),("last_name","Cervantes"),("widely_known_for","Don Quixote")]

如何获得两个列表的所需列表?

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2013-09-01 21:52:52

我使用listA函数来收集列表中的结果。这是我的代码:

代码语言:javascript
复制
module Famous where
import Text.XML.HXT.Core (isElem, hasName, getChildren, getText, listA, runX, readDocument, getName)
import Control.Arrow.ArrowTree (deep)
import Control.Arrow ((>>>), (&&&))
import Text.XML.HXT.Arrow.XmlArrow (ArrowXml)
import Text.XML.HXT.DOM.TypeDefs (XmlTree)

atTag :: ArrowXml a => String -> a XmlTree XmlTree
atTag tag = deep (isElem >>> hasName tag)

parseFamous :: ArrowXml a => a XmlTree [(String, String)]
parseFamous = atTag "famous_person" >>> listA (getChildren >>>
                                               (getName &&& (getChildren >>> getText)))



main :: IO ()
main = do
  let path = "famous.xml"
  result <- runX (readDocument [] path >>> parseFamous)
  print result
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/18562029

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档