在为Nokogiri和他们的文档进行正确的设置时遇到了一些问题,开始使用起来有点麻烦。
我正在尝试解析XML文件:http://www.kongregate.com/games_for_your_site.xml
它返回游戏集中的多个游戏,并且对于每个游戏,它都有一个标题、描述等。
<gameset>
<game>
<id>160342</id>
<title>Tricky Rick</title>
<thumbnail>
http://cdn3.kongregate.com/game_icons/0042/7180/KONG_icon250x200_site.png?21656-op
</thumbnail>
<launch_date>2012-12-12</launch_date>
<category>Puzzle</category>
<flash_file>
http://external.kongregate-games.com/gamez/0016/0342/live/embeddable_160342.swf
</flash_file>
<width>640</width>
<height>480</height>
<url>
http://www.kongregate.com/games/tAMAS_Games/tricky-rick
</url>
<description>
Help Rick to collect all the stolen fuel to refuel his spaceship and fly away from the planet. Use hammer, bombs, jetpack and other useful stuff to solve puzzles!
</description>
<instructions>
WASD \ Arrow Keys – move; S \ Down Arrow – take\release an object; CNTRL – interaction with objects: throw, hammer strike, invisibility mode; SPACE – interaction with elevators and fuel stations; Esc \ P – pause;
</instructions>
<developer_name>tAMAS_Games</developer_name>
<gameplays>24999</gameplays>
<rating>3.43</rating>
</game>
<game>
<id>160758</id>
<title>Flying Cookie Quest</title>
<thumbnail>
http://cdn2.kongregate.com/game_icons/0042/8428/icon_cookiequest_kong_250x200_site.png?16578-op
</thumbnail>
<launch_date>2012-12-07</launch_date>
<category>Action</category>
<flash_file>
http://external.kongregate-games.com/gamez/0016/0758/live/embeddable_160758.swf
</flash_file>
<width>640</width>
<height>480</height>
<url>
http://www.kongregate.com/games/LongAnimals/flying-cookie-quest
</url>
<description>
Launch Rocket Panda into the land of Cookies. With the help of low-flying sharks, hang-gliding sheep and Rocket Badger, can you defeat the all powerful Biscuit Head? Defeat All enemies of cookies in this launcher game.
</description>
<instructions>Use the mouse button!</instructions>
<developer_name>LongAnimals</developer_name>
<gameplays>168672</gameplays>
<rating>3.67</rating>
</game>在文档中,我使用了如下内容:
require 'nokogiri'
require 'open-uri'
url = "http://www.kongregate.com/games_for_your_site.xml"
xml = Nokogiri::XML(open(url))
xml.xpath("//game").each do |node|
puts node.xpath("//id")
puts node.xpath("//title")
puts node.xpath("//thumbnail")
puts node.xpath("//category")
puts node.xpath("//flash_file")
puts node.xpath("//width")
puts node.xpath("//height")
puts node.xpath("//description")
puts node.xpath("//instructions")
end但是,它只返回无穷无尽的数据,而不是一组数据。任何帮助都是有帮助的。
发布于 2013-01-01 15:29:06
下面是我如何重写你的代码:
xml = Nokogiri::XML(open("http://www.kongregate.com/games_for_your_site.xml"))
xml.xpath("//game").each do |game|
%w[id title thumbnail category flash_file width height description instructions].each do |n|
puts game.at(n)
end
end代码中的问题是所有子标记都以//为前缀,这在XPath-ese中意味着“从根节点开始,向下搜索包含该文本的所有标记”。因此,它不是只在每个//game节点内搜索,而是在整个文档中搜索每个//game节点的每个列出的标记。
我推荐使用CSS存取器而不是XPath,因为它们(通常)更简单,也更容易阅读。因此,我使用search('game')而不是xpath('//game')。(search将接受CSS或XPath访问器,at也是如此。)
如果要将文本包含在标记中,请将puts game.at(n)更改为:
puts game.at(n).text为了使输出更有用,我会这样做:
require 'nokogiri'
require 'open-uri'
xml = Nokogiri::XML(open('http://www.kongregate.com/games_for_your_site.xml'))
games = xml.search('game').map do |game|
%w[
id title thumbnail category flash_file width height description instructions
].each_with_object({}) do |n, o|
o[n] = game.at(n).text
end
end
require 'awesome_print'
puts games.size
ap games.first
ap games.last这会导致:
395
{
"id" => "160342",
"title" => "Tricky Rick",
"thumbnail" => "http://cdn3.kongregate.com/game_icons/0042/7180/KONG_icon250x200_site.png?21656-op",
"category" => "Puzzle",
"flash_file" => "http://external.kongregate-games.com/gamez/0016/0342/live/embeddable_160342.swf",
"width" => "640",
"height" => "480",
"description" => "Help Rick to collect all the stolen fuel to refuel his spaceship and fly away from the planet. Use hammer, bombs, jetpack and other useful stuff to solve puzzles!\n",
"instructions" => "WASD \\ Arrow Keys – move;\nS \\ Down Arrow – take\\release an object;\nCNTRL – interaction with objects: throw, hammer strike, invisibility mode;\nSPACE – interaction with elevators and fuel stations;\nEsc \\ P – pause;\n"
}
{
"id" => "78",
"title" => "rotaZion",
"thumbnail" => "http://cdn2.kongregate.com/game_icons/0000/0115/pixtiz.rotazion_icon.jpg?8217-op",
"category" => "Action",
"flash_file" => "http://external.kongregate-games.com/gamez/0000/0078/live/embeddable_78.swf",
"width" => "350",
"height" => "350",
"description" => "In rotaZion, you play with a bubble bar that you can’t stop rotating !\nCollect the bubbles and try to avoid the mines !\nCollect the different bonus to protect your bubble bar, makes the mines go slower or destroy all the mines !\nTry to beat 100.000 points ;)\n",
"instructions" => "Move the bubble bar with the arrow keys !\nBubble = 500 Points !\nPixtiz sign = 5000 Points !\n"
}发布于 2013-01-01 10:50:35
你可以试试这样的东西。我建议为游戏中你想要的元素创建一个数组,然后迭代它们。我确信有一种方法可以在Nokogiri中获取指定元素中的所有元素,但这是可行的:
xml = Nokogiri::XML(result)
xml.css("game").each do |inv|
inv.css("title").each do |f| # title or whatever else you want
puts f.inner_html
end
endhttps://stackoverflow.com/questions/14107178
复制相似问题