我有一个字符串,如下所示:
lorep ipsum <a href="#" class="link-1">dolor sit</a>amet, consectetur <a href="#" class="link-2">adipiscing</a> elit.我需要将它分割成片段,但是在锚内保存片段的链接类。如此完美的结果将是:
['lorep ipsum ', {'link-1' => 'dolor sit'}, 'amet, consectetur', {'link-2' => 'adipiscing'}, ' elit.']<br />或者:
['lorep ipsum ', ['link-1', 'dolor sit'], 'amet, consectetur', ['link-2', 'adipiscing'], ' elit.']我试过使用以下代码:
string.split(/<[^>]>/)但是它只返回一个片段数组。
发布于 2014-01-22 13:32:15
我会用Nokogiri
require 'nokogiri'
doc = Nokogiri::HTML.parse <<-eot
lorep ipsum <a href="#" class="link-1">dolor sit</a>amet, consectetur <a href="#" class="link-2">adipiscing</a> elit.
eot
ary = doc.search("//a").flat_map do |n,a|
[n.previous_sibling.text.strip,{n['class'] => n.text.strip},n.next_sibling.text.strip]
end.uniq
p ary输出
["lorep ipsum", {"link-1"=>"dolor sit"}, "amet, consectetur", {"link-2"=>"adipis
cing"}, "elit."]https://stackoverflow.com/questions/21284054
复制相似问题