首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >修改下面的方法,让我只调用一次Nokogiri::HTML?

修改下面的方法,让我只调用一次Nokogiri::HTML?
EN

Stack Overflow用户
提问于 2013-08-21 12:15:09
回答 1查看 70关注 0票数 1

下面的代码删除了站点的不同部分。基本上,我对每个部分都有几个方法,然后是一个方法(parse_details),它将所有散列合并为一个散列:

代码语言:javascript
复制
class Parser
  def parse_details(html)
    merged_hashes = {}
    array_of_hashes = [
      self.parse_department(html),
      self.parse_super_saver(html),
    ]
    array_of_hashes.inject(merged_hashes,:update)

    return merged_hashes
  end

  def parse_department(file)  
    html       = file
    data       = Nokogiri::HTML(open(html))
    department = data.css('#ref_2619534011')

    @department_hash = {}
    department.css('li').drop(1).each do | department |
      department_title = department.css('.refinementLink').text
      department_count = department.css('.narrowValue').text[/[\d,]+/].delete(",").to_i
      @department_hash[:department] ||= {}
      @department_hash[:department]["Pet Supplies"] ||= {}
      @department_hash[:department]["Pet Supplies"][department_title] = department_count
    end 

    return @department_hash
  end 

  def parse_super_saver(file)
    html        = file
    data        = Nokogiri::HTML(open(html))
    super_saver = data.css('#ref_2661623011')

    @super_saver_hash = {}
    super_saver.css('li').each do | super_saver |
      super_saver_title = super_saver.css('.refinementLink').text
      super_saver_count = super_saver.css('.narrowValue').text[/[\d,]+/].delete(",").to_i
      @super_saver_hash[:super_saver] ||= {}
      @super_saver_hash[:super_saver][super_saver_title] = super_saver_count
    end 

    return @super_saver_hash
  end

如您所见,我不止一次调用Nokogiri::HTML(open(html))

有人建议我这样做:

代码语言:javascript
复制
  def self.parse(html)
    doc = Nokogiri::HTML html 
    self.parse_details(doc) unless doc.nil?
  end

所以我只调用Nokogiri::HTML一次。

但是我被困住了,例如,我不知道如何处理像department = data.css('#ref_2619534011')这样的部分,它们是否应该进入新的parse方法?我也不知道如何处理htmlfile参数。有了新的parse方法后,我应该保留它们还是删除它们?

有什么建议可以让我完成我想要的东西吗?

EN

回答 1

Stack Overflow用户

发布于 2013-08-21 13:14:28

代码语言:javascript
复制
class Parser
  def initialize(url)
    @data = Nokogiri.HTML(open(url))
  end
  def parse_details()
    {}.tap do |merged_hashes|
      array_of_hashes = [
        parse_department(),
        parse_super_saver(),
      ]
      array_of_hashes.inject(merged_hashes,:update)
    end
  end

  def parse_department()  
    department = @data.css('#ref_2619534011')

    @department_hash = {}
    department.css('li').drop(1).each do | department |
      department_title = department.css('.refinementLink').text
      department_count = department.css('.narrowValue').text[/[\d,]+/].delete(",").to_i
      @department_hash[:department] ||= {}
      @department_hash[:department]["Pet Supplies"] ||= {}
      @department_hash[:department]["Pet Supplies"][department_title] = department_count
    end 
    @department_hash
  end 

  def parse_super_saver()
    super_saver = @data.css('#ref_2661623011')

    @super_saver_hash = {}
    super_saver.css('li').each do | super_saver |
      super_saver_title = super_saver.css('.refinementLink').text
      super_saver_count = super_saver.css('.narrowValue').text[/[\d,]+/].delete(",").to_i
      @super_saver_hash[:super_saver] ||= {}
      @super_saver_hash[:super_saver][super_saver_title] = super_saver_count
    end 
    @super_saver_hash
  end
end

如果您实际上不需要@department_hash@super_saver_hash作为实例变量,则可以选择将它们转换为我在parse_details中使用的tap样式。

如果你实际上根本不需要它是一个类,而仅仅是一个方法的集合,那么考虑一下:

代码语言:javascript
复制
module Parser
  def self.parse_details(url)
    data = Nokogiri.HTML(open(url))
    {}.tap do |merged_hashes|
      array_of_hashes = [
        parse_department(data),
        parse_super_saver(data),
      ]
      array_of_hashes.inject(merged_hashes,:update)
    end
  end

  def self.parse_department(data)
    {}.tap do |department_hash|
      data.css('#ref_2619534011 li').drop(1).each do | department |
        department_title = department.css('.refinementLink').text
        department_count = department.css('.narrowValue').text[/[\d,]+/].delete(",").to_i
        department_hash[:department] ||= {}
        department_hash[:department]["Pet Supplies"] ||= {}
        department_hash[:department]["Pet Supplies"][department_title] = department_count
      end
    end
  end 

  def self.parse_super_saver(data)    
    {}.tap do |super_saver_hash|
      data.css('#ref_2661623011 li').each do |super_saver|
        super_saver_title = super_saver.css('.refinementLink').text
        super_saver_count = super_saver.css('.narrowValue').text[/[\d,]+/].delete(",").to_i
        super_saver_hash[:super_saver] ||= {}
        super_saver_hash[:super_saver][super_saver_title] = super_saver_count
      end 
    end
  end
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/18348760

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档