文章/答案/技术大牛

发布

问URLS爬虫跳过URLS
EN

Stack Overflow用户

提问于 2016-04-02 04:53:02

回答 1查看 71关注 0票数 0

我正在编写一个扫描易受攻击网站的程序，我碰巧知道有几个站点存在漏洞，并返回一个SQL语法错误，然而，当我运行该程序时，它跳过这些站点，不输出它们所在的位置，也不输出它们保存到文件中的内容。该程序正被用于隐藏内容，所有站点所有者都意识到了该漏洞。

来源：

def get_urls
  info("Searching for possible SQL vulnerable sites.")
  @agent = Mechanize.new
  page = @agent.get('http://www.google.com/')
  google_form = page.form('f')
  google_form.q = "#{SEARCH}"
  url = @agent.submit(google_form, google_form.buttons.first)
  url.links.each do |link|
    if link.href.to_s =~ /url.q/
      str = link.href.to_s
      str_list = str.split(%r{=|&})
      urls = str_list[1]
      next if str_list[1].split('/')[2] == "webcache.googleusercontent.com"
      urls_to_log = urls.gsub("%3F", '?').gsub("%3D", '=')
      success("Site found: #{urls_to_log}")
      File.open("#{PATH}/temp/SQL_sites_to_check.txt", "a+") {|s| s.puts("#{urls_to_log}'")}
    end
  end
  info("Possible vulnerable sites dumped into #{PATH}/temp/SQL_sites_to_check.txt")
end

def check_if_vulnerable
  info("Checking if sites are vulnerable.")
  IO.read("#{PATH}/temp/SQL_sites_to_check.txt").each_line do |parse|
    begin
      Timeout::timeout(5) do
        parsing = Nokogiri::HTML(RestClient.get("#{parse.chomp}")) 
      end
    rescue Timeout::Error, RestClient::ResourceNotFound, RestClient::SSLCertificateNotVerified, Errno::ECONNABORTED, Mechanize::ResponseCodeError, RestClient::InternalServerError => e
      if e
        warn("URL: #{parse.chomp} failed with error: [#{e}] dumped to non_exploitable.txt")
        File.open("#{PATH}/lib/non_exploitable.txt", "a+"){|s| s.puts(parse)}
      else 
        success("SQL syntax error discovered in URL: #{parse.chomp} dumped to SQL_VULN.txt")
        File.open("#{PATH}/lib/SQL_VULN.txt", "a+"){|vuln| vuln.puts(parse)}
      end
    end
  end
end

使用示例：

[22:49:29 INFO]Checking if sites are vulnerable.
[22:49:53 WARNING]URL: http://www.police.bd/content.php?id=275' failed with error: [execution expired] dumped to non_exploitable.txt

包含URL的文件：

http://www.bible.com/subcat.php?id=2'
http://www.cidko.com/pro_con.php?id=3'
http://www.slavsandtat.com/about.php?id=25'
http://www.police.bd/content.php?id=275'
http://www.icdcprage.org/index.php?id=10'
http://huawei.com/en/plugin.php?id=hwdownload'
https://huawei.com/en/plugin.php?id=unlock'
https://facebook.com/profile.php?id'
http://www.footballclub.com.au/index.php?id=43'
http://www.mesrs.qc.ca/index.php?id=1525'

正如您所看到的，程序跳过3个URL并直接进入第四个URL，为什么？

我是不是做错了事情会发生在哪里？

ruby

nokogiri

rest-client

回答 1

Stack Overflow用户

回答已采纳

发布于 2016-04-02 08:03:15

我不确定rescue块是否应该在那里。您没有对在parsing = Nokogiri::HTML(RestClient.get("#{parse.chomp}"))中获取的内容做任何事情，对于前三个内容，它可能只起作用，因此没有异常，也没有错误输出。在该行之后添加一些输出，以查看它们是否被获取。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/36369336

复制

相似问题

问URLS爬虫跳过URLS
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问URLS爬虫跳过URLSEN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问URLS爬虫跳过URLS
EN