我正在删除众所周知的html内容。
ActionView::Base.full_sanitizer.sanitize(value)方法。但是,当传入方法的value被包装在<![CDATA[中并且]]>返回值为空时,它可以很好地工作。如何防止此方法对CDATA做出反应?
我尝试的是将这个放入application.rb
config.action_view.sanitized_allowed_tags = ["![CDATA[", "]]"]但它不起作用
发布于 2015-09-08 21:16:42
这是行不通的,因为CDATA不是一个标记,而是一个实体,而且它通常属于XML文档而不是HTML文档。如果你使用dig deep enough,你会发现Rails::Html::FullSanitizer在幕后使用了Loofah,也就是说,它的#fragment方法委托将传递的字符串解析为超文本标记语言文档片段,忽略了幕后的所有CDATA部分。
# === Rails::Html::FullSanitizer
# Removes all tags but strips out scripts, forms and comments.
#
# full_sanitizer = Rails::Html::FullSanitizer.new
# full_sanitizer.sanitize("<b>Bold</b> no more! <a href='more.html'>See more here</a>...")
# # => Bold no more! See more here...
class FullSanitizer < Sanitizer
def sanitize(html, options = {})
return unless html
return html if html.empty?
Loofah.fragment(html).tap do |fragment|
remove_xpaths(fragment, XPATHS_TO_REMOVE)
end.text(options)
end
end因此,解决方案就是直接使用Loofah,如下所示:
text = "<div>in div</div> just text <![CDATA[ in cdata ]]> <script>alert(1);</script> <form>some form</form> <!-- some comments also -->"
# => "<div>in div</div> just text <![CDATA[ in cdata ]]> <script>alert(1);</script> <form>some form</form> <!-- some comments also -->"
Loofah.scrub_xml_fragment(text, :prune).text
# => "in div just text in cdata some form "这段代码的结果与FullSanitizer产生的结果略有不同,因为后者还删除了所有的<form>标记,而我的代码没有。如果这对你很重要,你可以将这段代码与上面的remove_xpaths代码结合起来(参见link)。
https://stackoverflow.com/questions/29364552
复制相似问题