我需要一个java regex来提取下面code.help me中的脚本标记中的图像src。谢谢
<script language="javascript"><!--
document.write('<a href="javascript:popupWindow(\'https://www.kitchenniche.ca/prepara-adjustable-oil-pourer-pi-5597.html?invis=0\')">
<img src="images/imagecache/prepara-adjustable-oil-pourer-1.jpg" border="0" alt="Prepara Adjustable Oil Pourer" title=" Prepara Adjustable Oil Pourer " width="170" height="175" hspace="5" vspace="5">
<br>
</a>');
--></script>发布于 2017-05-31 04:44:23
试试这个:
String mydata = "<script language='javascript'><!--document.write('<a href='javascript:popupWindow"
+ "(\'https://www.kitchenniche.ca/prepara-adjustable-oil-pourer-pi-5597.html?invis=0\')'><img "
+ "src='images/imagecache/prepara-adjustable-oil-pourer-1.jpg' border='0' alt='Prepara Adjustable Oil Pourer' "
+ "title=' Prepara Adjustable Oil Pourer ' width='170' height='175' hspace='5' vspace='5'><br></a>');</script>";
Pattern pattern = Pattern.compile("src='(.*?)'");
Matcher matcher = pattern.matcher(mydata);
if (matcher.find()) {
System.out.println(matcher.group(1));
}发布于 2017-05-31 05:03:34
只有当src位于src之后时,此正则表达式才能找到<img属性的内容。如果src不是img标记的第一个属性,那么您需要更复杂的正则表达式。
public static void main(String[] args) {
String s = "<script language=\"javascript\"><!--\r\n"
+ " document.write('<a href=\"javascript:popupWindow(\\'https://www.kitchenniche.ca/prepara-adjustable-oil-pourer-pi-5597.html?invis=0\\')\">\r\n"
+ "<img src=\"images/imagecache/prepara-adjustable-oil-pourer-1.jpg\" border=\"0\" alt=\"Prepara Adjustable Oil Pourer\" title=\" Prepara Adjustable Oil Pourer \" width=\"170\" height=\"175\" hspace=\"5\" vspace=\"5\">\r\n"
+ "<br>\r\n" + "</a>');\r\n" + "--></script>";
Pattern pattern = Pattern.compile("<img src=\"([^\"]+)");
Matcher matcher = pattern.matcher(s);
while (matcher.find()) {
String group = matcher.group(1);
System.out.println(group);
}
}([^\"]+)的意思是匹配除"以外的任何字符,并将匹配放入第1组。在java中,您必须转义"。
https://stackoverflow.com/questions/44275731
复制相似问题