当加载此页面:https://www.fangraphs.com/leaders/splits-leaderboards?splitArr=5&strgroup=season&statgroup=1&startDate=2018-03-01&endDate=2018-11-01&filter=IP%7Cgt%7C0&position=P&statType=player&autoPt=true&players=&pg=0&pageItems=30&sort=22,1&splitArrPitch=&splitTeams=false时,在https://www.fangraphs.com/leaders/splits-leaderboards?splitArr=5&strgroup=season&statgroup=1&startDate=2018-03-01&endDate=2018-11-01&filter=IP%7Cgt%7C0&position=P&statType=player&autoPt=true&players=&pg=0&pageItems=30&sort=22,1&splitArrPitch=&splitTeams=false对象中返回的动态内容没有任何进展。
“反应滴测试”是空的。我试图找到锚与“导出数据”文本,以便我可以点击它,并获得内容作为一个流。
对于如何让HtmlPage包含动态内容,有什么想法吗?
这是我现在拥有的一个样本。锚从不返回任何元素。
webClient = new WebClient(BrowserVersion.CHROME);
webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
webClient.getCookieManager().setCookiesEnabled(false);
webClient.getOptions().setThrowExceptionOnScriptError(false);
webClient.setAjaxController(new NicelyResynchronizingAjaxController());
webClient.setJavaScriptTimeout(jsTimeout);
updateJSErrorListener(webClient);
int thisYear = year;
if (isEarlySeason()) {
thisYear = year - 1;
}
String leftyURL = "https://www.fangraphs.com/leaderssplits.aspx?splitArr=5&strgroup=season&statgroup=1&startDate=" + thisYear + "-03-01&endDate=" + year + "-11-01&filter=IP%7Cgt%7C0&position=P&statType=player&autoPt=true&players=&pg=0&pageItems=30&sort=22,1";
HtmlPage page = webClient.getPage(leftyURL);
HtmlAnchor leftyAnchor = null;
HtmlDivision div = (HtmlDivision) page.getElementById("react-drop-test");
List<HtmlElement> anchors = div.getElementsByTagName("a");
for (DomElement anchor:anchors2) {
if ((anchor.getAttribute("class").contains("data-export"))) {
leftyAnchor = (HtmlAnchor) anchor;
break;
}
}
Page p = leftyAnchor.click();
InputStream is = p.getWebResponse().getContentAsStream();
List<List<String>> leftyCSV = readCSVFile(is);发布于 2018-10-29 20:15:37
和另一个充满了奇怪的js的网页。让我从一些一般性的提示开始:
最后:您需要更新版本的HtmlUnit才能完成任务,因为javascript促使错过了一个特征获得此页面使用的javascript代码。
要获得新的(快照)版本,您可以使用以下选项:
使用最新的代码库,这将为您完成以下工作:
String url = "https://www.fangraphs.com/leaders/splits-leaderboards?splitArr=5&strgroup=season&statgroup=1&startDate=2018-03-01&endDate=2018-11-01&filter=IP%7Cgt%7C0&position=P&statType=player&autoPt=true&players=&pg=0&pageItems=30&sort=22,1&splitArrPitch=&splitTeams=false";
try (final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_60)) {
webClient.getOptions().setThrowExceptionOnScriptError(false);
HtmlPage page = webClient.getPage(url);
webClient.waitForBackgroundJavaScript(50000);
System.out.println("----------------");
System.out.println(page.asText());
HtmlDivision div = (HtmlDivision) page.getElementById("react-drop-test");
List<HtmlElement> anchors = div.getElementsByTagName("a");
for (DomElement anchor:anchors) {
if ((anchor.getAttribute("class").contains("data-export"))) {
HtmlAnchor leftyAnchor = (HtmlAnchor) anchor;
Page p = leftyAnchor.click();
System.out.println();
System.out.println("----------------");
System.out.println(p.getWebResponse().getContentAsString());
break;
}
}
} https://stackoverflow.com/questions/53037782
复制相似问题