-
-
Notifications
You must be signed in to change notification settings - Fork 184
Closed
Description
Hi All, I am trying to scrap a page that has a tracking info .
When i try to access the html contents post scrapping , the row data is not in the expected string format .
### Error : As you can see from below , one of the table data is not in text/string format when scrapped, but the same request works fine when we execute the scrapped url from browser.
2021-07-07 15:32:00 **[object ProgressEvent]** LHR TG0910A sample code snippet :
WebClient webClient = new WebClient(BrowserVersion.CHROME);
webClient.waitForBackgroundJavaScript(Integer.MAX_VALUE);
webClient.waitForBackgroundJavaScriptStartingBefore(10000);
webClient.getOptions().setJavaScriptEnabled(true);
webClient.getOptions().setCssEnabled(false);
webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
webClient.getOptions().setThrowExceptionOnScriptError(false);
webClient.getOptions().setPrintContentOnFailingStatusCode(false);
webClient.getOptions().setThrowExceptionOnScriptError(false);
webClient.setAjaxController(new NicelyResynchronizingAjaxController());
HtmlPage page = webClient.getPage(urlToScrap);
TimeUnit.SECONDS.sleep(10);
String fullpath = String.format("//*[@id=\"pp_collapse_%s_control\"]/div/div[3]/div[2]/table",trackNumber);
HtmlTable table = (HtmlTable) page.getByXPath(fullpath).get(0);
for (final HtmlTableRow row : table.getRows()) {
System.out.println("Found row");
for (final HtmlTableCell cell : row.getCells()) {
System.out.println(" Found cell: " + cell.asNormalizedText());
}
}
Result:
Found row
Found cell:
Found cell: 2021-07-07 15:32:00
Found cell: [object ProgressEvent]
Found cell: LHR
Found cell: TG0910
Metadata
Metadata
Assignees
Labels
No labels