-
-
Notifications
You must be signed in to change notification settings - Fork 184
Description
Note: This information is duplicated on https://stackoverflow.com/questions/69849910/navigating-to-javascript-form-using-htmlunit. Feel free to request any additional information there.
Overview
I have successfully been using HtmlUnit to navigate BoardGameGeek and execute tasks (e.g. send GeekMail). Recently they changed their login from a normal webpage to a javascript-generated form, and now I just can't seem to access the login form using HtmlUnit no matter what I try, including:
- Adding a WebWindowListener.
- Waiting for javascript to complete with
webClient.waitForBackgroundJavaScript(JAVASCRIPT_PAUSE)orwebClient.waitForBackgroundJavaScriptStartingBefore(JAVASCRIPT_PAUSE). - Listing all windows (0) and forms (1).
- Printing XML for the current HtmlPage and all HtmlForms.
Of course I want a general understanding beyond this specific webpage / javascript code, but I give this specific concrete example since the other StackOverflow questions I looked at didn't help in this particular situation (so maybe there is something unique here?).
Steps to Reproduce
- Use your browser to navigate to https://boardgamegeek.com/geekmail/compose?touser=FakeUserName (any name will suffice).
- As long as you are not logged in to BoardGameGeek, you will then see a pop-up window titled "Sign in" with text inputs "Username" and "Password".
- If you View Page Source, you will see that the form is generated by javascript (https://cf.geekdo-static.com/frontend/main-es2015.aeff0e4f13bcecc7eb55.js or https://cf.geekdo-static.com/frontend/main-es5.aeff0e4f13bcecc7eb55.js). As far as I can tell, the created form has no name / id that I can use to access it. Even if it did, I do not seem to be able to view it within HtmlUnit (i.e. "Sign up" never appears when I print XML, whether for HtmlPage or HtmlForm).
Existing code
Here is my current version of the code with multiple attempts made to diagnose the problem / extract some useful information:
import com.gargoylesoftware.htmlunit.WebClient;
...
import java.util.LinkedList;
public class GeekMailSender {
...
// Static variable to track all website windows.
private final static LinkedList<WebWindow> websiteWindows = new LinkedList<WebWindow>();
// Inner-class to listen for new (i.e. pop-up) windows.
static class GeekMailWindowListener implements WebWindowListener {
public void webWindowClosed(WebWindowEvent event) {}
public void webWindowContentChanged(WebWindowEvent event) {}
public void webWindowOpened(WebWindowEvent event) {
GeekMailSender.websiteWindows.add(event.getWebWindow());
}
}
// Method to actually send GeekMail by navigating the BGG website.
public static void sendGeekMail(...) {
...
try (final WebClient webClient = new WebClient()) {
// Track creation of new (i.e. pop-up) windows.
websiteWindows.clear();
webClient.addWebWindowListener(new GeekMailWindowListener());
// Try to access the GeekMail page.
HtmlPage currentPage = webClient.getPage("https://boardgamegeek.com/geekmail/compose?touser=FakeUserName");
String pageTitle = currentPage.getTitleText();
System.out.println(pageTitle); // BoardGameGeek
// We may need to login first.
if (!pageTitle.contains("GeekMail")) {
// Need to wait for javascript to complete, otherwise no forms are available.
webClient.waitForBackgroundJavaScriptStartingBefore(JAVASCRIPT_PAUSE);
// No difference if use webClient.waitForBackgroundJavaScript(JAVASCRIPT_PAUSE);
// Unfortunately the only form found is the top-right Search form on BoardGameGeek.
if (currentPage.getForms().isEmpty()) {
// This does NOT happen.
System.out.println("WARNING! No form found, even after waiting for javascript!");
return;
}
// We don't find any windows at all... this confuses me.
if (websiteWindows.isEmpty()) {
// This does happen :(
System.out.println("WARNING! No windows found even after waiting for javascript!");
}
// Additional printing does not reveal where the form is.
// For instance, searching the XML for "Sign up" yields no results.
System.out.println(currentPage.asXml());
// And printing the one form we can access reveals it is just the Search form.
System.out.println(currentPage.getForms().size()); // 1
final HtmlForm loginForm = currentPage.getForms().get(0);
System.out.println(loginForm.asXml());
...
}
...
}
...
}
References
In trying to solve this, I have checked the following references (among many others):
- https://htmlunit.sourceforge.io/gettingStarted.html
- https://stackoverflow.com/questions/41117026/htmlunit-cant-find-forms-on-website
- https://stackoverflow.com/questions/54528410/locating-a-pop-up-window-with-htmlunit
- https://sourceforge.net/p/htmlunit/mailman/message/20356348/
- https://htmlunit.sourceforge.io/apidocs/com/gargoylesoftware/htmlunit/WebWindowListener.html
Nevertheless, I seem unable to locate the desired form. Any help would be much appreciated!