Skip to content

[🐛 Bug]: Log File Process Is Not Closed in Firefox Driver Binary #11730

@acbilson

Description

@acbilson

What happened?

I am running a Selenium instance with a Firefox driver to scrape and download many files from a website. This runs as a Python Flask web service inside a Docker container.

I discovered that my container would scrape a few pages before it began to hit it's memory limits and need a restart. I used Python's default profiler to investigate where the memory allocation was growing and discovered that the process handler to the log file continued to grow with each execution. This was especially surprising given that I was pointing the logging service to /dev/null.

I was able to resolve this in my code by manually closing the file handler prior to calling driver.quit(). I think it might be best if driver.quit() handled closing this handler internally.

How can we reproduce the issue?

# This was a wrapper I created to ensure that the log file handler closes when
# I am finished with the driver. If you remove the line that closes the handler
# and run this driver instance against a site multiple times, you'll observe
# that the handler eats up more and more space. If you don't have a lot of
# memory, you may also observe that each execution gets slower.

class DriverManager:
    """wraps a selenium driver instance
    attrs:
        download_path (str): the folder location that driver downloads will be placed inside
        firefox_exe_path (str): the path to the Firefox executable
        gecko_driver_exe_path (str): the path to the Gecko Driver executable
    """

    def __init__(
        self,
        download_path: str,
        firefox_exe_path: str,
        gecko_driver_exe_path: str,
    ):
        self.download_path = download_path

        # Setup the firefox webdriver
        service = Service(executable_path=gecko_driver_exe_path, log_path=os.devnull)

        options = Options()
        options.headless = True
        options.binary = firefox_exe_path
        options.set_preference("browser.download.folderList", 2)
        options.set_preference("browser.download.manager.showWhenStarting", False)
        options.set_preference("browser.download.dir", download_path)
        options.set_preference("download.prompt_for_download", False)
        options.set_preference(
            "browser.helperApps.neverAsk.saveToDisk", "application/pdf"
        )
        options.set_preference("pdfjs.disabled", True)
        options.set_capability("marionette", True)

        self.driver = Firefox(options=options, service=service)

    def __enter__(self) -> Firefox:
        return self.driver

    def __exit__(self, exception_type, exception_val, trace):
        # closes file handler manually to fix memory leak
        self.driver.binary._log_file.close()
        self.driver.quit()

Relevant log output

I wish I'd kept the profiler output, but I don't have it anymore.

Operating System

Debian Buster

Selenium version

Python 4.1.3

What are the browser(s) and version(s) where you see this issue?

Firefox 102.0.1

What are the browser driver(s) and version(s) where you see this issue?

GeckoDriver v0.31.0

Are you using Selenium Grid?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-needs-triagingA Selenium member will evaluate this soon!C-pyPython BindingsI-defectSomething is not working as intended

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions