WebDriver BiDi

1. Introduction

This section is non-normative.

WebDriver defines a protocol for introspection and remote control of user agents. This specification extends WebDriver by introducing bidirectional communication. In place of the strict command/response format of WebDriver, this permits events to stream from the user agent to the controlling software, better matching the evented nature of the browser DOM.

2. Infrastructure

This specification depends on the Infra Standard. [INFRA]

Network protocol messages are defined using CDDL. [RFC8610]

This specification defines a wait queue which is a map.

Surely there’s a better mechanism for doing this "wait for an event" thing.

When an algorithm algorithm running in parallel awaits a set of events events, and resume id:

Pause the execution of algorithm.
Assert: wait queue does not contain resume id.
Set wait queue[resume id] to (events, algorithm).

To resume given name, id and parameters:

If wait queue does not contain id, return.
Let (events, algorithm) be wait queue[id]
For each event in events:
1. If event equals name:
  1. Remove id from wait queue.
  2. Resume running the steps in algorithm from the point at which they were paused, passing name and parameters as the result of the await.
    
    Should we have something like microtasks to ensure this runs before any other tasks on the event loop?

A WebDriver configuration is a struct with:

item global which is a value, initially unset;
item user contexts which is a weak map between user contexts and value, initially empty;
item navigables which is a weak map between navigables and value, initially empty.

A WebDriver configuration has an associated type which is a type.

The value for a WebDriver configuration is either a value whose type is the associated type for that configuration or unset.

Unset is a value indicating that a specific configuration value has not been set.

Note: this algorithm allows accessing the WebDriver configuration for a given navigable by checking values in navigables, then in user contexts and finally in global. Returns unset if configuration is not set.

To get WebDriver configuration value of WebDriver configuration configuration for navigable navigable:

Let top-level traversable be navigable’s top-level traversable.
If configuration’s navigables contains top-level traversable:
1. Let navigable configuration value be configuration’s navigables[top-level traversable].
2. If navigable configuration value is not unset, return navigable configuration value.
Let user context be navigable’s associated user context.
If configuration’s user contexts contains user context:
1. Let user context configuration value be configuration’s user contexts[user context].
2. If user context configuration value is not unset, return user context configuration value.
Return configuration’s global.

Note: this is a generic algorithm for storing WebDriver configuration per target, which can be either navigable, user context, or store it globally if the target is null or omitted.

To store WebDriver configuration configuration’s value value in optional target which is a navigable, a user context or null if not provided:

If target is null, set configuration’s global to value.
If target is a user context, set configuration’s user contexts[target] to value.
If target is a navigable, set configuration’s navigables[target] to value.

Note: This generic algorithm stores WebDriver configuration’s value in global, user contexts, or navigables, depending on the presence of "userContexts" and "contexts" in command parameters. These parameters are mutually exclusive. If neither is provided, the configuration is stored globally.

To store WebDriver configuration WebDriver configuration configuration’s value value for given command parameters:

If command parameters contains "userContexts" and command parameters contains "contexts", return error with error code invalid argument.
Let affected navigables be an empty set.
If command parameters contains "contexts":
1. Let navigables be the result of trying to get valid top-level traversables by ids with command parameters["contexts"].
2. For each navigable of navigables:
  1. Append navigable to affected navigables.
  2. Store configuration’s value in navigable.
Otherwise, if command parameters contains "userContexts":
1. Let user contexts be the result of trying to get valid user contexts with command parameters["userContexts"].
2. For each user context of user contexts:
  1. For each top-level traversable in the list of all top-level traversables whose associated user context is user context:
    1. Append top-level traversable to affected navigables.
  2. Store configuration’s value in user context.
Otherwise:
1. For each top-level traversable of all top-level traversables, append top-level traversable to affected navigables.
2. Store configuration’s value.
Return affected navigables.

3. Protocol

This section defines the basic concepts of the WebDriver BiDi protocol. These terms are distinct from their representation at the transport layer.

The protocol is defined using a CDDL definition. For the convenience of implementers two separate CDDL definitions are defined; the remote end definition which defines the format of messages produced on the local end and consumed on the remote end, and the local end definition which defines the format of messages produced on the remote end and consumed on the local end

3.1. Definition

Should this be an appendix?

This section gives the initial contents of the remote end definition and local end definition. These are augmented by the definition fragments defined in the remainder of the specification.

Remote end definition

Command = {
  id: js-uint,
  CommandData,
  Extensible,
}

CommandData = (
  BrowserCommand //
  BrowsingContextCommand //
  EmulationCommand //
  InputCommand //
  NetworkCommand //
  ScriptCommand //
  SessionCommand //
  StorageCommand //
  WebExtensionCommand
)

EmptyParams = {
   Extensible
}

Local end definition

Message = (
  CommandResponse /
  ErrorResponse /
  Event
)

CommandResponse = {
  type: "success",
  id: js-uint,
  result: ResultData,
  Extensible
}

ErrorResponse = {
  type: "error",
  id: js-uint / null,
  error: ErrorCode,
  message: text,
  ? stacktrace: text,
  Extensible
}

ResultData = (
  BrowserResult /
  BrowsingContextResult /
  EmulationResult /
  InputResult /
  NetworkResult /
  ScriptResult /
  SessionResult /
  StorageResult /
  WebExtensionResult
)

EmptyResult = {
  Extensible
}

Event = {
  type: "event",
  EventData,
  Extensible
}

EventData = (
  BrowsingContextEvent //
  InputEvent //
  LogEvent //
  NetworkEvent //
  ScriptEvent
)

An EmptyResult is a result type with no required fields, used as the return type for commands that don’t produce result data.

Remote end definition and Local end definition

Extensible = (*text => any)

js-int = -9007199254740991..9007199254740991
js-uint = 0..9007199254740991

3.2. Session

WebDriver BiDi extends the session concept from WebDriver.

A session has a BiDi flag, which is false unless otherwise stated.

A BiDi session is a session which has the BiDi flag set to true.

The list of active BiDi sessions is given by:

Let BiDi sessions be a new list.
For each session in active sessions:
1. If session is a BiDi session append session to BiDi sessions.
Return BiDi sessions.

3.3. Modules

The WebDriver BiDi protocol is organized into modules.

Each module represents a collection of related commands and events pertaining to a certain aspect of the user agent. For example, a module might contain functionality for inspecting and manipulating the DOM, or for script execution.

Each module has a module name which is a string. The command name and event name for commands and events defined in the module start with the module name followed by a period ".".

Modules which contain commands define remote end definition fragments. These provide choices in the CommandData group for the module’s commands, and can also define additional definition properties. They can also define local end definition fragments that provide additional choices in the ResultData group for the results of commands in the module.

Modules which contain events define local end definition fragments that are choices in the Event group for the module’s events.

An implementation may define extension modules. These must have a module name that contains a single colon ":" character. The part before the colon is the prefix; this is typically the same for all extension modules specific to a given implementation and should be unique for a given implementation.

Other specifications may define their own WebDriver-BiDi modules that extend the protocol. Such modules must not have a name which contains a colon (:) character, nor must they define command names, event names, or property names that contain that character.

Authors of external specifications are encouraged to to add new modules rather than extending existing ones. Where it is desired to extend an existing module, it is preferred to integrate the extension directly into the specification containing the original module definition.

3.4. Commands

A command is an asynchronous operation, requested by the local end and run on the remote end, resulting in either a result or an error being returned to the local end. Multiple commands can run at the same time, and commands can potentially be long-running. As a consequence, commands can finish out-of-order.

Each command is defined by:

A command type which is defined by a remote end definition fragment containing a group. Each such group has two fields:
- method which is a string literal of the form [module name].[method name]. This is the command name.
- params which defines a mapping containing data that to be passed into the command. The populated value of this map is the command parameters.
A result type, which is defined by a local end definition fragment.
A set of remote end steps which define the actions to take for a command given a BiDi session and command parameters and return an instance of the command result type.

A command that can run without an active session is a static command. Commands are not static commands unless stated in their definition.

When commands are sent from the local end they have a command id. This is an identifier used by the local end to identify the response from a particular command. From the point of view of the remote end this identifier is opaque and cannot be used internally to identify the command.

Note: This is because the command id is entirely controlled by the local end and isn’t necessarily unique over the course of a session. For example a local end which ignores all responses could use the same command id for each command.

The set of all command names is a set containing all the defined command names, including any belonging to extension modules.

3.5. Errors

WebDriver BiDi extends the set of error codes from WebDriver with the following additional codes:

invalid web extension: Tried to install an invalid web extension.
no such client window: Tried to interact with an unknown client window.
no such handle: Tried to deserialize an unknown RemoteObjectReference.
no such history entry: Tried to havigate to an unknown session history entry.
no such network collector: Tried to remove an unknown collector.
no such intercept: Tried to remove an unknown network intercept.
no such network data: Tried to reference an unknown network data.
no such node: Tried to deserialize an unknown SharedReference.
no such request: Tried to continue an unknown request.
no such screencast: Tried to stop an unknown screencast recording.
no such script: Tried to remove an unknown preload script.
no such storage partition: Tried to access data in a non-existent storage partition.
no such user context: Tried to reference an unknown user context.
no such web extension: Tried to reference an unknown web extension.
unable to close browser: Tried to close the browser, but failed to do so.
unable to set cookie: Tried to create a cookie, but the user agent rejected it.
underspecified storage partition: Tried to interact with data in a storage partition which was not adequately specified.
unable to set file input: Tried to set a file input, but failed to do so.
unavailable network data: Tried to get network data which was not collected or already evicted.

ErrorCode = "invalid argument" /
            "invalid selector" /
            "invalid session id" /
            "invalid web extension" /
            "move target out of bounds" /
            "no such alert" /
            "no such network collector" /
            "no such element" /
            "no such frame" /
            "no such handle" /
            "no such history entry" /
            "no such intercept" /
            "no such network data" /
            "no such node" /
            "no such request" /
            "no such screencast" /
            "no such script" /
            "no such storage partition" /
            "no such user context" /
            "no such web extension" /
            "session not created" /
            "unable to capture screen" /
            "unable to close browser" /
            "unable to set cookie" /
            "unable to set file input" /
            "unavailable network data" /
            "underspecified storage partition" /
            "unknown command" /
            "unknown error" /
            "unsupported operation"

3.6. Events

An event is a notification, sent by the remote end to the local end, signaling that something of interest has occurred on the remote end.

An event type is defined by a local end definition fragment containing a group. Each such group has two fields:
- method which is a string literal of the form [module name].[event name]. This is the event name.
- params which defines a mapping containing event data. The populated value of this map is the event parameters.
A remote end event trigger which defines when the event is triggered and steps to construct the event type data.
Optionally, a set of remote end subscribe steps, which define steps to take when a local end subscribes to an event. Where defined these steps have an associated subscribe priority which is an integer controlling the order in which the steps are run when multiple events are enabled at once, with lower integers indicating steps that run earlier.

A BiDi session has subscriptions which is a list of subscriptions.

A BiDi session has a known subscription ids which is a set of all subscription ids that have been issued to the local end but which have not yet been unsubscribed.

A subscription is a struct consisting of a subscription id (a string), event names (a set of event names), top-level traversable ids (a set of IDs of top-level traversables) and user context ids (a set of IDs of user contexts).

A subscription subscription is global if subscription’s top-level traversable ids is an empty set and subscription’s user context ids is an empty set.

The set of sessions for which an event is enabled given event name and navigables is:

Let sessions be a new set.
For each session in active BiDi sessions:
1. If event is enabled with session, event name and navigables, append session to sessions.
Return sessions.

To determine if an event is enabled given session, event name and navigables:

Note: navigables is a set because a shared worker can be associated with multiple contexts.

Let top-level traversables be get top-level traversables with navigables.
For each subscription in session’s subscriptions:
1. If subscription’s event names do not contains event name, continue.
2. If subscription is global return true.
3. If user context ids is not empty:
  1. For each navigable in top-level traversables:
    1. If subscription’s user context ids contains navigable’s associated user context’s user context id, return true.
4. Otherwise:
  1. Let subscription top-level traversables be get navigables by ids with subscription’s top-level traversable ids.
  2. If the intersection of top-level traversables and subscription top-level traversables is not empty return true.
Return false.

The set of top-level traversables for which an event is enabled given event name and session is:

Let result be a new set.
For each subscription in session’s subscriptions:
1. If subscription’s event names does not contain event name, continue.
2. If subscription’s is global:
  1. For each traversable in remote end’s top-level traversables:
    1. Append traversable to result.
  2. Break.
3. Otherwise, if user context ids is not empty:
  1. For each traversable in remote end’s top-level traversables:
    1. Append traversable to result if subscription’s user context ids contains traversable’s associated user context’s user context id.
4. Otherwise:
  1. Let top-level traversables be get navigables by ids with subscription’s top-level traversable ids.
  2. Append each item of top-level traversables to result.
Return result.

To obtain a set of event names given a name:

Let events be an empty set.
If name contains a U+002E (period):
1. If name is the event name for an event, append name to events and return success with data events.
2. Return an error with error code invalid argument
Otherwise name is interpreted as representing all the events in a module. If name is not a module name return an error with error code invalid argument.
Append the event name for each event in the module with name name to events.
Return success with data events.

4. Transport

Message transport is provided using the WebSocket protocol. [RFC6455]

Note: In the terms of the WebSocket protocol, the local end is the client and the remote end is the server / remote host.

Note: The encoding of commands and events as messages is similar to JSON-RPC, but this specification does not normatively reference it. [JSON-RPC] The normative requirements on remote ends are instead given as a precise processing model, while no normative requirements are given for local ends.

A WebSocket listener is a network endpoint that is able to accept incoming WebSocket connections.

A WebSocket listener has a host, a port, a secure flag, and a list of WebSocket resources.

When a WebSocket listener listener is created, a remote end must start to listen for WebSocket connections on the host and port given by listener’s host and port. If listener’s secure flag is set, then connections established from listener must be TLS encrypted.

A remote end has a set of WebSocket listeners active listeners, which is initially empty.

A remote end has a set of WebSocket connections not associated with a session, which is initially empty.

A WebSocket connection is a network connection that follows the requirements of the WebSocket protocol

A BiDi session has a set of session WebSocket connections whose elements are WebSocket connections. This is initially empty.

A BiDi session session is associated with connection connection if session’s session WebSocket connections contains connection.

Note: Each WebSocket connection is associated with at most one BiDi session.

When a client establishes a WebSocket connection connection by connecting to one of the set of active listeners listener, the implementation must proceed according to the WebSocket server-side requirements, with the following steps run when deciding whether to accept the incoming connection:

Let resource name be the resource name from reading the client’s opening handshake. If resource name is not in listener’s list of WebSocket resources, then stop running these steps and act as if the requested service is not available.
If resource name is the byte string "/session", and the implementation supports BiDi-only sessions:
1. Run any other implementation-defined steps to decide if the connection should be accepted, and if it is not stop running these steps and act as if the requested service is not available.
2. Add the connection to WebSocket connections not associated with a session.
3. Return.
Get a session ID for a WebSocket resource with resource name and let session id be that value. If session id is null then stop running these steps and act as if the requested service is not available.
If there is a session in the list of active sessions with session id as its session ID then let session be that session. Otherwise stop running these steps and act as if the requested service is not available.
Run any other implementation-defined steps to decide if the connection should be accepted, and if it is not stop running these steps and act as if the requested service is not available.
Otherwise append connection to session’s session WebSocket connections, and proceed with the WebSocket server-side requirements when a server chooses to accept an incoming connection.

Do we support > 1 connection for a single session?

When a WebSocket message has been received for a WebSocket connection connection with type type and data data, a remote end must handle an incoming message given connection, type and data.

When the WebSocket closing handshake is started or when the WebSocket connection is closed for a WebSocket connection connection, a remote end must handle a connection closing given connection.

Note: Both conditions are needed because it is possible for a WebSocket connection to be closed without a closing handshake.

To construct a WebSocket resource name given a session session:

If session is null, return "/session"
Return the result of concatenating the string "/session/" with session’s session ID.

To construct a WebSocket URL given a WebSocket listener listener and session session:

Let resource name be the result of construct a WebSocket resource name with session.
Return a WebSocket URI constructed with host set to listener’s host, port set to listener’s port, path set to resource name, following the wss-URI construct if listener’s secure flag is set and the ws-URL construct otherwise.

To get a session ID for a WebSocket resource given resource name:

If resource name doesn’t begin with the byte string "/session/", return null.
Let session id be the bytes in resource name following the "/session/" prefix.
If session id is not the string representation of a UUID, return null.
Return session id.

To start listening for a WebSocket connection given a session session:

If there is an existing WebSocket listener in active listeners which the remote end would like to reuse, let listener be that listener. Otherwise let listener be a new WebSocket listener with implementation-defined host, port, secure flag, and an empty list of WebSocket resources.
Let resource name be the result of construct a WebSocket resource name with session.
Append resource name to the list of WebSocket resources for listener.
Append listener to the remote end’s active listeners.
Return listener.

Note: An intermediary node handling multiple sessions can use one or many WebSocket listeners. WebDriver defines that an endpoint node supports at most one session at a time, so it’s expected to only have a single listener.

Note: For an endpoint node the host in the above steps will typically be "localhost".

To handle an incoming message given a WebSocket connection connection, type type and data data:

If type is not text, send an error response given connection, null, and invalid argument, and finally return.
Assert: data is a scalar value string, because the WebSocket handling errors in UTF-8-encoded data would already have failed the WebSocket connection otherwise.

Nothing seems to define what status code is used for UTF-8 errors.
If there is a BiDi Session associated with connection connection, let session be that session. Otherwise if connection is in WebSocket connections not associated with a session, let session be null. Otherwise, return.
Let parsed be the result of parsing JSON into Infra values given data. If this throws an exception, then send an error response given connection, null, and invalid argument, and finally return.
If session is not null and not in active sessions then return.
Match parsed against the remote end definition. If this results in a match:
1. Let matched be the map representing the matched data.
2. Assert: matched contains "id", "method", and "params".
3. Let command id be matched["id"].
4. Let method be matched["method"]
5. Let command be the command with command name method.
6. If session is null and command is not a static command, then send an error response given connection, command id, and invalid session id, and return.
7. Run the following steps in parallel:
  1. Let result be the result of running the remote end steps for command given session and command parameters matched["params"]
  2. If result is an error, then send an error response given connection, command id, and result’s error code, and finally return.
  3. Let value be result’s data.
  4. Assert: value matches the definition for the result type corresponding to the command with command name method.
  5. If method is "session.new", let session be the entry in the list of active sessions whose session ID is equal to the "sessionId" property of value, append connection to session’s session WebSocket connections, and remove connection from the WebSocket connections not associated with a session.
  6. Let response be a new map matching the CommandResponse production in the local end definition with the id field set to command id and the value field set to value.
  7. Let serialized be the result of serialize an infra value to JSON bytes given response.
  8. Send a WebSocket message comprised of serialized over connection.
Otherwise:
1. Let command id be null.
2. If parsed is a map and parsed["id"] exists and is an integer greater than or equal to zero, set command id to that integer.
3. Let error code be invalid argument.
4. If parsed is a map and parsed["method"] exists and is a string, but parsed["method"] is not in the set of all command names, set error code to unknown command.
5. Send an error response given connection, command id, and error code.

To get related navigables given an settings object settings:

Let related navigables be an empty set.
If settings’ relevant global object is a Window:
1. Let navigable be relevant global object’s associated Document’s node navigable.
2. If navigable is not null, append navigable to related navigables.
Otherwise if the global object specified by settings is a WorkerGlobalScope, for each owner in the global object’s owner set:
1. Let navigable be null.
2. If owner is a Document, set navigable to owner’s node navigable.
3. If navigable is not null, append navigable to related navigables.
Return related navigables.

To get navigables by ids given a list of context ids navigable ids:

Let result be an empty set.
For each navigable id in navigable ids:
1. Let navigable be the navigable with id navigable id if such navigable exists, and null otherwise.
2. Append navigable to result if navigable is not null.
Return result.

To get top-level traversables given a list of navigables navigables:

Let result be an empty set.
For each navigable in navigables:
1. Append navigable’s top-level traversable to result.
Return result.

To get valid navigables by ids given a list of context ids navigable ids:

Let result be an empty set.
For each navigable id in navigable ids:
1. Let navigable be the result of trying to get a navigable with navigable id.
2. Append navigable to result.
Return success with data result.

To get valid top-level traversables by ids given a list of context ids navigable ids:

Let result be an empty set.
For each navigable id in navigable ids:
1. Let navigable be the result of trying to get a navigable with navigable id.
2. If navigable is not a top-level traversable, return error with error code invalid argument.
3. Append navigable to result.
Return success with data result.

To emit an event given session, and body:

Assert: body matches the Event production.
Let serialized be the result of serialize an infra value to JSON bytes given body.
For each connection in session’s session WebSocket connections:
1. Send a WebSocket message comprised of serialized over connection.

To send an error response given a WebSocket connection connection, command id, and error code:

Let error data be a new map matching the ErrorResponse production in the local end definition, with the id field set to command id, the error field set to error code, the message field set to an implementation-defined string containing a human-readable definition of the error that occurred and the stacktrace field optionally set to an implementation-defined string containing a stack trace report of the active stack frames at the time when the error occurred.
Let response be the result of serialize an infra value to JSON bytes given error data.

Note: command id can be null, in which case the id field will also be set to null, not omitted from response.
Send a WebSocket message comprised of response over connection.

To handle a connection closing given a WebSocket connection connection:

If there is a BiDi session associated with connection connection:
1. Let session be the BiDi session associated with connection connection.
2. Remove connection from session’s session WebSocket connections.
Otherwise, if WebSocket connections not associated with a session contains connection, remove connection from that set.

Note: This does not end any session.

Need to hook in to the session ending to allow the UA to close the listener if it wants.

To close the WebSocket connections given session:

For each connection in session’s session WebSocket connections:
1. Start the WebSocket closing handshake with connection.
  
  Note: this will result in the steps in handle a connection closing being run for connection, which will clean up resources associated with connection.

4.1. Establishing a Connection

WebDriver clients opt in to a bidirectional connection by requesting the WebSocket URL capability with value true.

The WebDriver new session algorithm defined by this specification, with parameters session, capabilities, and flags is:

If flags contains "bidi", return.
Let webSocketUrl be the result of getting a property named "webSocketUrl" from capabilities.
If webSocketUrl is undefined, return.
Assert: webSocketUrl is true.
Let listener be the result of start listening for a WebSocket connection given session.
Set webSocketUrl to the result of construct a WebSocket URL with listener and session.
Set a property on capabilities named "webSocketUrl" to webSocketUrl.
Set session’s BiDi flag to true.
Append "bidi" to flags.

Implementations should also allow clients to establish a BiDi Session which is not a HTTP Session. In this case the URL to the WebSocket server is communicated out-of-band. An implementation that allows this supports BiDi-only sessions. At the time such an implementation is ready to accept requests to start a WebDriver session, it must:

Start listening for a WebSocket connection given null.

5. Sandboxed Script Execution

A common requirement for automation tools is to execute scripts which have access to the DOM of a document, but don’t have information about any changes to the DOM APIs made by scripts running in the navigable containing the document.

A BiDi session has a sandbox map which is a weak map in which the keys are Window objects, and the values are maps between strings and SandboxWindowProxy objects.

Note: The definition of sandboxes here is an attempt to codify the behaviour of existing implementations. It exposes parts of the implementations that have previously been considered internal by specifications, in particular the distinction between the internal state of platform objects (which is typically implemented as native objects in the main implementation language of the browser engine) and the ECMAScript-visible state. Because existing sandbox implementations happen at a low level in the engine, implementations converging toward the specification in all details might be a slow process. In the meantime, implementers are encouraged to provide detailed documentation on any differences with the specification, and users of this feature are encouraged to explicitly test that scripts running in sandboxes work in all implementations.

5.1. Sandbox Realms

Each sandbox is a unique ECMAScript Realm. However the sandbox realm provides access to platform objects in an existing Window realm via SandboxProxy objects.

To get or create a sandbox realm given name and navigable:

If name is an empty string, then return error with error code invalid argument.
Let window be navigable’s active window.
If sandbox map does not contain window, set sandbox map[window] to a new map.
Let sandboxes be sandbox map[window].
If sandboxes does not contain name, set sandboxes[name] to create a sandbox realm with navigable.
Return success with data sandboxes[name].

To create a sandbox realm with window:

Define creation of sandbox realm. This is going to return a SandboxWindowProxy wrapping window.

To get a sandbox name given target realm:

Let realms maps be get the values of sandbox map.
For each realms map in realms maps:
1. For each name → realm in realms map:
  1. If realm is target realm, return name.
Return null.

5.2. Sandbox Proxy Objects

A SandboxProxy object is an exotic object that mediates sandboxed access to objects from another realm. Sandbox proxy objects are designed to enforce the following restrictions:

Platform objects are accessible, but property access returns only Web IDL-defined properties and not ECMAScript-defined properties (either "expando" properties that are not present in the underlying interface, or ECMAScript-defined properties that shadow a property in the underlying interface).
Setting a property either runs Web IDL-defined setter steps, or sets a property on the proxy object. This means that properties written outside the sandbox are not accessible, but interface members can be used as normal.

There is no SandboxProxy interface object.

Define in detail how SandboxProxy works

To get unwrapped object:

While object is SandboxProxy or SandboxWindowProxy, set object to it’s wrapped object.
Return object.

5.3. SandboxWindowProxy

A SandboxWindowProxy is an exotic object that represents a Window object wrapped by a SandboxProxy object. This provides sandboxed access to that data in a Window global.

Define how this works.

6. User Contexts

A user context represents a collection of zero or more top-level traversables within a remote end. Each user context has an associated storage partition, so that remote end data is not shared between different user contexts.

Unclear that this is the best way to formally define the concept of a user context or the interaction with storage.

Note: The infra spec uses the term "user agent" to refer to the same concept as user contexts. However, this is not compatible with usage of the term "user agent" to mean the entire web client with multiple user contexts. Although this difference is not visible to web content, it is observed via WebDriver, so we avoid using this terminology.

A user context has a user context id, which is a unique string set upon the user context creation.

A navigable has an associated user context, which is a user context.

When a new top-level traversable is created its associated user context is set to a user context in the set of user contexts.

Note: In some cases the user context is set by specification when the top-level traversable is created, however in cases where no such requirements are present, the associated user context for a top-level traversable is implemenation-defined.

Should we specify that top-level traversables with a non-null opener have the same associated user context as their opener? Need to check if this is something existing implementations enforce.

A child navigable’s associated user context is it’s parent’s associated user context.

A user context which isn’t the associated user context for any top-level traversable is an empty user context.

The default user context is a user context with user context id "default".

An implementation has a set of user contexts, which is a set of user contexts. Initially this contains the default user context.

Implementations may append new user contexts to the set of user contexts at any time, for example in response to user actions.

Note: "At any time" here includes during implementation startup, so a given implementation might always have multiple entries in the set of user contexts.

Implementations may remove any empty user context, with exception of the default user context, from the set of user contexts at any time. However they are not required to remove such user contexts. User contexts that are not empty user contexts must not be removed from the set of user contexts.

A BiDi session has a user context to accept insecure certificates override map, which is a map between user contexts and boolean.

A BiDi session has a user context to proxy configuration map, which is a map between user contexts and proxy configuration.

An emulated network conditions struct is a struct with:

item named offline which is a boolean or null.

A BiDi session has a emulated network conditions which is a struct with an item named default network conditions, which is an emulated network conditions struct or null, an item named user context network conditions, which is a weak map between user contexts and emulated network conditions struct, and a item named navigable network conditions, which is a weak map between navigables and emulated network conditions struct.

When a user context is removed from the set of user contexts, remove user context subscriptions.

To remove user context subscriptions:

For each session in active sessions:
1. Let subscriptions to remove be a set.
2. For each subscription in session’s subscriptions:
  1. If subscription’s user context ids contains navigable’s associated user context’s user context id;
    1. Remove navigable’s associated user context’s user context id from subscription’s user context ids.
    2. If subscription’s user context ids is empty:
      1. Append subscription to subscriptions to remove.
3. Remove subscriptions to remove from session’s subscriptions.

To get user context given user context id:

For each user context in the set of user contexts:
If user context’s user context id equals user context id:
1. Return user context.
Return null.

To get valid user contexts given user context ids:

Let result be an empty set.
For each user context id of user context ids:
1. Set user context to get user context with user context id.
2. If user context is null, return error with error code no such user context.
3. Append user context to result.
Return result.

7. Modules

7.1. The session Module

The session module contains commands and events for monitoring the status of the remote end.

7.1.1. Definition

remote end definition

SessionCommand = (
  session.End //
  session.New //
  session.Status //
  session.Subscribe //
  session.Unsubscribe
)

local end definition

SessionResult = (
  session.EndResult /
  session.NewResult /
  session.StatusResult /
  session.SubscribeResult /
  session.UnsubscribeResult
)

To end the session given session:

Remove session from active sessions.
If active sessions is empty, set the webdriver-active flag to false.

To cleanup the session given session:

Close the WebSocket connections with session.
For each user context in the set of user contexts:
1. Remove session’s user context to accept insecure certificates override map[user context].
2. Remove session’s user context to proxy configuration map[user context].
For each request id → (request, phase, response) in session’s blocked request map:
1. Resume with "continue request", request id and (response, "incomplete").
For each collector in session’s network collectors:
1. Let collector id be collector’s collector.
2. For each collected data in collected network data, remove collector from data with collected data and collector id.
For each screencast recording in session’s screencast recordings map:
1. Stop a screencast recording given screencast recording.
2. Remove screencast recording from screencast recordings map.
If active sessions is empty, cleanup remote end state.
Perform any implementation-specific cleanup steps.

To cleanup remote end state.

Clear the before request sent map.
Set the default cache behavior to "default".
Clear the navigable cache behavior map.
Perform implementation-defined steps to enable any implementation-specific resource caches that are usually enabled in the current remote end configuration.

7.1.2. Types

7.1.2.1. The session.CapabilitiesRequest Type

session.CapabilitiesRequest = {
  ? alwaysMatch: session.CapabilityRequest,
  ? firstMatch: [*session.CapabilityRequest]
}

The session.CapabilitiesRequest type represents the capabilities requested for a session.

7.1.2.2. The session.CapabilityRequest Type

remote end definition and local end definition

session.CapabilityRequest = {
  ? acceptInsecureCerts: bool,
  ? browserName: text,
  ? browserVersion: text,
  ? platformName: text,
  ? proxy: session.ProxyConfiguration,
  ? unhandledPromptBehavior: session.UserPromptHandler,
  Extensible
}

The session.CapabilityRequest type represents a specific set of requested capabilities.

WebDriver BiDi defines additional WebDriver capabilities. The following tables enumerates the capabilities each implementation must support for WebDriver BiDi.

Capability:	WebSocket URL
Key:	"`webSocketUrl`"
Value type:	boolean
Description:	Defines the current session’s support for bidirectional connection.

The additional capability deserialization algorithm for the "webSocketUrl" capability, with parameter value is:

If value is not a boolean, return error with code invalid argument.
Return success with data value.

The matched capability serialization algorithm for the "webSocketUrl" capability, with parameter value is:

If value is false, return success with data null.
Return success with data true.

7.1.2.3. The session.ProxyConfiguration Type

remote end definition and local end definition

session.ProxyConfiguration = {
   session.AutodetectProxyConfiguration //
   session.DirectProxyConfiguration //
   session.ManualProxyConfiguration //
   session.PacProxyConfiguration //
   session.SystemProxyConfiguration
}

session.AutodetectProxyConfiguration = (
   proxyType: "autodetect",
   Extensible
)

session.DirectProxyConfiguration = (
   proxyType: "direct",
   Extensible
)

session.ManualProxyConfiguration = (
   proxyType: "manual",
   ? httpProxy: text,
   ? sslProxy: text,
   ? session.SocksProxyConfiguration,
   ? noProxy: [*text],
   Extensible
)

session.SocksProxyConfiguration = (
   socksProxy: text,
   socksVersion: 0..255,
)

session.PacProxyConfiguration = (
   proxyType: "pac",
   proxyAutoconfigUrl: text,
   Extensible
)

session.SystemProxyConfiguration = (
   proxyType: "system",
   Extensible
)

7.1.2.4. The session.UserPromptHandler Type

Remote end definition and local end definition

session.UserPromptHandler = {
  ? alert: session.UserPromptHandlerType,
  ? beforeUnload: session.UserPromptHandlerType,
  ? confirm: session.UserPromptHandlerType,
  ? default: session.UserPromptHandlerType,
  ? file: session.UserPromptHandlerType,
  ? prompt: session.UserPromptHandlerType,
}

The session.UserPromptHandler type represents the configuration of the user prompt handler.

Note: file handles file picker. "accept" and "dismiss" dismisses the picker. "ignore" keeps the picker open.

7.1.2.5. The session.UserPromptHandlerType Type

Remote end definition and local end definition

session.UserPromptHandlerType = "accept" / "dismiss" / "ignore";

The session.UserPromptHandlerType type represents the behavior of the user prompt handler.

7.1.2.6. The session.Subscription Type

session.Subscription = text

The session.Subscription type represents a unique subscription identifier.

7.1.2.7. The session.SubscribeParameters Type

session.SubscribeParameters = {
  events: [+text],
  ? contexts: [+browsingContext.BrowsingContext],
  ? userContexts: [+browser.UserContext],
}

The session.SubscribeParameters type represents a request to subscribe to a specific set of events.

7.1.2.8. The session.UnsubscribeByIDRequest Type

session.UnsubscribeByIDRequest = {
  subscriptions: [+session.Subscription],
}

The session.UnsubscribeByIDRequest type represents a request to remove event subscriptions identified by subscription IDs.

7.1.2.9. The session.UnsubscribeByAttributesRequest Type

session.UnsubscribeByAttributesRequest = {
  events: [+text],
}

The session.UnsubscribeByAttributesRequest type represents a request to unsubscribe using subscription attributes.

7.1.3. Commands

7.1.3.1. The session.status Command

The session.status command returns information about whether a remote end is in a state in which it can create new sessions, but may additionally include arbitrary meta information that is specific to the implementation.

This is a static command.

Command Type

session.Status = (
  method: "session.status",
  params: EmptyParams,
)

Return Type

session.StatusResult = {
  ready: bool,
  message: text,
}

The remote end steps given session, and command parameters are:

Let body be a new map with the following properties:

"ready"
The remote end’s readiness state.
"message"
An implementation-defined string explaining the remote end’s readiness state.
Return success with data body

7.1.3.2. The session.new Command

The session.new command allows creating a new BiDi session.

Note: A session created this way will not be accessible via HTTP.

This is a static command.

Command Type

session.New = (
  method: "session.new",
  params: session.NewParameters
)

session.NewParameters = {
  capabilities: session.CapabilitiesRequest
}

Return Type

session.NewResult = {
  sessionId: text,
  capabilities: {
    acceptInsecureCerts: bool,
    browserName: text,
    browserVersion: text,
    platformName: text,
    setWindowRect: bool,
    userAgent: text,
    ? proxy: session.ProxyConfiguration,
    ? unhandledPromptBehavior: session.UserPromptHandler,
    ? webSocketUrl: text,
    Extensible
  }
}

The remote end steps given session and command parameters are:

If session is not null, return an error with error code session not created.
If the implementation is unable to start a new session for any reason, return an error with error code session not created.
Let flags be a set containing "bidi".
Let capabilities json be the result of trying to process capabilities with command parameters and flags.
Let capabilities be convert a JSON-derived JavaScript value to an Infra value with capabilities json.
Let session be the result of trying to create a session with capabilities and flags.
Set session’s BiDi flag to true.

Note: the connection for this session will be set to the current connection by the caller.
Let body be a new map matching the session.NewResult production, with the sessionId field set to session’s session ID, and the capabilities field set to capabilities.
Return success with data body.

7.1.3.3. The session.end Command

The session.end command ends the current session.

Command Type

session.End = (
  method: "session.end",
  params: EmptyParams
)

Return Type

session.EndResult = EmptyResult

The remote end steps given session and command parameters are:

End the session with session.
Return success with data null, and in parallel run the following steps:
1. Wait until the Send a WebSocket message steps have been called with the response to this command.
  
  this is rather imprecise language, but hopefully it’s clear that the intent is that we send the response to the command before starting shutdown of the connections.
2. Cleanup the session with session.

The session.subscribe command enables certain events either globally or for a set of navigables.

This needs to be generalized to work with realms too.

Command Type

session.Subscribe = (
  method: "session.subscribe",
  params: session.SubscribeParameters
)

Return Type

session.SubscribeResult = {
  subscription: session.Subscription,
}

The remote end steps with session and command parameters are:

Let event names be an empty set.
For each entry name in command parameters["events"], let event names be the union of event names and the result of trying to obtain a set of event names with name.
Let input user context ids be create a set with command parameters[userContexts].
Let input context ids be create a set with command parameters[contexts].
If input user context ids is not empty and input context ids is not empty, return error with error code invalid argument.
Let subscription navigables be a set.
Let top-level traversable context ids be a set.
If input context ids is not empty:
1. Let navigables be the result of trying to get valid navigables by ids with input context ids.
2. Set subscription navigables be get top-level traversables with navigables.
3. For each navigable in subscription navigables:
  1. Append navigable’s navigable id to top-level traversable context ids.
Otherwise, if input user context ids is not empty:
1. For each user context id of input user context ids:
  1. Let user context be get user context with user context id.
  2. If user context is null, return error with error code no such user context.
  3. For each top-level traversable in the list of all top-level traversables whose associated user context is user context:
    1. Append top-level traversable to subscription navigables.
Otherwise, set subscription navigables to a set of all top-level traversables in the remote end.
Let subscription be a subscription with subscription id set to the string representation of a UUID, event names set to event names, top-level traversable ids set to top-level traversable context ids and user context ids set to input user context ids.
Let subscribe step events be a new map.
For each event name in the event names:
1. If the event with event name event name does not define remote end subscribe steps, continue;
2. Let existing navigables be a set of top-level traversables for which an event is enabled with session and event name.
3. Set subscribe step events[event name] to difference of subscription navigables and existing navigables.
Append subscription to session’s subscriptions.
Append subscription’s subscription id to session’s known subscription ids.
Sort in ascending order subscribe step events using the following less than algorithm given two entries with keys event name one and event name two:
1. Let event one be the event with name event name one
2. Let event two be the event with name event name two
3. Return true if event one’s subscribe priority is less than event two’s subscribe priority, or false otherwise.
If subscription is global, let include global be true, otherwise let include global be false.
For each event name → navigables in subscribe step events:
1. Run the remote end subscribe steps for the event with event name event name given session, navigables and include global.
Let body be a new map matching the session.SubscribeResult production, with the subscription field set to subscription’s subscription id.
Return success with data body.

7.1.3.5. The session.unsubscribe Command

The session.unsubscribe command disables events either globally or for a set of navigables.

This needs to be generalised to work with realms too.

Command Type

session.Unsubscribe = (
  method: "session.unsubscribe",
  params: session.UnsubscribeParameters,
)

session.UnsubscribeParameters = session.UnsubscribeByAttributesRequest / session.UnsubscribeByIDRequest

Return Type

session.UnsubscribeResult = EmptyResult

The remote end steps with session and command parameters are:

If command parameters does not contain "subscriptions":

Note: The condition implies that command parameters is matching the session.UnsubscribeByAttributesRequest production.
1. Let event names be an empty set.
2. For each entry name in command parameters["events"], let event names be the union of event names and the result of trying to obtain a set of event names with name.
3. Let new subscriptions to be a list.
4. Let matched events to be a set.
5. For each subscription of session’s subscriptions:
  1. If intersection of subscription’s event names and event names is an empty set:
    1. append subscription to new subscriptions.
    2. Continue.
  2. If subscription is not global:
    1. append subscription to new subscriptions.
    2. Continue.
  3. Let subscription event names be clone of subscription’s event names.
  4. For each event name of event names:
    1. If subscription event names contains event name:
      1. Append event name to matched events.
      2. Remove event name from subscription event names.
  5. If subscription event names is not empty:
    1. Let cloned subscription be a subscription with subscription id set to subscription’s subscription id, event names set to a new set containing subscription event names.
    2. append cloned subscription to new subscriptions.
6. If matched events is not equal to event names, return error with error code invalid argument.
7. Set session’s subscriptions to new subscriptions.
Otherwise:
1. Let subscriptions be create a set with command parameters[subscriptions].
2. Let unknown subscription ids to difference between subscriptions and session’s known subscription ids.
3. If unknown subscription ids is not empty:
  1. Return error with error code invalid argument.
4. Let subscriptions to remove be an empty set.
5. For each subscription in session’s subscriptions:
  1. If subscriptions contains subscription’s subscription id:
    1. Append subscription to subscriptions to remove.
6. Set session’s known subscription ids to difference between session’s known subscription ids and subscriptions.
7. Remove each item in subscriptions to remove from session’s subscriptions.
Return success with data null.

7.2. The browser Module

The browser module contains commands for managing the remote end browser process.

7.2.1. Definition

remote end definition

BrowserCommand = (
  browser.Close //
  browser.CreateUserContext //
  browser.GetClientWindows //
  browser.GetUserContexts //
  browser.RemoveUserContext //
  browser.SetClientWindowState //
  browser.SetDownloadBehavior
)

local end definition

BrowserResult = (
  browser.CloseResult /
  browser.CreateUserContextResult /
  browser.GetClientWindowsResult /
  browser.GetUserContextsResult /
  browser.RemoveUserContextResult /
  browser.SetClientWindowStateResult /
  browser.SetDownloadBehaviorResult
)

7.2.2. Windows

Each top-level traversable is associated with a single client window which represents a rectangular area containing the viewport that will be used to render that top-level traversable’s active document when its visibility state is "visible", as well as any browser-specific user interface elements associated with displaying the traversable (e.g. any URL bar, toolbars, or OS window decorations).

A client window has a client window id which is a string uniquely identifying that window.

A client window has an x-coordinate, which is the number of CSS pixels between the left edge of the web-exposed screen area and the left edge of the window, or zero if that doesn’t make sense for a particular window.

A client window has a y-coordinate, which is the number of CSS pixels between the top edge of the web-exposed screen area and the top edge of the window, or zero if that doesn’t make sense for a particular window.

A client window has a width, which is the width of the window’s rectangle in CSS pixels.

A client window has a height, which is the height of the window’s rectangle in CSS pixels.

To maximize the client window window an implementation should either perform steps corresponding to the platform notion of maximizing window, or position window such that its x-coordinate is as close as possible to 0, its y-coordinate is as close as possible to 0, its width is as close as possible to the width of the web-exposed screen area and its height is as close as possible to the height of the web-exposed screen area. If either of these options are supported then maximize client window is supported.

To minimize the client window window an implementation should either perform steps corresponding to the platform notion of minimizing window, or otherwise hide window such that all the active documents in top-level traversables associated with window have visibility state "hidden" and window’s width and height are both as close as possible to 0. If either of these options are supported then minimize client window is supported.

To restore the client window window an implementation should ensure that it’s neither in a platform-defined maximized state, nor in a platform-defined minimized state, and that if there is one or more top-level traversable associated with window, at least one of those has an active document in the "visible" state. If this is supported then restore client window is supported.

To get the client window state given window:

Let documents be an empty list.
Let visible documents be an empty list.
For each top-level traversable traversable:
1. If traversable’s client window is not window then continue.
2. Let document be traversable’s active document.
3. Append document to documents.
4. If document’s visibility state is "visible", Append document to visible documents.
For each document in visible documents:
1. If document’s fullscreen element is not null, return "fullscreen".
If visible documents is empty but documents is not empty, or if window is otherwise in an OS-specific minimized state, return "minimized".

Note: This will usually, but not necessarily, mean that window’s width and height are equal to 0.
If window is in an OS-specific maximized state return "maximized".

Note: This will usually, but not necessarily, mean that window’s width is equal to the width of the web-exposed screen area and window’s height is equal to the height of the web-exposed screen area.
Return "normal".

To set the client window state given window and state:

Let current state be get the client window state with window.
If current state is "fullscreen", "maximized", or "minimized" and is equal to state, return success with data null.
In the following list of conditions and associated steps, run the first set of steps for which the associated condition is true:

"fullscreen"
If not fullscreen is supported return error with error code unsupported operation.
"normal"
If not restore client window is supported for window return error with error code unsupported operation.
"maximize"
If not maximize client window is supported for window return error with error code unsupported operation.
"minimize"
If not minimize client window is supported for window return error with error code unsupported operation.
Let documents be an empty list.
For each top-level traversable traversable:
1. If traversable’s associated client window is not window then continue.
2. Let document be traversable’s active document.
3. Append document to documents.
If documents is empty return error with error code no such client window.
If current state is "fullscreen":
1. For each document in documents:
  1. Fully exit fullscreen with document.
    
    Note: This is a no-op for documents in window that are not fullscreen.
If current state is "maximized" or "minimized":
1. Restore the client window window.
Switch on the value of state:
"fullscreen"
1. For each document in documents:
  1. If document’s visibility state is "visible", fullscreen an element with document’s document element.
  2. Break.
"maximize"
1. Maximize the client window window.
"minimize"
1. Minimize the client window window.
Return success with data null.

7.2.3. Types

7.2.3.1. The browser.ClientWindow Type

browser.ClientWindow = text;

The browser.ClientWindow uniquely identifies a client window.

7.2.3.2. The browser.ClientWindowInfo Type

browser.ClientWindowInfo = {
  active: bool,
  clientWindow: browser.ClientWindow,
  height: js-uint,
  state: "fullscreen" / "maximized" / "minimized" / "normal",
  width: js-uint,
  x: js-int,
  y: js-int,
}

The browser.ClientWindowInfo type represents properties of a client window.

To get the client window info given client window:

Let client window id be the client window id for client window.
Let state be get the client window state with client window.
If client window can receive keyboard input channeled from the operating system, let active be true, otherwise let active be false.

Note: This could mean that a top-level traversable whose client window is client window has system focus, or it could mean that the user interface of the browser itself currently has focus.
Let client window info be a map matching the browser.ClientWindowsInfo production with the clientWindow field set to client window id, state field set to state, the x field set to client window’s x-coordinate, the y field set to client window’s y-coordinate, the width field set to client window’s width, the height field set to client window’s height, and the active field set to active.
Return client window info

7.2.3.3. The browser.UserContext Type

browser.UserContext = text;

The browser.UserContext unique identifies a user context.

7.2.3.4. The browser.UserContextInfo Type

browser.UserContextInfo = {
  userContext: browser.UserContext
}

The browser.UserContextInfo type represents properties of a user context.

7.2.4. Commands

7.2.4.1. The browser.close Command

The browser.close command terminates all WebDriver sessions and cleans up automation state in the remote browser instance.

Command Type

browser.Close = (
  method: "browser.close",
  params: EmptyParams,
)

Return Type

browser.CloseResult = EmptyResult

The remote end steps with session and command parameters are:

End the session with session.
If active sessions is not empty an implementation may return error with error code unable to close browser, and then run the following steps in parallel:
1. Wait until the Send a WebSocket message steps have been called with the response to this command.
2. Cleanup the session with session.
Note: The behaviour in cases where the browser has multiple automation sessions is currently unspecified. It might be that any session can close the browser, or that only the final open session can actually close the browser, or only the first session started can. This behaviour might be fully specified in a future version of this specification.
For each active session in active sessions:
1. End the session active session.
2. Cleanup the session with active session
Return success with data null, and run the following steps in parallel.
1. Wait until the Send a WebSocket message steps have been called with the response to this command.
2. Cleanup the session with session.
3. Close any top-level traversables without prompting to unload.
4. Perform implementation defined steps to clean up resources associated with the remote end under automation.
  
  Note: For example this might include cleanly shutting down any OS-level processes associated with the browser under automation, removing temporary state, such as user profile data, created by the remote end while under automation, or shutting down the WebSocket Listener. Because of differences between browsers and operating systems it is not possible to specify in detail precise invariants local ends can depend on here.

7.2.4.2. The browser.createUserContext Command

The browser.createUserContext command creates a user context.

Command Type

browser.CreateUserContext = (
  method: "browser.createUserContext",
  params: browser.CreateUserContextParameters,
)

browser.CreateUserContextParameters = {
  ? acceptInsecureCerts: bool,
  ? proxy: session.ProxyConfiguration,
  ? unhandledPromptBehavior: session.UserPromptHandler
}

Return Type

browser.CreateUserContextResult = browser.UserContextInfo

The remote end steps with session and command parameters are:

Let user context be a new user context.
If command parameters contain "acceptInsecureCerts":

Note: If "acceptInsecureCerts" is set, it overrides the accept insecure TLS flag’s behavior.
1. Let acceptInsecureCerts be command parameters["acceptInsecureCerts"]:
2. If acceptInsecureCerts is true and endpoint node doesn’t support accepting insecure TLS connections, return error with error code unsupported operation.
3. Set session’s user context to accept insecure certificates override map[user context] to acceptInsecureCerts.
If command parameters contains "unhandledPromptBehavior", set unhandled prompt behavior overrides map[user context] to command parameters["unhandledPromptBehavior"].
If command parameters contains "proxy":
1. Let proxy configuration be command parameters["proxy"].
2. If the remote end is unable to configure proxy settings per user context, or is unable to configure the proxy with proxy configuration, return error with error code unsupported operation.
3. Set session’s user context to proxy configuration map[user context] to proxy configuration.
Append user context to the set of user contexts.
Let user context info be a map matching the browser.UserContextInfo production with the userContext field set to user context’s user context id.
Return success with data user context info.

7.2.4.3. The browser.getClientWindows Command

The browser.getClientWindows command returns a list of client windows.

Command Type

browser.GetClientWindows = (
  method: "browser.getClientWindows",
  params: EmptyParams,
)

Return Type

browser.GetClientWindowsResult = {
  clientWindows: [ * browser.ClientWindowInfo]
}

The remote end steps are:

Let client window ids be an empty set.
Let client windows be an empty list.
For each top-level traversable traversable:
1. Let client window be traversable’s associated client window
2. Let client window id be the client window id for client window.
3. If client window ids contains client window id, continue.
4. Append client window id to client window ids.
5. Let client window info be get the client window info with client window.
6. Append client window info to client windows.
Let result be a map matching the browser.GetClientWindowsResult production with the clientWindows field set to client windows.
Return success with data result.

7.2.4.4. The browser.getUserContexts Command

The browser.getUserContexts command returns a list of user contexts.

Command Type

browser.GetUserContexts = (
  method: "browser.getUserContexts",
  params: EmptyParams,
)

Return Type

browser.GetUserContextsResult = {
  userContexts: [ + browser.UserContextInfo]
}

The remote end steps are:

Let user contexts be an empty list.
For each user context in the set of user contexts:
1. Let user context info be a map matching the browser.UserContextInfo production with the userContext field set to user context’s user context id.
2. Append user context info to user contexts.
Let result be a map matching the browser.GetUserContextsResult production with the userContexts field set to user contexts.
Return success with data result.

7.2.4.5. The browser.removeUserContext Command

The browser.removeUserContext command closes a user context and all navigables in it without running beforeunload handlers.

Command Type

browser.RemoveUserContext = (
  method: "browser.removeUserContext",
  params: browser.RemoveUserContextParameters
)

browser.RemoveUserContextParameters = {
  userContext: browser.UserContext
}

Return Type

browser.RemoveUserContextResult = EmptyResult

The remote end steps with command parameters are:

Let user context id be command parameters["userContext"].
If user context id is "default", return error with error code invalid argument.
Set user context to get user context with user context id.
If user context is null, return error with error code no such user context.
For each top-level traversable navigable:
1. If navigable’s associated user context is user context:
  1. Close navigable without prompting to unload.
Remove user context for the set of user contexts.
Return success with data null.

7.2.4.6. The browser.setClientWindowState Command

The browser.setClientWindowState command sets the dimensions of a client window.

Command Type

browser.SetClientWindowState = (
  method: "browser.setClientWindowState",
  params: browser.SetClientWindowStateParameters
)

browser.SetClientWindowStateParameters = {
  clientWindow: browser.ClientWindow,
  (browser.ClientWindowNamedState // browser.ClientWindowRectState)
}

browser.ClientWindowNamedState = (
  state: "fullscreen" / "maximized" / "minimized"
)

browser.ClientWindowRectState = (
  state: "normal",
  ? width: js-uint,
  ? height: js-uint,
  ? x: js-int,
  ? y: js-int,
)

Return Type

browser.SetClientWindowStateResult = browser.ClientWindowInfo

The remote end steps with session and command parameters are:

If the implementation does not support setting the client window state at all, then return error with error code unsupported operation.
If there is a client window with client window id command parameters["clientWindow"], let client window be that client window. Otherwise return error with error code no such client window.
Try to set the client window state with client window and command parameters["state"].
If command parameters["state"] is "normal":
1. If command parameters contains "x" and the implementation supports positioning client windows, set the x-coordinate of client window to a value that is as close as possible command parameters["x"].
2. If command parameters contains "y" and the implementation supports positioning client windows, set the y-coordinate of client window to a value that is as close as possible command parameters["y"].
3. If command parameters contains "width" and the implementation supports resizing client windows, set the width of client window to a value that is as close as possible command parameters["width"].
4. If command parameters contains "width" and the implementation supports resizing client windows, set the width of client window to a value that is as close as possible command parameters["width"].
Let client window info be get the client window info with client window.
Return success with data client window info.

Note: For simplicity this models all client window operations as synchronous. Therefore the returned client window dimensions are expected to be those after the window has reached its new state.

7.2.4.7. The browser.setDownloadBehavior Command

A download behavior struct is a struct with:

item named allowed which is a boolean;
item named destinationFolder which is a string or null.

A remote end has a download behavior which is a struct with an item named default download behavior, which is a download behavior struct or null, and an item named user context download behavior, which is a weak map between user contexts and download behavior struct.

Command Type

browser.SetDownloadBehavior = (
  method: "browser.setDownloadBehavior",
  params: browser.SetDownloadBehaviorParameters
)

browser.SetDownloadBehaviorParameters = {
  downloadBehavior: browser.DownloadBehavior / null,
  ? userContexts: [+browser.UserContext]
}

browser.DownloadBehavior = {
  (
    browser.DownloadBehaviorAllowed //
    browser.DownloadBehaviorDenied
  )
}

browser.DownloadBehaviorAllowed = (
  type: "allowed",
  destinationFolder: text
)

browser.DownloadBehaviorDenied = (
  type: "denied"
)

Return Type

browser.SetDownloadBehaviorResult = EmptyResult

To get download behavior given navigable:

Let user context be navigable’s associated user context.
If download behavior’s user context download behavior contains user context, return download behavior’s user context download behavior[user context].
Return download behavior’s default download behavior.

The remote end steps with session and command parameters are:

If command parameters["downloadBehavior"] is null, let download behavior be null.
Otherwise:
1. If command parameters["downloadBehavior"]["type"] is "allowed", let allowed be true, otherwise let allowed be false.
2. If command parameters["downloadBehavior"] contains "destinationFolder", let destinationFolder be command parameters["downloadBehavior"]["destinationFolder"], otherwise let destinationFolder be null.
3. Let download behavior be a download behavior struct with allowed set to allowed and destinationFolder set to destinationFolder.
If the implementation does not support required download behavior, then return error with error code unsupported operation.
If the userContexts field of command parameters is present:
1. Let user contexts be the result of trying to get valid user contexts with command parameters["userContexts"].
2. For each user context of user contexts:
  1. If download behavior is null, remove user context from download behavior’s user context download behavior.
  2. Otherwise, set download behavior’s user context download behavior[user context] to download behavior.
Otherwise, set download behavior’s default download behavior to download behavior.
Return success with data null.

7.3. The browsingContext Module

The browsingContext module contains commands and events relating to navigables.

Note: For historic reasons this module is called browsingContext rather than navigable, and the protocol uses the term context to refer to navigables, particularly as a field in command and response parameters.

The progress of navigation is communicated using an immutable struct WebDriver BiDi navigation status, which has the following items:

id: The navigation id for the navigation, or null when the navigation is canceled before making progress.
status: A status code that is either "canceled", "pending", or "complete".
url: The URL which is being loaded in the navigation
suggestedFilename: If the navigation is a download, suggested filename, otherwise null.
downloadedFilepath: If the navigation is a download which is finished and the downloaded file is available, absolute filepath of the downloaded file, otherwise null.
downloadResponse: If the navigation is a download, response, otherwise null.

7.3.1. Definition

remote end definition

BrowsingContextCommand = (
  browsingContext.Activate //
  browsingContext.CaptureScreenshot //
  browsingContext.Close //
  browsingContext.Create //
  browsingContext.GetTree //
  browsingContext.HandleUserPrompt //
  browsingContext.LocateNodes //
  browsingContext.Navigate //
  browsingContext.Print //
  browsingContext.Reload //
  browsingContext.SetBypassCSP //
  browsingContext.SetViewport //
  browsingContext.StartScreencast //
  browsingContext.StopScreencast //
  browsingContext.TraverseHistory
)

local end definition

BrowsingContextResult = (
  browsingContext.ActivateResult /
  browsingContext.CaptureScreenshotResult /
  browsingContext.CloseResult /
  browsingContext.CreateResult /
  browsingContext.GetTreeResult /
  browsingContext.HandleUserPromptResult /
  browsingContext.LocateNodesResult /
  browsingContext.NavigateResult /
  browsingContext.PrintResult /
  browsingContext.ReloadResult /
  browsingContext.SetBypassCSPResult /
  browsingContext.SetViewportResult /
  browsingContext.StartScreencastResult /
  browsingContext.StopScreencastResult /
  browsingContext.TraverseHistoryResult
)

BrowsingContextEvent = (
  browsingContext.ContextCreated //
  browsingContext.ContextDestroyed //
  browsingContext.DomContentLoaded //
  browsingContext.DownloadEnd //
  browsingContext.DownloadWillBegin //
  browsingContext.FragmentNavigated //
  browsingContext.HistoryUpdated //
  browsingContext.Load //
  browsingContext.NavigationAborted //
  browsingContext.NavigationCommitted //
  browsingContext.NavigationFailed //
  browsingContext.NavigationStarted //
  browsingContext.UserPromptClosed //
  browsingContext.UserPromptOpened
)

A remote end has a device pixel ratio overrides which is a weak map between navigables and device pixel ratio overrides. It is initially empty.

Note: this map is not cleared when the final session ends i.e. device pixel ratio overrides outlive any WebDriver session.

A viewport dimensions is a struct with:

Item named height which is an integer;
Item named width which is an integer.

A viewport configuration is a struct with:

Item named viewport which is a viewport dimensions or null;
Item named devicePixelRatio which is a float or null.

An unhandled prompt behavior struct is a struct with:

Item named alert which is a string or null;
Item named beforeUnload which is a string or null;
Item named confirm which is a string or null;
Item named default which is a string or null;
Item named file which is a string or null;
Item named prompt which is a string or null.

A remote end has a viewport overrides map which is a weak map between user contexts and viewport configuration.

A remote end has a locale overrides map which is a weak map between navigables or user contexts and string.

A screen settings is a struct with an item named height which is an integer, an item named width which is an integer, an item named x which is an integer, an item named y which is an integer.

A remote end has a screen settings overrides which is a struct with an item named user context screen settings, which is a weak map between user contexts and screen settings, and an item named navigable screen settings, which is a weak map between navigables and screen settings.

A remote end has a timezone overrides map which is a weak map between navigables or user contexts and string.

A remote end has an unhandled prompt behavior overrides map which is a weak map between user contexts and unhandled prompt behavior struct.

A remote end has a scripting enabled overrides map which is a weak map between navigables or user contexts and boolean.

A remote end has a download id map which is is a weak map between response and download ids. It is initially empty.

A screencast stream is an abstract stream of the viewport of a top-level traversable, consisting of a video track containing the rendered visual output of the top-level traversable’s document’s viewport, and optionally an audio track containing the audio output of the top-level traversable’s document.

A BiDi session has a screencast recordings map which is a map in which the keys are UUIDs, and the values are screencast recording, which is a struct with an item named stream, which is a screencast stream, an item named path, which is a string, an item named state, which is one of "recording", "stopping", "stopped", an item named writeError, which is a string or null.

To start a screencast recording given a screencast recording recording and mime type:

Run the following steps in parallel:
1. Begin encoding recording’s stream using mime type, producing successive chunks of encoded data as byte sequences. Produce a new chunk at the implementation defined interaval while recording’s state is "recording".
2. For each chunk bytes produced for recording, run the following steps:
  1. Append bytes to the file at recording’s path. If this fails:
    1. Set recording’s writeError to an implementation-defined string describing the write failure.
    2. Stop a screencast recording given recording.

To stop a screencast recording given a screencast recording recording:

If recording’s state is not "recording" then return.
Set recording’s state to "stopping".
Stop producing new chunks for recording, flush any remaining encoded data as a final chunk (processed as in start a screencast recording), stop capturing from recording’s stream and release its video track and, if present, its audio track, and then set recording’s state to "stopped".
Wait until recording’s state is "stopped".

7.3.2. Types

7.3.2.1. The browsingContext.BrowsingContext Type

remote end definition and local end definition

browsingContext.BrowsingContext = text;

Each navigable has an associated navigable id, which is a string uniquely identifying that navigable. This is implicitly set when the navigable is created. For navigables with an associated WebDriver window handle the navigable id must be the same as the window handle.

Each navigable also has an associated storage partition, which is the storage partition it uses to persist data.

Each navigable also has an associated original opener, which is a navigable that caused the navigable to open or null, initially set to null.

To get a navigable given navigable id:

If navigable id is null, return success with data null.
If there is no navigable with navigable id navigable id return error with error code no such frame
Let navigable be the navigable with id navigable id.
Return success with data navigable.

7.3.2.2. The browsingContext.Info Type

local end definition

browsingContext.InfoList = [*browsingContext.Info]

browsingContext.Info = {
  children: browsingContext.InfoList / null,
  clientWindow: browser.ClientWindow,
  context: browsingContext.BrowsingContext,
  originalOpener: browsingContext.BrowsingContext / null,
  url: text,
  userContext: browser.UserContext,
  ? parent: browsingContext.BrowsingContext / null,
}

The browsingContext.Info type represents the properties of a navigable.

To get the child navigables given navigable:

TODO: make this return a list in document order

Let child navigables be a set containing all navigables that are a child navigable of navigable.
Return child navigables.

To get the navigable info given navigable, max depth and include parent id:

Let navigable id be the navigable id for navigable.
Let parent navigable be navigable’s parent.
If parent navigable is not null let parent id be the navigable id of parent navigable. Otherwise let parent id be null.
Let document be navigable’s active document.
Let url be the result of running the URL serializer, given document’s URL.

Note: This includes the fragment component of the URL.
Let child infos be null.
If max depth is null, or max depth is greater than 0:
1. Let child navigables be get the child navigables given navigable.
2. Let child depth be max depth - 1 if max depth is not null, or null otherwise.
3. Set child infos to an empty list.
4. For each child navigable of child navigables:
  1. Let info be the result of get the navigable info given child navigable, child depth, and false.
  2. Append info to child infos
Let user context be navigable’s associated user context.
Let opener id be the navigable id for navigable’s original opener, if navigable’s original opener is not null, and null otherwise.
Let top-level traversable be navigable’s top-level traversable.
Let client window id be the client window id for top-level traversable’s associated client window.
Let navigable info be a map matching the browsingContext.Info production with the context field set to navigable id, the parent field set to parent id if include parent id is true, or unset otherwise, the url field set to url, the userContext field set to user context’s user context id, originalOpener field set to opener id, the children field set to child infos, and the clientWindow field set to client window id.
Return navigable info.

To given navigable, request, wait condition, and optionally history handling (default: "default") and ignore cache (default: false):

Let navigation id be the string representation of a UUID based on truly random, or pseudo-random numbers.
Navigate navigable with resource request, and using navigable’s active document as the source Document, with navigation id navigation id, and history handling behavior history handling. If ignore cache is true, the navigation must not load resources from the HTTP cache.

property specify how the ignore cache flag works. This needs to consider whether only the first load of a resource bypasses the cache (i.e. whether this is like initially clearing the cache and proceeding like normal), or whether resources not directly loaded by the HTML parser (e.g. loads initiated by scripts or stylesheets) also bypass the cache.
Let (event received, navigation status) be await given «"navigation started", "navigation failed", "fragment navigated"», and navigation id.
Assert: navigation status’s id is navigation id.
If navigation status’s status is "complete":
1. Let body be a map matching the browsingContext.NavigateResult production, with the navigation field set to navigation id, and the url field set to the result of the URL serializer given navigation status’s url.
2. Return success with data body.
Note: this is the case if the navigation only caused the fragment to change.
If navigation status’s status is "canceled" return error with error code unknown error.

TODO: is this the right way to handle errors here?
Assert: navigation status’s status is "pending" and navigation id is not null.
If wait condition is "committed", let event name be "committed".
Otherwise, if wait condition is "interactive", let event name be "domContentLoaded".
Otherwise, let event name be "load".
Let (event received, status) be await given «event name, "download started", "navigation aborted", "navigation failed"» and navigation id.
If event received is "navigation failed" return error with error code unknown error.

Are we surfacing enough information about what failed and why with an error here? What error code do we want? Is there going to be a problem where local ends parse the implementation-defined strings to figure out what actually went wrong?
Let body be a map matching the browsingContext.NavigateResult production, with the navigation field set to status’s id, and the url field set to the result of the URL serializer given status’s url.
Return success with data body.

7.3.2.3. The browsingContext.Locator Type

remote end definition and local end definition

browsingContext.Locator = (
   browsingContext.AccessibilityLocator /
   browsingContext.CssLocator /
   browsingContext.ContextLocator /
   browsingContext.InnerTextLocator /
   browsingContext.XPathLocator
)

browsingContext.AccessibilityLocator = {
   type: "accessibility",
   value: {
    ? name: text,
    ? role: text,
   }
}

browsingContext.CssLocator = {
   type: "css",
   value: text
}

browsingContext.ContextLocator = {
  type: "context",
  value: {
    context: browsingContext.BrowsingContext,
  }
}

browsingContext.InnerTextLocator = {
   type: "innerText",
   value: text,
   ? ignoreCase: bool
   ? matchType: "full" / "partial",
   ? maxDepth: js-uint,
}

browsingContext.XPathLocator = {
   type: "xpath",
   value: text
}

The browsingContext.Locator type provides details on the strategy for locating a node in a document.

remote end definition and local end definition

browsingContext.Navigation = text;

The browsingContext.Navigation type is a unique string identifying an ongoing navigation.

TODO: Link to the definition in the HTML spec.

7.3.2.5. The browsingContext.Download Type

remote end definition and local end definition

browsingContext.Download = text;

The browsingContext.Download type is a unique string identifying a download.

7.3.2.6. The browsingContext.NavigationInfo Type

local end definition:

browsingContext.BaseNavigationInfo = (
  context: browsingContext.BrowsingContext,
  navigation: browsingContext.Navigation / null,
  timestamp: js-uint,
  url: text,
  ? userContext: browser.UserContext,
)

browsingContext.NavigationInfo = {
  browsingContext.BaseNavigationInfo
}

The browsingContext.NavigationInfo type provides details of an ongoing navigation.

To , given navigable navigable and navigation status:

Let navigable id be the navigable id for navigable.
Let navigation id be navigation status’s id.
Let timestamp be a time value representing the current date and time in UTC.
Let url be navigation status’s url.
Let user context id be the user context id of navigable’s associated user context.
Return a map matching the browsingContext.NavigationInfo production, with the context field set to navigable id, the navigation field set to navigation id, the timestamp field set to timestamp, the url field set to the result of the URL serializer given url, and the userContext field set to user context id.

7.3.2.7. The browsingContext.ReadinessState Type

browsingContext.ReadinessState = "none" / "interactive" / "complete"

The browsingContext.ReadinessState type represents the stage of document loading at which a navigation command will return.

7.3.2.8. The browsingContext.UserPromptType Type

Remote end definition and local end definition

browsingContext.UserPromptType = "alert" / "beforeunload" / "confirm" / "prompt";

The browsingContext.UserPromptType type represents the possible user prompt types.

7.3.3. Commands

7.3.3.1. The browsingContext.activate Command

The browsingContext.activate command activates and focuses the given top-level traversable.

Command Type

browsingContext.Activate = (
  method: "browsingContext.activate",
  params: browsingContext.ActivateParameters
)

browsingContext.ActivateParameters = {
  context: browsingContext.BrowsingContext
}

Return Type

browsingContext.ActivateResult = EmptyResult

The remote end steps with command parameters are:

Let navigable id be the value of the command parameters["context"] field.
Let navigable be the result of trying to get a navigable with navigable id.
If navigable is not a top-level traversable, return error with error code invalid argument.
Return activate a navigable with navigable.

To activate a navigable given navigable:

Run implementation-specific steps so that navigable’s system visibility state becomes visible. If this is not possible return error with error code unsupported operation.

Note: This can have the side effect of making currently visible navigables hidden.

Note: This can change the underlying OS state by causing the window to become unminimized or by other side effects related to changing the system visibility state.
Run implementation-specific steps to set the system focus on the navigable if it is not focused.

Note: This does not change the focused area of the document except as mandated by other specifications.
Return success with data null.

7.3.3.2. The browsingContext.captureScreenshot Command

The browsingContext.captureScreenshot command captures an image of the given navigable, and returns it as a Base64-encoded string.

Command Type

browsingContext.CaptureScreenshot = (
  method: "browsingContext.captureScreenshot",
  params: browsingContext.CaptureScreenshotParameters
)

browsingContext.CaptureScreenshotParameters = {
  context: browsingContext.BrowsingContext,
  ? origin: ("viewport" / "document") .default "viewport",
  ? format: browsingContext.ImageFormat,
  ? clip: browsingContext.ClipRectangle,
}

browsingContext.ImageFormat = {
   type: text,
   ? quality: 0.0..1.0,
}

browsingContext.ClipRectangle = (
  browsingContext.BoxClipRectangle /
  browsingContext.ElementClipRectangle
)

browsingContext.ElementClipRectangle = {
  type: "element",
  element: script.SharedReference
}

browsingContext.BoxClipRectangle = {
   type: "box",
   x: float,
   y: float,
   width: float,
   height: float
}

Return Type

browsingContext.CaptureScreenshotResult = {
  data: text
}

To normalize rect given rect:

Note: This ensures that the resulting rect has positive width dimension and height dimension.

Let x be rect’s x coordinate.
Let y be rect’s y coordinate.
Let width be rect’s width dimension.
Let height be rect’s height dimension.
If width is less than 0, set x to x + width and then set width to -width.
If height is less than 0, set y to y + height and then set height to -height.
Return a new DOMRectReadOnly with x coordinate x, y coordinate y, width dimension width and height dimension height.

To rectangle intersection given rect1 and rect2

Let rect1 be normalize rect with rect1.
Let rect2 be normalize rect with rect2.
Let x1_0 be rect1’s x coordinate.
Let x2_0 be rect2’s x coordinate.
Let x1_1 be rect1’s x coordinate plus rect1’s width dimension.
Let x2_1 be rect2’s x coordinate plus rect2’s width dimension.
Let x_0 be the maximum element of «x1_0, x2_0».
Let x_1 be the minimum element of «x1_1, x2_1».
Let y1_0 be rect1’s y coordinate.
Let y2_0 be rect2’s y coordinate.
Let y1_1 be rect1’s y coordinate plus rect1’s height dimension.
Let y2_1 be rect2’s y coordinate plus rect2’s height dimension.
Let y_0 be the maximum element of «y1_0, y2_0».
Let y_1 be the minimum element of «y1_1, y2_1».
If x_1 is less than x_0, let width be 0. Otherwise let width be x_1 - x_0.
If y_1 is less than y_0, let height be 0. Otherwise let height be y_1 - y_0.
Return a new DOMRectReadOnly with x coordinate x_0, y coordinate y_0, width dimension width and height dimension height.

To render document to a canvas given document and rect:

Let ratio be determine the device pixel ratio given document’s default view.
Let paint width be rect’s width dimension multiplied by ratio, rounded to the nearest integer, so it matches the width of rect in device pixels.
Let paint height be rect’s height dimension multiplied by ratio, rounded to the nearest integer, so it matches the height of rect in device pixels.
Let canvas be a new HTMLCanvasElement with width paint width and height paint height.
Let canvas context be the result of running the 2D context creation algorithm with canvas and null.
Set canvas’s context mode to 2D.
Complete implementation specific steps equivalent to drawing the region of the framebuffer representing the region of document covered by rect to canvas context, such that each pixel in the framebuffer corresponds to a pixel in canvas context with (rect’s x coordinate, rect’s y coordinate) in viewport coordinates corresponding to (0,0) in canvas context and (rect’s x coordinate + rect’s width dimension, rect’s y coordinate + rect’s height dimension) corresponding to (paint width, paint height).
Return canvas.

To encode a canvas as Base64 given canvas and format:

If format is not null, let type be the type field of format, and let quality be the quality field of format.
Otherwise, let type be "image/png" and let quality be undefined.
Let file be a serialization of the bitmap as a file for canvas with type and quality.
Let encoded string be the forgiving-base64 encode of file.
Return success with data encoded string.

To get the origin rectangle given document and origin:

If origin is "viewport":
1. Let viewport be document’s visual viewport.
2. Let viewport rect be a DOMRectReadOnly with x coordinate viewport page left, y coordinate viewport page top, width dimension viewport width, and height dimension viewport height.
3. Return success with data viewport rect.
Assert: origin is "document".
Let document element be the document element for document.
Let document rect be a DOMRectReadOnly with x coordinate 0, y coordinate 0, width dimension document element scroll height, and height dimension document element scroll width.
Return success with data document rect.

The remote end steps with session and command parameters are:

Let navigable id be the value of the context field of command parameters if present, or null otherwise.
Let navigable be the result of trying to get a navigable with navigable id.
If the implementation is unable to capture a screenshot of navigable for any reason then return error with error code unsupported operation.
Let document be navigable’s active document.
Immediately after the next invocation of the run the animation frame callbacks algorithm for document:

This ought to be integrated into the update rendering algorithm in some more explicit way.
Let origin be the value of the context field of command parameters if present, or "viewport" otherwise.
Let origin rect be the result of trying to get the origin rectangle given origin and document.
Let clip rect be origin rect.
If command parameters contains "clip":
1. Let clip be command parameters["clip"].
2. Run the steps under the first matching condition:
  clip matches the browsingContext.ElementClipRectangle production:
  1. Let environment settings be the environment settings object whose relevant global object’s associated Document is document.
  2. Let realm be environment settings’ realm execution context’s Realm component.
  3. Let element be the result of trying to deserialize remote reference with clip["element"], realm, and session.
  4. If element doesn’t implement Element return error with error code no such element.
  5. If element’s node document is not document, return error with error code no such element.
  6. Let viewport rect be get the origin rectangle given "viewport" and document.
  7. Let element rect be get the bounding box for element.
  8. Let clip rect be a DOMRectReadOnly with x coordinate element rect["x"] + viewport rect["x"], y coordinate element rect["y"] + viewport rect["y"], width element rect["width"], and height element rect["height"].
  clip matches the browsingContext.BoxClipRectangle production:
  1. Let clip x be clip["x"] plus origin rect’s x coordinate.
  2. Let clip y be clip["y"] plus origin rect’s y coordinate.
  3. Let clip rect be a DOMRectReadOnly with x coordinate clip x, y coordinate clip y, width clip["width"], and height clip["height"].
Note: All coordinates are now measured from the origin of the document.
Let rect be the rectangle intersection of origin rect and clip rect.
If rect’s width dimension is 0 or rect’s height dimension is 0, return error with error code unable to capture screen.
Let canvas be render document to a canvas with document and rect.
Let format be the format field of command parameters.
Let encoding result be the result of trying to encode a canvas as Base64 with canvas and format.
Let body be a map matching the browsingContext.CaptureScreenshotResult production, with the data field set to encoding result.
Return success with data body.

7.3.3.3. The browsingContext.close Command

The browsingContext.close command closes a top-level traversable.

Command Type

browsingContext.Close = (
  method: "browsingContext.close",
  params: browsingContext.CloseParameters
)

browsingContext.CloseParameters = {
  context: browsingContext.BrowsingContext,
  ? promptUnload: bool .default false
}

Return Type

browsingContext.CloseResult = EmptyResult

The remote end steps with command parameters are:

Let navigable id be the value of the context field of command parameters.
Let prompt unload be the value of the promptUnload field of command parameters.
Let navigable be the result of trying to get a navigable with navigable id.
Assert: navigable is not null.
If navigable is not a top-level traversable, return error with error code invalid argument.
If prompt unload is true:
1. Close navigable.
Otherwise:
1. Close navigable without prompting to unload.
Return success with data null.

There is an open discussion about the behavior when closing the last top-level traversable. We could expect to close the browser, close the session or leave this up to the implementation. [w3c/webdriver-bidi Issue #170]

7.3.3.4. The browsingContext.create Command

The browsingContext.create command creates a new navigable, either in a new tab or in a new window, and returns its navigable id.

Command Type

browsingContext.Create = (
  method: "browsingContext.create",
  params: browsingContext.CreateParameters
)

browsingContext.CreateType = "tab" / "window"

browsingContext.CreateParameters = {
  type: browsingContext.CreateType,
  ? referenceContext: browsingContext.BrowsingContext,
  ? background: bool .default false,
  ? userContext: browser.UserContext
}

Return Type

browsingContext.CreateResult = {
  context: browsingContext.BrowsingContext,
  ? userContext: browser.UserContext
}

The remote end steps with command parameters are:

Let type be the value of the type field of command parameters.
Let reference navigable id be the value of the referenceContext field of command parameters, if present, or null otherwise.
If reference navigable id is not null, let reference navigable be the result of trying to get a navigable with reference navigable id. Otherwise let reference navigable be null.
If reference navigable is not null and is not a top-level traversable, return error with error code invalid argument.
If the implementation is unable to create a new top-level traversable for any reason then return error with error code unsupported operation.
Let user context be the default user context if reference navigable is null, and reference navigable’ associated user context otherwise.
Let user context id be the value of the userContext field of command parameters if present, or null otherwise.
If user context id is not null, set user context to the result of trying to get user context with user context id.
If user context is null, return error with error code no such user context.
If the implementation is unable to create a new top-level traversable with associated user context user context for any reason, return error with error code unsupported operation.
Let traversable be the result of trying to create a new top-level traversable steps with null and empty string, and setting the associated user context for the newly created top-level traversable to user context. Which OS window the new top-level traversable is created in depends on type and reference navigable:
- If type is "tab" and the implementation supports multiple top-level traversables in the same OS window:
  - The new top-level traversable should reuse an existing OS window, if any.
  - If reference navigable is not null, the new top-level traversable should reuse the window containing reference navigable, if any. If the top-level traversables inside an OS window have a definite ordering, the new top-level traversable should be immediately after reference navigable’s top-level traversable in that ordering.
- If type is "window", and the implementation supports multiple top-level traversable in separate OS windows, the created top-level traversable should be in a new OS window.
- Otherwise, the details of how the top-level traversable is presented to the user are implementation defined.
If the value of the command parameters’ background field is false:
1. Let activate result be the result of activate a navigable with the newly created navigable.
2. If activate result is an error, return activate result.
Note: Do not invoke the focusing steps for the created navigable if background is true.
Let body be a map matching the browsingContext.CreateResult production, with the context field set to traversable’s navigable id and the userContext property set to the user context id of traversable’s associated user context.
Return success with data body.

7.3.3.5. The browsingContext.getTree Command

The browsingContext.getTree command returns a tree of all descendent navigables including the given parent itself, or all top-level contexts when no parent is provided.

Command Type

browsingContext.GetTree = (
  method: "browsingContext.getTree",
  params: browsingContext.GetTreeParameters
)

browsingContext.GetTreeParameters = {
  ? maxDepth: js-uint,
  ? root: browsingContext.BrowsingContext,
}

Return Type

browsingContext.GetTreeResult = {
  contexts: browsingContext.InfoList
}

The remote end steps with session and command parameters are:

Let root id be the value of the root field of command parameters if present, or null otherwise.
Let max depth be the value of the maxDepth field of command parameters if present, or null otherwise.
Let navigables be an empty list.
If root id is not null, append the result of trying to get a navigable given root id to navigables. Otherwise append all top-level traversables to navigables.
Let navigables infos be an empty list.
For each navigable of navigables:
1. Let info be the result of get the navigable info given navigable, max depth, and true.
2. Append info to navigables infos
Let body be a map matching the browsingContext.GetTreeResult production, with the contexts field set to navigables infos.
Return success with data body.

7.3.3.6. The browsingContext.handleUserPrompt Command

The browsingContext.handleUserPrompt command allows closing an open prompt

Command Type

browsingContext.HandleUserPrompt = (
  method: "browsingContext.handleUserPrompt",
  params: browsingContext.HandleUserPromptParameters
)

browsingContext.HandleUserPromptParameters = {
  context: browsingContext.BrowsingContext,
  ? accept: bool,
  ? userText: text,
}

Return Type

browsingContext.HandleUserPromptResult = EmptyResult

The remote end steps with session and command parameters are:

Let navigable id be the value of the context field of command parameters.
Let navigable be the result of trying to get a navigable with navigable id.
Let accept be the value of the accept field of command parameters if present, or true otherwise.
Let userText be the value of the userText field of command parameters if present, or the empty string otherwise.
If navigable is currently showing a simple dialog from a call to alert then acknowledge the prompt.

Otherwise if navigable is currently showing a simple dialog from a call to confirm, then respond positively if accept is true, or respond negatively if accept is false.

Otherwise if navigable is currently showing a simple dialog from a call to prompt, then respond with the string value userText if accept is true, or abort if accept is false.

Otherwise, if navigable is currently showing a prompt as part of the prompt to unload steps, then confirm the navigation if accept is true, otherwise refuse the navigation.

Otherwise return error with error code no such alert.
Return success with data null.

7.3.3.7. The browsingContext.locateNodes Command

The browsingContext.locateNodes command returns a list of all nodes matching the specified locator.

Command Type

browsingContext.LocateNodes = (
  method: "browsingContext.locateNodes",
  params: browsingContext.LocateNodesParameters
)

browsingContext.LocateNodesParameters = {
   context: browsingContext.BrowsingContext,
   locator: browsingContext.Locator,
   ? maxNodeCount: (js-uint .ge 1),
   ? serializationOptions: script.SerializationOptions,
   ? startNodes: [ + script.SharedReference ]
}

Return Type

browsingContext.LocateNodesResult = {
    nodes: [ * script.NodeRemoteValue ]
}

To locate nodes using CSS with given navigable, context nodes, selector, maximum returned node count, and session:

Let returned nodes be an empty list.
Let parse result be the result of parse a selector given selector.
If parse result is failure, return error with error code invalid selector.
For each context node of context nodes:
1. Let elements be the result of match a selector against a tree with parse result and navigable’s active document root using scoping root context node.
2. For each element in elements:
  1. Append element to returned nodes.
  2. If maximum returned node count is not null and size of returned nodes is equal to maximum returned node count, return success with data returned nodes.
Return success with data returned nodes.

To locate the container element given navigable:

Let returned nodes be an empty list.
If navigable’s container is not null, append navigable’s container to returned nodes.
Return returned nodes.

To locate nodes using XPath with given navigable, context nodes, selector, and maximum returned node count:

Note: Owing to the unmaintained state of the XPath specification, this algorithm is phrased as if making calls to the XPath DOM APIs. However this is to be understood as equivalent to spec-internal calls directly accessing the underlying algorithms, without going via the ECMAScript runtime.

Let returned nodes be an empty list.
For each context node of context nodes:
1. Let evaluate result be the result of calling evaluate on navigable’s active document, with arguments selector, context node, null, ORDERED_NODE_SNAPSHOT_TYPE, and null. If this throws a "SyntaxError" DOMException, return error with error code invalid selector; otherwise, if this throws any other exception return error with error code unknown error.
2. Let index be 0.
3. Let length be the result of getting the snapshotLength property from evaluate result.
4. Repeat, while index is less than length:
  1. Let node be the result of calling snapshotItem with evaluate result as this and index as the argument.
  2. Append node to returned nodes.
  3. If maximum returned node count not null and size of returned nodes is equal to maximum returned node count, return success with data returned nodes.
  4. Set index to index + 1.
Return success with data returned nodes.

To locate nodes using inner text with given context nodes, selector, max depth, match type, ignore case, and maximum returned node count:

If selector is the empty string, return error with error code invalid selector.
Let returned nodes be an empty list.
If ignore case is false, let search text be selector. Otherwise, let search text be the result of toUppercase with selector according to the Unicode Default Case Conversion algorithm.
For each context node in context nodes:
1. If context node implements Document or DocumentFragment:
  
  Note: when traversing the document or document fragment, max depth is not decreased intentionally to make the search result with document and document.documentElement equivalent.
  1. Let child nodes be an empty list.
  2. For each node child in the children of context node.
    1. Append child to child nodes.
  3. Extend returned nodes with the result of trying to locate nodes using inner text with child nodes, selector, max depth, match type, ignore case, and maximum returned node count.
2. If context node does not implement HTMLElement then continue.
3. Let node inner text be the result of calling the innerText getter steps with context node as the this value.
4. If ignore case is false, let node text be node inner text. Otherwise, let node text be the result of toUppercase with node inner text according to the Unicode Default Case Conversion algorithm.
5. If search text is a code point substring of node text, perform the following steps:
  1. Let child nodes be an empty list and, for each node child in the children of context node:
    1. Append child to child nodes.
  2. If size of child nodes is equal to 0 or max depth is equal to 0, perform the following steps:
    1. If match type is "full" and node text is search text, append context node to returned nodes.
    2. Otherwise, if match type is "partial", append context node to returned nodes.
  3. Otherwise, perform the following steps:
    1. Let child max depth be null if max depth is null, or max depth - 1 otherwise.
    2. Let child node matches be the result of locate nodes using inner text with child nodes, selector, child max depth , match type, ignore case, and maximum returned node count.
    3. If size of child node matches is equal to 0 and match type is "partial", append context node to returned nodes. Otherwise, extend returned nodes with child node matches.
If maximum returned node count is not null, remove all entries in returned nodes with an index greater than or equal to maximum returned node count.
Return success with data returned nodes.

To collect nodes using accessibility attributes with given context nodes, selector, maximum returned node count, and returned nodes:

If returned nodes is null:
1. Set returned nodes to an empty list.
For each context node in context nodes:
1. Let match be true.
2. If context node implements Element:
  1. If selector contains "role":
    1. Let role be the computed role of context node.
    2. If selector["role"] is not role:
      1. Set match to false.
  2. If selector contains "name":
    1. Let name be the accessible name of context node.
    2. If selector["name"] is not name:
      1. Set match to false.
3. Otherwise, set match to false.
4. If match is true:
  1. If maximum returned node count is not null and size of returned nodes is equal to maximum returned node count, break.
  2. Append context node to returned nodes.
5. Let child nodes be an empty list and, for each node child in the children of context node:
  1. If child implements Element, append child to child nodes.
6. Try to collect nodes using accessibility attributes with child nodes, selector, maximum returned node count, and returned nodes.
Return returned nodes.

To locate nodes using accessibility attributes with given context nodes, selector, and maximum returned node count:

If selector does not contain "role" and selector does not contain "name", return error with error code invalid selector.
Return the result of collect nodes using accessibility attributes with context nodes, selector, maximum returned node count, and null.

The remote end steps with session and command parameters are:

Let navigable id be command parameters["context"].
Let navigable be the result of trying to get a navigable with navigable id.
Assert: navigable is not null.
Let realm be the result of trying to get a realm from a navigable with navigable id of navigable and null.
Let locator be command parameters["locator"].
If command parameters contains "startNodes", let start nodes parameter be command parameters["startNodes"]. Otherwise let start nodes parameter be null.
If command parameters contains "maxNodeCount", let maximum returned node count be command parameters["maxNodeCount"]. Otherwise, let maximum returned node count be null.
Let context nodes be an empty list.
If start nodes parameter is null, append the navigable’s active document to context nodes. Otherwise, for each serialized start node in start nodes parameter:
1. Let start node be the result of trying to deserialize shared reference given serialized start node, realm and session.
2. Append start node to context nodes.
Assert size of context nodes is greater than 0.
Let type be locator["type"].
In the following list of conditions and associated steps, run the first set of steps for which the associated condition is true:
type is the string "css"
1. Let selector be locator["value"].
2. Let result nodes be a result of trying to locate nodes using css given navigable, context nodes, selector and maximum returned nodes.
type is the string "xpath"
1. Let selector be locator["value"].
2. Let result nodes be a result of trying to locate nodes using xpath given navigable, context nodes, selector and maximum returned nodes.
type is the string "innerText"
1. Let selector be locator["value"].
2. If locator contains maxDepth, let max depth be locator["maxDepth"]. Otherwise, let max depth be null.
3. If locator contains ignoreCase, let ignore case be locator["ignoreCase"]. Otherwise, let ignore case be false.
4. If locator contains matchType, let match type be locator["matchType"]. Otherwise, let match type be "full".
5. Let result nodes be a result of trying to locate nodes using inner text given context nodes, selector, max depth, match type, ignore case and maximum returned node count.
type is the string "accessibility"
1. Let selector be locator["value"].
2. Let result nodes be locate nodes using accessibility attributes given context nodes, selector, and maximum returned node count.
type is the string "context"
1. If start nodes parameter is not null, return error with error code "invalid argument".
2. Let selector be locator["value"].
3. Let context id be selector["context"].
4. Let child navigable be the result of trying to get a navigable with context id.
5. If child navigable’s parent is not navigable, return error with error code "invalid argument".
6. Let result nodes be locate the container element given child navigable.
7. Assert: For each node in result nodes, node’s node navigable is navigable.
Assert: maximum returned node count is null or size of result nodes is less than or equal to maximum returned node count.
If command parameters contains "serializationOptions", let serialization options be command parameters["serializationOptions"]. Otherwise, let serialization options be a map matching the script.SerializationOptions production with the fields set to their default values.
Let result ownership be "none".
Let serialized nodes be an empty list.
For each result node in result nodes:
1. Let serialized node be the result of serialize as a remote value with result node, serialization options, result ownership, a new map as serialization internal map, realm and session.
2. Append serialized node to serialized nodes.
Let result be a map matching the browsingContext.LocateNodesResult production, with the nodes field set serialized nodes.
Return success with data result.

7.3.3.8. The browsingContext.navigate Command

The browsingContext.navigate command navigates a navigable to the given URL.

Command Type

browsingContext.Navigate = (
  method: "browsingContext.navigate",
  params: browsingContext.NavigateParameters
)

browsingContext.NavigateParameters = {
  context: browsingContext.BrowsingContext,
  url: text,
  ? wait: browsingContext.ReadinessState,
}

Return Type

browsingContext.NavigateResult = {
  navigation: browsingContext.Navigation / null,
  url: text,
}

The remote end steps with session and command parameters are:

Let navigable id be the value of the context field of command parameters.
Let navigable be the result of trying to get a navigable with navigable id.
Assert: navigable is not null.
Let wait condition be "committed".
If command parameters contains wait and command parameters[wait] is not "none", set wait condition to command parameters[wait].
Let url be the value of the url field of command parameters.
Let document be navigable’s active document.
Let base be document’s base URL.
Let url record be the result of applying the URL parser to url, with base URL base.
If url record is failure, return error with error code invalid argument.
Let request be a new request whose URL is url record.
Return the result of await a navigation with navigable, request and wait condition.

7.3.3.9. The browsingContext.print Command

The browsingContext.print command creates a paginated representation of a document, and returns it as a PDF document represented as a Base64-encoded string.

Command Type

browsingContext.Print = (
  method: "browsingContext.print",
  params: browsingContext.PrintParameters
)

browsingContext.PrintParameters = {
  context: browsingContext.BrowsingContext,
  ? background: bool .default false,
  ? margin: browsingContext.PrintMarginParameters,
  ? orientation: ("portrait" / "landscape") .default "portrait",
  ? page: browsingContext.PrintPageParameters,
  ? pageRanges: [*(js-uint / text)],
  ? scale: (0.1..2.0) .default 1.0,
  ? shrinkToFit: bool .default true,
}

browsingContext.PrintMarginParameters = {
  ? bottom: (float .ge 0.0) .default 1.0,
  ? left: (float .ge 0.0) .default 1.0,
  ? right: (float .ge 0.0) .default 1.0,
  ? top: (float .ge 0.0) .default 1.0,
}

; Minimum size is 1pt x 1pt. Conversion follows from
; https://www.w3.org/TR/css3-values/#absolute-lengths
browsingContext.PrintPageParameters = {
  ? height: (float .ge 0.0352) .default 27.94,
  ? width: (float .ge 0.0352) .default 21.59,
}

Return Type

browsingContext.PrintResult = {
  data: text
}

The remote end steps with session and command parameters are:

Let navigable id be the value of the context field of command parameters.
Let navigable be the result of trying to get a navigable with navigable id.
If the implementation is unable to provide a paginated representation of navigable for any reason then return error with error code unsupported operation.
Let margin be the value of the margin field of command parameters if present, or otherwise a map matching the browsingContext.PrintMarginParameters with the fields set to their default values.
Let page size be the value of the page field of command parameters if present, or otherwise a map matching the browsingContext.PrintPageParameters with the fields set to their default values.

Note: The minimum page size is 1 point, which is (2.54 / 72) cm as per absolute lengths.

Let page ranges be the value of the pageRanges field of command parameters if present or an empty list otherwise.
Let document be navigable’s active document.

Immediately after the next invocation of the run the animation frame callbacks algorithm for document:

This ought to be integrated into the update rendering algorithm in some more explicit way.

Let pdf data be the result taking UA-specific steps to generate a paginated representation of document, with the CSS media type set to print, encoded as a PDF, with the following paper settings:

Property	Value
Width in cm	`page size`["`width`"] if `command parameters`["`orientation`"] is "`portrait`" otherwise `page size`["`height`"]
Height in cm	`page size`["`height`"] if `command parameters`["`orientation`"] is "`portrait`" otherwise `page size`["`width`"]
Top margin, in cm	`margin`["`top`"]
Bottom margin, in cm	`margin`["`bottom`"]
Left margin, in cm	`margin`["`left`"]
Right margin, in cm	`margin`["`right`"]

In addition, the following formatting hints should be applied by the UA:

If command parameters["scale"] is not equal to 1:: Zoom the size of the content by a factor command parameters["scale"]
If command parameters["background"] is false:: Suppress output of background images
If command parameters["shrinkToFit"] is true:: Resize the content to match the page width, overriding any page width specified in the content

If page ranges is not empty, let pages be the result of trying to parse a page range with page ranges and the number of pages contained in pdf data, then remove any pages from pdf data whose one-based index is not contained in pages.
Let encoding result be the result of calling Base64 Encode on pdf data.
Let encoded data be encoding result’s data.
Let body be a map matching the browsingContext.PrintResult production, with the data field set to encoded data.
Return success with data body.

7.3.3.10. The browsingContext.reload Command

The browsingContext.reload command reloads a navigable.

Command Type

browsingContext.Reload = (
  method: "browsingContext.reload",
  params: browsingContext.ReloadParameters
)

browsingContext.ReloadParameters = {
  context: browsingContext.BrowsingContext,
  ? ignoreCache: bool,
  ? wait: browsingContext.ReadinessState,
}

Return Type

browsingContext.ReloadResult = browsingContext.NavigateResult

The remote end steps with command parameters are:

Let navigable id be the value of the context field of command parameters.
Let navigable be the result of trying to get a navigable with navigable id.
Assert: navigable is not null.
Let ignore cache be the the value of the ignoreCache field of command parameters if present, or false otherwise.
Let wait condition be "committed".
If command parameters contains wait and command parameters[wait] is not "none", set wait condition to command parameters[wait].
Let document be navigable’s active document.
Let url be document’s URL.
Let request be a new request whose URL is url.
Return the result of await a navigation with navigable, request, wait condition, history handling "reload", and ignore cache ignore cache.

7.3.3.11. The browsingContext.setBypassCSP Command

The browsingContext.setBypassCSP command allows bypassing Content Security Policy enforcement.

Note: When CSP bypass is enabled, all CSP directives are bypassed, including those that would normally block eval(), new Function(), inline scripts, and resource loading.

Command Type

browsingContext.SetBypassCSP = (
  method: "browsingContext.setBypassCSP",
  params: browsingContext.SetBypassCSPParameters
)

browsingContext.SetBypassCSPParameters = {
  bypass: true / null,
  ? contexts: [+browsingContext.BrowsingContext],
  ? userContexts: [+browser.UserContext],
}

Return Type

browsingContext.SetBypassCSPResult = EmptyResult

A remote end has a bypass CSP configuration, which is WebDriver configuration with associated type boolean.

The WebDriver BiDi CSP is bypassed steps given navigable navigable are:

Let top-level traversable be navigable’s top-level traversable.
Let bypass CSP enabled be the result of get WebDriver configuration value of bypass CSP configuration for top-level traversable.
Assert: bypass CSP enabled is true or unset.
If bypass CSP enabled is unset, return false.
Return true.

The remote end steps given command parameters are:

Let bypass be command parameters["bypass"].
If bypass is null, set bypass to unset.
Try to store WebDriver configuration bypass CSP configuration bypass for command parameters.
Return success with data null.

7.3.3.12. The browsingContext.setViewport Command

The browsingContext.setViewport command modifies specific viewport characteristics (e.g. viewport width and viewport height) on the given top-level traversable.

Command Type

browsingContext.SetViewport = (
  method: "browsingContext.setViewport",
  params: browsingContext.SetViewportParameters
)

browsingContext.SetViewportParameters = {
  ? context: browsingContext.BrowsingContext,
  ? viewport: browsingContext.Viewport / null,
  ? devicePixelRatio: (float .gt 0.0) / null,
  ? userContexts: [+browser.UserContext],
}

browsingContext.Viewport = {
  width: js-uint,
  height: js-uint,
}

Return Type

browsingContext.SetViewportResult = EmptyResult

To set device pixel ratio override given navigable and device pixel ratio:

If device pixel ratio is not null:
1. For document currently loaded in a specified navigable:
  1. When the select an image source from a source set steps are run, act as if the implementation’s pixel density was set to device pixel ratio when selecting an image.
  2. For the purposes of the resolution media feature, act as if the implementation’s resolution is device pixel ratio dppx scaled by the page zoom.
2. Set device pixel ratio overrides[navigable] to device pixel ratio.
  
  Note: This will take an effect because of the patch of § 8.3.1 Determine the device pixel ratio.
Otherwise:
1. For document currently loaded in a specified navigable:
  1. When the select an image source from a source set steps are run, use the implementation’s default behavior, without any changes made by previous invocations of these steps.
  2. For the purposes of the resolution media feature, use the implementation’s default behavior, without any changes made by previous invocations of these steps.
2. Remove navigable from device pixel ratio overrides.
Run evaluate media queries and report changes for document currently loaded in a specified navigable.

To set viewport given given navigable navigable and viewport viewport:

If viewport is not null, set the width of navigable’s layout viewport to be the viewport’s width in CSS pixels and set the height of the navigable’s layout viewport to be the viewport’s height in CSS pixels.
Otherwise, set the navigable’s layout viewport to the implementation-defined default.

After creating a document in a new navigable navigable and before the run WebDriver BiDi preload scripts algorithm is invoked:

TODO: Move it as a hook in the html spec instead.

Let user context be navigable’s associated user context.
If navigable is a top-level traversable:
1. Update geolocation override for navigable.
2. Update emulated forced colors theme for navigable.
3. If screen orientation overrides map contains user context, set emulated screen orientation with navigable and screen orientation overrides map[user context].
If viewport overrides map contains user context:
1. If navigable is a top-level traversable and viewport overrides map[user context]'s viewport is not null:
  1. Set viewport with navigable and viewport overrides map[user context]'s viewport.
2. If viewport overrides map[user context]'s devicePixelRatio is not null:
  1. Set device pixel ratio override with navigable and viewport overrides map[user context]'s devicePixelRatio.
Update scrollbar type override for navigable.

The remote end steps with command parameters are:

If the implementation is unable to adjust the layout viewport parameters with the given command parameters for any reason, return error with error code unsupported operation.
If command parameters contains "userContexts" and command parameters contains "context", return error with error code invalid argument.
Let navigables be a set.
If the context field of command parameters is present:
1. Let navigable id be the value of the context field of command parameters.
2. Let navigable be the result of trying to get a navigable with navigable id.
3. If navigable is not a top-level traversable, return error with error code invalid argument.
4. Append navigable to navigables.
Otherwise, if the userContexts field of command parameters is present:
1. Let user contexts be the result of trying to get valid user contexts with command parameters["userContexts"].
2. For each user context of user contexts:
  1. Set viewport overrides map[user context] to a struct.
  2. If command parameters contains "viewport":
    1. Set viewport overrides map[user context]'s viewport to command parameters["viewport"].
  3. If command parameters contains "devicePixelRatio":
    1. Set viewport overrides map[user context]'s devicePixelRatio to command parameters["devicePixelRatio"].
  4. For each top-level traversable of the list of all top-level traversables whose associated user context is user context:
    1. Append top-level traversable to navigables.
Otherwise, return error with error code invalid argument.
If command parameters contains the viewport field:
1. Let viewport be the command parameters["viewport"].
2. For each navigable of navigables:
  1. Set viewport with navigable and viewport.
  2. Run the CSSOM View § 13.1 Resizing viewports steps with navigable’s active document.
If command parameters

WebDriver BiDi

Abstract

Status of this document

1. Introduction

2. Infrastructure

3. Protocol

3.1. Definition

3.2. Session

3.3. Modules

3.4. Commands

3.5. Errors

3.6. Events

4. Transport

4.1. Establishing a Connection

5. Sandboxed Script Execution

5.1. Sandbox Realms

5.2. Sandbox Proxy Objects

5.3. SandboxWindowProxy

6. User Contexts

7. Modules

7.1. The session Module

7.1.1. Definition

7.1.2. Types

7.1.2.1. The session.CapabilitiesRequest Type

7.1.2.2. The session.CapabilityRequest Type

7.1.2.3. The session.ProxyConfiguration Type

7.1.2.4. The session.UserPromptHandler Type

7.1.2.5. The session.UserPromptHandlerType Type

7.1.2.6. The session.Subscription Type

7.1.2.7. The session.SubscribeParameters Type

7.1.2.8. The session.UnsubscribeByIDRequest Type

7.1.2.9. The session.UnsubscribeByAttributesRequest Type

7.1.3. Commands

7.1.3.1. The session.status Command

7.1.3.2. The session.new Command

7.1.3.3. The session.end Command

7.1.3.4. The session.subscribe Command

7.1.3.5. The session.unsubscribe Command

7.2. The browser Module

7.2.1. Definition

7.2.2. Windows

7.2.3. Types

7.2.3.1. The browser.ClientWindow Type

7.2.3.2. The browser.ClientWindowInfo Type

7.2.3.3. The browser.UserContext Type

7.2.3.4. The browser.UserContextInfo Type

7.2.4. Commands

7.2.4.1. The browser.close Command

7.2.4.2. The browser.createUserContext Command

7.2.4.3. The browser.getClientWindows Command

7.2.4.4. The browser.getUserContexts Command

7.2.4.5. The browser.removeUserContext Command

7.2.4.6. The browser.setClientWindowState Command

7.2.4.7. The browser.setDownloadBehavior Command

7.3. The browsingContext Module

7.3.1. Definition

7.3.2. Types

7.3.2.1. The browsingContext.BrowsingContext Type

7.3.2.2. The browsingContext.Info Type

7.3.2.3. The browsingContext.Locator Type

7.3.2.4. The browsingContext.Navigation Type

7.3.2.5. The browsingContext.Download Type

7.3.2.6. The browsingContext.NavigationInfo Type

7.3.2.7. The browsingContext.ReadinessState Type

7.3.2.8. The browsingContext.UserPromptType Type

7.3.3. Commands

7.3.3.1. The browsingContext.activate Command

7.3.3.2. The browsingContext.captureScreenshot Command

7.3.3.3. The browsingContext.close Command

7.3.3.4. The browsingContext.create Command

7.3.3.5. The browsingContext.getTree Command

7.3.3.6. The browsingContext.handleUserPrompt Command

7.3.3.7. The browsingContext.locateNodes Command

7.3.3.8. The browsingContext.navigate Command

7.3.3.9. The browsingContext.print Command

7.3.3.10. The browsingContext.reload Command

7.3.3.11. The browsingContext.setBypassCSP Command

7.3.3.12. The browsingContext.setViewport Command