Skip to content

Minimal implementation of resource pools (RFC) #8449

@alexey-milovidov

Description

@alexey-milovidov

Let's split all resources to two classes:

  1. Non elastic resources.
    When we are exhausted of non-elastic resource, we cannot work.
    Examples: memory usage (OOM), disk space usage (no space left on device).

  2. Elastic resources.
    When we are out of elastic resource, OS or another system will do resource sharing.
    Examples: CPU utilization, disk IOPS, disk throughput, network bandwidth.
    But our goal is to implement resource sharing by our own configurable proportions.

Proper sharing of non-elastic resources requires implementation of resource overcommit + preemption. Query preemption will be implemented only after "Processors" branch and it's currently out of schedule.

We can first implement resource pools only for elastic resources without waiting for "Processors" implementation.

We can share elastic resources in the following way:

  1. Calculate per query and total resource usage by some metric.

This is already implemented. For example, for CPU usage we can use OSCPUVirtualTimeMicroseconds or sum of UserTimeMicroseconds and SystemTimeMicroseconds. Also we already have real metrics about IO usage without page cache.

  1. Calculate by this metric or by another metric that we are above some threshold on resource capacity.

It should signal that resource is near overloaded. Or some lower estimate on this threshold.

It can be made by threshold on the same metric, example: the OSCPUVirtualTimeMicroseconds is larger than total number of CPU cores without hyper threading multiplied by 0.9. Or it can be made on threshold by another metric, example: OSCPUWaitMicroseconds is larger than 10% of total time across all threads; or if we have high IO wait ratio then IO is near overloaded.

Sometimes it's difficult to calculate total resource capacity that we have. For example, how many IOPS can we have on SSD? It depends on the load scenario, but some threshold on io wait ratio will suffice to decide that it's overloaded.

  1. If we are above the threshold, calculate desired ratios of this resource usage for all queries.

Simply by dividing per-query metric to aggregated metrics across resource pools.

  1. The queries that have maximum allowed ratio will continue to work unlimited and other queries will be throttled to make their resource usage below this ratio related to the total resource usage. Before Processors will be implemented, throttling is done simply by sleep.

It means, we leave queries that have highest priority unlimited to fully utilize resource capacity; and at the same time we will throttle all other queries proportionally. Or: we can throttle even most priority queries but only if their share is significantly higher than desired.

It's important, because we can be above the estimate on resource capacity but the resource is in fact, underutilized.

Example: we have 16 logical CPU cores, and threshold when CPU is considered overloaded is 8 cores;
we have three queries running, query A is using 6 cores but configured to share 50% of CPU time, query B and C are using 4 cores each but configured to share 25% of CPU time. Then we let query A to run without limits but throttle queries B and C so they will use only 25% * (6 + 4 + 4) = 3.5 CPU cores instead of 4.

Note: there are some unresolved issues with this method...

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions