Skip to content

Backup to S3 endpoint out of memory #43083

@DamienVicet

Description

@DamienVicet

Describe what's wrong

When we backup a lot of data to a minio S3 storage, ClickHouse takes all the available memory and crashes.

Does it reproduce on recent release?

Tested on ClickHouse 22.10.2.11 and on this build : https://s3.amazonaws.com/clickhouse-builds/0/12c6a1116c16ddefce8632138b7871753e1394f3/clickhouse_build_check/report.html

How to reproduce

  • ClickHouse server version : 22.10.2.11
  • Docker-compose with :
    • clickhouse-server with 10GB of memory (to reproduce faster)
    • minio
    • docker-tc : to limit minio bandwidth

docker-compose.yml :

services:
  tc:
    image: lukaszlach/docker-tc
    container_name: docker-tc
    cap_add:
      - NET_ADMIN
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - /var/docker-tc:/var/docker-tc
    deploy:
      mode: global
      restart_policy:
        condition: any
    environment:
      HTTP_BIND: "127.0.0.1"
      HTTP_PORT: "4080"
    network_mode: host

  clickhouse:
    depends_on:
      - tc
    image: clickhouse/clickhouse-server:latest-alpine
    networks:
      - limited-network
    deploy:
      resources:
        limits:
          memory: 10G

  minio-backup:
    depends_on:
      - tc
    image: minio/minio:latest
    environment:
      - MINIO_ACCESS_KEY=access
      - MINIO_SECRET_KEY=secretkey
    labels:
      - "com.docker-tc.enabled=1"
      - "com.docker-tc.limit=8mbps"
    command: server /var/lib/minio
    networks:
      - limited-network

  createbuckets:
    image: registry.dametis.dev/proxy/minio/mc
    depends_on:
      - minio-backup
      - tc
    entrypoint: >
      /bin/sh -c "
      /usr/bin/mc config host add myminio http://minio-backup:9000 access secretkey;
      /usr/bin/mc rm -r --force myminio/backups;
      /usr/bin/mc mb myminio/backups;
      /usr/bin/mc policy download myminio/backups;
      exit 0;
      "
    networks:
      - limited-network

networks:
  default:
    external:
      name: host
  limited-network:

Launch a clickhouse-client :
docker run -it --rm --name client --network docker-tc_limited-network clickhouse/clickhouse-server:latest-alpine clickhouse-client --host clickhouse
Create a test database with many rows :

CREATE DATABASE IF NOT EXISTS test;

CREATE TABLE IF NOT EXISTS test.test
(
  `a` Array(Int8),
  `d` Decimal32(4),
  `c` Tuple(DateTime64(3), UUID)
)
ENGINE = MergeTree
ORDER BY a;

INSERT INTO test.test SELECT * FROM generateRandom('a Array(Int8), d Decimal32(4), c Tuple(DateTime64(3), UUID)', 1, 10, 2) LIMIT 500000000;

Backup database to minio S3 :

BACKUP DATABASE test TO S3('http://minio-backup:9000/backups/backup', 'access', 'secretkey')

Expected behavior

Backup query should succeed and this should not take all the memory to send data to minio.

Error message and/or stacktrace

ClickHouse crashes (sometimes) because memory is full :

Exception on client:
Code: 32. DB::Exception: Attempt to read after eof: while receiving packet from clickhouse:9000. (ATTEMPT_TO_READ_AFTER_EOF)

Thank you in advance ! Don't hesitate to ask me with you need more information.

Metadata

Metadata

Assignees

Labels

potential bugTo be reviewed by developers and confirmed/rejected.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions