Skip to content
This repository was archived by the owner on Feb 7, 2026. It is now read-only.
This repository was archived by the owner on Feb 7, 2026. It is now read-only.

createQueryStream loads a big set of data into memory before streaming #1073

@koenvanzuijlen

Description

@koenvanzuijlen

The bigQuery.createQueryStream seems to load an entire set of data into memory before the stream starts actually piping data into the next streams.

Environment details

  • OS: MacOS 12.1
  • Node.js version: 14.18.1
  • npm version: 6.14.15
  • @google-cloud/bigquery version: 5.10.0

Steps to reproduce

Using this test script I can see over 300mb of data is loaded into memory before the stream starts piping to the next streams. And I am only selecting one column, so this is a lot of records in that case.

If I log each entry in the transform stream it also seems to come into batches. It pauses for a while and suddenly starts piping again. This makes me think internally a whole page is loaded into memory and then piped to the readable stream, but this might not be the issue.

const stream = bigQuery
  .dataset("dataset")
  .createQueryStream("SELECT email FROM table");

let checked = false;
const tr = new Transform({
  objectMode: true,
  transform: (chunk, enc, next) => {
    if (!checked) {
      console.dir("START PIPING");
      console.dir(process.memoryUsage());
      console.dir(
        "DIFFERENCE = " +
          (process.memoryUsage().heapUsed - heapUsed) / (1024 * 1024) +
          " MB"
      );
      checked = true;
    }
    next(null, JSON.stringify(chunk) + "\n");
  },
});

const write = fs.createWriteStream("/dev/null");

console.dir("BEFORE");
console.dir(process.memoryUsage());
const { heapUsed } = process.memoryUsage();

tr.pipe(write);
stream.pipe(tr);

Metadata

Metadata

Assignees

No one assigned

    Labels

    api: bigqueryIssues related to the googleapis/nodejs-bigquery API.priority: p2Moderately-important priority. Fix may not be included in next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions