Benchmark complex_layout_scroll_perf__memory needs better metrics to avoid reporting phantom regressions

The main issue (after some discussion) is that this benchmark is not reporting metrics that exactly match its concern and so situations such as those described below can cause it to indicate a regression when a change simply affects its underlying assumptions instead. We need to develop better metrics for it to measure so that it only flags when we have an actual memory regression.

Until then the benchmark acts mostly as a "canary" that will often see a change that might negatively affect app performance, but sometimes in cases that don't really impact real-world performance. Changes in the benchmark should be noted, but more investigation will determine if the change in reported metrics really represents a regression.

**Originally reported issue:**

I'm investigating the historic slowdowns in the complex_layout_scroll_perf__memory benchmark. It looks like we were at around 5000 in late July and since then we've been stuck around 9000+.

It appears that commit 9b150f134b9b6ba103ec305f489645b5fee905d2 is responsible for that change. Here are 3 runs of the benchmark on a Moto G4 first running on commit 0d0af31598f5ee0552529bfd3fdc35be1d8fe176 (which is just before the indicated commit) and then on commit 9b150f134b9b6ba103ec305f489645b5fee905d2

```
Stats for hash 0d0af3

On G4:

  "data": {
    "start-min": 34210,
    "start-max": 40838,
    "start-median": 39220,
    "end-min": 44052,
    "end-max": 46190,
    "end-median": 45034,
    "diff-min": 4464,
    "diff-max": 11980,
    "diff-median": 5033
  },
  "data": {
    "start-min": 35164,
    "start-max": 41233,
    "start-median": 40473,
    "end-min": 44973,
    "end-max": 45653,
    "end-median": 45321,
    "diff-min": 4125,
    "diff-max": 10322,
    "diff-median": 4487
  },
  "data": {
    "start-min": 32907,
    "start-max": 40356,
    "start-median": 39693,
    "end-min": 44393,
    "end-max": 48066,
    "end-median": 44879,
    "diff-min": 4099,
    "diff-max": 15159,
    "diff-median": 4907
  },

Stats for hash 9b150f:

On G4:

  "data": {
    "start-min": 31106,
    "start-max": 33471,
    "start-median": 31184,
    "end-min": 40494,
    "end-max": 45019,
    "end-median": 40980,
    "diff-min": 7793,
    "diff-max": 11548,
    "diff-median": 9319
  },
  "data": {
    "start-min": 31144,
    "start-max": 32886,
    "start-median": 31240,
    "end-min": 40506,
    "end-max": 41104,
    "end-median": 40906,
    "diff-min": 7932,
    "diff-max": 9798,
    "diff-median": 9690
  },
  "data": {
    "start-min": 30958,
    "start-max": 32838,
    "start-median": 31365,
    "end-min": 40332,
    "end-max": 41224,
    "end-median": 40892,
    "diff-min": 8054,
    "diff-max": 9874,
    "diff-median": 9501
  },
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Benchmark complex_layout_scroll_perf__memory needs better metrics to avoid reporting phantom regressions #40406

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Benchmark complex_layout_scroll_perf__memory needs better metrics to avoid reporting phantom regressions #40406

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions