Fix vega chart unrecognized dataset error by lukasmasuch · Pull Request #11911 · streamlit/streamlit

lukasmasuch · 2025-07-10T21:16:00Z

Describe your changes

Attempts to fix an issue with vega charts showing a blank chart with an unrecognised data set error in the dev console. This is caused by two race conditions in the current implementation.

GitHub Issue Link (if applicable)

Testing Plan

Added unit tests.
Since replicating this is not deterministic, its a bit hard to write proper e2e tests.

Contribution License Agreement

By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.

snyk-io · 2025-07-10T21:16:09Z

🎉 Snyk checks have passed. No issues have been found so far.

✅ security/snyk check is complete. No issues have been found. (View Details)

✅ license/snyk check is complete. No issues have been found. (View Details)

github-actions · 2025-07-10T21:16:19Z

✅ PR preview is ready!

Name	Link
📦 Wheel file	https://core-previews.s3-us-west-2.amazonaws.com/pr-11911/streamlit-1.49.1-py3-none-any.whl
🕹️ Preview app	pr-11911.streamlit.app (☁️ Deploy here if not accessible)

Copilot

Pull Request Overview

This PR addresses issues with unknown data in the Vega chart by improving how and when the chart view is created and how datasets are updated.

Added an isCreatingView flag to prevent data updates during view instantiation.
Switched from view.data() to explicit view.remove() + view.insert() when dataset shapes change.
Updated guards in createView and updateView to respect the new creation state.

Comments suppressed due to low confidence (1)

frontend/lib/src/components/elements/ArrowVegaLiteChart/useVegaEmbed.ts:204

New removal and insertion logic for datasets merits unit tests to verify behavior when data shapes change, ensuring regressions are caught early.

        view.remove(name, truthy)

frontend/lib/src/components/elements/ArrowVegaLiteChart/useVegaEmbed.ts

itsToggle · 2025-08-11T08:22:07Z

This pull request fixes the empty altair plot bugs 👍

itsToggle · 2025-08-21T06:55:49Z

Hey @lukasmasuch any chance this fix could make it into an upcoming release? It works perfectly and fixes the missing dataset and empty altair plot issue weve been having. We'd love to use the new data-editor list columns which were just added, as they are essential for our use case :)

lukasmasuch · 2025-08-21T07:57:33Z

@itsToggle sorry for not finishing this up sooner. The 1.49 was already yesterday :( But I will try to get this in for 1.50.

itsToggle · 2025-08-21T08:42:22Z

Oh thank you so much! We are really excited that the data-interaction side of streamlit has been getting a lot of attention recently :)
Happy to test or help in any way!

itsToggle · 2025-09-17T11:58:11Z

Bump? :)

lukasmasuch · 2025-09-17T17:11:00Z

frontend/lib/src/components/elements/ArrowVegaLiteChart/useVegaEmbed.ts


      // Load the initial set of data into the chart.
-      const dataArrays = getDataArrays(datasetsRef.current)
+      const dataArrays = getDataArrays(datasets)


Under heavy reruns (e.g., many sidebar widgets), these refs can lag and hold the old chart’s datasets, which don’t match the new spec’s dataset names, causing Vega to throw on insert.

I don't think there is a downside to directly using the initial datasets here.

Copilot

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

frontend/lib/src/components/elements/ArrowVegaLiteChart/useVegaEmbed.ts

lukasmasuch · 2025-09-17T17:51:11Z

frontend/lib/src/components/elements/ArrowVegaLiteChart/useVegaEmbed.ts

+      try {
+        // Finalize the previous view so it can be garbage collected.
+        finalizeView()
+
+        const options = {
+          // Adds interpreter support for Vega expressions that is compliant with CSP
+          ast: true,
+          expr: expressionInterpreter,
+
+          // Disable default styles so that vega doesn't inject <style> tags in the
+          // DOM. We set these styles manually for finer control over them and to
+          // avoid inlining styles.
+          tooltip: { disableDefaultStyle: true },
+          defaultStyle: false,
+          forceActionsMenu: true,
+        }

-      // Finalize the previous view so it can be garbage collected.
-      finalizeView()
-
-      const options = {
-        // Adds interpreter support for Vega expressions that is compliant with CSP
-        ast: true,
-        expr: expressionInterpreter,
-
-        // Disable default styles so that vega doesn't inject <style> tags in the
-        // DOM. We set these styles manually for finer control over them and to
-        // avoid inlining styles.
-        tooltip: { disableDefaultStyle: true },
-        defaultStyle: false,
-        forceActionsMenu: true,
-      }
-
-      const { vgSpec, view, finalize } = await embed(
-        containerRef.current,
-        spec,
-        options
-      )
+        const { vgSpec, view, finalize } = await embed(
+          containerRef.current,
+          spec,
+          options
+        )

-      vegaView.current = maybeConfigureSelections(view)
+        vegaView.current = maybeConfigureSelections(view)

-      vegaFinalizer.current = finalize
+        vegaFinalizer.current = finalize

-      // Load the initial set of data into the chart.
-      const dataArrays = getDataArrays(datasetsRef.current)
+        // Load the initial set of data into the chart.
+        const dataArrays = getDataArrays(latestDatasetsRef.current)

-      // Heuristic to determine the default dataset name.
-      const datasetNames = dataArrays ? Object.keys(dataArrays) : []
-      if (datasetNames.length === 1) {
-        const [datasetName] = datasetNames
-        defaultDataName.current = datasetName
-      } else if (datasetNames.length === 0 && vgSpec.data) {
-        defaultDataName.current = DEFAULT_DATA_NAME
-      }
+        // Heuristic to determine the default dataset name.
+        const datasetNames = dataArrays ? Object.keys(dataArrays) : []
+        if (datasetNames.length === 1) {
+          const [datasetName] = datasetNames
+          defaultDataName.current = datasetName
+        } else if (datasetNames.length === 0 && vgSpec.data) {
+          defaultDataName.current = DEFAULT_DATA_NAME
+        }

-      const dataObj = getInlineData(dataRef.current)
-      if (dataObj) {
-        vegaView.current.insert(defaultDataName.current, dataObj)
-      }
-      if (dataArrays) {
-        for (const [name, dataArg] of Object.entries(dataArrays)) {
-          vegaView.current.insert(name, dataArg)
+        const dataObj = getInlineData(latestDataRef.current)
+        if (dataObj) {
+          vegaView.current.insert(defaultDataName.current, dataObj)
+        }
+        if (dataArrays) {
+          for (const [name, dataArg] of Object.entries(dataArrays)) {
+            vegaView.current.insert(name, dataArg)
+          }
        }
-      }

-      await vegaView.current.runAsync()
+        await vegaView.current.runAsync()

-      // Fix bug where the "..." menu button overlaps with charts where width is
-      // set to -1 on first load.
-      await vegaView.current.resize().runAsync()
+        // Fix bug where the "..." menu button overlaps with charts where width is
+        // set to -1 on first load.
+        await vegaView.current.resize().runAsync()

-      return vegaView.current
+        // Record the data used to initialize this view so subsequent updates
+        // have an accurate previous state to diff against.
+        prevDataRef.current = latestDataRef.current
+        prevDatasetsRef.current = latestDatasetsRef.current
+
+        return vegaView.current
+      } finally {


Nothing logical has changed with these lines, its just wrapped a try/finally

github-actions · 2025-09-17T18:02:55Z

📈 Frontend coverage change detected

The frontend unit test (vitest) coverage has increased by 0.0800%

Current PR: 84.9100% (47567 lines, 7176 missed)
Latest develop: 84.8300% (47546 lines, 7209 missed)

🎉 Great job on improving test coverage!

📊 View detailed coverage comparison

lukasmasuch · 2025-09-17T18:20:02Z

e2e_playwright/__snapshots__/linux/st_area_chart_test/st_area_chart-basic_df[chromium].png

The only change with all these charts is the side "padding". It seems like the new rendering approach fills out the full available space. I'm not exactly sure why, but it doesn't seem to be broken.

sfc-gh-nbellante

Any idea what caused all the snapshot diffs?

lukasmasuch · 2025-09-17T18:27:58Z

Any idea what caused all the snapshot diffs?

With the change, it uses the full available space with less padding which seems fine. However, the investigation into the root cause is still ongoing.

sfc-gh-nbellante · 2025-09-17T18:28:46Z

Any idea what caused all the snapshot diffs?

With the change, it uses the full available space with less padding which seems fine. However, the investigation into the root cause is still ongoing.

It doesn't look bad per-say, it just makes me nervous when things unintentionally change.

sfc-gh-nbellante · 2025-09-17T18:33:50Z

frontend/lib/src/components/elements/ArrowVegaLiteChart/useVegaEmbed.ts

+          // The finally block ensures execution flow continues even if view.remove() fails
+          // This allows us to safely exit the function while still propagating any errors
+        }
+        view.insert(name, getDataArray(dataArg))


question: I've never seen this pattern before. Can you help me understand how the code gets executed in the instance that view.remove(name, truthy) throws an error?

I just copied this pattern from another place in this file, but decided to revert it since it causes some issues with add_rows. But the pattern actually doesn't make a lot of sense :) I changed the other usage to a try/catch.

lukasmasuch · 2025-09-18T07:18:05Z

frontend/lib/src/components/elements/ArrowVegaLiteChart/ArrowVegaLiteChart.tsx

+    void updateView(data, datasets)
+
+    // We only want to update the view if the data or datasets change.
+    // updateView isn't stable because its updated via the isCreatingView flag.
+    // With updateView as dependency, the chart seems to
+    // expand within the parent container (less left/right padding).
+
+    // eslint-disable-next-line react-hooks/react-compiler
+    // eslint-disable-next-line react-hooks/exhaustive-deps
+  }, [data, datasets])


Having updateView as a dependency is causing the difference in snapshots. I think what happens is that updateView gets called, an additional time -> which recalculates the chart and better fills the available space.
But I'm ignoring updateView here for this PR to keep the status quo on how charts are rendered.

sfc-gh-lwilby · 2025-09-18T17:16:06Z

frontend/lib/src/components/elements/ArrowVegaLiteChart/useVegaEmbed.ts

-          // The finally block ensures execution flow continues even if view.remove() fails
-          // This allows us to safely exit the function while still propagating any errors
+        } catch {
+          // The dataset was already removed, so we do nothing


Does this change mean the the errors will not be propagated now? Is that an intended effect of the change?

Yeah, but I think logging errors here doesn't really matter -> if the dataset doesn't exist, it's fine to just not do anything here. In the current version, if the dataset is missing, it would likely result in some kind of frontend error.

sfc-gh-lwilby · 2025-09-18T17:27:22Z

frontend/lib/src/components/elements/ArrowVegaLiteChart/useVegaEmbed.ts

+
+    // Initialize the data and datasets refs with the current data and datasets
+    // This is predominantly used to handle the case where we want to reference
+    // these in createView before the first render.


Should this comment say updateView? It seems the prevDataRef is used in updateView.

I removed the comment since its not fully correct and more confusing. And correct, prevDataRef is only used in updateView. Regarding race condition see the comment below.

sfc-gh-lwilby · 2025-09-18T17:28:50Z

frontend/lib/src/components/elements/ArrowVegaLiteChart/useVegaEmbed.ts

-  // these in createView before the first render.
+  // Keep latest refs in sync and initialize previous refs before first view
  useEffect(() => {
+    latestDataRef.current = data


Is the race condition that the ref was not updated when createView was called because vegaVew.current was null sometimes when this block was called?

I think the issue was that if you had too many quick reruns, it could happen that it was trying to access a dataset that was already removed and/or update a dataset that wasn't fully added yet. The change enforces that no update can happen during creation and ensures that we are always using the actual latest data for creating the view. However, the code is a bit complex mainly because of our add_rows support which I'm hoping that we can deprecate at some point (it has an extremely low adoption, but adds a lot of complexity).

That added a lot of complexity for adding width/height as well.

OK, so using the two refs is more just for clarity it seems?

There are two parts to this. We are using refs to keep the callbacks stable, otherwise we could just use the passed in dataset parameter but that creates some other issues since the callback identity would change. And the added pair of refs is needed to fully fix #11911 since there was a second less frequent race condition where that was caused by this line: https://github.com/streamlit/streamlit/pull/11911/files#diff-827a3eedfdd50708b3577301ac8e982adc7aaedbfc5f735e6967de2dcc6b2657R149

I haven't fully analyzed how this race condition actually happens in detail. I hope this resolves it without any side effects, but let's see

Fix vega chart unknown data issue

340500b

lukasmasuch changed the title ~~Fix vega chart unknown data issue~~ [WIP] Fix vega chart unknown data issue Jul 10, 2025

lukasmasuch requested a review from Copilot July 10, 2025 21:16

Copilot AI reviewed Jul 10, 2025

View reviewed changes

frontend/lib/src/components/elements/ArrowVegaLiteChart/useVegaEmbed.ts Show resolved Hide resolved

Add example

f3e479a

itsToggle mentioned this pull request Jul 29, 2025

Empty Altair Plots (Unrecognized Dataset) #11342

Closed

4 tasks

lukasmasuch changed the title ~~[WIP] Fix vega chart unknown data issue~~ [WIP] Fix vega chart unrecognized data set error Sep 17, 2025

lukasmasuch and others added 2 commits September 17, 2025 19:16

Merge branch 'develop' into fix/altair-unknown-data-issue

acac544

Update vega embed hook

072838b

lukasmasuch commented Sep 17, 2025

View reviewed changes

lukasmasuch added 3 commits September 17, 2025 20:23

Update

701f94b

Remove script

ca9127d

Update more tests

3985582

lukasmasuch changed the title ~~[WIP] Fix vega chart unrecognized data set error~~ Fix vega chart unrecognized data set error Sep 17, 2025

lukasmasuch added 3 commits September 17, 2025 20:44

Fix

eedbccd

Update

10cad34

Improve typing

2360de1

lukasmasuch requested a review from Copilot September 17, 2025 17:49

lukasmasuch added type:bug Something isn't working as expected security-assessment-completed impact:users PR changes affect end users labels Sep 17, 2025

lukasmasuch marked this pull request as ready for review September 17, 2025 17:50

Copilot AI reviewed Sep 17, 2025

View reviewed changes

frontend/lib/src/components/elements/ArrowVegaLiteChart/useVegaEmbed.ts Outdated Show resolved Hide resolved

frontend/lib/src/components/elements/ArrowVegaLiteChart/useVegaEmbed.ts Outdated Show resolved Hide resolved

lukasmasuch commented Sep 17, 2025

View reviewed changes

lukasmasuch added change:bugfix PR contains bug fix implementation and removed type:bug Something isn't working as expected labels Sep 17, 2025

lukasmasuch added 2 commits September 17, 2025 21:05

Add try finally

3e700f7

Improve comment

4befded

lukasmasuch changed the title ~~Fix vega chart unrecognized data set error~~ Fix vega chart unrecognized dataset error Sep 17, 2025

Update snapshots

e275693

lukasmasuch commented Sep 17, 2025

View reviewed changes

sfc-gh-nbellante reviewed Sep 17, 2025

View reviewed changes

lukasmasuch added 6 commits September 18, 2025 01:29

Revert

269d198

Fix test

1de61a1

Ignore updateView for dependency update

973112b

Update snapshots

c42da13

Update pattern

e38ce6a

Update comment

e694857

lukasmasuch commented Sep 18, 2025

View reviewed changes

Merge branch 'develop' into fix/altair-unknown-data-issue

b79fd4d

sfc-gh-lwilby reviewed Sep 18, 2025

View reviewed changes

sfc-gh-lwilby approved these changes Sep 18, 2025

View reviewed changes

lukasmasuch merged commit c4812ba into develop Sep 18, 2025
37 checks passed

lukasmasuch deleted the fix/altair-unknown-data-issue branch September 18, 2025 19:29

Conversation

lukasmasuch commented Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe your changes

GitHub Issue Link (if applicable)

Testing Plan

Uh oh!

snyk-io bot commented Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🎉 Snyk checks have passed. No issues have been found so far.

Uh oh!

github-actions bot commented Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ PR preview is ready!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

itsToggle commented Aug 11, 2025

Uh oh!

itsToggle commented Aug 21, 2025

Uh oh!

lukasmasuch commented Aug 21, 2025

Uh oh!

itsToggle commented Aug 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

itsToggle commented Sep 17, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📈 Frontend coverage change detected

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sfc-gh-nbellante left a comment

Choose a reason for hiding this comment

Uh oh!

lukasmasuch commented Sep 17, 2025

Uh oh!

sfc-gh-nbellante commented Sep 17, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sfc-gh-lwilby Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lukasmasuch Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lukasmasuch commented Jul 10, 2025 •

edited

Loading

snyk-io bot commented Jul 10, 2025 •

edited

Loading

github-actions bot commented Jul 10, 2025 •

edited

Loading

itsToggle commented Aug 21, 2025 •

edited

Loading

github-actions bot commented Sep 17, 2025 •

edited

Loading

sfc-gh-lwilby Sep 18, 2025 •

edited

Loading

lukasmasuch Sep 18, 2025 •

edited

Loading