Skip to content

VBVR-DataFactory/G-27_read_the_chart_data_semantic_comprehension_data-generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

G-27: Read The Chart Data Semantic Comprehension Data Generator

Generates synthetic datasets for training and evaluating vision models on chart reading and data comprehension tasks. Each sample contains a data table where specific values must be identified and highlighted based on semantic queries.

Each sample pairs a task (first frame + prompt describing what needs to happen) with its ground truth solution (final frame showing the result + video demonstrating how to achieve it). This structure enables both model evaluation and training.


📌 Basic Information

Property Value
Task ID G-27
Task Read The Chart Data Semantic Comprehension
Category Knowledge
Resolution 1024×1024 px
FPS 16 fps
Duration ~3 seconds
Output PNG images + MP4 video

🚀 Usage

Installation

# 1. Clone the repository
git clone https://github.com/VBVR-DataFactory/G-27_read_the_chart_data_semantic_comprehension_data-generator.git
cd G-27_read_the_chart_data_semantic_comprehension_data-generator

# 2. Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# 3. Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
pip install -e .

Generate Data

# Generate 50 samples
python examples/generate.py --num-samples 50

# Custom output directory
python examples/generate.py --num-samples 100 --output data/my_dataset

# Reproducible generation with seed
python examples/generate.py --num-samples 50 --seed 42

# Without videos (faster)
python examples/generate.py --num-samples 50 --no-videos

Command-Line Options

Argument Description
--num-samples Number of tasks to generate (required)
--output Output directory (default: data/questions)
--seed Random seed for reproducibility
--no-videos Skip video generation (images only)

📖 Task Example

Prompt

The scene shows a sales table with Class as rows and Goods as columns. Find the minimum value within the row corresponding to the Class 'Class 4' and draw a red rectangular border around the corresponding cell to highlight it.

Visual

Initial Frame
Sales table with Areas and Products
Animation
Red border appears around target cell
Final Frame
Maximum value in West row highlighted

📖 Task Description

Objective

Read and comprehend data tables to find specific values based on semantic queries (e.g., maximum in a specific row), then highlight the target cell.

Task Setup

  • Data structure: Tables with rows and columns (e.g., Area × Product sales data)
  • Query types: Find maximum/minimum values within specific rows or columns
  • Row/column labels: Semantic names (e.g., "West", "Product A")
  • Highlight method: Red rectangular border around target cell
  • Data values: Numerical data in table cells
  • Background: White with clear table structure
  • Goal: Identify correct cell based on query and highlight it

Key Features

  • Semantic data comprehension (understanding row/column meanings)
  • Maximum/minimum value identification within constraints
  • Table structure reading and navigation
  • Visual highlighting with red border annotation
  • Tests data literacy and reading comprehension
  • Various table configurations (different dimensions and labels)

📦 Data Format

data/questions/read_the_chart_data_semantic_comprehension_task/read_the_chart_data_semantic_comprehension_00000000/
├── first_frame.png      # Data table without highlighting
├── final_frame.png      # Table with target cell highlighted
├── prompt.txt           # Query instruction specifying search criteria
├── ground_truth.mp4     # Animation of highlight appearing
└── question_metadata.json # Task metadata

File specifications:

  • Images: 1024×1024 PNG format
  • Video: MP4 format, 16 fps
  • Duration: ~3 seconds

🏷️ Tags

logic-symbols data-comprehension table-reading semantic-reasoning data-analysis visual-highlighting


About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages