DreamStruct: Understanding Slides and User Interfaces via Synthetic Data Generation (ECCV 2024)

Abstract

Enabling machines to understand structured visuals like slides and user interfaces is essential for making them accessible to people with disabilities. However, achieving such understanding computationally has required manual data collection and annotation, which is time-consuming and labor-intensive. To overcome this challenge, we present a method to generate synthetic, structured visuals with target labels using code generation. Our method allows people to create datasets with built-in labels and train models with a small number of human-annotated examples. We demonstrate performance improvements in three tasks for understanding slides and UIs: recognizing visual elements, describing visual content, and classifying visual content types.

Resources

Datasets and Models: [Google Drive Original Dataset] [Hugging Face Collection]
Examples of Model's I/O: [Project Website] [DreamStruct Prompts and Exaples] [ArXiv]

Citation

@inproceedings{peng2024dreamstruct,
  title={DreamStruct: Understanding Slides and User Interfaces via Synthetic Data Generation},
  author={Peng, Yi-Hao and Huq, Faria and Jiang, Yue and Wu, Jason and Li, Amanda Xin Yue and Bigham, Jeffrey and Pavel, Amy},
  booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
fonts		fonts
static		static
LICENSE		LICENSE
README.md		README.md
fonts.css		fonts.css
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DreamStruct: Understanding Slides and User Interfaces via Synthetic Data Generation (ECCV 2024)

Abstract

Resources

Citation

About

Uh oh!

Releases

Packages

Languages

License

yihaop/dreamstruct

Folders and files

Latest commit

History

Repository files navigation

DreamStruct: Understanding Slides and User Interfaces via Synthetic Data Generation (ECCV 2024)

Abstract

Resources

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages