Global Local Transformer for Scene Graph Generation

[Caution: This reposity is still under development mode and not cleanly documented yet. We only recommed you to use it as a reference.]

At this repository, we build our Global-Local-Transformer model on top of a selection of base scene grap generator models including KERN, Neural Motif, Stanford, etc to improve scene graph generation by leveraging Visual Commonsense

The corresponding paper was accepted at ECCV 2020 arXiv preprint arXiv:2006.09623 (2020). Alireza Zareian*, Zhecan Wang*, Haoxuan You*, Shih-Fu Chang, "Learning Visual Commonsense for Robust Scene Graph Generation", ECCV, 2020. (* co-first authors) [manuscript]

Pretraining and independent finetuning of GLAT could refer to another repository(https://github.com/ZhecanJamesWang/GLAT_Visual_Commonsense)

Reference to Base Scene Graph Generators

Knowledge-Embedded Routing Network for Scene Graph Generation

Tianshui Chen*, Weihao Yu*, Riquan Chen, and Liang Lin, “Knowledge-Embedded Routing Network for Scene Graph Generation”, CVPR, 2019. (* co-first authors) [manuscript]

Neural Motifs: Scene Graph Parsing with Global Context

Zellers R, Yatskar M, Thomson S, Choi Y. "Neural motifs: Scene graph parsing with global context". CVPR, 2018.

Scene Graph Generation by Iterative Message Passing

Xu D, Zhu Y, Choy CB, Fei-Fei L. "Scene graph generation by iterative message passing".CVPR 2017.

Evaluation metrics

In validation/test dataset, assume there are images. For each image, a model generates top predicted relationship triplets. As for image , there are ground truth relationship triplets, where triplets are predicted successfully by the model. We can calculate:

For image , in its ground truth relationship triplets, there are ground truth triplets with relationship (Except , meaning no relationship. The number of relationship classes is , including no relationship), where triplets are predicted successfully by the model. In images of validation/test dataset, for relationship , there are images which contain at least one ground truth triplet with this relationship. The R@X of relationship can be calculated:

Then we can calculate:

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.idea		.idea
data/stanford_filtered		data/stanford_filtered
dataloaders		dataloaders
lib		lib
models		models
prior_matrices		prior_matrices
scripts		scripts
visualization		visualization
.gitignore		.gitignore
.gitignore.py		.gitignore.py
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
__init__.py		__init__.py
config.py		config.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Global Local Transformer for Scene Graph Generation

Reference to Base Scene Graph Generators

Knowledge-Embedded Routing Network for Scene Graph Generation

Neural Motifs: Scene Graph Parsing with Global Context

Scene Graph Generation by Iterative Message Passing

Evaluation metrics

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

ZhecanJamesWang/GLAT_SGG

Folders and files

Latest commit

History

Repository files navigation

Global Local Transformer for Scene Graph Generation

Reference to Base Scene Graph Generators

Knowledge-Embedded Routing Network for Scene Graph Generation

Neural Motifs: Scene Graph Parsing with Global Context

Scene Graph Generation by Iterative Message Passing

Evaluation metrics

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages