Visual Navigation with Spatial Attention

By Bar Mayo, Tamir Hazan and Ayellet Tal (CVPR 2021).

CVPR 2021 Paper | BibTex

(a) Paths	(b) Our's agent view	(c) Our attention	(d) SAVN agent view

We present a novel attention probability model for visual navigation tasks. This attention encodes semantic information about observed objects, as well as spatial information about their place. This combination of the "what" and the "where" allows the agent to navigate toward the sought-after object effectively. In the figure above (a) The agent aims at finding a TV (red rectangle) in a living room (top view), starting from a given location (black circle). Our agent's path is marked in orange and SAVN path is in magenta. At each step, the agent is given a specific view, depending on its position. In this example, our agent starts by turning around in its starting location to gather information---a strategy it has learned. (b) shows our agent's view before the first move forward, whereas (d) shows SAVN view before its first move forward.

Citing

If you find this project useful in your research, please consider citing:

@misc{mayo2021visual,
      title={Visual Navigation with Spatial Attention}, 
      author={Bar Mayo and Tamir Hazan and Ayellet Tal},
      year={2021},
      eprint={2104.09807},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Results

Model	SPL ≥ 1	Success ≥ 1	SPL ≥ 5	Success ≥ 5
Our (A2C)	17.88	46.20	15.94	32.63
Our (A3C)	16.99	43.20	15.51	31.71
SAVN	16.15	40.86	13.91	28.70
Scene Priors	15.47	35.13	11.37	22.25
Non-Adaptive A3C	14.68	33.04	11.69	21.44

Setup

Clone the repository with git clone https://github.com/barmayo/spatial_attention.git && cd eotp.
Install the necessary packages. If you are using conda then simply run 'conda create --name --file requirements.txt`.
Download the pretrained models and data to the eotp directory. Untar with

tar -xzf pretrained_models.tar.gz
tar -xzf data.tar.gz

The data folder contains:

thor_offline_data which is organized into sub-folders, each of which corresponds to a scene in AI2-THOR. For each room we have the ResNet 18 features of all possible locations in addition to a metadata and NetworkX graph of possible navigations in the scene.
thor_glove which contains the GloVe embeddings for the navigation targets.

Note that the starting positions and scenes for the test and validation set may be found in test_val_split.

If you wish to access the RGB images in addition to the ResNet features, replace thor_offline_data with thor_offlline_data_with_images. If you wish to run your model on the image files, add the command line argument --images_file_name images.hdf5.

Evaluation using Pretrained Models

Use the following code to run the pretrained models on the test set. Add the argument --gpu-ids 0 1 to speed up the evaluation by using GPUs.

Our (A2C)

python main.py --eval \
    --test_or_val test \
    --episode_type TestValEpisode \
    --load_model pretrained_models/EOTP_A2C_alpha_SE_66502810_4500000_2020-09-28_09:06:40.dat \
    --model EOTP \
    --results_json eotp_a2c_test.json 

cat eotp_a2c_test.json

Our (A3C)

python main.py --eval \
    --test_or_val test \
    --episode_type TestValEpisode \
    --load_model pretrained_models/EOTP_final_75614446_5000000_2020-10-09_16:54:35.dat \
    --model EOTP \
    --results_json eotp_a3c_test.json 

cat eotp_a3c_test.json

How to Train your models

You may train your own models by using the commands below.

Training

python main.py \
    --title eotp_train \
    --model EOTP \
    --gpu-ids 0 1 \
    --workers 12

How to Evaluate your Trained Model

You may use the following commands for evaluating models you have trained.

Evaluate

python full_eval.py \
    --title EOTP \
    --model EOTP \
    --results_json eotp_results.json \
    --gpu-ids 0 1
    
cat eotp_results.json

Acknowledgement

In this work we based our code on SAVN implementation. Please cite the original SAVN if you use their part of the code.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
agents		agents
datasets		datasets
episodes		episodes
figs		figs
models		models
optimizers		optimizers
runners		runners
test_val_split		test_val_split
utils		utils
4.5_bst_tst		4.5_bst_tst
LICENSE		LICENSE
README.md		README.md
full_eval.py		full_eval.py
main.py		main.py
main_eval.py		main_eval.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Visual Navigation with Spatial Attention

Citing

Results

Setup

Evaluation using Pretrained Models

Our (A2C)

Our (A3C)

How to Train your models

Training

How to Evaluate your Trained Model

Evaluate

Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

barmayo/spatial_attention

Folders and files

Latest commit

History

Repository files navigation

Visual Navigation with Spatial Attention

Citing

Results

Setup

Evaluation using Pretrained Models

Our (A2C)

Our (A3C)

How to Train your models

Training

How to Evaluate your Trained Model

Evaluate

Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages