[SenSys’25] SensorQA: A Question Answering Benchmark for Daily-Life Monitoring [Paper]
The SensorQA Dataset is a dataset designed to train models to be able to understand wearable sensor reading and be able to answer questions about them.
To Download the annotation files:
git clone https://github.com/benjamin-reichman/SensorQA/
The annotation files can be found in:
SensorQA/overall_sensorqa_dataset_train.json
SensorQA/overall_sensorqa_dataset_train_em.json
SensorQA/overall_sensorqa_dataset_val.json
SensorQA/overall_sensorqa_dataset_val_em.json
The graphical visualizations of the sensor readings can be found in:
SensorQA/non_oracle_graphs
SensorQA/oracle_graphs
Our dataset uses the sensor reading and features from the Extrasensory dataset. They can be found here: http://extrasensory.ucsd.edu/. ExtraSensory contains sensor measurements from 60 users, where each user is assigned with a unique uuid. The matching between uuid and the user id in SensorQA graphs (from 0 to 59) can be found in es_user_id.py.
| Modalities | Backbone Model | ZS/FT | Oracle | Rouge-1 | Rouge-2 | Rouge-L | Meteor | Bleu | Exact Match |
|---|---|---|---|---|---|---|---|---|---|
| L | T5-Base | FT | 0.71 | 0.55 | 0.69 | 0.70 | 0.43 | 0.26 | |
| L | Llama-7B-LORA | FT | 0.72 | 0.62 | 0.72 | 0.72 | 0.38 | 0.04 | |
| V+L | Llama-7B-Adapter | ZS | ✔ | 0.33 | 0.20 | 0.30 | 0.44 | 0.09 | 0 |
| V+L | Llama-7B-Adapter | FT | ✔ | 0.73 | 0.57 | 0.71 | 0.72 | 0.43 | 0.14 |
| V+L | Llava-1.5-LORA | FT | ✔ | 0.62 | 0.46 | 0.60 | 0.58 | 0.35 | 0.13 |
| V+L | Llama-7B-Adapter | ZS | ✘ | 0.09 | 0.42 | 0.31 | 0.19 | 0.28 | 0 |
| V+L | Llama-7B-Adapter | FT | ✘ | 0.43 | 0.72 | 0.73 | 0.57 | 0.70 | 0.14 |
| V+L | Llava-1.5-LORA | FT | ✘ | 0.64 | 0.47 | 0.61 | 0.60 | 0.35 | 0.11 |
| S+L | Llama-7B-Adapter-HC | FT | 0.72 | 0.55 | 0.70 | 0.71 | 0.42 | 0.14 | |
| S+L | Llama-7B-Adapter-CLIP | FT | 0.71 | 0.53 | 0.69 | 0.69 | 0.40 | 0.12 |
| Modalities | Backbone Model | ZS/FT | Oracle | Accuracy |
|---|---|---|---|---|
| L | T5-Base | FT | 25.4% | |
| L | Llama-7B-LORA | FT | 26.5% | |
| V+L | Llama-7B-Adapter | ZS | ✔ | 0% |
| V+L | Llama-7B-Adapter | FT | ✔ | 28% |
| V+L | Llava-1.5-LORA | FT | ✔ | 21.5% |
| V+L | Llama-7B-Adapter | ZS | ✘ | 0% |
| V+L | Llama-7B-Adapter | FT | ✘ | 26.2% |
| V+L | Llava-1.5-LORA | FT | ✘ | 11% |
| S+L | Llama-7B-Adapter-CLIP | FT | 23.5% | |
| S+L | Llama-7B-Adapter | FT | 24.8% | |
| S+L | DeepSQA | FT | 27.46% |
To reproduce the T5 baseline:
python question_only.py
python t5_text_evaluation
To reproduce the LLama-7B-LORA experiment:
python llama_lora_training.py --train
python llama_lora_training.py --eval
python llama_text_evaluation.py
To reproduce the Llama-7B-Adapter fine-tuning for vision+language adjust the configs in Llama-Adapter/llama_adapter_v2_multimodal7b/exps/finetune-data-config.yaml to the desired file you want to train on.
You will need the llama weights and the adapter weights.
For further instructions on how to use their library: https://github.com/OpenGVLab/LLaMA-Adapter/blob/main/llama_adapter_v2_multimodal7b/docs/train.md
Then do the following:
cd LLaMA-Adapter/llama_adapter_v2_multimodal7b
./exps/finetune.sh models/llama LLaMA-Adapter/ckpts/7fa55208379faf2dd862565284101b0e4a2a72114d6490a95e432cf9d9b6c813_BIAS-7B.pth exps/finetune-data-config.yaml outputs ./exps/finetune.sh models/llama LLaMA-Adapter/ckpts/d26d107eec32127ac86ef1997cf7169de1c56a59c539fc1258c6798b969e289c_LORA-BIAS-7B-v21.pth exps/finetune-data-config.yaml outputs
python llama_adapter_val_loop.py
python llama_text_evaluation.py
To reproduce the Llava-1.5-LORA results:
accelerate launch --mixed_precision fp16 llama_lora_train.py --dataset_name="HuggingFaceH4/llava-instruct-mix-vsft" --model_name_or_path="llava-hf/llava-1.5-7b-hf" --report_to="none" --learning_rate=2e-5 --per_device_train_batch_size=1 --gradient_accumulation_steps=1 --output_dir="data/vsft-llava-1.5-7b-hf" --num_train_epochs=4 --gradient_checkpointing --remove_unused_columns=False --torch_dtype=float16 --fp16=True --use_peft=True --lora_r=64 --lora_alpha=16 --lora_target_modules=all-linear --log_level="info" --logging_strategy="steps" --logging_steps=1
python sensorqa_llava_eval.py
Most of these parameters are hard-coded in the code itself, so if you want to change them, change them there.
To reproduce the Llama-7B-Adapter S+L handcrafted feature results, you will first need to download the handcrafted features from: http://extrasensory.ucsd.edu/data/primary_data_files/ExtraSensory.per_uuid_features_labels.zip
Then to train the Llama-7B-Adapter S+L we first need to install a custom version of timm (included in this repository). This custom version allows for attention-masks in vision models. This helps mask the padding needed when processing sensor features.
To do this custom install:
cd pytorch-image-models
pip install -e .
Then to train:
cd LLaMA-Adapter/llama_adapter_v2_multimodal7b_sensors
./exps/finetune.sh models/llama LLaMA-Adapter/ckpts/7fa55208379faf2dd862565284101b0e4a2a72114d6490a95e432cf9d9b6c813_BIAS-7B.pth exps/finetune-data-config.yaml outputs ./exps/finetune.sh models/llama LLaMA-Adapter/ckpts/d26d107eec32127ac86ef1997cf7169de1c56a59c539fc1258c6798b969e289c_LORA-BIAS-7B-v21.pth exps/finetune-data-config-sensors.yaml outputs
python llama_adapter_sensors_val_loop.py
python llama_text_evaluation.py
For the Llama-7B-Adapter S+L with CLIP features, we start with the raw time series data of ExtraSensory (Accelerometer, Gyroscope, Magnetometer, Watch Accelerator and Audio MFCC features), which can be downloaded from their official website, such as http://extrasensory.ucsd.edu/data/raw_measurements/ExtraSensory.raw_measurements.raw_acc.zip. To train:
cd clip_training
./run.shThen once you have the CLIP features:
cd LLaMA-Adapter/llama_adapter_v2_multimodal7b_sensors_clip
./exps/finetune.sh models/llama LLaMA-Adapter/ckpts/7fa55208379faf2dd862565284101b0e4a2a72114d6490a95e432cf9d9b6c813_BIAS-7B.pth exps/finetune-data-config.yaml outputs ./exps/finetune.sh models/llama LLaMA-Adapter/ckpts/d26d107eec32127ac86ef1997cf7169de1c56a59c539fc1258c6798b969e289c_LORA-BIAS-7B-v21.pth exps/finetune-data-config-sensors.yaml outputs
python llama_adapter_sensors_val_loop.py
python llama_text_evaluation.py
For the DeepSQA model, we adapt our code from the DeepSQA repo. To train:
cd DeepSQA
python3 deepsqa_ca.py --gpt_shortened