Beyond Average: Individualized Visual Scanpath Prediction

This code implements the prediction of individualized visual scanpath in three different tasks (4 different datasets) with two different architecture:

Free-viewing: the prediction of scanpath for looking at some salient or important object in the given image. (OSIE, OSIE-ASD)
Visual Question Answering: the prediction of scanpath during human performing general tasks, e.g., visual question answering, to reflect their attending and reasoning processes. (AiR-D)
Visual search: the prediction of scanpath during the search of the given target object to reflect the goal-directed behavior. (COCO-Search18)

📣 Overview

Understanding how attention varies across individuals has significant scientific and societal impacts. However, existing visual scanpath models treat attention uniformly, neglecting individual differences. To bridge this gap, this paper focuses on individualized scanpath prediction (ISP), a new attention modeling task that aims to accurately predict how different individuals shift their attention in diverse visual tasks. It proposes an ISP method featuring three novel technical components: (1) an observer encoder to characterize and integrate an observer's unique attention traits, (2) an observer-centric feature integration approach that holistically combines visual features, task guidance, and observer-specific characteristics, and (3) an adaptive fixation prioritization mechanism that refines scanpath predictions by dynamically prioritizing semantic feature maps based on individual observers' attention traits. These novel components allow scanpath models to effectively address the attention variations across different observers. Our method is generally applicable to different datasets, model architectures, and visual tasks, offering a comprehensive tool for transforming general scanpath models into individualized ones. Comprehensive evaluations using value-based and ranking-based metrics verify the method's effectiveness and generalizability.

🙇‍♂️ Disclaimer

For the ScanMatch evaluation metric, we adopt the part of GazeParser package. We adopt the implementation of SED and STDE from VAME as two of our evaluation metrics mentioned in the Visual Attention Models. More specific, we adopt the evaluation metrics provided in Scanpath. For ChemLSTM and Gazeformer, we adopt the released code in Scanpath and Gazeformer, respectively. Based on the checkpoint implementation from updown-baseline, we slightly modify it to accommodate our pipeline.

✅ Requirements

Python 3.9
PyTorch 1.12.1 (along with torchvision)
We also provide the conda environment user_scanpath.yml, you can directly run

$ conda env create -f user_scanpath.yml

to create the same environment where we successfully run our codes.

📑 Tasks

We provide the corresponding codes for the aforementioned four different datasets and the pretrained models.

OSIE
OSIE-ASD
COCOSearch
AiR-D

More details of these tasks are provided in their corresponding folders.

✒️ Citation

If you use our code or data, please cite our paper:

@InProceedings{xianyu:2024:individualscanpath,
    author={Xianyu Chen and Ming Jiang and Qi Zhao},
    title = {Beyond Average: Individualized Visual Scanpath Prediction},
    booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year = {2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
AiR		AiR
COCO_Search18		COCO_Search18
OSIE		OSIE
OSIE_ASD		OSIE_ASD
asset		asset
README.md		README.md
user_scanpath.yml		user_scanpath.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Beyond Average: Individualized Visual Scanpath Prediction

📣 Overview

🙇‍♂️ Disclaimer

✅ Requirements

📑 Tasks

✒️ Citation

About

Releases

Packages

Languages

chenxy99/IndividualScanpath

Folders and files

Latest commit

History

Repository files navigation

Beyond Average: Individualized Visual Scanpath Prediction

📣 Overview

🙇‍♂️ Disclaimer

✅ Requirements

📑 Tasks

✒️ Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages