Skip to content

Official code for the paper "Beyond Average: Individualized Visual Scanpath Prediction"

Notifications You must be signed in to change notification settings

chenxy99/IndividualScanpath

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Beyond Average: Individualized Visual Scanpath Prediction

This code implements the prediction of individualized visual scanpath in three different tasks (4 different datasets) with two different architecture:

  • Free-viewing: the prediction of scanpath for looking at some salient or important object in the given image. (OSIE, OSIE-ASD)
  • Visual Question Answering: the prediction of scanpath during human performing general tasks, e.g., visual question answering, to reflect their attending and reasoning processes. (AiR-D)
  • Visual search: the prediction of scanpath during the search of the given target object to reflect the goal-directed behavior. (COCO-Search18)

Paper Video

📣 Overview

overall_structure Understanding how attention varies across individuals has significant scientific and societal impacts. However, existing visual scanpath models treat attention uniformly, neglecting individual differences. To bridge this gap, this paper focuses on individualized scanpath prediction (ISP), a new attention modeling task that aims to accurately predict how different individuals shift their attention in diverse visual tasks. It proposes an ISP method featuring three novel technical components: (1) an observer encoder to characterize and integrate an observer's unique attention traits, (2) an observer-centric feature integration approach that holistically combines visual features, task guidance, and observer-specific characteristics, and (3) an adaptive fixation prioritization mechanism that refines scanpath predictions by dynamically prioritizing semantic feature maps based on individual observers' attention traits. These novel components allow scanpath models to effectively address the attention variations across different observers. Our method is generally applicable to different datasets, model architectures, and visual tasks, offering a comprehensive tool for transforming general scanpath models into individualized ones. Comprehensive evaluations using value-based and ranking-based metrics verify the method's effectiveness and generalizability.

🙇‍♂️ Disclaimer

For the ScanMatch evaluation metric, we adopt the part of GazeParser package. We adopt the implementation of SED and STDE from VAME as two of our evaluation metrics mentioned in the Visual Attention Models. More specific, we adopt the evaluation metrics provided in Scanpath. For ChemLSTM and Gazeformer, we adopt the released code in Scanpath and Gazeformer, respectively. Based on the checkpoint implementation from updown-baseline, we slightly modify it to accommodate our pipeline.

✅ Requirements

  • Python 3.9

  • PyTorch 1.12.1 (along with torchvision)

  • We also provide the conda environment user_scanpath.yml, you can directly run

$ conda env create -f user_scanpath.yml

to create the same environment where we successfully run our codes.

📑 Tasks

We provide the corresponding codes for the aforementioned four different datasets and the pretrained models.

  • OSIE
  • OSIE-ASD
  • COCOSearch
  • AiR-D

More details of these tasks are provided in their corresponding folders.

✒️ Citation

If you use our code or data, please cite our paper:

@InProceedings{xianyu:2024:individualscanpath,
    author={Xianyu Chen and Ming Jiang and Qi Zhao},
    title = {Beyond Average: Individualized Visual Scanpath Prediction},
    booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year = {2024}
}

About

Official code for the paper "Beyond Average: Individualized Visual Scanpath Prediction"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published