integrate text stimulus into GazeDataFrame #526

dkrako · 2023-09-12T12:16:56Z

No description provided.

SiQube · 2023-09-12T13:28:58Z

what was your idea here? OCR of the images provided?

dkrako · 2023-09-12T15:17:19Z

this refers to the slides of my pitch presentation last week

dkrako · 2023-11-23T16:13:38Z

Regarding the integration of text stimuli, I would propose the following design:

Types of stimuli

We can expect to encounter different kinds of stimuli during further development. Apart from TextStimulus this could also be ImageStimulus, ShapeStimulus, VideoStimulus, GameStimulus and so on. Although all these types may have completely different datastructures, it makes sense to keep that in mind during design.

We will focus on TextStimulus first.

Use Case

When designing the integration of text stimuli into pymovements we need to focus on which types of results need the consideration of stimulus data.

For all types of stimuli we will probably have some kind of AOI representation.
The most prominent use case is then to calculate measures (e.g. first fixation duration) in respect to each AOI or groups of AOIs.

We will focus on computing measures for rectangular AOIs first.

Name space

Create a new stimulus name space and have subspsaces for each type of stimulus. This would look like this:

pymovements.stimulus.text  # has TextStimulus
pymovements.stimulus.image  # has ImageStimulus
pymovements.stimulus.video  # has VideoStimulus

Class composition

The TextStimulus class is composed of three attributes:

text: the presented text represented as a single string
aois: a mapping from one or multiple characters in the text string to XYWH
image: optional image of the rendered and presented text

Loading stimulus data

Loading stimulus data using an interface like this:

stimulus = pymovements.stimulus.text.from_files(
    text='path/to/text/file.txt',
    aoi='path/to/aoi.csv',
    image='optional/path/to/image.jpg',
)

AOI mapping

The mapping can be a polars dataframe with the columns:

index: the index of the character in the text string
string: the substring captured in the aoi
pixel_x: pixel x-position of the top left corner
pixel_y: pixel y-position of the top left corner
width: pixel width of aoi box
height: pixel height of aoi box

The index must be ordered always.

It can potentially have additional columns

page: the page index
line: the line on the page
column: the character index in the line, the naming can be improved here
word: the word this character/substring belongs to

Integration into GazeDataFrame

Each GazeDataFrame can be assigned a stimulus during initialization like this:

gaze = pymovements.GazeDataFrame(
    ...
    stimulus=stimuls,
)

gaze = pymovements.from_file(
    path='/path/to/gaze.csv',
    stimulus=stimulus,
)

Computing measures for AOIs

gaze.compute_aoi_measure('first fixation duration', level='word')

Caveats

multi character aois

I'm not sure, but maybe the index in the AOI mapping can support both an integer or a tuple/slice?

multi page text stimuli

I'm unsure how well mulit page text stimuli can be integrated into this design.
We wouldn't want to need to split GazeDataFrames by pages, as this creates a lot of overhead. So multi page text stimulus support is a must I think.

One idea is to have some a simple dataframe with the columns time and page.

The attributes text, aois, and image could have a new top level with the keys being the pages. The indexing would go like this:

stimulus.text[page_id], stimulus.image[page_id]

An alternative would be the other way around:

stimulus.pages[0].text, stimulus.pages[0].image

I prefer the former.

not much experience with gaze text data

This is a first proposel which should open up the discussion on developing this feature. I don't have that much experience with working with text aois. It is expected that I overlook some further issues. Each proposed point is open for discussion.

dkrako added the enhancement New feature or request label Sep 12, 2023

dkrako added this to the Sprint 18 milestone Sep 12, 2023

dkrako self-assigned this Sep 12, 2023

dkrako modified the milestones: Sprint 18, Sprint 20 Sep 22, 2023

dkrako modified the milestones: Sprint 20, Sprint 21, Sprint 22 Oct 6, 2023

dkrako removed their assignment Nov 17, 2023

dkrako modified the milestones: Sprint 22, Sprint 24 Nov 17, 2023

dkrako assigned SiQube Nov 23, 2023

dkrako linked a pull request Feb 9, 2024 that will close this issue

feat: add text stimulus class #676

Merged

dkrako closed this as completed in #676 Mar 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

integrate text stimulus into GazeDataFrame #526

integrate text stimulus into GazeDataFrame #526

dkrako commented Sep 12, 2023

SiQube commented Sep 12, 2023

dkrako commented Sep 12, 2023 •

edited

Loading

dkrako commented Nov 23, 2023

integrate text stimulus into GazeDataFrame #526

integrate text stimulus into GazeDataFrame #526

Comments

dkrako commented Sep 12, 2023

SiQube commented Sep 12, 2023

dkrako commented Sep 12, 2023 • edited Loading

dkrako commented Nov 23, 2023

Types of stimuli

Use Case

Name space

Class composition

Loading stimulus data

AOI mapping

Integration into GazeDataFrame

Computing measures for AOIs

Caveats

multi character aois

multi page text stimuli

not much experience with gaze text data

dkrako commented Sep 12, 2023 •

edited

Loading