Skip to content
Marco Filetti edited this page Jun 27, 2018 · 24 revisions

ReadingEvent

  • isSummary: Boolean, yes if this ReadingEvent was sent at the end of the reading session (document close). Summary ReadingEvents can contain united (non-floating) manually marked rectangles and paragraph rectangles (also see Rectangles). Non summary reading events contain rectangles sent "immediately" (as soon as the current reading event "exited", see DiMe push procedure for a description of this).
  • sessionId: Unique string identifying the reading session this event relates to (a reading session is from document open until document close).
  • proportionRead: Fraction of the document marked as "Read"
  • proportionInteresting: Fraction of the document marked as "Interesting"
  • proportionCritical: Fraction of the document marked as "Critical"
  • pageNumbers: Vector of pages specifying the pages currently being displayed, starting from 0. Unset for summary events.
  • pageLabels: Vector of page labels (string) specifyng the pages currently being displayed. Unset for summary events.
  • pageRects: An array (vector) of Rectangles. All the rects should fit within the page. Rect dimensions refer to points in a 72 dpi space where the bottom left is the origin, as in Apple's PDFKit. A page in US Letter format (often used for papers) translates to approx 594 x 792 points.
  • plainTextContent: Plain text content of text currently displayed on screen.
  • pageEyeData: Array (vector) of PageEyeData, see below. Each item in the array represents a "chunk" of fixations (one chunk per page).

ReadingEvents also contain targettedResource, which describes the pdf document related to this event (filename, title, etc).

Rectangles

Each item in the pageRects array of a reading event contains a single Rect instance, which contains the following data

  • origin: Starting point in a 72dpi space starting from bottom left (of page). Contains x and y.
  • size: Size of the rectangle on a 72dpi page space. Contains width and height.
  • pageIndex: Index (from 0) of PDF page on which this rect resides.
  • readingClass: Integer representing reading class" of this rectangle (what kind of rectangle is this?). See Constants below.
  • classSource: Integer representing "source" of this rectangle (what created this rectangle?).
  • plainTextContent: Text contained within this rectangle
  • floating: Floating rectangles are created immediately and are sent in non-summary ReadingEvents. Summary ReadingEvents, on the other hand, can contain non-floating rectangles. Non floating rectangles are a union of all intersecting floating rectangles.
  • unixt: Unix time(s) representing when this rectangle was generated. Floating rectangles contain only one timestamp, while non-floating ones contain all times of the rectangles that were united.
  • scaleFactor: Zoom level when rect was created (1 = 100% = 72 points per inch displayed on screen).

PageEyeData

Each item in the array pageEyeData of a ReadingEvent contains:

  • pageIndex: index of the page this data refers to.
  • Xs: array of x positions for each fixation.
  • Ys: array of y positions for each fixation.
  • Ps: pupil size for each fixation (unsupported yet).
  • startTimes: microseconds timestamps representing start of fixation, one per fixation.
  • endTimes: microseconds timestamps for fixation end, one per fixation.
  • durations: duration of each fixation in μS.
  • unixt: unix time (note: ms) corresponding to when first chunk of fixations was received.
  • scaleFactor: Zoom level when data were collected (1 = 100% = 72 points per inch displayed on screen).

Constants

Integer constants for reading classes

  • 0: undefined
  • 10: viewport. This rectangle identifies a viewport position in page space. These are sent in non-summary reading events.
  • 15: paragraph. A paragraph is probably present here. Each fixation can generate this kind of rectangle if there is some text below the fixation. This rectangle should cover at least 3 degrees of visual angle in the vertical axis.
  • 20: read. (debugging - unused) This text was marked as read. These are sent in summary reading events.
  • 30: interesting. This rectangle contains interesting text. These are sent in summary reading events.
  • 40: critical. This rectangle contains critical text. These are sent in summary reading events.

Integer contants for reading sources

  • 0: undefined.
  • 1: viewport. This rectangle was created by a viewport.
  • 2: click. Manually marked by user.
  • 3: SMI. Generated by a fixation.
  • 4: machine learning. (unused yet).
  • 5: search. Generated by a textual search query.

DiMe push procedure

PeyeDF passes a ReadingEvent to DiMe each time the user stays on a page for a fixed amount of time (2 seconds). The start time refers to the first time the user scrolled to a give position, while the end time is when the user navigates away from the page, switches windows or 10 minutes from the start, whichever comes earlier (actual constants defined in PeyeConstants) also see Constants, below.

The event generation loop can then be defined as follows:

  1. When first opening a file, we send an information element to dime within a desktop event. This contains the details of the file (title, path, plaintext, etc.)
  2. We submit a reading event every time a "status" expires

The "status expiration" loop can then be defined as follows:

  1. There is an entry event, such as we landed on a specific page, maybe immediately after opening a file, maybe not. HistoryManager keeps a reference of the window in which we landed. An entry event could even be that we started scrolling a window which is not in focus. An entry event is always preceded by an exit event (step 3).
  2. After a minimum amount of time (e.g. 2 seconds) we assume the user started reading and save this status in the HistoryManager. This starts another timer (e.g. 10 minutes) after which we assume the user went away from keyboard. This is called an exit event.
  3. An exit event is when the user switches focus, scrolls / moves a window. An exit event is also when the timer started in step 2 above terminates. After receiving an exit event, the HistoryManager sends the status (which was saved in step 2 above) to DiMe. If no status was saved (e.g. because we switched focus after less than 5 seconds), nothing happens.

The main actors in the DiMe event submission loop are:

  • DocumentWindowController: keeps a reference to the MyPDFReader instance and tracks events such as scrolling, focus change, etc. Answers to requests from history manager (it is passed inside their timers as the userInfo object). Many DocumentWindowControllers can be active at once (when the user has many documents open).
  • HistoryManager: there is only one history manager. It keeps a reference to the currently running timer(s) which contains information about the "active window" (the last which generated an "entry event"). The timers are invalidated at "exit events". HistoryManager is also responsible for DiMe api calls and asks ReadingEvent data from the DocumentWindowController.
  • ReadingEvent: data representation (see above)