Annotated Virtual Gate data for 22 gates. Please note that this is an extension of the earlier SAIVT-BuidlingMonitoring database, and uses all the virtual gates from this data, alongside a collection of new gates.
The SAIVT-VirtualGate database contains footage from 14 cameras, each capturing two hours of footage during a single work day at a busy university campus building. In total, 22 gates have been annotated (with up to 3 in a single camera). Pedestrian crossings within the gate are annotated with the frame number (approximate) at which the centre of mass of the pedestrian crosses the centre of the gate. Contact Dr Simon Denman (s dot denman at qut dot edu dot au) for more information
The SAIVT-VirtualGate database is © 2018 QUT, and is licensed under the [Creative Commons Attribution-ShareAlike 3.0 Australia License] (http://creativecommons.org/licenses/by-sa/3.0/).
To attribute this database, use the citation provided on our publication at [eprints] (https://eprints.qut.edu.au/122188/):
S. Denman, C. Fookes, P. Yarlagadda, & S. Sridharan (2018) Scene invariant virtual gates using DNNs. IEEE Transactions on Circuits and Systems for Video Technology. (In Press)
In addition to citing our paper, we kindly request that the following text be included in an acknowledgements section at the end of your publications:
'We would like to thank the SAIVT Research Labs at Queensland University of Technology (QUT) for freely supplying us with the SAIVT-VirtualGate database for our research'.
Download and unzip the following archive:
At this point, you should have the following data structure and the SAIVT-BuildingMonitoring database is installed:
SAIVT-VirtualGate
+-- 3-4ExternalStairs
+-- GP_P_Block_Level_3_4_external_stairs_ip_35_20160928_110000.avi
+-- GroundTruth_2.xml
+-- GroundTruth_3.xml
+-- ROI_2.xml
+-- ROI_3.xml
+-- Level3-2PoolStairs
+-- GP_P_Block_Level_3_2_Pool_Stairs_ip_31_20160928_110000.avi
+-- VG2-GroundTruth.xml
+-- VG2-ROI.xml
+-- VG-GroundTruth.xml
+-- VG-ROI.xml
+-- Level3ExtLifts
+-- GP_P_Block_Level_3_External_Lifts_ip_40_20160928_110000.avi
+-- GroundTruth.xml
+-- ROI.xml
+-- MainDriveXBlock
+-- ...
+-- MainDriveXBlock-ECP
+-- ...
+-- MainDriveXBlock-PTZ
+-- ...
+-- P_Lev_4_Entry_ip_53
+-- ...
+-- P_Lev_4_Entry_Way_ip_107
+-- ...
+-- P_Lev_4_External_Lift_foyer_ip_70
+-- ...
+-- P_Lev_4_internal_stairs_ip_93
+-- ...
+-- P_Lev_4_Lift_6_ip_51
+-- ...
+-- P_Lev_4_lift_foyer_ip_55
+-- ...
+-- P_Lev_4_Main_Entry_ip_54
+-- ...
+-- P_Lev_4_stairs_ip_95
+-- ...
+-- Denman 2018 - Scene invariant virtual gates using DNNs.pdf
+-- LICENSE.txt
+-- README.txt
Each directory contains:
- a video file, which a 2 hour video catpured during work hours
- one or more region of interests (ROIs) to define a virtual gate, and a ground truth file for that gate
Note that there is a small amount of inconsistency in the naming of the ROI and GT files within the different directories. Also note that in a couple of instances with elevators where multiple elevators are captured by a single gate, a corresponding set of ground truth where the annotations have been broken into individual doors is also available, however this has not been used in our paper.
The ROI and ground truth files have the following structure:
- The ROI files, i.e. "ROI.xml", define the region of interest as follows:
<ROI num-points="4" image-width="768" image-height="576" doi="90">
<point x="400" y="100" />
<point x="400" y="150" />
<point x="580" y="150" />
<point x="580" y="100" />
</ROI>
This defines a polygon within the image that defines the gate. People are annotated as they cross (approximatley) the centre of the gate. The direction of interest is also specified in the doi attribute.
- The Ground Truth, i.e. "GroundTruth.xml", contains ground truth in the following format:
<qutcrowd-flow-gt both-directions="0">
<ROI num-points="4" image-width="856" image-height="480">
<point x="622" y="91" />
<point x="837" y="242" />
<point x="837" y="282" />
<point x="622" y="131" />
</ROI>
<doi>0<doi>
<ped frame="1889" x="662" y="144" direction="1" />
<ped frame="3615" x="667" y="137" direction="0" />
<ped frame="4851" x="770" y="212" direction="1" />
<ped frame="5153" x="659" y="129" direction="1" />
<ped frame="6317" x="655" y="147" direction="1" />
...
</qutcrowd-flow-gt>
The ROI is repeated within the ground truth, and a direction of interest (the <doi>
tag) is also included, which indicates the primary direction for the gait (i.e. the direction that denotes a positive count. Each pedestrian crossing is represented by a <ped>
tag, which contains the approximate frame the crossing occurred in (when the centre of mass was at the centre of the gait region), the x and y location of the centre of mass of the person during the crossing, and the direction (0 being the primary direction, 1 being the secondary).
Further information on the SAIVT-VirtualGate database in our paper:
S. Denman, C. Fookes, P. Yarlagadda, & S. Sridharan (2018) Scene invariant virtual gates using DNNs. IEEE Transactions on Circuits and Systems for Video Technology. (In Press)
This paper is also available alongside this document in the file: 'Denman 2018 - Scene invariant virtual gates using DNNs.pdf'.