-
Notifications
You must be signed in to change notification settings - Fork 7
Copying field data
When collecting data from the field, you'll want to grab the qbank
assessment data as well as any student uploaded files.
On Windows unplatform installations, there is a script provided in unplatform, called DataExtractionScript.bat
. Just run that script and copy the resulting zip
file. That script takes care of assessment data export as well as student uploaded files, and bundles both sets of data into a single zip
file.
For a gstudio / MongoDB installation, just do a mongodump
and you'll have all the qbank
data! Depending on your configuration, it will most likely be stored in the following databases
- assessment
- assessment_authoring
- hierarchy
- id
- logging
- relationship
- repository
- resource
In the field, student-responses is only saved in certain database collections -- the remainder is data used to serve the assessments, and does not change in the field. Student-related data is in:
- assessment/AssessmentSection
- assessment/AssessmentTaken
- logging/Log
- logging/LogEntry
- repository/Asset
- repository/Repository
Also, any files uploaded as part of an assessment are put into the webapps/CLIx/studentResponseFiles
directory (see below for more details).
Make sure that all of the above data is saved by the field service provides and transported back to the central data store, for analytics and research purposes.
NOTE If you expect that teachers in the field will be authoring new content, then you MUST save / export the entire MongoDB datastore (not just the student-specific collections listed above), so that you capture the new questions and assessments.
Once you have collected the data from multiple schools, you can merge all the data into a single MongoDB instance with mongorestore
. If you collected the raw MongoDB data (i.e. *.wt
files), then first you'll need to run mongodump
on each of the data sets to get *.json
and *.bson
files.
To convert from the raw WiredTiger *.wt
files to a proper MongoDB database dump, you have to run mongodump
against a running MongoDB instance. For example, if your *.wt
files are located in a directory called rj1
:
$ ls rj1
-rw-r--r-- 1 user staff 49 Mar 21 2017 WiredTiger
-rw-r--r-- 1 user staff 21 Mar 21 2017 WiredTiger.lock
-rw-r--r-- 1 user staff 933 Feb 13 13:11 WiredTiger.turtle
-rw-r--r-- 1 user staff 188416 Feb 13 13:11 WiredTiger.wt
-rw-r--r-- 1 user staff 4096 Feb 13 13:11 WiredTigerLAS.wt
-rw-r--r-- 1 user staff 104857728 Dec 4 23:05 WiredTigerLog.0000000067
-rw-r--r-- 1 user staff 104857728 Dec 4 22:44 WiredTigerPreplog.0000000001
-rw-r--r-- 1 user staff 104857728 Dec 4 22:44 WiredTigerPreplog.0000000002
-rw-r--r-- 1 user staff 36864 Feb 13 13:11 _mdb_catalog.wt
-rw-r--r-- 1 user staff 77824 Feb 13 13:11 collection-0--1013810609953599019.wt
-rw-r--r-- 1 user staff 36864 Feb 13 13:11 collection-0--1650450412842504590.wt
-rw-r--r-- 1 user staff 61440 Feb 13 13:11 collection-0-1527613604414953189.wt
etc...
$ mongod --dbpath rj1 &
$ mongodump -o rj1-dump
$ ls rj1-dump
drwxr-xr-x 16 user staff 544 Feb 13 12:54 assessment
drwxr-xr-x 6 user staff 204 Feb 13 12:54 assessment_authoring
drwxr-xr-x 20 user staff 680 Feb 13 12:54 gstudio-mongodb
drwxr-xr-x 4 user staff 136 Feb 13 12:54 hierarchy
drwxr-xr-x 10 user staff 340 Feb 13 12:54 id
drwxr-xr-x 6 user staff 204 Feb 13 12:54 logging
drwxr-xr-x 6 user staff 204 Feb 13 12:54 relationship
drwxr-xr-x 6 user staff 204 Feb 13 12:54 repository
drwxr-xr-x 6 user staff 204 Feb 13 12:54 resource
$ ls rj1-dump/assessment
-rw-r--r-- 1 user staff 342301 Feb 13 12:54 Assessment.bson
-rw-r--r-- 1 user staff 93 Feb 13 12:54 Assessment.metadata.json
-rw-r--r-- 1 user staff 1712600 Feb 13 12:54 AssessmentOffered.bson
-rw-r--r-- 1 user staff 100 Feb 13 12:54 AssessmentOffered.metadata.json
-rw-r--r-- 1 user staff 917299 Feb 13 12:54 AssessmentSection.bson
-rw-r--r-- 1 user staff 100 Feb 13 12:54 AssessmentSection.metadata.json
-rw-r--r-- 1 user staff 109002 Feb 13 12:54 AssessmentTaken.bson
-rw-r--r-- 1 user staff 98 Feb 13 12:54 AssessmentTaken.metadata.json
-rw-r--r-- 1 user staff 353323 Feb 13 12:54 Bank.bson
-rw-r--r-- 1 user staff 87 Feb 13 12:54 Bank.metadata.json
-rw-r--r-- 1 user staff 13972948 Feb 13 12:54 Item.bson
-rw-r--r-- 1 user staff 87 Feb 13 12:54 Item.metadata.json
Once you have a set of dump
directories with *.json
and *.bson
files, you can now restore then into a single database. For the first set, you'll use the --drop
flag to clean out any existing data:
$ mongorestore --drop rj1-dump
For all subsequent directories, do not use the --drop
flag, to preserve the previously merged data sets. Note that the question and assessments data are always duplicated, and will cause MongoDB to throw some warnings about duplicate IDs. You can ignore those, assuming that teachers in the field did not edit existing assessments or questions.
$ mongorestore rj2-dump
For an unplatform installation, the JSON data files will be located in the webapps/
directory under the qbank
installation (the path can be found by visiting https://localhost:8080/datastore_path
). You'll want to copy the entire webapps/CLIx/
directory (will include the student-uploaded files).
In either of the two above situations (gstudio or unplatform), you still need to copy the student-uploaded files, which are not stored in GridFS, but instead are stored on the filesystem. These files will be located in the webapps/
directory under the qbank
installation (the full path can be found by visiting https://localhost:8080/datastore_path
). You'll want to copy the entire webapps/CLIx/studentResponseFiles/
directory.