-
Notifications
You must be signed in to change notification settings - Fork 33
Monitoring
Part of the simulation design process is to decide which results should be captured with a monitoring statement. Not all the data produced by the simulation are needed to interpret the results: mosquito-to-human transmission intensity, parasite density per infection per human together with human infectiousness to mosquitoes, bouts of sickness and treatments affecting each human, etc. A summary of those events over all the simulated population may be more important than the detailed experience of each simulated individual.
The monitoring method is described by an XML element of the form below. The continuous
element is optional and describes continuous reporting, similarly the cohorts
element is required only if output is required by sub-population. The other three elements are required and describe surveys.
<monitoring name="(some description or name)">
<continuous ... />
<SurveyOptions ... />
<surveys ... />
<ageGroup ... />
<cohorts ... />
</monitoring>
Surveys are periodic summaries at predefined time-steps. At those time-steps, information is saved about the quantity of interest, this could be a measurable quantity such as the EIR, or number of events since the last survey time point. Where this data concerns humans, it can be segregated into age groups, and can be sampled either across all simulated individuals or from selected subpopulations.
Standard outputs from OpenMalaria are intended to simulate the collection of data at a cross-sectional survey of the whole population. The measures output may include cross-sectional data from the time of the survey and/or values summed over the population since the previous survey. Exactly which measures will be output by a simulation depends on the SurveyOptions
element. This contains a list of options, for example:
<SurveyOptions>
<option name="nHost" value="true"/>
<option name="nPatent" value="true"/>
<option name="nMassVaccinations"/>
</SurveyOptions>
(The value
attribute is optional and can be omitted as in the last example. It can also be explicitly set to "false" which is the same as omitting the option entirely.)
A list of all outputs currently implemented can be found here.
Survey measures can be modified to disable some categorisation and to change the output code. For example, if multiple cohorts are used but per-cohort outputs are not needed for all outputs, one can do:
<option name="nHost"/> <!-- implicitly, byCohort="true" where available -->
<option name="nInfect" byCohort="false"/>
Several other modifications can be made. The full list of options is:
<option name="nInfect"
outputNumber="1"
byAge="true"
byCohort="true"
bySpecies="false"
byGenotype="false"
byDrugType="false" />
Restrictions: the same output number cannot be used by multiple active measures, and categorisation options cannot be enabled where they default to disabled for the measure. The same measure may be used multiple times so long as the outputNumber
is always different.
From Version 36 the option <SurveyOptions onlyNewEpisode="true">
is available. If set, some statistics exclude humans who have been treated in the recent past (precisely, when the time of last treatment was before the current step and no more than health-system-memory days/steps ago). This is a rough replacement for the REPORT_ONLY_AT_RISK
option, with one difference: the maximum age of treatment for REPORT_ONLY_AT_RISK
was fixed at 20 days. The affected measures are those with the following ID
valuesin the table of survey measures: 0-6,8,10,62,68-73.
Surveys can take place at any time point, starting from the beginning of the intervention period. Surveys only report events which happened from the beginning of the time-step of the last survey until the end of the time-step before the current survey time-step, and measures of the current state (such as the number of patent hosts) from the beginning of the survey time-step.
Therefore in order to report events at t0
, you must do a survey at t1
. It will report day 0-4, or January 1st to January 5th.
The timing of these is described as in the XML fragment below. The first valid time-point for a survey is time-step 0; however, any events happening before time-step 0 are not reported, so measures of events (such as infectious bites received) will be zero. Times may be specified in various units.
<surveys diagnostic="RDT">
<surveyTime>90d</surveyTime>
<surveyTime>185d</surveyTime>
<surveyTime>275d</surveyTime>
<surveyTime>365d</surveyTime>
</surveys>
This example describes four quarterly surveys; these are reported in the output.txt file as surveys 1, 2, 3 and 4. (The diagnostic
attribute references a diagnostic that needs to be separately specified within the <diagnostics>
element. More on diagnostics.)
By convention, timestep 0 corresponds to the first of January (up to 5th Jan with a 5-day timestep). Years are always modelled as 365 days long. [Caveat: prior to schema 22, when the maximumAgeYrs
attribute of demography was not a whole number of years timestep values may well have been offset relative to transmission seasonality.]
A report is made for those measures enabled under SurveyOptions
. Reported data is either from the moment the survey is done (immediate data) or is collected over the time since the previous survey, or in some cases over a fixed time span (usually one year).
The simulation ends immediately after the last survey is taken.
Where a large number of surveys are carried out at a regular interval, the following syntax is available.
<surveys diagnostic="RDT">
<surveyTime repeatStep="30d" repeatEnd="40y">
8y
</surveyTime>
</surveys>
The lower time (in this case 8y) refers to the time of the first survey.
Surveys may be "non-reporting". This means that "condition" variables are updated but no outputs are written. Adding unreported surveys should not have any effect on output (besides the updating of condition variables) except that, when diagnostics are stochastic, random number generation will be affected.
<surveyTime reported="false">20d</surveyTime>
The ageGroup
element describes which age-groups human-specific data is segregated into. Examples:
<ageGroup lowerbound="0.0">
<group upperbound="0.25"/>
<group upperbound="0.5"/>
<group upperbound="0.75"/>
<group upperbound="1"/>
<group upperbound="1.5"/>
<group upperbound="2"/>
<group upperbound="3"/>
<group upperbound="4"/>
<group upperbound="5"/>
<group upperbound="6"/>
<group upperbound="7"/>
<group upperbound="8"/>
<group upperbound="9"/>
<group upperbound="10"/>
<group upperbound="12"/>
<group upperbound="14"/>
<group upperbound="16"/>
<group upperbound="18"/>
<group upperbound="20"/>
<group upperbound="25"/>
<group upperbound="30"/>
<group upperbound="35"/>
<group upperbound="40"/>
<group upperbound="45"/>
<group upperbound="50"/>
<group upperbound="55"/>
<group upperbound="60"/>
<group upperbound="65"/>
<group upperbound="70"/>
<group upperbound="99"/>
</ageGroup>
<ageGroup lowerbound="0.0">
<group upperbound="99"/>
</ageGroup>
Here the upper bound is inclusive, so exact age 1 is included under age group 4 (with upper bound 1), but exact age 0 is always included in the first age group. Note also that this is no implicit catch-all last age group, so if the upper bound on the last age group is less than the oldest age allowed, then humans who are older than the last upper bound will simply not be included in surveys.
By default, survey data comes from the whole simulated population. It is also possible to specify monitoring only for defined subpopulations.
Whereas survey reporting is designed to aggregate data into configurable-size lumps, the continuous reporting mechanism is designed to report some data at high frequencies (but generally without segregation by age group and with less configuration potential).
To enable continuous reporting, add a continuous
sub-element to the monitoring
element, of the following form.
<continuous duringInit="false" period="1">
<option name="input EIR" value="true"/>
<option name="simulated EIR" value="true"/>
<option name="human infectiousness" value="true"/>
</continuous>
The period
attribute specifies the number of time-steps between reports. duringInit
is mostly used for debugging and can be omitted entirely; if set to true it enables reporting during the warm-up period and an extra column in the output (simulation time, the time-step counting from the beginning of the simulation rather than the beginning of the intervention period).
A list of available continuous outputs is here.
As of schema 32, multiple subpopulations may be monitored in surveys (but not in continuous output).
Each of the monitored subpopulations must first be defined, either implicitly, by an intervention within the element, or explicitly by the recruitment only pseudo-intervention.
Each of the subpopulations needs to be specified in the "cohorts" element, which links a subpopulation number
to the id
assigned when the subpopulation was declared in the "component" element:
<cohorts>
<subPop id="StudycohortA" number="1"/>
<subPop id="StudycohortB" number="2"/>
<subPop id="StudycohortB" number="4"/>
....
</cohorts>
The number
must be a power of 2 (i.e. 1, 2, 4, 8, ...). These numbers are used to label the output in the output.txt file.
If only a single subpopulation is listed, the corresponding output set in the output.txt file is identified by multiplying the number
by 1000. This increment
is then added to the values in the column usually reserved for age Group, corresponding to this subpopulation. The values for the complementary sub-population (those not included) retain the standard values in the age-group column.
Where multiple sub-populations are listed, output is segregated according to all combinations of membership: e.g.
if sub-populations A (number=1
) and B (number=2
) are listed, there will be outputs for "member of A and B"(increment
=3000); "member of A but not B"(increment
=1000);, "B but not A"(increment
=2000); and "not a member of A or B"(increment
=0). A corresponding rule is used to assign increment
values to all 8 combinations if three sub-populations are listed (each is further segregated by age groups, survey times and enabled output measures, which could lead to excessive program memory usage and output file size if many subpopulations are listed).
To monitor a defined cohort only, in Versions before 32, cohortOnly
is set to "true" as follows:
<monitoring name="(some description or name)" cohortOnly="true">
<continuous ... />
<SurveyOptions ... />
<surveys ... />
<ageGroup ... />
</monitoring>
cohortOnly
works only for versions up to 31 and has been removed in schema version 32. It may be assigned values true
or false
or, if not using cohorts (with Versions 32 or above), omitted entirely.
If cohortOnly
is set to true
, then the output of most measures are restricted to the cohort (although not quite all; see the list of measures for details); if it is set to false
or omitted, then these measures act on the entire simulated population. To obtain outputs for both the cohort and the entire population prior to version 32 the simulation must be run twice.
| Download openmalaria | Installation instructions | XML Schema Documentation |
XML Schema Version | Program version | master |
develop |
---|---|---|---|
43 | schema-43.0 |
- User Guide
- Compilation Guide
- Developer Guide
- Schema Update Guide
- Scenario Design Guide
- Monitoring Guide
- Changelog
- Schema Documentation
- Human demography
- Levels of transmission
- Parasite dynamics within humans
- P vivax dynamics
- Vector bionomics and transmission to humans
- Mosquito population dynamics
- Clinical (illness) models
- Time in the models