Skip to content

Observer Manipulation

John Jenkins edited this page Feb 13, 2014 · 33 revisions

Table of Contents

Observer

An observer is a set of streams that define sets of data collected. This is analogous to our campaigns, where a campaign defines an overall set of surveys for a user.

Stream

A stream is a definition of a set of data to be collected. This is analogous to our surveys, where a survey defines a specific set of questions to ask a user.

Presently, all of our observers are defined in XML and the schema of each component is an Concordia definition. The XML should have a root tag that is "observer" and is defined as follows:

Key Description
id A unique ID for the observer. This should follow the Java naming conventions with periods as dividers between parts of the name, e.g. "org.ohmage.my.probe". It must begin with an alphanumeric character and may contain any combination of alphanumeric characters including periods to use as a separator; however, two periods may not appear together. The maximum length is 255 characters.
version A number describing this observer's version. This value must be increased any time any part of the observer's definition is changed.
name A user-friendly name for this probe.
description A user-friendly description describing the overall purpose of this probe. Each stream will have its own description, which is the more applicable place to describe exactly what type of data is being collected.
versionString A user-friendly string representing the version of the probe.
stream Any number of these tags may exist as long as there is at least one. These are the individual streams of data being collected. They are defined as follows:
Key Description
id An identifier for this stream that is unique to all other stream identifiers in this observer. This may be any alphanumeric characters and the underscore with a maximum length of 255 characters.
version A number describing the stream's version. This value must be increased any time any part of the stream's definition is changed.
name A user-friendly name for this stream.
description A user-friendly description describing the type of data collected by this stream.
metadata The types of meta-data to save. This tag, including all of its sub-components, is optional. The available tags are id, timestamp, and location.

The sub-components may have either no value, e.g. <timestamp />, which will evaluate to 'true', or a boolean value. If the tag's value is 'true', all uploaded points must contain the value; if the tag's value is 'false', no uploaded point may contain the value. If the tag is missing, uploaded points may or may not have the value; it will be saved if present.
schema A Concordia schema defining the data that is being collected. Every data point uploaded to this stream must conform to this schema; however, the schema is not strict. It is acceptable to add additional fields.

For example, a schema may require an object that defines one key, whose value is an integer value. It is perfectly acceptable to upload records with that key-value pair as well as any other key-value pairs; all of the data will be persisted.

Example Definition

<?xml version="1.0" encoding="UTF-8"?>
<observer>
    <id>edu.ucla.cens.Mobility</id>
    <version>2012050700</version>
    
    <name>Mobility</name>
    <description>The Mobility probe collects the user's current movement type (still, walking, running, etc.), which we call "mode", and may also collect the phone's movements.</description>
    <versionString>3.0</versionString>
    
    <stream>
        <id>regular</id>
        <version>2012050700</version>
        
        <name>Mobility - Regular</name>
        <description>This records only the user's mode.</description>
        
        <metadata>
            <timestamp />
            <location />
        </metadata>
        
        <schema>
            {
                "type":"object",
                "doc":"Only contains the user's mode.",
                "fields":[
                    {
                        "name":"mode",
                        "doc":"The user's mode.",
                        "type":"string"
                    }
                ]
            }
        </schema>
    </stream>
    
    <stream>
        <id>extended</id>
        <version>2012050700</version>
        
        <name>Mobility - Extended</name>
        <description>This records the user's mode as well as accelerometer, WiFi, and GPS data.</description>
        
        <metadata>
            <timestamp />
            <location />
        </metadata>
        
        <schema>
            {
                "type":"object",
                "doc":"Contains the user's mode, plus all of the additional sensor data.",
                "fields":[
                    {
                        "name":"mode",
                        "doc":"The user's mode.",
                        "type":"string"
                    },
                    {
                        "name":"speed",
                        "doc":"The user's speed over the last minute.",
                        "type":"number"
                    },
                    {
                        "name":"accel_data",
                        "doc":"An array of the accelerometer readings over the last minute.",
                        "type":"array",
                        "constType":{
                            "type":"object",
                            "fields":[
                                {
                                    "name":"x",
                                    "doc":"The x-component of the accelerometer reading.",
                                    "type":"number"
                                },
                                {
                                    "name":"y",
                                    "doc":"The y-component of the accelerometer reading.",
                                    "type":"number"
                                },
                                {
                                    "name":"z",
                                    "doc":"The z-component of the accelerometer reading.",
                                    "type":"number"
                                }
                            ]
                        }
                    },
                    {
                        "name":"wifi_data",
                        "doc":"A WiFi reading from the last minute.",
                        "type":"object",
                        "fields":[
                            {
                                "name":"time",
                                "doc":"The time this data was recorded.",
                                "type":"number"
                            },
                            {
                                "name":"timezone",
                                "doc":"The time zone of the device when this data was recorded.",
                                "type":"string"
                            },
                            {
                                "name":"scan",
                                "doc":"The scan of WiFi information.",
                                "type":"array",
                                "constType":{
                                    "type":"object",
                                    "doc":"A single access point's information.",
                                    "fields":[
                                        {
                                            "name":"ssid",
                                            "doc":"The access point's SSID.",
                                            "type":"string"
                                        },
                                        {
                                            "name":"strength",
                                            "doc":"The strength of the signal from the access point.",
                                            "type":"number"
                                        }
                                    ]
                                }
                            }
                        ]
                    }
                ]
            }
        </schema>
    </stream>
</observer>

↑ Back to Top

What does it do?

Allows a user to create a new observer.

URI

observer/create

Access Rules

Anyone is allowed to create observers.

Input Parameters

Authentication

  • (r) user = The user's username.
  • (r) password = The user's password.

OR

  • (r) auth_token = The user's authentication token.

Additional Parameters

  • (r) client = A description of the client making the call.
  • (r) observer_definition = The observer's definition as defined above.

Example POST

POST /app/observer/create HTTP/1.1
 Host: dev.ohmage.org
 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.77 Safari/535.7
 Content-Length: byte-length-of-content
 Content-Type: application/x-www-form-urlencoded
 
  auth_token=abcfcd36-ab25-4494-8434-7798cb1d718e
  &observer_definition=<XML Content>

cURL Examples

curl -v -F "auth_token=abcfcd36-ab25-4494-8434-7798cb1d718e" -F "observer_definition=@/myObserver.xml;type=text/xml" http://localhost:8080/app/observer/create 

Output Format

Success

{
   "result" : "success"
}

Failure

See the error page for a description of error codes and their associated descriptions.

↑ Back to Top

What does it do?

Allows a user to update an existing observer.

The observer's ID cannot change and its version number must increase. Any streams that existed before may be removed, remain unchanged, or be updated as long as their version number increases. There is no concept of "renaming" a stream, so a stream with the same definition as a previous stream but with a different name is considered a completely different stream. Readers with this knowledge may correlate the data as needed. All previous versions of observers, streams, and their data will remain. For example, an observer with three streams, S1v1, S2v1, and S3v1, may be upgraded by removing S1, increasing the version of S2 to version 2, leaving S3 alone, and creating a new stream S4. The original observer will still exist as will the new observer; likewise, 5 streams will exist, S1v1, S2v1, S2v2, S3v1, and S4v1. S3v1 will belong to both versions of the observer.

Note: If a stream's version does not change but its definition does, the stream will not be updated and success will be returned (assuming no other errors).

URI

observer/update

Access Rules

The creator of the observer is the only one that can update it.

Input Parameters

Authentication

  • (r) user = The user's username.
  • (r) password = The user's password.

OR

  • (r) auth_token = The user's authentication token.

Additional Parameters

  • (r) client = A description of the client making the call.
  • (r) observer_definition = The observer's definition as defined above.

Example POST

POST /app/observer/update HTTP/1.1
 Host: dev.ohmage.org
 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.77 Safari/535.7
 Content-Length: byte-length-of-content
 Content-Type: application/x-www-form-urlencoded
 
  auth_token=abcfcd36-ab25-4494-8434-7798cb1d718e
  &observer_definition=<XML Content>

cURL Examples

curl -v -F "auth_token=abcfcd36-ab25-4494-8434-7798cb1d718e" -F "observer_definition=@/myObserverV2.xml;type=text/xml" http://localhost:8080/app/observer/create 

Output Format

Success

{
   "result" : "success"
}

Failure

See the error page for a description of error codes and their associated descriptions.

↑ Back to Top

What does it do?

Uploads a set of data points for a user. All of the points must belong to the same observer but may belong to any of the streams and any of those streams' versions associated with that observer. The idea is that a data point is first associated with an observer, to which all of the data points in a single upload must belong. Then, it is associated with a stream for any version of that observer. Then, it is associated with a specific version of that stream, which defines a specific schema.

URI

stream/upload

Access Rules

Anyone is allowed to upload their own data to an existing observer.

Input Parameters

Authentication

  • (r) user = The user's username.
  • (r) password = The user's password.

OR

  • (r) auth_token = The user's authentication token.

Additional Parameters

  • (r) client = A description of the client making the call.
  • (r) observer_id = The unique identifier for this observer.
  • (r) observer_version = The current version of this observer that is being used to generate this data. This is required as it is needed for the stream/read API. If a maintainer were to update an observer but not modify one of the existing streams, then, without this parameter, it would be impossible to facilitate stream/read's "observer_version" parameter which allows a caller to limit the data to only those points after the change.
  • (r) data = The data to be uploaded. This have the format described below. Character encoding is a tricky thing. The server should be able to handle UTF-16, but may only support UTF-8. When in doubt, make sure characters with a code greater than 127 be translated into their \uXXXX counterpart.
  • (o) opt_in = If, and only if, given and true, any invalid points in the data will be pseudo-anonymized and stored such that the creator of the observer can retrieve them for debugging purposes.

The data must be a JSON array of JSON objects. Each object must follow this definition:

Key Description
stream_id The unique identifier for the stream to which this data applies.
stream_version The version of this stream to which this data applies.
metadata This should be a JSON object containing the metadata for this point. This object may be omitted if the definition states that all metadata fields should _not_ be sent and/or are optional. The format of this object is:
Key Description
id A string that uniquely identifies this point for this user for this observer-stream.
timestamp The ISO8601-formatted date-time-timezone string. If this is not present, then the time and timezone fields will be used.
location This is the location definition as an object. Its definition is:
Key Description
timestamp The ISO8601-formatted date-time-timezone string. If this is not present, then the time and timezone fields will be used.
latitude The latitude component.
longitude The longitude component.
accuracy The accuracy of the reading.
provider A string representing who provided this information.
data The data according which must match this stream's definition. Additional fields may be provided, but the defined fields must be present. The additional fields will be saved.

Example POST

POST /app/stream/upload HTTP/1.1
 Host: dev.ohmage.org
 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.77 Safari/535.7
 Content-Length: byte-length-of-content
 Content-Type: application/x-www-form-urlencoded
 
  auth_token=abcfcd36-ab25-4494-8434-7798cb1d718e
  &observer_id=org.ohmage.myObserver
  &observer_version=123456789
  &data=<The data>

cURL Examples

curl -v -F "auth_token=abcfcd36-ab25-4494-8434-7798cb1d718e" -F "observer_id=org.ohmage.myObserver" -F "observer_version=123456789" -F "data=@/my.data;type=text/xml" http://localhost:8080/app/stream/upload 

Output Format

Success

Success indicates that the server successfully read and understood the parameters, but it does not mean that all of the points were persisted in the database. If a point is invalid for whatever reason, an entry will be added to the "invalid_points" JSON array in the response. The elements in the array are JSON objects with the following keys, "index" (the index of the invalid point from the upload array), "comment" (a user-friendly explanation of what was wrong with the point), and "persisted" (a boolean indicating if the point was persisted in the database). The array will always be present but may be empty if all points were valid.

{
    "result" : "success",
    "invalid_points" : [
        {
           "index":0,
           "comment":"The required ID field in the metadata was missing.",
           "persisted":false
        },
        ...
    ]
}

Failure

See the error page for a description of error codes and their associated descriptions.

↑ Back to Top

What does it do?

Reads the contents of a stream. For now, we only allow a single stream's data to be read at a time. The reason for this is that all of the parameters are specific to each stream read, so allowing multiple streams to be read simultaneously would require that the stream ID, stream version, and column list would need to be duplicated for each stream.

URI

stream/read

Access Rules

Anyone is allowed to read their own data. For now, we don't have a way to allow users to read data about other users.

Input Parameters

Authentication

  • (r) user = The user's username.
  • (r) password = The user's password.

OR

  • (r) auth_token = The user's authentication token.

Additional Parameters

  • (r) client = A description of the client making the call.
  • (r) observer_id = The unique identifier for this observer.
  • (o) observer_version = The version of the observer. Because a stream may not change between observer version changes, then querying a stream with a specific version and not providing this parameter would allow you to read across observer versions where the stream version didn't change.
  • (r) stream_id = The unique identifier for the steam.
  • (r) stream_version = The version of the stream.
  • (o) username = The username of the user for which data is requested. The requester must be a privileged user in a class to which the requestee belongs.
  • (o) start_date = Limits the results to only those after or on this date. This must be an ISO8601 date-time of the form "yyyy-MM-dd'T'HH:mm:ss.SSSZZ".
  • (o) end_date = Limits the results to only those before or on this date. This must be an ISO8601 date-time of the form "yyyy-MM-dd'T'HH:mm:ss.SSSZZ".
  • (o) column_list = Limits the results to only those whose columns are in this list. The format of the list is a comma-separated set of strings. Each string denotes the column to return based on the following schema, <column>:<sub-column>:<sub-sub-column>:.... For a record of the form {"a":"b","one":{"sub1":"first", "sub2":"second", "sub3":[{"sub-sub1":1, "sub-sub2":2}, {"sub-sub1":10, "sub-sub3":30}]}}, the column_list a, one:sub1, one:sub3:sub-sub2 would return {"a":"b", "one":{"sub1":"first", "sub3":[{"sub-sub2":2}, {}]}}. The columns do not need to be defined in the schema, so additional data that was added may be queried, but records without that data will be empty but present, i.e. {}.
  • (o) num_to_skip = The number of records to skip. If this is negative, it will be set to 0. It is valid to skip more records than exist, but the number of records returned will be 0.
  • (o) num_to_return = The number of records to return. If this is negative or greater than the maximum, it will be reset to the maximum. The current maximum is 2000.

Example POST

POST /app/stream/read HTTP/1.1
 Host: dev.ohmage.org
 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.77 Safari/535.7
 Content-Length: byte-length-of-content
 Content-Type: application/x-www-form-urlencoded
 
  auth_token=abcfcd36-ab25-4494-8434-7798cb1d718e
  &observer_id=org.ohmage.myObserver
  &stream_id=myStream
  &stream_version=123456789
  &column_list=some_int,some_record:sub_item,some_array:sub_item:sub_sub_item

cURL Examples

curl -v -F "auth_token=abcfcd36-ab25-4494-8434-7798cb1d718e" -F "observer_id=org.ohmage.myObserver" -F "stream_id=myStream" -F "stream_version=123456789" -F "column_list=some_int,some_record:sub_item,some_array:sub_item:sub_sub_item" http://localhost:8080/app/stream/read 

Output Format

Success

{
  "result" : "success"
  "metadata":{
    "count":<A number representing the number of results.>,
    "prev":"<The URL for the previous set of results.>",
    "next":"<The URL for the next set of results.>"
  },
  "data":[
    {
      "metadata":{ // If no metadata exists, this key will not exist.
        "timestamp":"<The ISO8601 timestamp.>" // If no timestamp exists, this key will not exist.
        "location":{ // If no location exists, this key will not exist.
          "latitude":12.345
          "longitude":67.890
          "accuracy":15
          "provider":"Magic"
        }
      },
      "data":{} // The data based on the given columns and the data collected for this point.
    },
    ...
  ]
}

Failure

See the error page for a description of error codes and their associated descriptions.

↑ Back to Top

Clone this wiki locally