-
Notifications
You must be signed in to change notification settings - Fork 91
Getting Started with Citygram
Citygram is a geographic notification platform designed to work with open government data. It allows residents to designate area(s) of a city they are interested in and subscribe to one or more topics. When an event for a desired topic occurs in the subscriber's area of interest, a notification (email, SMS, or webhook) is delivered.
Citygram is actually made up of several different pieces, each of which has a role in making the application work. We will walk through them one-by-one.
On the data side: City departments maintain data internally for their own uses. Using internal processes, this data is published to the city's open data portal.
An intermediary, ETL "connector" layer pulls data from the open data portal's API and converts it from its export format to citygram-compliant geojson. (This layer can also cache the data to relieve strain on the city's servers.)
The Citygram application takes user input and stores for each user a) what datasets they want and b) what area they want. Citygram regularly (every ten minutes) makes calls to the Connector layer (every ten minutes) to pull down any new events (data points) in all connected datasets, which are received as geojson. Citygram then stores that data in an events table in a PostGIS database.
On the user side: a user signs up for a) datasets they care about and b) areas they care about. These areas are stored in a subscriptions table with information about the user's notification preferences and the dataset they care about. (Each dataset gets its own entry in the table, even if it's the same subscription area.)
Every ten minutes, when Citygram calls to the Connector layer and pulls down new events (as described above), it also runs a spatial query: using the subscription geometries (areas), it identifies the events that fall inside that area. For each event that's in the subscription area, Citygram uses the information about the user's preferences stored in the subscription and notifies them in the medium they selected (SMS, email, or webhook).
The connector is what pulls data from the open data portal and converts it to Citygram-compliant geojson. What follows in an example of a connector written as a Sinatra (Ruby) application that connects to San Francisco's open data portal (a Socrata database) and will be deployed on Heroku. In this example, we'll be making a connector for SF mobile food facility permits.
First, identify which dataset you'd like to build the connector for. The key is to make sure the dataset has a geographic component, a timestamp, and a decent about of contextual data. The geographic component can be anything that can be represented as a GeoJSON geometry. A real bonus is a URL to find out more information about the specific event. Identify all information that may be useful for a resident to know. In our case, the data contains a link to a PDF of the schedule for that food truck that includes where they're going to be and when, which is useful information.
- Install Ruby
- Install bundler: In the command line, type:
$ gem install bundler
- Set up the bones of the application. In the command line, type:
$ mkdir food-trucks
$ cd food-trucks/
$ touch Gemfile app.rb config.ru .gitignore Procfile
(For your application, you may want to name it something else that is more relevant to your dataset. :-P)
- Set up the app for deployment on Heroku. Open
config.ru
and add the following:
require './app'
run Sinatra::Application
- Open
Procfile
and add the following:
web: bundle exec rackup -p $PORT
- Open
Gemfile
and type the following to declare dependencies:
source 'https://rubygems.org'
gem 'faraday'
gem 'sinatra'
In our case, Sinatra is being used as our web framework and Faraday is our HTTP client.
- On the command line, type the following to install these gems:
$ bundle install
- Open
app.rb
, require the dependencies, and define a route. The route should describe the underlying data set, e.g./mobile-food-facility-permits
.
require 'faraday'
require 'sinatra'
get '/mobile-food-facility-permits' do
- In order to make this route successful, we need to get the URL for the Socrata endpoint for the dataset. On the data portal itself, go to your dataset. Click Export. The URL can be found at API Access Endpoint.
- In our router in
app.rb
, let's add the following:
require 'faraday'
require 'sinatra'
get '/mobile-food-facility-permits' do
url = URI('https://data.sfgov.org/resource/rqzj-sfat.json')
- The endpoint needs three things: an order, a limit, and a query.
- The order tells Socrata to give us the most recent events first. One of the fields in the dataset should be a Timestamp. Sometimes there is more than one timestamp -- you have to decide for your dataset which one you should use.
- The limit tells Socrata how many records to give it at a time.
- On the data portal itself, click the info button next to the header for the timestamp column. Sometimes the actual dataset header is different than the human-readable name dispayed.
- In our router in
app.rb
, let's add the following to tell Socrata to give us the data by Date Approved (the actual dataset header for this column is 'approved') in descending order. Let's also give it a limit of 100 records returned:
require 'faraday'
require 'sinatra'
get '/mobile-food-facility-permits' do
url = URI('https://data.sfgov.org/resource/rqzj-sfat.json')
url.query = Faraday::Utils.build_query(
'$order' => 'approved DESC',
'$limit' => 100
)
- We still need to give the endpoint a query. One important piece of information in the Mobile Food Facility Permits dataset is its status: we only want permits that have status APPROVED. We also want to make sure we keep out any data that doesn't have a unique ID, a latitude, or a longitude, and we want to only pull in data from the last seven days.
- In our router in
app.rb
, let's add the following:
require 'faraday'
require 'sinatra'
get '/mobile-food-facility-permits' do
url = URI('https://data.sfgov.org/resource/rqzj-sfat.json')
url.query = Faraday::Utils.build_query(
'$order' => 'approved DESC',
'$limit' => 100,
'$where' => "status = 'APPROVED'"+
" AND objectid IS NOT NULL"+
" AND latitude IS NOT NULL"+
" AND longitude IS NOT NULL"+
" AND approved > '#{(DateTime.now - 7).iso8601}'"
)
- Note the syntax of the dates -- that's part of the SODA API, and is the way Socrata can parse your date connections.
- Also keep in mind that your field names may be different.
- Using Faraday, we can have our router connect to Socrata. In
app.rb
:
require 'faraday'
require 'sinatra'
get '/mobile-food-facility-permits' do
url = URI('https://data.sfgov.org/resource/rqzj-sfat.json')
url.query = Faraday::Utils.build_query(
'$order' => 'approved DESC',
'$limit' => 100,
'$where' => "status = 'APPROVED'"+
" AND objectid IS NOT NULL"+
" AND latitude IS NOT NULL"+
" AND longitude IS NOT NULL"+
" AND approved > '#{(DateTime.now - 7).iso8601}'"
)
connection = Faraday.new(url: url.to_s)
response = connection.get
- You can check if the query worked in your browser. (Danny, how?)
-
collection.get
is making an HTTP request to Socrata, andresponse
is the response object. Socrata returns an array, which we can convert to a Ruby array. Then we can loop over it and create our Citygram-compliant geojson (first by creating an array of Ruby hashes, then converting it to JSON). Inapp.rb
:
require 'faraday'
require 'sinatra'
get '/mobile-food-facility-permits' do
url = URI('https://data.sfgov.org/resource/rqzj-sfat.json')
url.query = Faraday::Utils.build_query(
'$order' => 'approved DESC',
'$limit' => 100,
'$where' => "status = 'APPROVED'"+
" AND objectid IS NOT NULL"+
" AND latitude IS NOT NULL"+
" AND longitude IS NOT NULL"+
" AND approved > '#{(DateTime.now - 7).iso8601}'"
)
connection = Faraday.new(url: url.to_s)
response = connection.get
collection = JSON.parse(response.body)
features = collection.map do |record|
{
'id' => record['objectid']
'type' => 'Feature',
'properties' => record.merge('title' => record['fooditems']),
'geometry' => {
'type' => 'Point',
'coordinates' => [
record['longitude'].to_f
record['latitude'].to_f
]
}
}
end
content_type :json
JSON.pretty_generate('type' => 'FeatureCollection', 'features' => features)
end
- Boom! We just created a Connector.
- The geojson we created is Citygram-compliant with the following fields:
- An 'id' field at the top-level of the feature that is unique to the event. This exists to prevent duplication.
- A 'title' property that merges fields to create a message to send out as the notification.
- So let's make our 'title' field more verbose. Some examples of good titles from other datasets:
- There's been a [CASE_GROUP] code violation near you at [ADDRESS]. Its status is [STATUS], and you can find out more at [URL].
- [APPLICANT_NAME] has applied for a commercial electrical permit at [ADDRESS]. Find out more at URL].
- A code complaint has been opened or updated near you at [ADDRESS]. Its status is [STATUS] & its case number is [CASE_NUMBER].
- The JSON object for the food facility permit from data.sfgov.org looks like this:
{
"location" : {
"needs_recoding" : false,
"longitude" : "-122.398658184594",
"latitude" : "37.7901490874965"
},
"status" : "APPROVED",
"expirationdate" : "2015-03-15T00:00:00",
"permit" : "14MFF-0102",
"block" : "3708",
"received" : "Jun 2 2014 12:23PM",
"facilitytype" : "Truck",
"blocklot" : "3708055",
"locationdescription" : "01ST ST: STEVENSON ST to JESSIE ST (21 - 56)",
"cnn" : "101000",
"priorpermit" : "0",
"approved" : "2014-06-02T15:32:00",
"schedule" : "http://bsm.sfdpw.org/PermitsTracker/reports/report.aspx?title=schedule&report=rptSchedule¶ms=permit=14MFF-0102&ExportPDF=1&Filename=14MFF-0102_schedule.pdf",
"address" : "50 01ST ST",
"applicant" : "Cupkates Bakery, LLC",
"lot" : "055",
"fooditems" : "Cupcakes",
"longitude" : "-122.398658184604",
"latitude" : "37.7901490737255",
"objectid" : "546631",
"y" : "2115738.283",
"x" : "6013063.33"
}
- Based on the info available to us, ours will say, "A mobile food facility permit (# [permit]) has been approved for a [facilitytype] serving [fooditems] at [address]. The applicant is [applicant]. Find more schedule information here: [schedule]."
- Let's add this to our router at
app.rb
:
require 'faraday'
require 'sinatra'
require 'json'
get '/mobile-food-facility-permits' do
url = URI('https://data.sfgov.org/resource/rqzj-sfat.json')
url.query = Faraday::Utils.build_query(
'$order' => 'approved DESC',
'$limit' => 100,
'$where' => "status = 'APPROVED'"+
" AND objectid IS NOT NULL"+
" AND latitude IS NOT NULL"+
" AND longitude IS NOT NULL"+
" AND approved IS NOT NULL"
)
connection = Faraday.new(url: url.to_s)
response = connection.get
collection = JSON.parse(response.body)
features = collection.map do |record|
title = "A mobile food facility permit (number #{record['permit']}) has been approved for a #{record['facilitytype']} serving #{record['fooditems']} at #{record['address']}. The applicant is #{record['applicant']}. Find more schedule information here: #{record['schedule']}."
{
'id' => record['objectid'],
'type' => 'Feature',
'properties' => record.merge('title' => title),
'geometry' => {
'type' => 'Point',
'coordinates' => [
record['longitude'].to_f,
record['latitude'].to_f
]
}
}
end
content_type :json
JSON.pretty_generate('type' => 'FeatureCollection', 'features' => features)
end
- Now our connector has a human-readable message that will be useful for subscribers.
- In the command line, we can run a webserver to test our connector
ruby app.rb
- visit http://localhost:4567/mobile-food-facility-permits to see the results!
- In the command line, we can deploy to Heroku with the following:
$ heroku create sf-mobile-food
$ git push heroku master && heroku open