-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Dumped from mediawiki
The script uses [http://www.ruby-lang.org/en/downloads/ Ruby 1.8.7]
If you install ruby with a one-click installer with rails rolled in, the only extra libraries you should need are:
- highline 1.6.1 (sudo gem install highline)
- garb 0.7.6 (sudo gem install garb)
Don’t need to be root on linux, just OSX, I think.
highline is used for formatting some terminal output.
[http://github.com/vigetlabs/garb garb is the analytics library]
You need at least garb 0.7.6 to use analytics segments. (the more complex reports depend upon custom profile segments)
reports boot up the session, and call data classes in lib/analytics/api
, and are children of ‘Report’.
returned data from api objects is usually put into the array object @collector
. This is then either printed to screen (report.to_screen
) or to a file (report.to_file
) (the file is defined in the initialization of the report object in as a string in @file_path
).
For extremely long slow reports, it is suggested you use the new Array method “@collector.capture string
” instead of “@collector << string
”, as this will print the results to screen as they are collected, so you can follow output as it happens…
example of a report method:
def page_one
@collector << self.page_heading(1) visits = Visits.new visits.main_graph.each {|thing| @collector << thing} visits.three_with_changes.each {|thing| @collector << thing} uniques = Uniques.new uniques.three_monthly_averages.each {|thing| @collector << thing}end
==$periods, $display, $profile, Startup, and /bin/==
=$periods=
every report should have a $periods
object ($periods = Periods.new)
, this contains the dates for the report. There are getter and setter methods like: $periods.start_date_reporting, $periods.end_date_baseline
etc…
This will usually be populated by methods in the Startup class, like Startup.new.select_reporting_period
Data classes then import date values from $periods
on initialization.
$periods can also be changed on the fly with setter methods ($periods.start_date_previous = Date.new(yh1. 2010,mh1. 3,dh1. 9)
)
Reports are generally structured to a Reporting period (main period). A previous period, which is of the same length, up to the start of the Reporting period, and a baseline period, ending at the same point as the previous, but much longer.
ie:
Reporting period starts 2010-04-01 - 2010-04-30 Previous period starts 2010-03-01 - 2010-03-31 Baseline period starts 2009-03-31 - 2010-03-31
The previous period is usually calculated automatically with the method Startup.new.generate_previous_period
’’’note’’’
Life is a lot simpler when all periods are even sets of months. (ie not 4.5 months etc). This is because some methods use Ruby’s .month methods to adjust ranges, subdivide ranges etc.
Another month related limitation is that you must set the $periods values @reporting_number_of_months
and @baseline_number_of_months
.
This is usually taken care of by the relevant Startup methods, which will ask the user for the length of the baseline and reporting periods, in months.
These are used by some Crunch class methods which subdivide the baseline period so that it is an average of an equivalent period to the reporting period. So that comparing a 24 month baseline visits count to a 2 month reporting period visits count makes sense.
This allows (horrific) helper methods like:
@dates << date #first date is returned unchanged intervals_in_range = months_in_range/month_interval #setting correct ammount of dates to collect intervals_in_range-h1. 1 #offset because first value is already entered while intervals_in_range > 0 @dates << date.months_ago(month_interval) date = date.months_ago(month_interval) intervals_in_range-h1. 1 end
=$display=
Every report also requires a $display
object. These are instances of classes that are children of ‘Display’, like ‘Windows’, ‘Unix’ etc ie $display = Unix.new
Different Display classes handle methods like ask_user, tell_user, alert_user
etc differently.
They also manage differently the ‘arrows’ variables. For percentage changes, some increases are good, some are bad etc. The script tells you which it is “up red” for a bad increase, “down green” for a positive decrease etc. What these actually amount to in terms of script output is decided by which Display sub-class is used.
Every data class has methods up_is_nothing?
and up_is_good?
which are used by the class Crunch to decide how to interpret percentage changed.
Percentage change results strings will include something like #{self.arrow(@baseline_percentage_change)}
; Crunch’s method “arrow” then runs through like:
if change > 0
if self.up_is_nothing?
@arrow = $display.grey_up
return @arrow…. etc etc…
Getting the actual string from
$display
, depending on what booleans are returned by the calling class’s up_is_nothing?
and up_is_good?
methods…
This should just work, but something to be aware of when writing new classes, or making major changes to an existing class…
The Startup class contains methods for prompting the user to authenticate, enter date ranges etc. see startup.rb to see what is available.
The Startup class usually instantiates and populates $profile
, which is basically a holder for the account, profile and segment details. Every report needs this. It has accessor methods for these to change throughout a report. ie $profile.segment
.
Example of a Startup method that populates $profile
:
def select_profile
$display.ask_user(‘Enter the profile you want stats for (ie 20425901)’)
chosen_profile = gets.chomp $profile = Profile.new $profile.string = chosen_profile $profile.garb = Garb::Profile.first(chosen_profile)end
Set segments like: $profile.segment = "18378974"
Be sure to set segments back to nil when not in use ($profile.segment = nil
) (many data methods check for segments with $profile.segment?
which is true if the segment variable isn’t nil).
The value segment_string
is a sort of ad-hoc segment, which is used by some content related data classes to limit results by path. ie:
def content_summaries content = Content.new $profile.segment_string = "/Contexts" content.info $profile.segment_string = "/Contexts/Earthquakes" content.info
Used by class Content like:
if $profile.segment_string? report.filters :pageviews.gt => @limit, :pagePath.contains => $profile.segment_string else report.filters :pageviews.gt => @limit end
Again you need to set these back to nil to avoid problems…
And segments and segment_string can be combined (ie return values containing /Some_Section/, within segment 128744).
These objects are generally instantiated in the /bin/ files.
require File.expand_path(File.dirname(__FILE__) + "/../lib/analytics.rb") $display = Unix.new $periods = Periods.new interface = Startup.new interface.get_dates_with_options #populates $periods interface.authenticate_session #creates and populates $profile report = TKI_Check.new report.all
The data crunching classes. inherit helper methods from class Crunch (methods like percentage_change, get_array_of_months make_bounces_rates
)
Many methods in Crunch depend upon there being a suitable method called ‘arbitrary’ in the calling class. As a test for this, classes in /api contain a method ‘arbitrary?’ which returns true if it has an appropriate ‘arbitrary’ method.
Most classes in /api have a basic method called arbitrary, which takes arguments for date range and possibly other parameters, that gets the data. Then 3 or more methods which call arbitrary, one for each reporting period (‘reporting’, ‘previous’, ‘baseline’), which have the arguments for arbitrary baked in – pulling dates from the $periods object.
So go:
visits = Visits.newp visits.reporting #don’t need arguments, they use the dates pulled out of $periods as arguments for arbitrary
p visits.previous
p visits.baselinep visits.arbitrary(some_start_date, some_end_date) #needs date arguments (and possibly others…)
How to use it===
After your initial installation you should navigate to the root folder and run “ruby test/all
” to make sure your dependencies are in place.
These tests aren’t as complete as they should be, but the most dodgy and frequently called data classes have a fairly complete range of unit tests. If these pass you can assume you have everything you need to run reports.
To run the tests you may need to enter authentication info in the mockup session classes in /test/session/
.
Currently these require a [email protected] login and passwords to work, and test against the sciencelearn and biotechlearn hubs data.
A report should have a file in lib/analytics/report/
that grabs all the numbers and such.
And then a file in /bin/ which boots up the $display, $profile and $periods objects (probably via calls to the Startup class), and calls the main method for the report class.
So running the report should be a case of typing “ruby bin/the_report
”, and then entering your date ranges, authentication details etc.
h3. Community usage statistics / CMIS====
~/analytics $ ruby bin/community_usage
~/analytics $ ruby bin/cmis
These reports run through a long list of communities, by profile id:
~/analytics $ ruby bin/community_usage
~/analytics $ ruby bin/cmis
in /analytics/reports/community_usage.rb
:
self.report("14745867", "Ako Panuku") #pass in profile id as string, and name
To get the profile id string, navigate to the profile’s dashboard, and take the first integer from the URL, after id=.
Example:
https://www.google.com/analytics/reporting/?reseth1. 1&idh1. 16845949&pdrh1. 20100707-20100806 The ID is “16845949”
The community_usage report outputs a csv file, for dumping in to the monthly community usage report spreadsheet.
~/analytics $ ruby bin/prototype
This is a generic report. It asks for date ranges and a profile ID, and outputs a basic set of information. This doesn’t really have any advantages over simply using the Analaytics web interface to export a PDF of the dashboard, except that it provides previous and baseline percentage changes for metrics, and crunches monthly uniques for these periods too.
The report lives in lib/reports/prototype.rb
, and should be reasonably self explanatory.
Example of the report: image:Science hubs analytics.pdf
Example of hubs report:
image:Biotech-March1-June30-v1.pdf
The biotech and sciencelearn hubs have custom quarterly reports produced with:
~/analytics $ ruby bin/hubs
You then authenticate – you need to use the [email protected] account at this point, as it contains the appropriate custom segments.
And enter the date range for the reporting period. Baseline and previous are generated for you based on some options.
The profile ID for both the hubs reports are hardcoded in (they are site specific because of segments and path filtering anyway).
This report makes extensive use of segments and ad hoc segment strings. See Custom_analytics_reporting#Using_segments_with_.24profile
Segment codes are laid out in the comments of each report.
Reports are @:
lib/reports/sciencelearn.rb
lib/reports/biotech.rb
The Sciencelearn hubs had an upgrade at some point which changed urls from http://some_url/some_url
to http://Some-Url/Some-Url
.
This has resulted in some methods that work with pagePath etc to [http://wiki.github.com/vigetlabs/garb/filtering-with-andor use OR filters in blocks like]:
report.filters do contains(:pagePath, path) end report.filters do contains(:pagePath, path.downcase) #for domains that have capitalized and non capitalized instances in their history end report.filters do contains(:pagePath, path.downcase.gsub("-", "_")) #for changes in how spaces are handled... end
Note that you can’t mix in filters for metrics with filters for dimensions using OR, in the same request. So where you see filtering OR blocks like above, you can’t then add in another filter on a ‘metric’, ie pageviews, visits etc.
The sciencelearn and biotech learn analytics accounts also make use of a trailing slash filter, taken from here:
[http://insightr.com/blog/2009/9/3/two-google-analytics-filters-that-will-fix-problems-with-dou.html http://insightr.com/blog/2009/9/3/two-google-analytics-filters-that-will-fix-problems-with-dou.html]
There is also a filter to exclude traffic from bugs.cwa.co.nz, which escapes the normal local-network filter, and makes its way into the top 10 or so sources, due to [http://en.wikipedia.org/wiki/Work-life_balance the extra-ordinary out-of-office commitment] of the CWA Hubs team :D
h3. Check====
~/analytics $ ruby bin/check
~/analytics $ ruby bin/cwa_check
~/analytics $ ruby bin/tki_check
~/analytics $ ruby bin/check
~/analytics $ ruby bin/cwa_check
~/analytics $ ruby bin/tki_check
The Check class (with children tki_check and cwa_check) collect visits and pageview numbers for the current day and yesterday, for multiple communities, and check for large decreases – flagging a 70%+ drop – and for 0 values for either day.
This is used as a regular last-ditch test to make sure a release hasn’t killed GA tracking.
Reports need to be kept up to date with the list of profiles you want checked.
Reports are @:
lib/reports/cwa_check.rb
lib/reports/tki_check.rb
Logic for the checks is in:
lib/reports/check.rb