Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create time binning for statsitics #28

Open
kpwebb opened this issue Mar 31, 2015 · 1 comment
Open

create time binning for statsitics #28

kpwebb opened this issue Mar 31, 2015 · 1 comment

Comments

@kpwebb
Copy link
Contributor

kpwebb commented Mar 31, 2015

Can we develop a stats bucket that grows the time window to capture data in a reasonable way?

@kpwebb kpwebb added this to the 0.5 ship fully anonymized data milestone Mar 31, 2015
@kpwebb
Copy link
Contributor Author

kpwebb commented Apr 2, 2015

@bmander and I just agreed on the following approach:

  1. Traffic Engine emits a stream of stats observations for a segment with the following:
    segment identifier,utc time,speed

Every N minutes we rotate the traffic-engine stats stream store

  1. We build a stats aggregator that collects thr rotated stores every N minutes and aggregates into defined time bins.

Once a bin meets the criteria for being sharable (enough stats to anonymize) we ship the data and reset the bin. Bins without enough data are kept and their time window is expanded by N minutes when the next round of rotated stats are added.

Some bins will be "timeless" in that there aren't frequent enough samples to keep stats on time windows. In this case all samples just live in a grouped collection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant