Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Current data structure for stats #109

Open
lee212 opened this issue Mar 22, 2013 · 2 comments
Open

Current data structure for stats #109

lee212 opened this issue Mar 22, 2013 · 2 comments

Comments

@lee212
Copy link
Contributor

lee212 commented Mar 22, 2013

Cloud Metrics shell returns dict data structure once calculation is finished.

"stats": {
              $group: {
                              $period: {
                                              $metric: value
                                            }
                           }
            }

Example 1. Total usage of wall-clock hours for entire period

"stats": {
              "All": { "All": { "runtime": 1234567.0 } } }

Example 2. Monthly usage of wall-clock hours for project groups

"stats": {
              "fg-111": { "monthly": { "runtime": 12345678.0 } }
              "fg-222": { "monthly": { "runtime": 12345678.0 } }
              ...
            }

Current data structure ought to be replaced to a new data structure.
In a nutshell,
current data structure keeps everything in a single dictionary,
it will be changed to
single analysis query keeps a single dictionary. It will lead us to have separated dictionaries for multiple analysis queries.

Proposed new data structure is following:

"result"+number: {
                          "options": { "metric": value(list),
                                           "start_date": value(datetime),
                                           "end_date": value(datetime),
                                           "cloud": value(list),
                                           "nodename": value(list),
                                           "groupby": value(list),
                                           "period": value(str),
                                           "timetype": value(str)
                                          }
                         "stats": {
                                      $group: { $metric: value }
                                      ...
                                     }
                        }
...

Updated example 1. Total usage of wall-clock hours for entire period

"result1": {
               "options": { "metric": ['runtime'],
                                "start_date": datetime(1981,1,1),
                                "end_date": datetime(3000,1,1),
                                "cloud": ['All'],
                                "nodename": ['All'],
                                "groupby": ['All'],
                                "period": 'All',
                                "timetype": 'hour'
                               }
                 "stats": {
                               "All": { "runtime": 1234567.0 } }       
              }

Updated example 2. Monthly usage of wall-clock hours for project groups

"result1": {
               "options": { "metric": ['runtime'],
                                "start_date": datetime(1981,1,1),
                                "end_date": datetime(3000,1,1),
                                "cloud": ['All'],
                                "nodename": ['All'],
                                "groupby": ['project],
                                "period": 'monthly',
                                "timetype": 'hour'
                               }
                 "stats": {
                              "fg-111": { "runtime": 12345678.0 } 
                              "fg-222": { "runtime": 12345678.0 } 
                              ...
                             }       
              }
  • result+number will be cached for a while unless clear command called.
    This might help to look up prior analyzed data without re-calculating it again.
@laszewsk
Copy link
Member

but is result cached?
in a single dict you can cache hings based on proper naming convention

I am not opposed to this, but you need to think about caching results

On Mar 22, 2013, at 3:55 PM, lee212 [email protected] wrote:

Cloud Metrics shell returns dict data structure once calculation is finished.

"stats": {
$group: {
$period: {
$metric: value
}
}
}
Example 1. Total usage of wall-clock hours for entire period

"stats": {
"All": { "All": { "runtime": 1234567.0 } } }
Example 2. Monthly usage of wall-clock hours for project groups

"stats": {
"fg-111": { "monthly": { "runtime": 12345678.0 } }
"fg-222": { "monthly": { "runtime": 12345678.0 } }
...
}
Current data structure ought to be replaced to a new data structure.
In a nutshell,
current data structure keeps everything in a single dictionary,
it will be changed to
single analysis query keeps a single dictionary. It will lead us to have separated dictionaries for multiple analysis queries.

Proposed new data structure is following:

"result"+number: {
"options": { "metric": value(list),
"start_date": value(datetime),
"end_date": value(datetime),
"cloud": value(list),
"nodename": value(list),
"groupby": value(list),
"period": value(str),
"timetype": value(str)
}
"stats": {
$group: { $metric: value }
...
}
}
...
Updated example 1. Total usage of wall-clock hours for entire period

"result1": {
"options": { "metric": ['runtime'],
"start_date": datetime(1981,1,1),
"end_date": datetime(3000,1,1),
"cloud": ['All'],
"nodename": ['All'],
"groupby": ['All'],
"period": 'All',
"timetype": 'hour'
}
"stats": {
"All": { "runtime": 1234567.0 } }
}
Updated example 2. Monthly usage of wall-clock hours for project groups

"result1": {
"options": { "metric": ['runtime'],
"start_date": datetime(1981,1,1),
"end_date": datetime(3000,1,1),
"cloud": ['All'],
"nodename": ['All'],
"groupby": ['project],
"period": 'monthly',
"timetype": 'hour'
}
"stats": {
"fg-111": { "runtime": 12345678.0 }
"fg-222": { "runtime": 12345678.0 }
...
}
}
result+number will be cached for a while unless clear command called. This might help to look up prior analyzed data without re-calculating it again.

Reply to this email directly or view it on GitHub.

@lee212
Copy link
Contributor Author

lee212 commented Mar 22, 2013

Cloud Metrics keeps result dict unless 'clear' command executed.

I am trying to improve current data structure for results since it does not seem well organized in a hierarchical view.
It also prevents generating historical reports for projects.

I will think about that carefully.
Note that this is different from instance dict which is resource data. Instance dict has been kept from the original development.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants