“Libin diagram” Contribution Flows #837

ryscheng-mobile · 2024-02-09T18:18:32Z

What is it?

The ability for a project’s ecosystem to understand, in detail, how new users are entering and exiting their open source community / user dependency graph.

Over-time graph visualizations for the high and low value user flows into a project (+ growth rates); see Appendix for examples of visualizations.
The number of new users (denoted by ach of devs and dependent repos) + growth rate
A “user contribution / value” metric to understand the overall value of an individual’s contribution to the project
The number of bounced users (devs or dependent repos - i.e. those who contribute for 1 week, then stop)
An exportable list of new users and bounced users, sorted and scored by overall contribution
Information about activity around major events – during the event, and the weeks following (i.e, hackathons)
The number of new issues or PRs + growth rate

ryscheng-mobile · 2024-02-09T18:21:05Z

ryscheng-mobile · 2024-02-09T18:22:50Z

Suggested steps:
Suggest:

Start with accessing our data on BigQuery. Create a Jupyter notebook that does this as a 1-off for a particular repo
Create a dbt pipeline in OSO, leveraging the OSO schema to gather data for all repos in a project.
Plumb this up into the front-end, so that you can see a Libin diagram for any OSO project.

ryscheng · 2024-09-08T17:55:33Z

Question for @ravenac95

@ccerv1 and I were just talking about this one, and I think we need some help with the metrics rolling window factory to support it.
I think there are actually 3 rolling windows at play here:

The classification rolling window (e.g. a developer needs to have events in 10 of 30 days to be considered fulltime)
The counting rolling window (e.g. we want to know how many active developers there were in the last 6 months)
The comparison rolling window (e.g. across the last 2x 6-month periods --- how many users went from part-time to full-time, or part-time to churned, etc)

I think right now we only assume a single rolling window, is that correct?

ravenac95 · 2024-09-08T19:53:31Z

ohhhh ya interesting, we do currently assume 1, but ya I'll need to think how we can combine things so we can depend on some of these other rolling windows. This seems to be rolling window queries on rolling windows.

ravenac95 · 2024-09-08T19:54:22Z

This changes how I was thinking of things because I was trying to constrain the collection/project automatic creation a bit. Let me think on this!

ravenac95 · 2024-09-08T23:47:20Z

Actually so what i was thinking in terms of changes was to do something like this:

timeseries_metrics(
    model_prefix="timeseries",
    metric_queries={
        # This will automatically generate star counts for the given roll up periods. 
        # A rollup is just a simple addition of the aggregation. So basically we 
        # calculate the daily rollup every day by getting the count of the day. 
        # Then the weekly every week by getting the count of the week and
        # monthly by getting the count of the month. 
        # Additionally this will also create this along the dimensions (entity_types) of 
        # project/collection so the resulting models will be named as follows
        # `metrics.timeseries_stars_to_{entity_type}_{rollup}`
        "stars": MetricQueryDef(
            ref="stars.sql",
            rollups=["daily", "weekly", "monthly"],
            entity_types=["artifact", "project", "collection"], # This is the default value
        ),
        # This defines something with a rolling option that allows you to look back 
        # to some arbitrary window. So you specify the window and specify the unit. 
        # The unit and the window are used to pass in variables to the query. So it's 
        # up to the query to actually query the correct window. 
        # The resultant models are named as such
        # `metrics.timeseries_active_days_to_{entity_type}_over_{window}_{unit}`
        "active_days": MetricQueryDef(
            ref="active_days.sql",
            rolling={
                "windows": [30, 60, 90],
                "unit": "day",
                "cron": "0 0 1 */6 *", # This determines how often this is calculated
            }
        ), 
    },
    default_dialect="clickhouse",
)

I think this setup should give us the flexibility to be able to do the window of windows without having to build much additional craziness i think?

ryscheng · 2024-11-20T19:24:19Z

I will do some updating of docs for this but this is an example of a "third order" rolling window: https://github.com/opensource-observer/oso/blob/main/warehouse/metrics_mesh/oso_metrics/change_in_developers.sql

It is derived originally from the active days rolling window, which then uses the developer classifications (active, part-time, full-time) and then compares the last two intervals associated with that metric... it's ergonomically clunky but it works at the moment

ryscheng · 2024-11-20T19:24:45Z

I imagine 4 states to start:

first time
part time
full time
churned

ryscheng · 2024-12-02T17:45:56Z

This is implemented as of #2541
Closing this out, there are some QA issues that we need to follow up on

ccerv1 · 2024-12-10T21:52:17Z

From Observable

Github Developers Building on our Stack
To make these measurements precise - we can define each type of user and compute the change in each arrow on a weekly basis. There are a lot of ways to define these users for our protocols (by developer, by project, by node, etc), but we can use Github metrics (which are readily available) on developer activity and repo dependence to give us a strong proxy:

Never used: anyone who has never contributed to a github project/repo built on our stack (While this is a large population, we are clearly targeting sub-areas of this market first - aka web3 developers who are building apps, tools, etc on our stack)

First time users: a first-time developer contributing to a github project/repo built on our stack (Note, this means that first time users need to do more than install a binary - they need to build something on it (websites should count!))

High-value users: a frequent (~>5x/week?) contributor to projects/repos built on our stack (ipfs+filecoin) (Examples of high-value users should include: Textile devs, Audius devs, Fleek devs, ENS devs, Infinite Scroll devs, Anytype devs, Valist devs, Infura devs, etc)

Low-value users: a developer occasionally contributing to projects/repos built tangentially on our stack (Maybe they only depend on a small lib, or the dependence is very tangential to their core offering, etc)

Inactive users: a lapsed developer who is no longer actively contributing to a project built on our stack (either because the project is defunct, they stopped contributing, etc)

Top KPIs
(todo - each project team help populate these numbers)

High-Value Users:
New High-Value Users:
Lapsed High-Value Users:
First Time Users:
Bounced Users:
Weekly User Model (by arrow)
(todo - each project team help populate these numbers)

First Time Users: (this week), (last week), (growth rate)
Bounced Users: (this week), (last week), (growth rate)
New Low-Value Users: (this week), (last week), (growth rate)
New High-Value Users: (this week), (last week), (growth rate)
Reactivated Low-Value Users: (this week), (last week), (growth rate)
Upleveled High-Value Users: (this week), (last week), (growth rate)
Downleveled Low-Value Users: (this week), (last week), (growth rate)
Lapsed Low-Value Users: (this week), (last week), (growth rate)
Reactivated High-Value Users: (this week), (last week), (growth rate)
Lapsed High-Value Users: (this week), (last week), (growth rate)
Never Used: (this week), (last week), (growth rate)
High-Value Users: (this week), (last week), (growth rate)
Low-Value Users: (this week), (last week), (growth rate)
Inactive Users: (this week), (last week), (growth rate)

ryscheng · 2024-12-10T21:53:13Z

https://observablehq.com/d/536c37787c312370
https://observablehq.com/@protocol/pmf-dashboard-new-users
https://observablehq.com/@protocol/w3dm-pmf-dashboard-v1

github-project-automation bot added this to OSO Feb 9, 2024

github-project-automation bot moved this to Backlog in OSO Feb 9, 2024

ryscheng assigned innoobijr Feb 19, 2024

ryscheng added this to the UX: Collection/Project/Artifact Pages milestone Mar 26, 2024

ccerv1 modified the milestones: (f) Collection/Project/Artifact Pages, (f) PLN Milestone 2 Apr 9, 2024

ryscheng modified the milestones: (f) PLN Milestone 2, (f) PLN Milestone 3 Jun 12, 2024

ryscheng modified the milestones: (f) PLN Milestone 3, (c) PLN Milestone 2/3, (c) PLN Milestone 1 Jul 26, 2024

ryscheng assigned ryscheng and unassigned innoobijr Aug 27, 2024

ryscheng assigned ravenac95 and unassigned ryscheng Sep 8, 2024

ccerv1 mentioned this issue Sep 30, 2024

Catalog current state of timeseries metrics #2275

Closed

ryscheng assigned Jabolol Nov 20, 2024

Jabolol moved this from Up Next to In Progress in OSO Nov 27, 2024

Jabolol mentioned this issue Nov 27, 2024

add: libin diagram first iteration #2541

Merged

ryscheng closed this as completed Dec 2, 2024

github-project-automation bot moved this from In Progress to Done in OSO Dec 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

“Libin diagram” Contribution Flows #837

“Libin diagram” Contribution Flows #837

ryscheng-mobile commented Feb 9, 2024

ryscheng-mobile commented Feb 9, 2024

ryscheng-mobile commented Feb 9, 2024

ryscheng commented Sep 8, 2024

ravenac95 commented Sep 8, 2024

ravenac95 commented Sep 8, 2024

ravenac95 commented Sep 8, 2024 •

edited

Loading

ryscheng commented Nov 20, 2024

ryscheng commented Nov 20, 2024

ryscheng commented Dec 2, 2024

ccerv1 commented Dec 10, 2024

ryscheng commented Dec 10, 2024

“Libin diagram” Contribution Flows #837

“Libin diagram” Contribution Flows #837

Comments

ryscheng-mobile commented Feb 9, 2024

What is it?

ryscheng-mobile commented Feb 9, 2024

ryscheng-mobile commented Feb 9, 2024

ryscheng commented Sep 8, 2024

ravenac95 commented Sep 8, 2024

ravenac95 commented Sep 8, 2024

ravenac95 commented Sep 8, 2024 • edited Loading

ryscheng commented Nov 20, 2024

ryscheng commented Nov 20, 2024

ryscheng commented Dec 2, 2024

ccerv1 commented Dec 10, 2024

ryscheng commented Dec 10, 2024

ravenac95 commented Sep 8, 2024 •

edited

Loading