feat: adding section on unique contributor count (#44)

* docs: adding section on unique contributor count * docs: adjusting document format * chore: add example for contrib count * docs: small tweak * docs: line lengths * docs: normalizing styles * docs: adding more metrics * feat: add recommendations for story cycle time * feat: rewrote the advice for the "Work Cycle Time" metric Co-authored-by: Adriel Perkins <[email protected]> * feat: add engineering-defaults.md and rewrite advice on "Branch Count and Age" metrics Co-authored-by: Adriel Perkins <[email protected]> * feat: Formatting the advice on "Number of Unique Contributors" metrics Co-authored-by: Adriel Perkins <[email protected]> * feat: rewrote guidance on lines changed metrics Co-authored-by: Adriel Perkins <[email protected]> Co-authored-by: Casey Wilson <[email protected]> * feat: rewrote guidance on pull request metrics * fix: fixing a link to #branch-metrics * feat: finished rewriting guidance on leading delivery indicators Co-authored-by: Adriel Perkins <[email protected]> Co-authored-by: Devin W. Leaman <[email protected]> Co-authored-by: Casey Wilson <[email protected]> * fix: updates for consistent markdown style * fix: final review of leading delivery indicators --------- Co-authored-by: Adriel Perkins <[email protected]> Co-authored-by: Alex Ramsay <[email protected]> Co-authored-by: Casey Wilson <[email protected]> Co-authored-by: Devin W. Leaman <[email protected]>
liatrio · Feb 21, 2024 · fd43820 · fd43820
1 parent a7d9ccc
commit fd43820
Show file tree

Hide file tree

Showing 5 changed files with 318 additions and 47 deletions.
diff --git a/docs/business-systems.md b/docs/business-systems.md
@@ -1,10 +1,13 @@
 # Business OKRs defined & measurable
 
-[Objectives & Key Results(OKRs)](https://www.productboard.com/blog/defining-objectives-and-key-results-for-your-product-team/)
-are defined, with clear and inspiring Objectives that align with the company's overall
-mission and vision. Key Results are specific, measurable, and quantifiable, providing a
-clear path towards achieving the Objectives. OKRs are regularly reviewed and updated as
-needed, with a strong commitment to achieving them.
+[Objectives & Key Results(OKRs)][okr] are defined, with clear and inspiring
+Objectives that align with the company's overall mission and vision. Key Results
+are specific, measurable, and quantifiable, providing a clear path towards
+achieving the Objectives. OKRs are regularly reviewed and updated as needed,
+with a strong commitment to achieving them.
 
-- _How to Measure:_ All team members understand the OKRs and how their work contributes to
-  their achievement. The OKRs are logged in the company's OKR tracker.
+***How to Measure:*** All team members understand the OKRs and how their work
+contributes to their achievement. The OKRs are logged in the company's OKR
+tracker.
+
+[okr]: https://www.productboard.com/blog/defining-objectives-and-key-results-for-your-product-team/
diff --git a/docs/engineering-defaults.md b/docs/engineering-defaults.md
@@ -0,0 +1,7 @@
+# Engineering Defaults
+
+## Pair Programming
+
+## Trunk Based Development
+
+## Small Batch Delivery
diff --git a/docs/human-systems/delivery-metrics/lagging-delivery-indicators.md b/docs/human-systems/delivery-metrics/lagging-delivery-indicators.md
@@ -1,31 +1,60 @@
 # DORA Metrics
 
-The four key delivery metrics from [DORA](https://dora.dev/) are the industry standard for measuring software delivery. We have found that these metrics are essential in modern software delivery. However, these metrics are not absolute and are lagging indicators of how teams are delivering software.
+The four key delivery metrics from [DORA](https://dora.dev/) are the industry
+standard for measuring software delivery. We have found that these metrics are
+essential in modern software delivery. However, these metrics are not absolute
+and are lagging indicators of how teams are delivering software.
 
 ## Lead Time for Changes
 
-Measures the time between merging a code change into trunk and deploying the code change to production. Provides insights into workflow efficiency and bottlenecks. Shorter lead times indicate smoother processes and quicker value delivery.
+Measures the time between merging a code change into trunk and deploying the
+code change to production. Provides insights into workflow efficiency and
+bottlenecks. Shorter lead times indicate smoother processes and quicker value
+delivery.
 
-- _How to Measure:_ Conduct team level Value Stream Map (VSM) to gather time code change goes from commit to production
-- _Example:_ team’s lead time is 509.15h (~64 days). Working time is 163.85h (~20 days)
+***How to Measure:*** Conduct team level Value Stream Map (VSM) to gather time
+  code change goes from commit to production
+
+***Example:*** team’s lead time is 509.15h (~64 days). Working time is 163.85h
+  (~20 days)
 
 ## Deploy Frequency
 
-Measures how often code is deployed to Production. Enables rapid iteration and faster time-to-market. Encourages small, incremental changes, reducing the risk of failures.
+Measures how often code is deployed to Production. Enables rapid iteration and
+faster time-to-market. Encourages small, incremental changes, reducing the risk
+of failures.
+
+***How to Measure:*** Divide the total number of deployments made in a given
+  time period (e.g., a month) by the total number of days in that period
 
-- _How to Measure:_ Divide the total number of deployments made in a given time period (e.g., a month) by the total number of days in that period
-- _Example:_ If a team deployed code 10 times in a month with 31 days, the deployment frequency would be 10/31 = an average of _0.32 deployments per day_ over the month
+***Example:*** If a team deployed code 10 times in a month with 31 days, the
+  deployment frequency would be 10/31 = an average of *0.32 deployments per day*
+  over the month
 
 ## Change Failure Rate
 
-Measures the percentage of deployments that result in failures after it is in production or released to end user. Offers visibility into code quality and stability. Low failure rates signify robust testing and higher software reliability.
+Measures the percentage of deployments that result in failures after it is in
+production or released to end user. Offers visibility into code quality and
+stability. Low failure rates signify robust testing and higher software
+reliability.
 
-- _How to Measure:_ Percentage of code changes that resulted in an incident, rollback, or any type of prod failure. Calculated by counting the number of deployment failures and then dividing by the number of total deployments in a given time period.
-- _Example:_ If your team deployed five times this week and one of them resulted in a failure, your change failure rate is 20%
+***How to Measure:*** Percentage of code changes that resulted in an incident,
+  rollback, or any type of prod failure. Calculated by counting the number of
+  deployment failures and then dividing by the number of total deployments in a
+  given time period.
+
+***Example:*** If your team deployed five times this week and one of them
+  resulted in a failure, your change failure rate is 20%
 
 ## Mean Time to Restore (MTTR)
 
-Calculates the time needed to recover from a service disruption and highlights the team's ability to detect and resolve issues swiftly. Shorter MTTR reflects strong incident response and system resilience.
+Calculates the time needed to recover from a service disruption and highlights
+the team's ability to detect and resolve issues swiftly. Shorter MTTR reflects
+strong incident response and system resilience.
+
+***How to Measure:*** Measures time it takes for service to recover from
+  failure. Calculated by tracking the average time between a time of service
+  disruption and the moment a fix is deployed.
 
-- _How to Measure:_ Measures time it takes for service to recover from failure. Calculated by tracking the average time between a time of service disruption and the moment a fix is deployed.
-- _Example:_ A team's average time from problem detection to full recovery is 90 minutes over the course of 6 months.
+***Example:*** A team's average time from problem detection to full recovery is
+  90 minutes over the course of 6 months.
diff --git a/docs/human-systems/delivery-metrics/leading-delivery-indicators.md b/docs/human-systems/delivery-metrics/leading-delivery-indicators.md
@@ -1,38 +1,267 @@
 # Engineering Metrics
 
-These metrics provide signals as to how teams are adhering to the adoption of the engineering practices and principals. Complimentary to the DORA metrics, these metrics are designed to be leading indicators of how teams are delivering software.
+These metrics provide signals as to how teams are adhering to the adoption of
+the engineering practices and principals. Complimentary to the
+[DORA metrics][dora], these metrics are designed to be leading
+indicators of how teams are delivering software.
 
-## Branch & PR Age
+Many of these metrics can be gathered using an OpenTelemetry collector
+configured to run a [GitProvider receiver][gitprovider].
 
-Measures the lifespan and efficiency of code integration in repos.
+## Branch Metrics
 
-- _How to Measure:_ Calculated by the overall average age of branches & pull requests across each repo
-- _Example:_ Number of commits a branch is behind or ahead of main. Hours or days a PR has existed before merging into main.
+Engineering Defaults: [Pair Programming][pp], [Small Batch Delivery][sbd], and
+[Trunk Based Development][tbd]
 
-## Open Branches
+***Branch Count*** measures the number of branches that exist within a
+repository at a given point in time, less the default branch.
 
-Measures the number of active and open branches in a repo, excluding the trunk.
+***Branch Age*** measures the time a branch has existed within a repository at a
+given point in time, less the default branch.
 
-- _How to Measure:_ Count the number of open branches in a repo.
-- _Example:_ If the `liatrio-otel-collector` repo has a main branch and 5 feature branches, then the count is 5.
+High branch counts and branch ages are forms of technical debt, introducing
+unnecessary risk through increased maintenance and cognitive overhead. High
+counts and ages may also signify:
+
+* The team is using GitFlow
+* The team is not pair programming
+* The team is not delivering in small batches
+* A high number of merge conflicts that must be resolved regularly
+
+Branch count and branch age should be reduced to a minimum based on team context
+and goals. These metrics have to be evaluated in context. For example, a large
+open source project may accept a much higher norm than a product team of eight
+engineers.
+
+The below chart shows targets towards the engineering defaults for branch count
+and branch age when taken in the context of an ideal product team:
+
+|                      | Risky | Mediocre | Better | Engineering Defaults |
+|:--------------------:|-------|----------|--------|----------------------|
+| Branch Count         | 20+   | 10 - 20  | 5 - 10 | < 5                  |
+| Branch Age (in days) | 10+   | 7 - 10   | 3 - 7  | < 3                  |
+
+***Branch Ahead By Commit Count*** measures the number of commits a downstream
+branch is ahead of its upstream branch, typically the trunk. A high number of
+"commits ahead" may indicate a need for smaller batch delivery.
+
+***Branch Behind By Commit Count*** measures the number of commits a downstream
+branch is behind its upstream branch, typically the trunk. A high number of
+"commits behind" may indicate the branch has lived too long, adding extra
+maintenance and cognitive overload.
+
+***Branch Lines Added*** measures the number of lines added to a downstream
+branch when compared to its upstream branch, typically the trunk.
+
+***Branch Lines Deleted*** measures the number of lines deleted from a
+downstream branch when compared to its upstream branch, typically the trunk.
+
+> Junior developers add code. Senior developers delete code.[^seniority]
+
+The purpose of these metrics is to simply provide observable data points with
+regards to addition and deletion of code when comparing a branch to the default
+trunk. It is a purely contextual metric that a team can leverage to provide
+additional information during self-evaluation. This metric can be correlated
+with other metrics like Pull Request Age to provide additional insight on
+cognitive overheard.
+
+> These metrics can be gathered automatically from GitHub and GitLab through the
+> [Liatrio OTEL Collector][lcol]. Check out the [Liatrio OTEL Demo Fork][demo]
+> to see this metric collection in action.
+
+## Number of Unique Contributors
+
+***Unique Contributors*** measures the total count of unique contributors to a
+repository over the course of its lifetime. This count will monotonically
+increase over time.
+
+Interpreting this metric is very contextual. Measuring an OpenSource Library
+that is used within production code may require a different number of
+contributors than a one-off proof-of-concept (POC) of an internal repository.
+
+The below chart takes a view based on several common scenarios.
+
+|  Impact  | Risky  | Hesitant | Desirable |
+|:--------:|--------|----------|-----------|
+| Critical | 1 - 20 | 21 - 50  | 51+       |
+| High     | 1 - 10 | 11 - 25  | 26+       |
+| Moderate | 1 - 5  | 6 - 20   | 21+       |
+| Low      | 1 - 3  | 4 - 10   | 11+       |
+
+> These metrics can be gathered automatically from GitHub and GitLab through the
+> [Liatrio OTEL Collector][lcol]. Check out the [Liatrio OTEL Demo Fork][demo]
+> to see this metric collection in action.
 
 ## Code Coverage
 
-Measures the percentage of code statements exercised during unit test runs. Assesses the amount of code logic invoked during unit testing.
+***Code Coverage*** measures the percentage of code statements exercised during
+unit test runs. Assesses the amount of code logic invoked during unit testing.
+Third party tooling such as [CodeCov][codecov] runs in your automated CI/CD
+pipelines. If the code in question has 100 lines of code and 50 of those are
+executed by unit tests, then the code coverage percentage of this software is
+50%.
 
-- _How to Measure:_ 3rd party tooling which runs in your automated CI/CD builds
-- _Example:_ If the code we're testing has 100 lines of code and 50 of those are executed by unit test code, then the code coverage percentage of this software is 50%
+[codecov]: https://app.codecov.io/gh/open-telemetry/opentelemetry-collector-contrib
+
+Open O11y recommends having code coverage for any product which is going to
+production.
 
 ## Code Quality
 
-Measures the quality of code across three tenets: Security (Vulnerabilities), Reliability (Bugs), and Maintainability (Code Smells).
+***Code Quality*** measures the quality of code across three tenets:
+
+Security (Vulnerabilities): Security in the context of code quality refers to
+the identification and mitigation of vulnerabilities that could be exploited by
+attackers to compromise the system.
+
+Reliability (Bugs): Reliability focuses on the software's ability to perform its
+intended functions under specified conditions for a designated period. Bugs, or
+errors in the code, can significantly impact the reliability of a software
+application, leading to system crashes, incorrect outputs, or performance
+issues.
+
+Maintainability (Code Smells): Maintainability is about how easily software can
+be understood, corrected, adapted, and enhanced. "Code smells" are indicators of
+potential problems in the code that may hinder maintainability. These can
+include issues like duplicated code, overly complex methods, or classes with too
+many responsibilities. Addressing code smells through refactoring and adhering
+to coding standards and best practices helps improve the maintainability of the
+codebase, making it easier for developers to work with and evolve the software
+over time.
+
+Use third party tooling such as Coverity that provide these metrics.
+
+## Work Cycle Time
+
+Engineering Defaults: [Small Batch Delivery][sbd]
+
+Work cycle time calculates the time between a work item being started and
+finished. For each work item, calculate the cycle time as:
+
+$$
+t = t_{finish} - t_{start}
+$$
+
+A team can then calculate the average cycle time for work items in a given
+period of time. For example, a team may calculate the following cycle times for
+four work items:
+
+* $t_0 = 48$ hours
+* $t_1 = 72$ hours
+* $t_2 = 16$ hours
+* $t_3 = 144$ hours
+
+Then, the team can calculate the average as:
+
+$$
+\frac{1}{n}
+\left(
+  \sum_{i=0}^{n}
+  t_i
+\right)
+= \frac{48 + 72 + 16 + 144}{4}
+= 70
+\text{ hours}
+$$
+
+In this example, the team may conclude that their average cycle time is very
+large. As a result, the team agrees to write smaller work items to deliver in
+smaller batches.
+
+Team Retrospective Questions:
+
+* Are work items blocked regularly?
+* Do work items need to be split into smaller scope portions?
+
+The below chart reflects a general targets between large and small batch
+delivery:
+
+|                    | Large Batch | Mediocre | Decent | Small Batch |
+|:------------------:|-------------|----------|--------|-------------|
+| Average Cycle Time | Months      | Weeks    | Days   | Hours       |
+
+> Important: We recommend first taking a look at the cycle time for branches ->
+> pull requests -> deployment into production through the DORA metrics instead
+> of relying on work cycle time.
+
+## Repositories Count
+
+The quantity of repositories managed by an organization or team is a critical
+indicator of the extent of code they oversee. This metric is the base for all
+other Source Control Management (SCM) metrics. High numbers of repositories for
+a small team may signify a high cognitive overhead. You can correlate the number
+of repositories a team owns to the number of repositories within the
+organization.
+
+However, it's crucial to recognize that this metric does not offer a
+one-size-fits-all solution. Although it forms the basis for further analysis,
+its significance can vary greatly. Like all metrics, it should be interpreted
+within the broader context and aligned with the specific values and objectives
+of the team.
+
+## Pull Request Count and Age
+
+Engineering Defaults: [Pair Programming][pp], [Small Batch Delivery][sbd], and
+[Trunk Based Development][tbd]
+
+***Pull Request Count*** measures the number of pull requests against the
+default branch in a repository at a given point in time.
+
+***Pull Request Age*** measures the time from when a pull request is opened to
+when it is approved and merged.
+
+These metrics help teams discover bottlenecks in the lifecycle of pull requests.
+There are three main states of a pull request that can be measured, `open`,
+`approved`, `merged`. Ideally, a team processes a steady flow of pull requests
+which suggests a healthy, productive development process.
+
+The below is the definition of states as defined in the
+[Git Provider Receiver][gitprovider] in terms of states for age:
+
+* ***open age***: the amount of time a pull request has been open
+* ***approved age***: the amount of time it took for a pull request to go from
+  open to approved
+* ***merged age***: the amount of time it took for a pull request to go from
+  open to merged
+
+The below chart outlines time for each of these metric states centered towards
+the engineering defaults. Remember to evaluate these in context of things like
+team size, contributor size, and inner vs open source.
+
+|                             | Risky | Mediocre | Better | Engineering Defaults |
+|:---------------------------:|-------|----------|--------|----------------------|
+| Pull Request Count          | 20+   | 10 - 20  | 5 - 10 | < 5                  |
+| Pull Request Age - Open     | Weeks | Days     | Hours  | Minutes              |
+| Pull Request Age - Approved | Weeks | Days     | Hours  | Minutes              |
+| Pull Request Age - Merged   | Weeks | Days     | Hours  | Minutes              |
+
+Team Retrospective Questions:
+
+* Are pull requests simply being ignore?
+* Is the team overwhelmed with external contributions?
+* Are the merge requirements excessively difficult? Can automation help?
+* Are team members pair programming enough?
+* Is the team delivering in large batches?
+
+Pair programming can reduce the time needed towards reviewing pull requests.
+When pairing, a code review effectively occurs in real-time during development.
+Thus, the pairing partner is very familiar with the changes and is able to very
+quickly review and approve a pull request.
 
-- _How to Measure:_ 3rd party tooling which runs in your automated CI/CD builds
-- _Example:_ One aspect of code quality is reusability, which can be measured by counting the number of interdependencies. The more tightly-coupled the code is with surrounding code and dependencies, the less reusable the code tends to be.
+Large batch deliveries increase the the time needed to review a pull request.
+This problem is discussed in detail [above](#branch-metrics).
 
-## Story Cycle Time
+Teams should also be concerned when these metrics are very low. This likely
+indicates that teams aren't reviewing pull requests effectively. Additionally,
+merging pull requests too quickly prevents other team members from reviewing the
+code changes.
 
-Measures the time between starting a work item (In Progress) and completing a work item (Done). Promote small batch delivery by striving for smaller cycle times.
+[^seniority]: Unknown source
 
-- _How to Measure:_ Calculated as the average time that stories remain active.
-- _Example:_ If a 4-person team completes 32 stories in a month with 22 work days, then the cycle time is `(4 * 22) / 32` or 2.75 days.
+[pp]: ../../engineering-defaults.md#pair-programming
+[tbd]: ../../engineering-defaults.md#trunk-based-development
+[sbd]: ../../engineering-defaults.md#small–batch-delivery
+[demo]: https://github.com/liatrio/opentelemetry-demo/blob/main/docs/delivery.md
+[lcol]: https://github.com/liatrio/liatrio-otel-collector/
+[dora]: https://dora.dev/
+[gitprovider]: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/gitproviderreceiver