From 0bf9c1bc89cb6c7578b125e12d5694db46eb245a Mon Sep 17 00:00:00 2001 From: Chris Holdgraf Date: Thu, 17 Nov 2022 01:59:10 +0100 Subject: [PATCH 1/3] Simplifying documentation and shared responsibility documentation --- README.md | 21 +++ about/distributions/index.md | 13 +- about/infrastructure/index.md | 8 +- about/service/2i2c.md | 55 ------- .../{sustainability => service}/comparison.md | 59 +++++++- about/service/index.md | 3 +- .../costs/cloud.md => service/options.md} | 43 +++++- .../service-objectives.md | 6 +- about/service/shared-responsibility.md | 68 ++++----- about/service/team.md | 2 +- about/strategy/index.md | 134 ------------------ about/strategy/roadmap.md | 51 ------- about/support/index.md | 13 -- about/sustainability/costs/people.md | 24 ---- about/sustainability/index.md | 27 ---- about/sustainability/strategy.md | 87 ------------ admin/howto/encrypted-support.md | 34 ----- admin/howto/manage-users.md | 7 +- conf.py | 2 +- index.md | 8 +- noxfile.py | 27 ++-- policy/index.md | 2 +- support.md | 47 ++++-- 23 files changed, 217 insertions(+), 524 deletions(-) delete mode 100644 about/service/2i2c.md rename about/{sustainability => service}/comparison.md (81%) rename about/{sustainability/costs/cloud.md => service/options.md} (82%) rename about/{strategy => service}/service-objectives.md (98%) delete mode 100644 about/strategy/index.md delete mode 100644 about/strategy/roadmap.md delete mode 100644 about/support/index.md delete mode 100644 about/sustainability/costs/people.md delete mode 100644 about/sustainability/index.md delete mode 100644 about/sustainability/strategy.md delete mode 100644 admin/howto/encrypted-support.md diff --git a/README.md b/README.md index 6a7f178..c3615dd 100644 --- a/README.md +++ b/README.md @@ -5,3 +5,24 @@ This repository serves as the user-facing documentation and communication space Most of the infrastructure that we discuss in the documentation is deployed [in the `infrastructure/` repository](https://github.com/2i2c-org/infrastructure). See [the service documentation](https://docs.2i2c.org) for more information. + +## How to preview this documentation + +To preview this documentation, use the `Nox` tool. +First install it: + +``` +pip install nox +``` + +To build the documentation and place the HTML files in `_build/html`: + +``` +nox -s docs +``` + +To build the documentation with a server that **watches for changes and auto-builds the documentation with a preview**, run the following: + +``` +nox -s docs -- live +``` diff --git a/about/distributions/index.md b/about/distributions/index.md index 0af3717..5d2a957 100644 --- a/about/distributions/index.md +++ b/about/distributions/index.md @@ -4,11 +4,18 @@ These services share many of the same infrastructure components, but have customizations and optimizations that are more domain- or community-specific. :::{note} -Our services are in an "alpha" state - we are still learning a lot about the best way that these hubs can serve communities in research and education. -The infrastructure and service may change over the coming months! -See [our strategy page](../strategy/index.md) for an overview of what we're hoping to do and where we're headed next. +Our services are in an "alpha" state, and the service may change over the coming months! +See {external:tc:doc}`2i2c's strategy page in the Team Compass ` for an overview of what we're hoping to do and where we're headed next. ::: + +```{figure} https://drive.google.com/uc?export=download&id=1vL8ekAtUQ4TEik4-oWIn36VAOITdlmpR +:width: 80% + +A high-level technical overview of an Interactive Computing Service collaboratively run by 2i2c and a community of practice. Each hub is a JupyterHub Distribution with a collection of community-led open source projects that are customized for a particular use-case. +``` + + For more information about specific hub distributions, see the links below. Otherwise, read onward for high-level information about all of our Managed JupyterHubs. diff --git a/about/infrastructure/index.md b/about/infrastructure/index.md index 61a20c6..cb9eca1 100644 --- a/about/infrastructure/index.md +++ b/about/infrastructure/index.md @@ -1,4 +1,4 @@ -# Infrastructure and features +# Infrastructure features These sections contain information about the technical and cloud infrastructure behind the {term}`Managed JupyterHub Service`. They describe the major technologies that are used, what kinds of use-cases and workflows are possible, as well as some important considerations that may be relevant to your community. @@ -10,9 +10,3 @@ They describe the major technologies that are used, what kinds of use-cases and ../distributions/research security.md ``` - -```{figure} https://drive.google.com/uc?export=download&id=1vL8ekAtUQ4TEik4-oWIn36VAOITdlmpR -:width: 80% - -A high-level technical overview of an Interactive Computing Service collaboratively run by 2i2c and a community of practice. Each hub is a JupyterHub Distribution with a collection of community-led open source projects that are customized for a particular use-case. -``` diff --git a/about/service/2i2c.md b/about/service/2i2c.md deleted file mode 100644 index 304d032..0000000 --- a/about/service/2i2c.md +++ /dev/null @@ -1,55 +0,0 @@ -# 2i2c's qualifications - -```{epigraph} -2i2c is a mission-driven non-profit with expertise in cloud infrastructure, Jupyter, open science and scholarship, and open development practices. -``` - -2i2c provides a **managed, customized JupyterHub service** that is tailored for research and education communities. -We manage entirely non-proprietary, open-source tools that ensure user communities have the [Right to Replicate](http://2i2c.org/right-to-replicate) this infrastructure with or without 2i2c. -As a part of this service, 2i2c also makes **upstream contributions to open-source communities** as a part of continuously operating and improving this infrastructure. - -This page describes why we believe that 2i2c and its service model is uniquely suited for the research and education communities. - -:::{tip} -The content on this page can be re-used as a part of "uniqueness and sole source justification" forms when completing contracting for communities. -::: - -## 2i2c has expertise in managed cloud infrastructure in research and education - -Our team has developed and managed cloud infrastructure for over 5 years - first at our previous institutions and now as a part of 2i2c. -We follow modern practices for Site Reliability Engineering with cloud infrastructure like Kubernetes and JupyterHub. -This makes 2i2c uniquely capable of managing scalable and reliable cloud infrastructure for interactive computing. - -Here are a few of the major projects our team memebers have been involved in over the past few years. - -- [The Pangeo project](https://pangeo.io/) - A community platform for Big Data geoscience connecting researchers across the world to large-scale computing and data infrastructure. -- [The UC Berkeley DataHubs](https://docs.datahub.berkeley.edu/en/latest/) - A collection of university-wide JupyterHubs for education serving many thousands of students. -- [The Binder Project](https://docs.mybinder.org/) - a large public cloud service for reproducible computing environments using JupyterHub, serving nearly 150,000 sessions each week. -- [The Syzygy Project](https://syzygy.ca/) - A network of federated JupyterHubs for more than 15 Canadian Universities running on national infrastructure. -- [The Jupyter Book](https://jupyterbook.org) and [MyST Markdown](https://myst.jupyterbook.org/) projects - A collection of tools and standards for improving scientific and technical communication and authoring with interactive computing. - -## 2i2c has expertise in open source workflows and Jupyter - -2i2c's team is comprised of several "[Distinguished Contributors](https://jupyter.org/about)" in the Jupyter ecosystem, which is a crucial technical component of this service. -We are [core team members of JupyterHub and Binder](https://jupyterhub-team-compass.readthedocs.io/en/latest/team/index.html), and make regular contributions across the Jupyter ecosystem. -Moreover, our team has many years of experience with all aspects of the Jupyter stack and we are comfortable interacting with open source communities everywhere. -This makes 2i2c uniquely capable of both utilizing and improving this technology through upstream contributions. - -## 2i2c has expertise with research and education workflows - -2i2c has years of experience managing cloud resources specifically for research and education communities. -We have led and contributed to projects like [the Binder Project](https://docs.mybinder.org/), [the Pangeo Project](https://pangeo.io/), [the Syzygy Project](https://syzygy.ca/), [the UC Berkeley DataHubs](https://docs.datahub.berkeley.edu/en/latest/), and [the Jupyter Book project](https://jupyterbook.org) to serve thousands of users in the research and education community. -As a non-profit, we have defined our mission in order to serve research and education sector, and our team and governing body is made up of individuals from this community. -We strive to build an understanding of their needs, to represent their interests in the Jupyter and open source ecosystem, and to collaborate with them in our operations and development. -2i2c is uniquely positioned to serve as a collaborator for research and education via these efforts. - -## 2i2c is a transparent, collaborative non-profit - -2i2c is a mission-driven non-profit organization that has a commitment to doing its work openly, transparently, and inclusively. -Our mission is to provide researchers and educators with the infrastructure they need to do their work, and to support open source communities that underlie this infrastructure. -2i2c is governed by a [Steering Council](tc:structure:steerco) made of members from the research and education community. -2i2c manages all of our work in public spaces, including [all of our infrastructure](http://github.com/2i2c-org/infrastructure) as well as [all of our organizational strategy and practices](http://team-compass.2i2c.org/). - -## The bottom line - -In short, there is no other organization in existence with a focus on open source workflows with Jupyter, extensive expertise in cloud infrastructure and JupyterHub, a commitment to managing non-proprietary and vendor-agnostic tools, a core practice of making upstream contributions to community-run infrastructure, and a non-profit and mission-driven structure. diff --git a/about/sustainability/comparison.md b/about/service/comparison.md similarity index 81% rename from about/sustainability/comparison.md rename to about/service/comparison.md index 2cb66cb..6e22afa 100644 --- a/about/sustainability/comparison.md +++ b/about/service/comparison.md @@ -17,6 +17,63 @@ In each section below, we'll list a few similar companies and services that can Their presence and ordering do not constitute an "endorsement" and are not exhaustive - we are merely trying to be transparent and helpful about the other organizations in this space. ::: +## 2i2c's qualifications + +```{epigraph} +2i2c is a mission-driven non-profit with expertise in cloud infrastructure, Jupyter, open science and scholarship, and open development practices. +``` + +2i2c provides a **managed, customized JupyterHub service** that is tailored for research and education communities. +We manage entirely non-proprietary, open-source tools that ensure user communities have the [Right to Replicate](http://2i2c.org/right-to-replicate) this infrastructure with or without 2i2c. +As a part of this service, 2i2c also makes **upstream contributions to open-source communities** as a part of continuously operating and improving this infrastructure. + +This page describes why we believe that 2i2c and its service model is uniquely suited for the research and education communities. + +:::{tip} +The content on this page can be re-used as a part of "uniqueness and sole source justification" forms when completing contracting for communities. +::: + +### 2i2c has expertise in managed cloud infrastructure in research and education + +Our team has developed and managed cloud infrastructure for over 5 years - first at our previous institutions and now as a part of 2i2c. +We follow modern practices for Site Reliability Engineering with cloud infrastructure like Kubernetes and JupyterHub. +This makes 2i2c uniquely capable of managing scalable and reliable cloud infrastructure for interactive computing. + +Here are a few of the major projects our team memebers have been involved in over the past few years. + +- [The Pangeo project](https://pangeo.io/) - A community platform for Big Data geoscience connecting researchers across the world to large-scale computing and data infrastructure. +- [The UC Berkeley DataHubs](https://docs.datahub.berkeley.edu/en/latest/) - A collection of university-wide JupyterHubs for education serving many thousands of students. +- [The Binder Project](https://docs.mybinder.org/) - a large public cloud service for reproducible computing environments using JupyterHub, serving nearly 150,000 sessions each week. +- [The Syzygy Project](https://syzygy.ca/) - A network of federated JupyterHubs for more than 15 Canadian Universities running on national infrastructure. +- [The Jupyter Book](https://jupyterbook.org) and [MyST Markdown](https://myst.jupyterbook.org/) projects - A collection of tools and standards for improving scientific and technical communication and authoring with interactive computing. + +### 2i2c has expertise in open source workflows and Jupyter + +2i2c's team is comprised of several "[Distinguished Contributors](https://jupyter.org/about)" in the Jupyter ecosystem, which is a crucial technical component of this service. +We are [core team members of JupyterHub and Binder](https://jupyterhub-team-compass.readthedocs.io/en/latest/team/index.html), and make regular contributions across the Jupyter ecosystem. +Moreover, our team has many years of experience with all aspects of the Jupyter stack and we are comfortable interacting with open source communities everywhere. +This makes 2i2c uniquely capable of both utilizing and improving this technology through upstream contributions. + +### 2i2c has expertise with research and education workflows + +2i2c has years of experience managing cloud resources specifically for research and education communities. +We have led and contributed to projects like [the Binder Project](https://docs.mybinder.org/), [the Pangeo Project](https://pangeo.io/), [the Syzygy Project](https://syzygy.ca/), [the UC Berkeley DataHubs](https://docs.datahub.berkeley.edu/en/latest/), and [the Jupyter Book project](https://jupyterbook.org) to serve thousands of users in the research and education community. +As a non-profit, we have defined our mission in order to serve research and education sector, and our team and governing body is made up of individuals from this community. +We strive to build an understanding of their needs, to represent their interests in the Jupyter and open source ecosystem, and to collaborate with them in our operations and development. +2i2c is uniquely positioned to serve as a collaborator for research and education via these efforts. + +### 2i2c is a transparent, collaborative non-profit + +2i2c is a mission-driven non-profit organization that has a commitment to doing its work openly, transparently, and inclusively. +Our mission is to provide researchers and educators with the infrastructure they need to do their work, and to support open source communities that underlie this infrastructure. +2i2c is governed by a [Steering Council](tc:structure:steerco) made of members from the research and education community. +2i2c manages all of our work in public spaces, including [all of our infrastructure](http://github.com/2i2c-org/infrastructure) as well as [all of our organizational strategy and practices](http://team-compass.2i2c.org/). + +### The bottom line + +In short, there is no other organization in existence with a focus on open source workflows with Jupyter, extensive expertise in cloud infrastructure and JupyterHub, a commitment to managing non-proprietary and vendor-agnostic tools, a core practice of making upstream contributions to community-run infrastructure, and a non-profit and mission-driven structure. + + ## Major factors to consider There are a few major categories to consider, and we'll provide a brief description of each below. @@ -238,7 +295,7 @@ It makes some simplifications and assumptions, and is meant to be a quick and "g (compare:2i2c)= ## 2i2c's managed cloud service -As a non-profit, we choose our prices to move forward on a sustainable path to achieve our mission according to [our cost model](costs:human) as well as [our growth model](strategy:growth). +As a non-profit, we choose our prices to move forward on a sustainable path to achieve our mission according to {external:tc:ref}`our cost model ` as well as {external:tc:ref}`our growth model `. Our service entails developing and managing entirely open-source, vendor-agnostic, and community-driven infrastructure that is customized for research and education. We curate and integrate this infrastructure, customize it for use-cases in research and education, and contribute back to the open source communities that underlie the tools we use. diff --git a/about/service/index.md b/about/service/index.md index 24b5d5f..f20b25a 100644 --- a/about/service/index.md +++ b/about/service/index.md @@ -8,7 +8,8 @@ This page provides some high-level information to help you get started, and the :maxdepth: 1 team shared-responsibility -2i2c +service-objectives.md +comparison ``` If you're interested in setting up a service for your community, click the button below to send us an email. diff --git a/about/sustainability/costs/cloud.md b/about/service/options.md similarity index 82% rename from about/sustainability/costs/cloud.md rename to about/service/options.md index da06312..0ca264e 100644 --- a/about/sustainability/costs/cloud.md +++ b/about/service/options.md @@ -1,10 +1,39 @@ +# Services options and cost + +2i2c pools resources from communities in order to sustain and grow our team. +We do this by charging fees for our services, and supplementing these fees with grants and donations. +These sections are living documents, and we update them regularly as we learn more. + +(service-offerings)= +## Our service offerings and pricing + +A matrix of our services and their prices are at the link below. +It is a living document, and we will continue to update it as we learn more. + +```{button-link} https://docs.google.com/document/d/1FNiDyKNDoe_TgU2WxuNZ5CayYD56tlNJpImQsAIGOmg/edit?usp=sharing +:color: primary + +Our service offerings and prices +``` + +## Types of costs + +There are two types of costs associated with our service: **human costs** and **cloud costs**. +We treat each of these separately in order to be transparent about where community costs are coming from. + +- **Staff costs** cover all of the human time that goes into managing, supporting, developing, and improving our hub service. + See [](service-offerings) for details, and {external:tc:ref}`the Cost Model section in our Team Compass ` for our staffing cost model. +- **Cloud costs** cover the cost we pay a cloud provider for the infrastructure that powers your service. + This is either on a dedicated cloud cluster, or on cluster that you share with other communities. + See [](costs:cloud) for more information. + (costs:cloud)= -# Cloud costs +## Estimating cloud costs We pass through cloud costs directly to our communities in a transparent manner. This encourages us to continually reduce the cloud costs for our communities, and helps them understand how their decisions affect their cloud bill. -## What components make up my cloud bill +### What components make up my cloud bill There are a few kinds of infrastructure that make up your cloud bill. Here is a short summary: @@ -16,7 +45,7 @@ Here is a short summary: There are some other components that go into your cloud bill (e.g., "networking costs") but these are the major pieces. -## User actions that impact cloud costs +### User actions that impact cloud costs Cloud costs depend on a few key factors that you and your community has control over. Here we list some major considerations (in decreasing order of importance): @@ -27,7 +56,7 @@ Here we list some major considerations (in decreasing order of importance): - **Dedicated vs. shared infrastructure**: If your community requires their own dedicated cloud infrastructure (for example, a dedicated Kubernetes cluster) then this will boost your cloud costs because you will not be sharing this cost with other communities. - **Cloud optimizations**: There are many ways to make cloud infrastructure more efficient and scalable, and the 2i2c engineering team is constantly experimenting with ways to lower costs for communities. For many non-2i2c hubs, inefficiency is a large source of cloud cost, though the 2i2c hubs are already well-optimized. -## Estimate my cloud costs +### Estimate my cloud costs The following is a very rough guideline to follow in order to understand and estimate what your cloud costs might be. These are similar whether you're using 2i2c to manage your hub, or running it yourself. @@ -65,14 +94,14 @@ None of these are guarantees about costs, but should give you a general idea. - For data- and compute-intensive hubs, see the Pangeo two-part series on their Kubernetes costs. ([part 1 link](https://medium.com/pangeo/pangeo-cloud-costs-part1-f89842da411d), [part 2 link](https://medium.com/pangeo/pangeo-cloud-cluster-design-9d58a1bf1ad3)) ::: -## How we estimate cloud costs for communities +### How we estimate cloud costs for communities The previous sections give a high-level overview of how to think about cloud costs and how they'll reflect your community's usage. This section describes how the 2i2c team calculates cloud costs and passes this on to communities. Over time, we will refine this process to make it more precise and (as much as possible) directly tied to the usage a community incurs. -### Shared kubernetes clusters +#### Shared kubernetes clusters For hubs that run on **shared Kubernetes clusters**, we estimate their cloud costs via the following process: @@ -80,7 +109,7 @@ For hubs that run on **shared Kubernetes clusters**, we estimate their cloud cos 2. Calculate the % usage for a specific community, based on the % of RAM requested throughout the month. 3. Estimate a community's cloud costs for that month by calculating `(monthly_cloud_bill_for_cluster * %_usage_for_this_community)`. -### Dedicated kubernetes clusters +#### Dedicated kubernetes clusters For hubs that run on a **dedicated Kubernetes cluster**, a cloud bill will be generated by the cloud provider, 2i2c will pay it in advance, and we will include this cost in the next month's invoice. This will exactly reflect the cloud charges incurred by the hub in that time. diff --git a/about/strategy/service-objectives.md b/about/service/service-objectives.md similarity index 98% rename from about/strategy/service-objectives.md rename to about/service/service-objectives.md index 6e6e55c..f541f5b 100644 --- a/about/strategy/service-objectives.md +++ b/about/service/service-objectives.md @@ -104,6 +104,7 @@ Our ability to meet these objectives will depend on the times they are reported - We will triage Change and Guidance requests and respond to them within one working day. - We will prioritize resolving Change and Guidance Requests by balancing them against our other development priorities as described in {doc}`our Support Team Process documentation `. + (objectives:cost)= ## Costs and cloud flexibility @@ -119,11 +120,6 @@ This is particularly relevant during sharp increases in hub usage. - If infrastructure will have known spikes of activity, we may temporarily favor speed over cost by asking for extra resources from the cloud provider. - If spikes in activity will come just after a holiday or weekend, we may make these changes a few days early to avoid working off-hours. -:::{seealso} -See [](../sustainability/index.md) for more information about costs. -::: - - (objectives:updates)= ## Upgrades and maintenance diff --git a/about/service/shared-responsibility.md b/about/service/shared-responsibility.md index 8489a80..74aa5b8 100644 --- a/about/service/shared-responsibility.md +++ b/about/service/shared-responsibility.md @@ -1,53 +1,29 @@ # Shared Responsibility Model -Many things must be done to successfully run a hub for a community. -Some of them are content-focused, some are community-focused, others are infrastructure-focused. +2i2c **shares responsibility for each hub** with the communities we serve.[^similar-models] +This aligns with our mission of promoting collaborative and open workflows in research and education. +It also leads to a more effective, more sustainable, and more transparent service[^ironies-automation]. It also helps ensure that the community has the [Right to Replicate](https://2i2c.org/right-to-replicate) their infrastructure. -2i2c **shares responsibility for each hub** with the communities we serve. We do this by defining the responsibilities that are a good fit for the skills and goals of each organization. -This "Shared Responsibility Model" is a useful way to understand what actions communities are still expected to perform under a service agreement with 2i2c.[^1] +We define and divide responsibilities via the following process: -[^1]: This is inspired by the **Shared Responsibility Model** that is often used to describe cloud services. For example, see [the AWS Shared Responsibility model for compliance](https://aws.amazon.com/compliance/shared-responsibility-model/) and for [Best Practices](https://aws.amazon.com/blogs/industries/applying-the-aws-shared-responsibility-model-to-your-gxp-solution/), the [GxP whitepaper from Google Cloud](https://cloud.google.com/security/compliance/cloud-gxp-whitepaper), and the [Azure Shared Responsibility Model](https://docs.microsoft.com/en-us/azure/security/fundamentals/shared-responsibility). +- Define the major responsibilities needed to run a hub service with a community, and categorize them broadly by skillset. +- Assign responsibilities that are well-suited to the skills and the interests of each group. +- Choose a cost recovery model according to the responsibilities that 2i2c is taking on. +- Choose an operational model for the group so that each actor is empowered to carry out their responsibilities. -```{figure} https://drive.google.com/uc?export=download&id=1SIhHrzPXSFBZ0yyVpxHm0WYs63k0SBRQ -:width: 80% +This section describes the default model that we use with most communities. -An overview of some categories of shared responsibility between the {term}`Cloud Engineering Team` and the {term}`Community Leadership Team`. -``` +## Engineering responsibilities -```{figure} https://drive.google.com/uc?export=download&id=1S6Y9TQcXXLkrGrhgXQc7kLzq7dxcuw9a +Engineering responsibilities are technical changes needed to configure the hub for a community and to keep it running over time. +Below are a range of Engineering responsibilities. + +```{figure} https://drive.google.com/uc?export=download&id=1SIhHrzPXSFBZ0yyVpxHm0WYs63k0SBRQ :width: 80% -An overview of some categories of shared responsibility between the {term}`Community Support Team` and the {term}`Community Leadership Team`. +An overview of some categories of shared responsibility between the {term}`Cloud Engineering Team` and the {term}`Community Leadership Team`. ``` -:::{seealso} -[](../strategy/service-objectives.md) has information about our goals for uptime, reliability, and responsiveness in running this service. -::: - -## Why follow this model - -We use a Shared Responsibility model because we believe that it leads to the best service for the communities that we serve. -Our main goal is to ensure that each hub service is maximally impactful for the community it serves, and that it achieves this impact in the most efficient way possible.[^ironies-automation] - -[^ironies-automation]: Even when collaborating with engineering expertise in other organizations, we describe our service model in terms of areas of responsibility, rather than "tiers" of service that provide "burst capacity" or support only on an as-needed basis. This is because service "tiers" often leads to anti-patterns where support is needed from a person that is not empowered to be efforted in their duties (e.g., if they have been away from infrastructure for many months, and only after a series of escalations are needed to debug something). For more information on this, see [the Ironies of Automation](https://ckrybus.com/static/papers/Bainbridge_1983_Automatica.pdf) as well as [this post](https://blog.acolyer.org/2020/01/08/ironies-of-automation/) and [this post](https://www.thinkautomation.com/automation-advice/the-ironies-of-automation-explored/) explaining its relevance to technology and service delivery. - -To do that, we want to do two things: - -- Assign responsibilities that are well-suited to the skills and the interests of each group. This will boost the impact of each party. -- Choose the operational model that is most well-matched to whomever is carrying out a responsibility. This will boost the efficiency of each party. - -Our challenge is to figure out which responsibiltiies should lie with the community, and which should lie with the 2i2c team, to strike this balance of impact and efficiency. - -We also follow this model because it helps us ensure that the community has the [Right to Replicate](https://2i2c.org/right-to-replicate) their infrastructure. -We think of our community hubs as being a collaboration between 2i2c and the communities that use it, and this framing helps us be explicit about where we fit into the picture. -By defining the roles and responsibilities that we take on via our services, this provides a natural place for other communities to take on some responsibility if they have the capacity to do so. - -## Example responsibilities - -Below are a few responsibilities that are involved in running a community hub, they are roughly ordered from "least to most technical". -In [2i2c's service model](index.md), our responsibility generally begins around number 7. -However, for some communities we may take on more or less responsibility and adjust our costs accordingly depending on their needs. - ::::{grid} :::{grid-item} @@ -88,3 +64,17 @@ However, for some communities we may take on more or less responsibility and adj ::: :::: + + +## Community guidance + +```{figure} https://drive.google.com/uc?export=download&id=1S6Y9TQcXXLkrGrhgXQc7kLzq7dxcuw9a +:width: 80% + +An overview of some categories of shared responsibility between the {term}`Community Support Team` and the {term}`Community Leadership Team`. +``` + + +[^ironies-automation]: Even when collaborating with engineering expertise in other organizations, we describe our service model in terms of areas of responsibility, rather than "tiers" of service that provide "burst capacity" or support only on an as-needed basis. This is because service "tiers" often leads to anti-patterns where support is needed from a person that is not empowered to be efforted in their duties (e.g., if they have been away from infrastructure for many months, and only after a series of escalations are needed to debug something). For more information on this, see [the Ironies of Automation](https://ckrybus.com/static/papers/Bainbridge_1983_Automatica.pdf) as well as [this post](https://blog.acolyer.org/2020/01/08/ironies-of-automation/) and [this post](https://www.thinkautomation.com/automation-advice/the-ironies-of-automation-explored/) explaining its relevance to technology and service delivery. + +[^similar-models]: This is inspired by the **Shared Responsibility Model** that is often used to describe cloud services. For example, see [the AWS Shared Responsibility model for compliance](https://aws.amazon.com/compliance/shared-responsibility-model/) and for [Best Practices](https://aws.amazon.com/blogs/industries/applying-the-aws-shared-responsibility-model-to-your-gxp-solution/), the [GxP whitepaper from Google Cloud](https://cloud.google.com/security/compliance/cloud-gxp-whitepaper), and the [Azure Shared Responsibility Model](https://docs.microsoft.com/en-us/azure/security/fundamentals/shared-responsibility). \ No newline at end of file diff --git a/about/service/team.md b/about/service/team.md index f63c4d7..7fdab74 100644 --- a/about/service/team.md +++ b/about/service/team.md @@ -83,7 +83,7 @@ Community Representatives Hub Administrator Hub Administrators - Trusted community members that perform common administrative operations on a hub that do not require intervention from a {term}`tc:Hub Engineer`. + Trusted community members that perform common administrative operations on a hub that do not require intervention from a Hub Engineer. {term}`Community Representatives` are the first Hub Administrators, and they may add new Hub Administrators via the JupyterHub interface. They are able to add users, start/stop servers, and generally have more control over operations on the hub. diff --git a/about/strategy/index.md b/about/strategy/index.md deleted file mode 100644 index bb102ab..0000000 --- a/about/strategy/index.md +++ /dev/null @@ -1,134 +0,0 @@ -# Objectives and strategy - -2i2c aims to make interactive computing infrastructure more accessible through a sustainable and scalable service model. -The 2i2c Managed JupyterHubs Pilot is an attempt at building infrastructure, process, and a sustainability model for this service. -We aim to run this pilot for several months, gaining experience and sharpening our understanding of how the service could best-meet the needs of the communities we wish to serve. - -2i2c values transparency and inclusion, and aims to run this pilot in an open manner. -This page describes the major strategy of the 2i2c Managed JupyterHubs pilot. - -```{toctree} -service-objectives.md -roadmap.md -``` - -## Goals of the pilot - -The primary aim for this pilot is **understanding how the managed JupyterHub service can be most impactful**. -Below are some major goals that we have: - -- Gain experience in running infrastructure for many different research and education organizations. -- Understand the diversity of organizations we may wish to serve, and the best way to reach each of them. - - For example, large vs. small organizations, research vs. education. -- Build deployment infrastructure that allows us to serve a small number of institutions, with a pathway to scaling to more institutions more quickly. - -## Communities we'll focus on - -2i2c aims to serve a diverse group of institutions during the pilot in order to understand the unique needs that each of them has. -We hope to serve around **20-25** communities in the pilot. -Here are a few major types of communities we hope to serve and learn from: - -- Community Colleges -- Research teams within universities -- University-wide education -- Event-based commmunities of practice - -## Use-cases we'll focus on - -2i2c must understand the major use-cases that it can focus its infrastructure efforts around, in order to build a repeatable and scalable service. -In the pilot, we will focus on a subset of use-cases that we believe are impactful across many communities of practice and organizations. - -- **Collaborative learning environments** - Communities of Practice that are focused around teaching and learning, and benefit from shared infrastructure to facilitate communicating and sharing with one another. Similar to our experience with Data 8, Syzygy, and Callysto. -- **Scalable research environments** - Communities of Practice that use cloud infrastructure to scale their workflows - either by accessing large datasets or leveraging scalable computing infrastructure from an interactive session. Similar to our experience with the Pangeo project. -- **Community event hubs** - Communities of Practice that have a time-bound event (e.g., a workshop or hackathon) that would benefit from a shared space to do their work and collaborate with one another. Similar to our experience with the NeuroHackademy and Pangeo workshops. - -## Our pricing strategy - -See [](../sustainability/index.md) for information about our pricing and cost strategy. - -(strategy:growth)= -## Our growth model - -Growing this service will require balancing two aspects of our team: - -- Our **capacity** to serve a given number of communities at a certain complexity of use-case. -- Our **commitments** to serve a specific set of communities. - -Because we are in a growth phase, we want our commitments to be near (or slightly above) our capacity. -We can increase our capacity by making infrastructure and process improvements, or by growing our team. -In the early phases of this pilot, we will focus on the former, and as our infrastructure and process is refined, we will consider the latter. -In either case, we should choose a pricing model that gives us enough buffer to be able to hire new team members when the right time comes. - -To carry this out, we'll take on new communities in "batches" and define pricing models for each that at least cover [our estimated costs](../sustainability/costs/people.md). -When we take on a new batch of communities, we should feel some tension as it challenges our process, support, and infrastructure in new ways. -As we make process and infrastructure improvements, will self-assess whether our capacity has grown. -If it has grown enough, we'll decide to bring on more communities. - -## Infrastructure strategy for the pilot - -We make a few base assumptions about the kind of infrastructure we will focus on in this pilot. -Here are several major components. - -### Where to deploy the infrastructure - -We'll focus our deployments on **major commercial cloud providers**. -These are the most likely places for organizations to want to run cloud infrastructure, and are the most scalable and sustainable. -In addition, getting Jupyter infrastructure to run well on all of the major cloud providers will have a large impact on a community's right to replicate. - -We'll focus on the following cloud providers: - -- Google Cloud -- Amazon Web Services -- Microsoft Azure - -In the short term, we favor deploying hubs on Google Cloud Platform. -This is because GCP has the most stable Kubernetes offering of all of the cloud providers. -We follow [team guidelines for when to deploy new Kubernetes clusters](infra:cluster:when-to-deploy). -For new hubs that don't require their own Kubernetes cluster, we plan to run them on Google Cloud until our team has capacity to run more infrastructure across Azure and AWS. - -### Why Jupyter and JupyterHub? - -- The Jupyter ecosystem is a collection of building blocks that are highly customizable and composable. They are popular and useful for many use-cases, but still require expertise to customize for a particular need. This is well-suited for 2i2c's skillset and the kind of service it wishes to provide. -- Jupyter is a community-led and multi-stakeholder ecosystem that aligns well with 2i2c's commitment to vendor-agnosticity and the [Right to Replicate](https://2i2c.org/right-to-replicate/). -- JupyterHub allows you to access centralized infrastructure for a community, but in a way that gives that community a lot of control over the details. It is a good balance between "SaaS" and "Fully bespoke community infrastructure". JupyterHub can be deployed via a single repository, but is also deployable by individual people or communities, providing them a clear off-ramp. - - -## Major questions to answer - -Below are the major questions we'd like to answer with this pilot. - -- How much work does it take to manage a community of JupyterHubs? What scaling efficiencies can we achieve? -- What are the major opportunities to improve technology or process to scale more efficiently? -- What is the balance of work between development, operations, administration, and sales? -- What are the major use-cases that can be met with repeatable JupyterHub distributions? -- What kind of support model is sustainable for our team? -- What are the major roles that should exist for a given hub? (both on the 2i2c side and the Community side) -- What other services do communities want other than just a JupyterHub? How would the JupyterHub connect with other services? -- What new development is needed to facilitate collaboration, communication, and exploration on a JupyterHub? - -## Deployment and operations strategy - -Our goal is to provide a service that minimizes infrastructural complexity while providing JupyterHubs that can be used my Communities of Practice independently of one another. -We wish to minimize the amount of engineering labor needed to develop and operate these deployments. -Below are a few major aspects of the service that we believe provide a good chance of accomplishing these goals: - -- Deploy independent JupyterHubs from a centralized deployment system -- Use Terraform to build Kubernetes clusters on cloud providers, and Kubernetes as a base to run the actual JupyterHub infrastructure. -- JupyterHubs should be pre-configured for a use-case, but customizable by the community -- JupyterHubs will respect the Community's [Right to Replicate](https://2i2c.org/right-to-replicate/) -- JupyterHubs may be more bespoke than is sustainable provided we can learn from them. -- JupyterHubs should be able to connect with external datasets and services, as the community needs. -- JupyterHubs should be customizable by the communities they serve, ideally without intervention from a 2i2c Engineer. - -## Our Timeframe -- Begin serving hub infrastructure immediately, as long as we do not over-extend our team -- Finish one iteration by Q1 2022. -- major questions should have research and answers by then. -- model for scaling the hub service should be developed by then - -## Where will we work? - -All of the work done in this pilot should be open and public by default, leveraging workflows that are common to open source communnities. -We will need to create some private channels for communication for conversations with sensitive or private information, but will strive to do everything that we can in public. - -Currently, all of our deployment infrastructure and development can be found at [the `infrastructure/` repository](https://infrastructure.2i2c.org). diff --git a/about/strategy/roadmap.md b/about/strategy/roadmap.md deleted file mode 100644 index e9bed1c..0000000 --- a/about/strategy/roadmap.md +++ /dev/null @@ -1,51 +0,0 @@ -# Roadmap - -{octicon}`clock` Updated **2021-09-16** - -This roadmap describes our major development priorities for the Managed JupyterHub Service. -It is meant to give an idea of where we hope to focus our efforts in the next several months. -Planning for this roadmap roughly follows a quarterly process, and items may be updated or changed as we learn more about the most important things to work on. -Treat this roadmap as a reflection of interests, not as a concrete promise. - -Below we describe major initiatives that are currently active in the Managed Hub Service. - -## Hub infrastructure launch - -The Managed Hubs Service relies heavily on infrastructure that centralizes the configuration and deployment of many JupyterHub instances. -Our first major project is to use [our alpha JupyterHubs service](https://infrastructure.2i2c.org/en/latest/reference/hubs.html) to drive development on this infrastructure stack. - -:::{note} -We are also [using an issue](https://github.com/2i2c-org/infrastructure/issues/610) to track long-term infrastructure needs for this service across all cloud providers. -That is more comprehensive and bigger in scope than any one initiative. -::: - -Our goal for this initiative is to have **basic infrastructure that automates the deployment of Kubernetes clusters and JupyterHubs**. - -:::{admonition} Deliverables for this initiative -You can find deliverables for this initiative at [this project board](https://github.com/orgs/2i2c-org/projects/10) -::: - -This initiative has the following major areas of work: - -- **Automation** - Automation is a critical part of scaling a service and minimizing manual steps with human intervention. We need to automate deployment and configuration of critical tools to deploy JupyterHub on Kubernetes. -- **Reporting and Quality Assurance** - Hub Administrators and Engineers should be confident that hubs are battle-tested and should know quickly if things are not working as expected. We should build basic reporting mechanisms and testing infrastructure that reports back what is going on with our infrastructure, as well as basic processes to ensure the quality of our hub deployments. -- **Basic hub setups** - Hub Administrators will want a basic environment that is useful to them. We need to make reasonable choices in Hub Infrastructure and the use-cases it enables. -- **Basic customizability** - Hub Administrators will want to customize their infrastructure to a degree. We should build in basic customization and configuration that does not require intervention from a 2i2c Engineer. - -## Hub service model - -In addition to basic infrastructure, 2i2c also requires a service model that makes it possible for communities to use the infrastructure, and that ensures the reliability of the infrastructure. -These largely require organizational, administrative, and team practices to operate and improve the Hub Service. - -Our goal for this initiative is to have **a working support and operations plan, a sales plan, team coordination practices, and administrative infrastructure to support this service. - -:::{admonition} Deliverables for this initiative -You can find deliverables for this initiative at [this project board](https://github.com/orgs/2i2c-org/projects/15) -::: - -This initiative has the following main areas of work: - -- **Administration** - In order to run a service that charges customers, we'll need an administrative process and infrastructure to handle the financial, legal, etc aspects. -- **Support and operations model** - The Hub Engineering team will need to coordinate Development and Operations of the hub service in partnership with those administering the service. This will require practices for coordination and prioritization. -- **Sales model** - In order to receive funding for running hubs for communities, we'll need a sales and pricing model that lets us sign contracts. -- **Documentation** - As this will be a public-facing service, it will be crucial that we build high-quality public-facing documentation that describes the service and infrastructure. diff --git a/about/support/index.md b/about/support/index.md deleted file mode 100644 index 0ea69fd..0000000 --- a/about/support/index.md +++ /dev/null @@ -1,13 +0,0 @@ -# User and community support - -As a part of the {term}`Managed JupyterHub Service`, we define two different kinds of user-support. -Documentation about each can be found at the links below. - -- **Change requests and incidents** are discussions around making changes to infrastructure in order to improve the hub service for users or to resolve outages. - See {ref}`2i2c's Support and Incident documentation ` for more information. -- **Usecase guidance** involves assisting users to help them be more affective in using the service infrastructure to accomplish their goals. - The **Hub Administration Topics** and **User Guide** in this documentation serve this purpose. - - :::{note} - We are actively exploring how to provide more guidance and support to the communities that use our infrastructure, see [this blog post](https://2i2c.org/blog/2022/job-product-community-lead/) for more information. - ::: diff --git a/about/sustainability/costs/people.md b/about/sustainability/costs/people.md deleted file mode 100644 index 0950134..0000000 --- a/about/sustainability/costs/people.md +++ /dev/null @@ -1,24 +0,0 @@ -(costs:human)= -# Personnel costs - -This page is a short description of the costs that we cover with service fees in order to sustain our service. -If you're interested in cloud costs (which we pass through directly to communities), see [](cloud.md). - -Our biggest cost is paying salaries for team members that carry out the services we provide. -This includes cloud operations and development, open source support, guidance and support for our communities, etc. - -:::{seealso} -You can find more about our compensation philosophy in our [compensation and benefits page](https://team-compass.2i2c.org/en/latest/hr/compensation.html). -::: - -At present, we choose monthly hub fees based on assumptions about _how many hubs an engineer can operate and support_. -We assume this is the primary bottleneck that limits our capacity. -This gives us an "engineering cost per hub" and we use this as a base to estimate the extra fees we need to charge to cover the non-engineering roles that are needed for the service. - -- **Cost of a 2i2c engineer**. If we assume that a 2i2c engineer is paid `$140,000/year`, with a `30%` benefits markup. This covers the design, development, and ongoing operation of cloud infrastructure for 2i2c's hubs. -- **Community support fees**. We add a `10%` markup to cover 2i2c's extra costs in providing ongoing support and community guidance for our hubs. This includes communications and guidance for community representatives as well as support for hub issues. -- **Open source support fees**. We add a `10%` markup to cover 2i2c's extra costs in ongoing open source engagement and support. This includes upstreaming contributions to open source projects, community engagement and leadership, and collaboration and planning. -- **Fiscal sponsor fees**. We add a `15%` markup to cover the fee of our fiscal sponsor, [Code for Science and Society](https://codeforscience.org/) (for see [](tc:structure:fiscal-sponsor) for information about the services that CS&S provides). - -The result is roughly `$250,000` annually for each engineering position. -The fees for each hub are thus determined by dividing this annual cost by the estimated number of hubs of a given type that we can realistically support. diff --git a/about/sustainability/index.md b/about/sustainability/index.md deleted file mode 100644 index 67e1239..0000000 --- a/about/sustainability/index.md +++ /dev/null @@ -1,27 +0,0 @@ -# Sustainability and pricing - -2i2c pools resources from communities in order to sustain and grow our team. -We do this by charging fees for our services, and supplementing these fees with grants and donations. -These sections are living documents, and we update them regularly as we learn more. - -You may find the prices of our services and their prices at the link below. -It is a living document, and we will continue to update it as we learn more. - -```{button-link} https://docs.google.com/document/d/1FNiDyKNDoe_TgU2WxuNZ5CayYD56tlNJpImQsAIGOmg/edit?usp=sharing -:color: primary - -Our service offerings and prices -``` - -## About our pricing - -As a mission-driven non-profit, we work with stakeholders in the communities we serve to develop a sustainability model that is transparent and that enables us to accomplish our mission of empowering the research and education community. -The sections below describe the strategy around our sustainability and pricing model, as well as details of our cost breakdown. - -```{toctree} -:maxdepth: 2 -costs/cloud -costs/people -comparison -strategy -``` diff --git a/about/sustainability/strategy.md b/about/sustainability/strategy.md deleted file mode 100644 index a1329fa..0000000 --- a/about/sustainability/strategy.md +++ /dev/null @@ -1,87 +0,0 @@ -# Pricing strategy and rationale - -This page describes the rationale and strategy behind our pricing. - -We invite comments and feedback about the attractiveness and sustainability of these services, and expect to update this model with feedback from the community as we learn more. - -```{list-table} -:widths: auto -:stub-columns: 1 - -- - Last Updated - - 2022-05-06 -- - Next checkpoint - - 2022-08 -``` - -## Guiding principles of our prices - -The following principles guide our decision-making around pricing our services. -We believe that following these principles allows us to deliver the best services for our communities in a way that aligns with our mission and values. -Our prices should: - -- Sustain and grow 2i2c's services and allow it to thrive as an organization. -- Support the extra cost associated with open source contributions. -- Reflect a holistic understanding of open source support, not just code. -- Be competitive with other "Data Science environment as a service" offerings (see below for how we consider ourselves relative to similar offerings). -- Be sustainable for the communities we serve, with mechanisms to accommodate institutions with fewer resources. - - -## Pricing structure - -Currently, we base our pricing on two major items: - -### Flat monthly fees - -We charge a flat monthly fee to cover [our personnel costs](costs/people.md). -We estimate the number of hubs an engineer can run, and use this to estimate our costs per hub after adding in project management and administration costs. - -Most hubs take extra effort during the _set up_ phase, and relatively less effort to maintain over time (depending on how many change requests a community makes). -As such we suspect that this pricing structure does not cover our costs in the first month or two, but regains those costs in subsequent months. - -In the future we may try to perform a more nuanced mapping of costs onto effort from our team, but for now we wish to keep things simple and predictable while we learn more. - -### Pass-through cloud costs - -In addition to our monthly hub fees, we pass cloud costs [directly to the communities we serve](costs/cloud.md), without taking any percentage markup. -We do this for two reasons: - -1. In our eyes, we are running infrastructure _on behalf of each community_, and wish to act as if a member of that community were running the infrastructure themselves. We are simply being compensated for our time and expertise. -2. Adding a percentage markup on cloud costs may create perverse incentives for us to avoid optimizing down a community's cloud costs. - -For these reasons, we are currently [passing through cloud costs directly to communities](costs/cloud.md). - -## Base fees for three service types - -These are based on major use-cases in the communities we have served. - -- **Educational communities** need basic hubs that are reliable, secure, and that serve a community of users. They tend to charge course fees and can recoup some costs this way. Assume 22 educational hubs per engineer. -- **Research communities** need a bit more power and complexity, and potentially access to more cloud infrastructure and data. They tend to have grant funding for fixed periods of time that is enough to cover an internal FTE. Assume 13 research hubs per engineer. -- **Partnerships** happen with more advanced / well-resourced communities that want more bespoke infrastructure and new development, and we want a service option that allows for this flexibility. It takes significant extra effort to define and carry out these more complex relationships, and we should only engage in them if they cover a larger percentage of our costs. We'll use a lower-bound of 20% FTE per engineer for these engagements (ie, at-most 5 partnership-level engagements per engineer), though we suspect most will require more FTE time than this. - -Below is a table which summarizes these major points - -% Generated with https://www.tablesgenerator.com/markdown_tables - -| Type of service | Annual cost / engineer | % Eng. FTE | N Hubs / Eng. | Monthly fee | Annual fee total | -|:---------------:|:----------------------:|:----------:|:-------------:|:-----------:|:----------------:| -| Education | $250,000.00 | 4-5% | 22 | $1,000.00 | $12,000.00 | -| Research | $250,000.00 | 7-8% | 14 | $1,500.00 | $18,000.00 | -| Partnership | $250,000.00 | >=20% | <=5 | >=$4,166.67 | >=$50,000.00 | - -### Markup for running on dedicated clusters - -In general we want our fees to scale with the amount of complexity that we have to manage. If we run JupyterHubs on community-specific cloud infrastructure, we have more responsibility and moving parts to keep track of. For example, we’ll need to manage credentials for cloud accounts, set up infrastructure that connects with those accounts, and manage a dedicated kubernetes cluster for the community. We’ll also need to provide billing reports per-cluster for cloud costs. All of this is extra complexity we must deal with, and so a 50% markup is a conservative estimate to cover this labor. - -## What we are missing - -We know that there are some communities that will not be well-served by the options described above. Our goal during the alpha is to build sustainability in order to meet these communities in the future. If the model described above doesn’t fit your needs well (e.g., wrong kind of service, too expensive, etc) please provide feedback, as this can help us evolve the service over time. - -Below are a few things that we know we are missing: - -- **Under-resourced communities**. For many communities, $12,000 a year is too much cost to justify. Our mission requires that we serve these communities as well, not just the ones with larger budgets. This proposal is a step towards sustainability, with the goal of developing new models that can serve under-resourced organizations as well. A few ideas to explore are sponsorship models, tiered pricing models that adjust price based on a community’s budget size, and reducing our internal costs (and thus the fees for our services). -- **Lightweight hubs**. For many individuals or communities, these offerings might involve more complexity than what they need. We’d like to offer a much more scalable and lightweight hub service that people could quickly spin up. We hope to prototype and experiment with ideas after the initial roll-out of the alpha service. -- **Communities that need multiple hubs**. For organizations with many sub-communities and complexity, a single hub is likely not enough to meet their needs. These communities may be better-served by their own dedicated federation of hubs, with a different pricing / growth model than the offerings described here. For now we’ll treat these as partnership opportunities, but may wish to standardize this in a service offering in the future. -- **Significant differences in community size**. Our pricing model assumes that most communities have a relatively similar degree of complexity and size between them. However, when communities grow to a certain size (say, two orders of magnitude), it generates additional work in supporting users and hub operations. We hope to better-understand the costs associated with serving these larger communities, and identify ways to recover them via our pricing. -- **Cloud payments as a service**. In some cases we manage the billing infrastructure and payments for communities (as opposed to them paying cloud providers directly). We do not currently charge explicitly for performing this service. In this case we take on extra work and complexity in tracking and paying cloud bills. We should estimate how much work and risk/liability is entailed in this, and work with our fiscal sponsor to understand how to cover these costs. -- **Liability for cloud payments**. For communities where we manage their cloud billing, we hold some liability because we’ll need to pay the cloud provider for their usage before they pay us. We should work with our fiscal sponsor to understand our risk here (for example, what if a community charges $50,000 in cloud costs and then refuses to pay their invoice). We should also explore potential ways to mitigate this risk (for example, pre-billing communities for *estimated* cloud usage to create a buffer, or asking for a deposit. diff --git a/admin/howto/encrypted-support.md b/admin/howto/encrypted-support.md deleted file mode 100644 index e3babca..0000000 --- a/admin/howto/encrypted-support.md +++ /dev/null @@ -1,34 +0,0 @@ -# Send `support@2i2c.org` encrypted content - -Sometimes community representatives need to send us *encrypted* information - -usually credentials for cloud access or an authentication system. We use -[age](https://age-encryption.org/) (pronounced *aghe*) to allow such information to -be encrypted and then sent to us in a way that *anyone* on the team can decrypt, -rather than the information be tied to a single engineer. You'll be directed to this -page by 2i2c support if we require something encrypted from you. - -This page describes how you can encrypt information and send it to us! - -1. [Install age](https://github.com/FiloSottile/age#installation) on your computer. - On a Mac, if you are using `homebrew`, you can simply `brew install age`. On Linux, - your package emanager should have `age`, and on Windows you can find binaries to download - [from the releases page](https://github.com/FiloSottile/age/releases). See - [all installation options](https://github.com/FiloSottile/age#installation) -2. Run `age -e -r age1mmx8hfzalmn3tmpryrfvcud5vyfakxdfqe683r40qkr6pjd2ag6s72cat5 -a` on - your terminal, and paste the contents of the message you want to encrypt. Press enter, - then `Ctrl-D`. Make sure to copy this exactly! -3. `age` will print the encrypted version of your message on your terminal, and it'll look - something like this: - - ``` - -----BEGIN AGE ENCRYPTED FILE----- - YWdlLWVuY3J5cHRpb24ub3JnL3YxCi0+IFgyNTUxOSAzOW1zTEwrM0FOZ2dWQUo0 - bks2WlZ0eU5LclRVNW4wcEZzRngyT0NSUkRjCm5oR0hGbzV0ZXJ4ZE0xQkdqSXFY - WkoxaWI4VWQvd3pNbnpiR1BjTnNwREkKLS0tIHpLRXorOWlsS2pFWHFiK1JqUW8v - U1pyYW40QSswcFNRZnBDcDcwN29EeVUKC5temNTLqJPd5oT0kfOOK2UHGgb2IVzK - zZS5QmYxmbRNa7qRGqbL - -----END AGE ENCRYPTED FILE----- - ``` - -4. Copy the encrypted version of the message and include it in the message to `support@2i2c.org`. - All 2i2c engineers will be able to decrypt the message! \ No newline at end of file diff --git a/admin/howto/manage-users.md b/admin/howto/manage-users.md index fa9304e..3d36f81 100644 --- a/admin/howto/manage-users.md +++ b/admin/howto/manage-users.md @@ -1,10 +1,9 @@ # User authentication and access - ## Authentication vs. Authorization -**Authentication** allows your users to prove who their are. -**Authorization** gives users certain permissions depending on their identity (such as "access to your hub", or "administrative privileges"). +- **Authentication** allows your users to prove who their are. +- **Authorization** gives users certain permissions depending on their identity (such as "access to your hub", or "administrative privileges"). (admin/configuration/authentication)= ### Authentication @@ -13,7 +12,7 @@ Users can prove who they are by logging in via an *authentication provider*. Cur 1. *Google*. This includes public `@gmail.com` accounts, as well as [Google Workspace](https://workspace.google.com/) accounts set up for your workspace or university. If you use the GMail interface to access your work / university email, it can be used here. -2. [*GitHub*](https://github.com/). Extremely popular community of people creating, publishing and collaborating on code. Accounts are free, and many people already have them especially since the target community for most hubs are people who also write some kind of code. We can setup GitHub authentication so you can either manage a list of specific GitHub handles in the [JupyterHub ddmin panel]((admin/management/admin-panel)), or so that members of a specific GitHub organisation or team are automatically authorised to use the hub. +2. [*GitHub*](https://github.com/). Extremely popular community of people creating, publishing and collaborating on code. Accounts are free, and many people already have them especially since the target community for most hubs are people who also write some kind of code. We can setup GitHub authentication so you can either manage a list of specific GitHub handles in the [JupyterHub admin panel](admin/management/admin-panel), or so that members of a specific GitHub organisation or team are automatically authorised to use the hub. 3. Username / Password via [auth0](https://auth0.com/docs/connections/database). A traditional username / password interface where users can sign up. There are currently [limited diff --git a/conf.py b/conf.py index 6db47ed..61031b9 100644 --- a/conf.py +++ b/conf.py @@ -63,7 +63,7 @@ rediraffe_redirects = { "about/overview.md": "about/service/index.md", "about/service/roles.md": "about/service/team.md", - "about/pricing/index.md": "about/sustainability/index.md", + "about/pricing/index.md": "about/service/options.md", } # Disable linkcheck for anchors because it throws false errors for any JS anchors diff --git a/index.md b/index.md index d1d4f35..367feb4 100644 --- a/index.md +++ b/index.md @@ -21,11 +21,9 @@ They are meant for individuals who wish to learn about the service for their own ```{toctree} :maxdepth: 2 :caption: About the service +about/service/options about/service/index about/infrastructure/index -policy/index -about/sustainability/index -about/strategy/index ``` ## Use the hub @@ -36,6 +34,7 @@ Covers end-user workflows that are common for cloud-native workflows with intera :maxdepth: 2 :caption: Use the hub +policy/index data/index.md ``` @@ -48,11 +47,11 @@ These cover many things that you can do to manage and configure your hub and its :maxdepth: 2 :caption: Administer the hub +support admin/howto/configurator admin/howto/environment/index admin/howto/manage-users admin/howto/control-user-server -admin/howto/encrypted-support admin/topics/network ``` @@ -79,7 +78,6 @@ These tend to cover technical, administrative, and collaborative processes for i :maxdepth: 2 admin/howto/new-hub -about/support/index admin/howto/replicate admin/howto/create-billing-account ``` diff --git a/noxfile.py b/noxfile.py index e4ae42a..31a61f3 100644 --- a/noxfile.py +++ b/noxfile.py @@ -7,18 +7,15 @@ @nox.session(python="3.9") def docs(session): session.install("-r", "requirements.txt") - session.run("sphinx-build", *build_command) - -@nox.session(name="docs-live", python="3.9") -def docs_live(session): - session.install("-r", "requirements.txt") - - AUTOBUILD_IGNORE = [ - "_build", - "build_assets", - ] - cmd = ["sphinx-autobuild"] - for folder in AUTOBUILD_IGNORE: - cmd.extend(["--ignore", f"*/{folder}/*"]) - cmd.extend(build_command) - session.run(*cmd) + if "live" in session.posargs: + AUTOBUILD_IGNORE = [ + "_build", + "build_assets", + ] + cmd = ["sphinx-autobuild"] + for folder in AUTOBUILD_IGNORE: + cmd.extend(["--ignore", f"*/{folder}/*"]) + cmd.extend(build_command) + session.run(*cmd) + else: + session.run("sphinx-build", *build_command) diff --git a/policy/index.md b/policy/index.md index 9b6a1fd..fdfb044 100644 --- a/policy/index.md +++ b/policy/index.md @@ -1,4 +1,4 @@ -# Policies +# User policies We have a few policies for both 2i2c and the communities that we work with. These describe the expectations and rules around the service. diff --git a/support.md b/support.md index 7ba328c..9257074 100644 --- a/support.md +++ b/support.md @@ -1,22 +1,16 @@ ---- -orphan: true ---- (support:email)= # Get support -Hub engineers can provide support and debugging for issues that are related to the 2i2c JupyterHub infrastructure. - Send all support requests as an email to [**`support@2i2c.org`**](mailto:support@2i2c.org). -This is email will be routed to the 2i2c support team, and we will get back to you shortly! +This will be routed to the 2i2c support team, and we will get back to you shortly! When you make a support request, please include as much information as possible in order to provide context needed to resolve your issue! % hard-coding this because sphinx-design buttons don't work with mailto Send a support email -## What kind of support does 2i2c provide? +## Types of support -2i2c Engineers provide support for major infrastructure issues or enhancements. -They are not needed for doing things like regular administrative tasks on a JupyterHub (see the other sections in this guide for how a Hub Administrator can accomplish these tasks instead). +For information about the types of support we may offer, and how it relates to our Shared Responsibility Model, see [](shared-responsibility:support). ## Who can ask for support? @@ -35,3 +29,38 @@ When you send us a support email, we'll try and resolve your issue via the follo - If needed, we'll open an issue in [our `infrastructure` repository](https://github.com/2i2c-org/infrastructure) in order to track the steps needed to resolve this issue. - Throughout this process, we'll communicate with you via the `support@2i2c.org` address. You are also welcome to follow along and discuss in any issues that we may create if you prefer. - When the issue is resolved, we'll send you a confirmation via `support@2i2c.org`, and we'll close the support issue. + +## Send encrypted content + +Sometimes community representatives need to send us *encrypted* information - +usually credentials for cloud access or an authentication system. We use +[age](https://age-encryption.org/) (pronounced *aghe*) to allow such information to +be encrypted and then sent to us in a way that *anyone* on the team can decrypt, +rather than the information be tied to a single engineer. You'll be directed to this +page by 2i2c support if we require something encrypted from you. + +This page describes how you can encrypt information and send it to us! + +1. [Install age](https://github.com/FiloSottile/age#installation) on your computer. + On a Mac, if you are using `homebrew`, you can simply `brew install age`. On Linux, + your package emanager should have `age`, and on Windows you can find binaries to download + [from the releases page](https://github.com/FiloSottile/age/releases). See + [all installation options](https://github.com/FiloSottile/age#installation) +2. Run `age -e -r age1mmx8hfzalmn3tmpryrfvcud5vyfakxdfqe683r40qkr6pjd2ag6s72cat5 -a` on + your terminal, and paste the contents of the message you want to encrypt. Press enter, + then `Ctrl-D`. Make sure to copy this exactly! +3. `age` will print the encrypted version of your message on your terminal, and it'll look + something like this: + + ``` + -----BEGIN AGE ENCRYPTED FILE----- + YWdlLWVuY3J5cHRpb24ub3JnL3YxCi0+IFgyNTUxOSAzOW1zTEwrM0FOZ2dWQUo0 + bks2WlZ0eU5LclRVNW4wcEZzRngyT0NSUkRjCm5oR0hGbzV0ZXJ4ZE0xQkdqSXFY + WkoxaWI4VWQvd3pNbnpiR1BjTnNwREkKLS0tIHpLRXorOWlsS2pFWHFiK1JqUW8v + U1pyYW40QSswcFNRZnBDcDcwN29EeVUKC5temNTLqJPd5oT0kfOOK2UHGgb2IVzK + zZS5QmYxmbRNa7qRGqbL + -----END AGE ENCRYPTED FILE----- + ``` + +4. Copy the encrypted version of the message and include it in the message to `support@2i2c.org`. + All 2i2c engineers will be able to decrypt the message! From 42e3a65d15a8941aea44ba9f38c212e7c0d94ac1 Mon Sep 17 00:00:00 2001 From: Chris Holdgraf Date: Wed, 23 Nov 2022 18:28:24 +0100 Subject: [PATCH 2/3] More updates to SRM --- .gitignore | 5 +- about/distributions/education.md | 5 +- about/distributions/index.md | 85 +++++++------ about/distributions/research.md | 4 +- about/infrastructure/index.md | 12 -- about/infrastructure/security.md | 35 ------ about/service/comparison.md | 45 +++---- about/service/index.md | 81 ++++++------ about/service/options.md | 122 ++++-------------- about/service/shared-responsibility.md | 168 +++++++++++++++++-------- about/service/team.md | 97 -------------- conf.py | 21 +++- index.md | 10 +- noxfile.py | 1 + topic/cloud-costs.md | 87 +++++++++++++ 15 files changed, 371 insertions(+), 407 deletions(-) delete mode 100644 about/infrastructure/index.md delete mode 100644 about/infrastructure/security.md delete mode 100644 about/service/team.md create mode 100644 topic/cloud-costs.md diff --git a/.gitignore b/.gitignore index 9a49481..7319250 100644 --- a/.gitignore +++ b/.gitignore @@ -133,4 +133,7 @@ _build/ # Docs data environments.txt -build_assets \ No newline at end of file +build_assets +images/shared_responsibility_diagram.png +images/collaborative_learning_hub.png +images/scalable_research_hub.png \ No newline at end of file diff --git a/about/distributions/education.md b/about/distributions/education.md index a49cdf3..6329ec3 100644 --- a/about/distributions/education.md +++ b/about/distributions/education.md @@ -1,5 +1,5 @@ (hub-types:education)= -# Collaborative learning hub +# Education and teaching The 2i2c Educational Hubs provide learning environments and infrastructure that is meant for teaching data science. These hubs are inspired by 2i2c's experience with the [DataHubs at UC Berkeley](https://docs.datahub.berkeley.edu/en/latest/) and the [Syzygy service](https://syzygy.ca/) for Canada. @@ -11,7 +11,8 @@ This hub deployment is designed for distributed learning for students with a var Below is a diagram that showcases some of the major components of this hub: -```{figure} https://drive.google.com/uc?export=download&id=1Mr51-s3D_KHPsAuTXbczaQ7mlPZUs9gm +% automatically downloaded in conf.py +```{figure} /images/collaborative_learning_hub.png A high level overview of major components in a collaborative learning hub. ``` diff --git a/about/distributions/index.md b/about/distributions/index.md index 5d2a957..ea655b0 100644 --- a/about/distributions/index.md +++ b/about/distributions/index.md @@ -3,12 +3,6 @@ 2i2c builds and operates **distributions of JupyterHubs** that are tailored for particular use-cases. These services share many of the same infrastructure components, but have customizations and optimizations that are more domain- or community-specific. -:::{note} -Our services are in an "alpha" state, and the service may change over the coming months! -See {external:tc:doc}`2i2c's strategy page in the Team Compass ` for an overview of what we're hoping to do and where we're headed next. -::: - - ```{figure} https://drive.google.com/uc?export=download&id=1vL8ekAtUQ4TEik4-oWIn36VAOITdlmpR :width: 80% @@ -16,33 +10,11 @@ A high-level technical overview of an Interactive Computing Service collaborativ ``` -For more information about specific hub distributions, see the links below. -Otherwise, read onward for high-level information about all of our Managed JupyterHubs. - -## What technology makes up each hub? - -🚀 core infrastructure -: Underneath each 2i2c JupyterHub is a [JupyterHub](https://jupyter.org/hub). These provide interactive computing sessions for each of your users, and connect to the other infrastructure in the cloud. We use [`auth0`](https://auth0.com/) and [CILogon](https://www.cilogon.org/) for authenticating users, which can connect to a number of other authentication protocols (such as OAuth2). - -💻 interfaces -: Each 2i2c JupyterHub has two main interactive interfaces: Jupyter interfaces (Notebook and Lab), and RStudio. Each of them is accessible from your session via `/tree`, `/lab`, and `/rstudio` endpoints in your URL. - -🌄 environment -: Your 2i2c JupyterHub has an environment that has been created for your particular use-case. It exists as a Docker image that your JupyterHub loads when a user starts a new session. These images can either be built with the tool [repo2docker](https://repo2docker.readthedocs.io/), or pulled directly from a Docker registry. The environment also comes pre-loaded with some tools that are helpful for working with JupyterHub, such as [nbgitpuller](https://jupyterhub.github.io/nbgitpuller). See [](environment/custom) for more information. - -🤖 hardware -: 2i2c JupyterHubs can run on most major cloud providers - the primary thing that is needed is a working Kubernetes deployment. By default, 2i2c runs its hubs on Google Cloud, but if communities wish to use a different provider, this can be accomplished as well. This also means that the hardware underlying the Kubernetes deployment is configurable. - -📦 data -: The data that is used by your 2i2c JupyterHub is provided by you! 2i2c JupyterHubs can connect with a variety of public data sources. We recommend using standard data structures or specifications via libraries like [Intake](https://intake.readthedocs.io/en/latest/). Note that 2i2c does not host this data itself, but can build connections between 2i2c JupyterHubs and these data sources. - -## Features of each hub - Here is a brief overview of the major features that are present in each. ```{csv-table} :header-rows: 1 -:widths: 20, 70, 5, 5 +:widths: auto :file: ../../build_assets/feature-matrix.csv ``` @@ -65,27 +37,66 @@ Here is a brief overview of the major features that are present in each. } -(note-on-urls)= -## Where are hubs accessed? -By default all 2i2c JupyterHub get their own URL with the following form: +## JupyterHub in the cloud + +At the core of a community service is one or more [JupyterHubs](https://jupyter.org/hub) that provide an access point for interactive computing and cloud infrastructure for your community members. + +You may access your community JupyterHub at a URL with the following form (though you may choose a custom URL if you wish): ``` ..2i2c.cloud ``` -Each 2i2c JupyterHub has a **hub name** (denoted by ``) and a **community name** (denoted by ``). Communities are collections of hubs around a particular community or collaboration. Each community infrastructure may be run by different teams. For more information, see [](../service/team.md). +JupyterHub provides interactive computing sessions for each of your users, and connect to the other infrastructure in the cloud. +Our JupyterHubs can run on Google Cloud, Amazon AWS, or Microsoft Azure. + +## Authentication + +We use [`auth0`](https://auth0.com/) and [CILogon](https://www.cilogon.org/) for authenticating users, which can connect to a number of other authentication protocols (such as OAuth2). -It is also possible to provide your own URL that points to a 2i2c JupyterHub. +## User interfaces -## Data outside of the hub +Each 2i2c JupyterHub has two main interactive interfaces: Jupyter interfaces (Notebook and Lab), and RStudio. Each of them is accessible from your session via `/tree`, `/lab`, and `/rstudio` endpoints in your URL. -If you wish to access data that exists outside of your 2i2c Hub, it is your responsibility to put this data in the cloud and manage the infrastructure around it. 2i2c does not control this data, it merely provides access to it via your hub infrastructure. +## Custom user environments -## Where are hubs configured and deployed? +Your 2i2c JupyterHub has an environment that has been created for your particular use-case. It exists as a Docker image that your JupyterHub loads when a user starts a new session. These images can either be built with the tool [repo2docker](https://repo2docker.readthedocs.io/), or pulled directly from a Docker registry. The environment also comes pre-loaded with some tools that are helpful for working with JupyterHub, such as [nbgitpuller](https://jupyterhub.github.io/nbgitpuller). See [](environment/custom) for more information. + +## Transparent infrastructure and operations All of the configuration and deployment scripts for the 2i2c JupyterHub can be found at [the `infrastructure/` repository](https://github.com/2i2c-org/infrastructure). This repository contains both the deployment code as well as documentation that explains how it works. It should be treated as "for advanced users only", and is provided for transparency and as a guide for the community to follow if they wish to manage their own infrastructure similar to 2i2c JupyterHub. To learn about how the `infrastructure/` repository works, we recommend checking out the [`infrastructure` documentation](infra:index). See the next sections for more information about each hub distribution. + +## Secure out of the box + +The cloud infrastructure that we manage follows best-practices in deploying cloud applications in a secure manner. +The [Zero to JupyterHub Helm Chart](https://zero-to-jupyterhub.readthedocs.io/en/latest/) is the community standard in deploying JupyterHub in the cloud, and is what 2i2c uses in all of its cloud hubs. +This project follows the principle of "secure by default", and has a number of configuration and design decisions that properly isolate user environments from one another, and prevent them from being able to access resources or data that is forbidden to them. + +As members of the JupyterHub team, we are constantly looking for ways to improve [the security of Zero to JupyterHub](https://zero-to-jupyterhub.readthedocs.io/en/latest/administrator/security.html), and use our experience running these hubs to further improve JupyterHub's security. + +### Data privacy + +2i2c will not collect user data for any purpose beyond what is required in order to run a JupyterHub. +Depending on the choices of your community the hub might contain identifiable information (e.g., e-mail addresses used as usernames for authentication), but this will remain within your hub's configuration and is not shared publicly. + +Our {term}`Cloud Engineering Team` will have access to all of the information that is inside a hub (which it requires in order to debug problems and and assist with upgrades), however we will not retain any of this data or move it *outside* of the hub, and will not retain it once the hub is shut down (except in order to transfer data to you at your request). + +## Monitored for abuse and unexpected costs + +We deploy [Grafana Dashboards](https://grafana.com/grafana/dashboards/) along with a [Prometheus Server](https://prometheus.io/) to continuously monitor the usage across all of our hubs. +This provides visual dashboards that allow us to identify abnormal behavior on a hub (such as a single user using unusual amounts of RAM, using a lot of CPU, or making unusual networking requests). + +### Cryptocurrency mining + +Cryptocurrency mining abuse occurs when users take advantage of cloud CPU in order to make money by mining cryptocurrency. +It is a common problem with cloud-based services and platforms. + +There are many different cryptocurrencies out there, but the most common by-far for abuse is [the Monero cryptocurrency](https://www.getmonero.org/) due to its anonymous nature. + +We deploy an open-source tool called [`cryptnono`](https://github.com/yuvipanda/cryptnono) to each of the clusters we manage. +This tool monitors any process that runs on the 2i2c hubs, and automatically kills any that are associated with Monero. diff --git a/about/distributions/research.md b/about/distributions/research.md index 84d0ff8..28f0aad 100644 --- a/about/distributions/research.md +++ b/about/distributions/research.md @@ -1,5 +1,5 @@ (hub-types:scalable-research)= -# Scalable computing hub +# Research and collaboration Scalable computing hubs are designed to let researchers and data scientists leverage cloud infrastructure to facilitate collaboration and interactive computation. They are heavily inspired by [the Pangeo Community infrastructure](https://pangeo.io). @@ -10,7 +10,7 @@ This hub deployment is designed for researchers and teams that wish to do their Below is a diagram that showcases some of the major components of this hub: -```{figure} https://drive.google.com/uc?export=download&id=1gWAIQVKcB-uxuJsBHqlDlRTq88oki1zn +```{figure} /images/scalable_research_hub.png A high level overview of major components in a scalable computing hub. ``` diff --git a/about/infrastructure/index.md b/about/infrastructure/index.md deleted file mode 100644 index cb9eca1..0000000 --- a/about/infrastructure/index.md +++ /dev/null @@ -1,12 +0,0 @@ -# Infrastructure features - -These sections contain information about the technical and cloud infrastructure behind the {term}`Managed JupyterHub Service`. -They describe the major technologies that are used, what kinds of use-cases and workflows are possible, as well as some important considerations that may be relevant to your community. - -```{toctree} -:maxdepth: 2 -../distributions/index.md -../distributions/education -../distributions/research -security.md -``` diff --git a/about/infrastructure/security.md b/about/infrastructure/security.md deleted file mode 100644 index 416dcd4..0000000 --- a/about/infrastructure/security.md +++ /dev/null @@ -1,35 +0,0 @@ -# Security and Abuse - -The cloud infrastructure that we manage follows best-practices in deploying cloud applications in a secure manner. -This page describes a few ways in which we make our hubs more secure and prevent them from abuse. - -## Secure out of the box - -The [Zero to JupyterHub Helm Chart](https://zero-to-jupyterhub.readthedocs.io/en/latest/) is the community standard in deploying JupyterHub in the cloud, and is what 2i2c uses in all of its cloud hubs. -This project follows the principle of "secure by default", and has a number of configuration and design decisions that properly isolate user environments from one another, and prevent them from being able to access resources or data that is forbidden to them. - -As members of the JupyterHub team, we are constantly looking for ways to improve [the security of Zero to JupyterHub](https://zero-to-jupyterhub.readthedocs.io/en/latest/administrator/security.html), and use our experience running these hubs to further improve JupyterHub's security. - - -## Data privacy - -2i2c will not collect user data for any purpose beyond what is required in order to run a JupyterHub. -Depending on the choices of your community the hub might contain identifiable information (e.g., e-mail addresses used as usernames for authentication), but this will remain within your hub's configuration and is not shared publicly. - -Our {term}`Cloud Engineering Team` will have access to all of the information that is inside a hub (which it requires in order to debug problems and and assist with upgrades), however we will not retain any of this data or move it *outside* of the hub, and will not retain it once the hub is shut down (except in order to transfer data to you at your request). - - -## Cryptocurrency mining - -Cryptocurrency mining abuse occurs when users take advantage of cloud CPU in order to make money by mining cryptocurrency. -It is a common problem with cloud-based services and platforms. - -There are many different cryptocurrencies out there, but the most common by-far for abuse is [the Monero cryptocurrency](https://www.getmonero.org/) due to its anonymous nature. - -We deploy an open-source tool called [`cryptnono`](https://github.com/yuvipanda/cryptnono) to each of the clusters we manage. -This tool monitors any process that runs on the 2i2c hubs, and automatically kills any that are associated with Monero. - -## Usage monitoring - -We deploy [Grafana Dashboards](https://grafana.com/grafana/dashboards/) along with a [Prometheus Server](https://prometheus.io/) to continuously monitor the usage across all of our hubs. -This provides visual dashboards that allow us to identify abnormal behavior on a hub (such as a single user using unusual amounts of RAM, using a lot of CPU, or making unusual networking requests). diff --git a/about/service/comparison.md b/about/service/comparison.md index 6e22afa..4f0bf4e 100644 --- a/about/service/comparison.md +++ b/about/service/comparison.md @@ -10,13 +10,6 @@ For some excellent comprehensive guides, we also recommend reading these two res - [The Principles of Open Scholarly Infrastructure](https://openscholarlyinfrastructure.org/) describes how infrastructure and services can align themselves with the mission and values of the scholarly community. We recommend that you use services that align closely with these principles. - [The Values and Principles Framework and Assessment Checklist](https://commonplace.knowledgefutures.org/pub/5se1i1qy/release/4) is an assessment checklist to help those in the scholarly community choose services that are aligned with the mission and values of the scholarly community. -:::{tip} -The content on this page can be re-used as a part of "price reasonableness and comparisons" forms when completing contracting for communities. - -In each section below, we'll list a few similar companies and services that can be compared with 2i2c. -Their presence and ordering do not constitute an "endorsement" and are not exhaustive - we are merely trying to be transparent and helpful about the other organizations in this space. -::: - ## 2i2c's qualifications ```{epigraph} @@ -33,7 +26,7 @@ This page describes why we believe that 2i2c and its service model is uniquely s The content on this page can be re-used as a part of "uniqueness and sole source justification" forms when completing contracting for communities. ::: -### 2i2c has expertise in managed cloud infrastructure in research and education +### Expertise in managed cloud infrastructure in research and education Our team has developed and managed cloud infrastructure for over 5 years - first at our previous institutions and now as a part of 2i2c. We follow modern practices for Site Reliability Engineering with cloud infrastructure like Kubernetes and JupyterHub. @@ -47,14 +40,14 @@ Here are a few of the major projects our team memebers have been involved in ove - [The Syzygy Project](https://syzygy.ca/) - A network of federated JupyterHubs for more than 15 Canadian Universities running on national infrastructure. - [The Jupyter Book](https://jupyterbook.org) and [MyST Markdown](https://myst.jupyterbook.org/) projects - A collection of tools and standards for improving scientific and technical communication and authoring with interactive computing. -### 2i2c has expertise in open source workflows and Jupyter +### Expertise in open source workflows and Jupyter 2i2c's team is comprised of several "[Distinguished Contributors](https://jupyter.org/about)" in the Jupyter ecosystem, which is a crucial technical component of this service. We are [core team members of JupyterHub and Binder](https://jupyterhub-team-compass.readthedocs.io/en/latest/team/index.html), and make regular contributions across the Jupyter ecosystem. Moreover, our team has many years of experience with all aspects of the Jupyter stack and we are comfortable interacting with open source communities everywhere. This makes 2i2c uniquely capable of both utilizing and improving this technology through upstream contributions. -### 2i2c has expertise with research and education workflows +### Expertise with research and education workflows 2i2c has years of experience managing cloud resources specifically for research and education communities. We have led and contributed to projects like [the Binder Project](https://docs.mybinder.org/), [the Pangeo Project](https://pangeo.io/), [the Syzygy Project](https://syzygy.ca/), [the UC Berkeley DataHubs](https://docs.datahub.berkeley.edu/en/latest/), and [the Jupyter Book project](https://jupyterbook.org) to serve thousands of users in the research and education community. @@ -62,17 +55,13 @@ As a non-profit, we have defined our mission in order to serve research and educ We strive to build an understanding of their needs, to represent their interests in the Jupyter and open source ecosystem, and to collaborate with them in our operations and development. 2i2c is uniquely positioned to serve as a collaborator for research and education via these efforts. -### 2i2c is a transparent, collaborative non-profit +### A transparent, collaborative non-profit 2i2c is a mission-driven non-profit organization that has a commitment to doing its work openly, transparently, and inclusively. Our mission is to provide researchers and educators with the infrastructure they need to do their work, and to support open source communities that underlie this infrastructure. 2i2c is governed by a [Steering Council](tc:structure:steerco) made of members from the research and education community. 2i2c manages all of our work in public spaces, including [all of our infrastructure](http://github.com/2i2c-org/infrastructure) as well as [all of our organizational strategy and practices](http://team-compass.2i2c.org/). -### The bottom line - -In short, there is no other organization in existence with a focus on open source workflows with Jupyter, extensive expertise in cloud infrastructure and JupyterHub, a commitment to managing non-proprietary and vendor-agnostic tools, a core practice of making upstream contributions to community-run infrastructure, and a non-profit and mission-driven structure. - ## Major factors to consider @@ -202,7 +191,7 @@ How closely does this infrastructure track the latest developments in data scien ## Overview of services Below is a short table summarizing the kinds of services discussed below, and how they (roughly) perform for each of the factors discussed above. -It makes some simplifications and assumptions, and is meant to be a quick and "glanceable" way to compare options: +It makes some simplifications and assumptions, and is meant to be a quick and "glanceable" way to compare options. ::::{grid} 3 :margin: 1 @@ -292,8 +281,15 @@ It makes some simplifications and assumptions, and is meant to be a quick and "g - 🟧 ::: +In addition, jump to a quick explanation of each type of service below: + +:::{contents} Jump to service description +:depth: 1 +:local: +::: + (compare:2i2c)= -## 2i2c's managed cloud service +### 2i2c's managed cloud service As a non-profit, we choose our prices to move forward on a sustainable path to achieve our mission according to {external:tc:ref}`our cost model ` as well as {external:tc:ref}`our growth model `. Our service entails developing and managing entirely open-source, vendor-agnostic, and community-driven infrastructure that is customized for research and education. @@ -336,7 +332,7 @@ Updates : 2i2c's team follows the latest developments in Jupyter and cloud infrastructure, and continuously incorporates them into our managed hubs. (compare:internal)= -## Internal staffing +### Internal staffing The most common way for organizations to achieve similar services is to staff their own internal teams. 2i2c encourages this, as it is aligned with our commitment to open source, vendor-agnostic tools, and the [Right to Replicate your infrastructure](https://2i2c.org/right-to-replicate). @@ -386,7 +382,7 @@ We constantly adjust our own prices and team compensation to be responsive to th ::: (compare:public-infra)= -## Large-scale public infrastructure +### Large-scale public infrastructure Depending on the state or country that you live in, you may be able to access large-scale shared infrastructure that is run by government agencies. For example, the [XSEDE](https://www.xsede.org/) program in the United States provides shared infrastructure that you can access with an application. @@ -431,7 +427,7 @@ Updates : There are more complex processes, bureaucracy, and constraints that manage the maintenance of large-scale infrastructure, and this means it tends to evolve and improve more slowly. (compare:consulting)= -## Consulting companies +### Consulting companies Many companies specialize in technical consulting that is flexible and tailored to an organization's needs. They can build bespoke infrastructure using an open source stack that is similar to the one that 2i2c offers. @@ -475,7 +471,7 @@ Updates : Depends on the consultancy, and their expertise in cloud infrastructure. (compare:saas)= -## Software as a Service Products +### Software as a Service Products There are many companies offering services and platforms via a subscription fee. The experience from a user's perspective may be similar and they may offer some open source tools as part of their services. @@ -514,10 +510,3 @@ Accessible Updates : Dependent on the platform. Most SaaS providers do a reasonable job of staying up to date with modern data and cloud workflows, though they tend to include new features in the form of custom or proprietary workflows. - - -## Bottom line - -There is a large ecosystem of vendors and services available for interactive data science. -2i2c believes that interactive computing is emerging as the vital medium for communications in research and education communities. As a result, we suggest that universities and research communities should build atop non-proprietary tools and commit to services that are vendor-agnostic and respect your [Right to Replicate your infrastructure](https://2i2c.org/right-to-replicate). -You should think about the constraints and principles that you'd like your infrastructure to follow, and choose the right approach for your organization. diff --git a/about/service/index.md b/about/service/index.md index f20b25a..2eff874 100644 --- a/about/service/index.md +++ b/about/service/index.md @@ -1,46 +1,28 @@ (about-the-project)= -# Service model +# Our collaborative service model -This section describes a high-level overview of our {term}`Managed JupyterHub Service` and the major teams, processes, and expectations around this service for both 2i2c and the partner community we work with. -This page provides some high-level information to help you get started, and the sections below go into more detail on our service model and structure. - -```{toctree} -:maxdepth: 1 -team -shared-responsibility -service-objectives.md -comparison -``` +Here we provide a high-level overview of our Managed JupyterHub Service and the major teams, processes, and expectations for both 2i2c and the partner communities we work with. +:::{admonition} Want to partner with us? If you're interested in setting up a service for your community, click the button below to send us an email. ```{button-link} mailto:hello@2i2c.org :color: primary Send us an email. ``` +::: -## What is the hub service? - -```{glossary} -Managed JupyterHub Service - An open, scalable, sustainable cloud service for interactive computing environments in research and education. - It follows a "DevOps as a Service" model where communities in research and education can pay for managed cloud infrastructure that runs on an entirely open source stack, and give you [the right to replicate your infrastructure](https://2i2c.org/right-to-replicate). - - It is run by [2i2c](https://2i2c.org), a non-profit organization that develops and operates interactive computing infrastructure for research and education. - 2i2c values transparency and community driven infrastructure. - The sections below describe the Managed JupyterHub Service, its strategy and goals, as well as information about its major features and pricing. -``` -## Who is this service for? +## Our shared responsibility model -2i2c's Managed JupyterHub Service is designed for communities in research and education who want the following things: +Our hub service is a collaboration between 2i2c and one or more communities. +We break down the responsibilities that must be carried out in order to successfully run a service. +We can then assign or share these responsibilities with partner communities according to their needs and interests. -1. Access to the latest technology in Jupyter and interactive computing for collaborative and scalable data science running in the cloud. -2. To utilize open source, community-driven tools and standards. -3. To partner with a mission-aligned organization that transparently and collaboratively runs infrastructure as a team. -4. To use infrastructure that they could take control of themselves, and that gives the user the [Right to Replicate](overview/right-to-replicate). -5. To use infrastructure that is designed by and for individuals in research and education. -6. To support infrastructure from a non-profit organization that is committed to communities in research, education, and open source. +```{toctree} +:maxdepth: 2 +shared-responsibility +``` (overview/right-to-replicate)= ## Your Right to Replicate your infrastructure @@ -52,15 +34,36 @@ One way in which we adhere to this principle is by respecting the [Community Rig The Right to Replicate gives communities the right to replicate their infrastructure in its entirety elsewhere, with or without 2i2c. ``` -We believe that following this principle will lead to a more equitable and more productive ecosystem for research and education in the cloud. It also helps avoid many of the potential downsides of relying on a cloud vendor for infrastructure. Read the [Right to Replicate](https://2i2c.org/right-to-replicate/) documentation for more information about what this means. +Following this principle leads to a more equitable and more productive ecosystem for research and education in the cloud, and helps avoid many of the potential downsides of relying on a cloud vendor for infrastructure. +Read the [Right to Replicate](https://2i2c.org/right-to-replicate/) documentation for more information about what this means. -:::{seealso} -Check out [](../../admin/howto/replicate.md) for information about replicating a 2i2c JupyterHub. -::: +## Service Level Objectives -## Sustaining open source +Our Service Level Objectives define our **goals in running the service** for each partner community. +This includes goals like service uptime and support responsiveness. -Everything that 2i2c deploys is open source and community-driven projects. -We prioritize using multi-stakeholder projects that are well-supported by a diverse community of contributors. -The resources that we receive to run 2i2c JupyterHubs thus **also go towards making open-source improvements** in these communities so that others may benefit from them. -We see this as an opportunity to solve two problems with one stream of funding: support research and education, and [support open source communities](https://2i2c.org/about#values) in the Jupyter ecosystem and beyond. +```{toctree} +:maxdepth: 2 +service-objectives.md +``` + +## Cost model + +There are two types of costs associated with our service. +We treat each of these separately in order to be transparent about where community costs are coming from. +They will be covered as either two separate invoices, or two different line items on the same invoice. + +**Staff costs** cover all of the human time that goes into managing, supporting, developing, and improving our hub service. +See [](service-offerings) for details, and {external:tc:ref}`the Cost Model section in our Team Compass ` for our staffing cost model. + +**Cloud costs** cover the cost we pay a cloud provider for the infrastructure that powers your service. +This is either on a dedicated cloud cluster, or on cluster that you share with other communities. + See [](costs:cloud) for a guide on how to estimate your community's cloud costs. + +## Comparison to similar services + +A comparison with similar kinds of services, to help you understand your options and the considerations you may want to take. + +```{toctree} +comparison +``` diff --git a/about/service/options.md b/about/service/options.md index 0ca264e..20abc85 100644 --- a/about/service/options.md +++ b/about/service/options.md @@ -1,14 +1,15 @@ -# Services options and cost +(service-offerings)= +# Usecases and prices -2i2c pools resources from communities in order to sustain and grow our team. -We do this by charging fees for our services, and supplementing these fees with grants and donations. -These sections are living documents, and we update them regularly as we learn more. +Our Hub Service is an open, scalable, sustainable cloud service for interactive computing environments. +We offer cloud infrastructure hubs that are designed for use-cases in research and education, and flexible enough to be tailored to the needs of each community. -(service-offerings)= -## Our service offerings and pricing +They run entirely on community-driven and open-source infrastructure, +follow a [community-centric collaborative service model](./index.md), and give you [the right to replicate your infrastructure](https://2i2c.org/right-to-replicate). -A matrix of our services and their prices are at the link below. -It is a living document, and we will continue to update it as we learn more. +A table summarizing our services and their prices are at the link below. +The rest of the pages in this section describe the cloud services that we offer and the use-cases they are designed for. +See [](./index.md) for more about our collaborative service model. ```{button-link} https://docs.google.com/document/d/1FNiDyKNDoe_TgU2WxuNZ5CayYD56tlNJpImQsAIGOmg/edit?usp=sharing :color: primary @@ -16,100 +17,27 @@ It is a living document, and we will continue to update it as we learn more. Our service offerings and prices ``` -## Types of costs - -There are two types of costs associated with our service: **human costs** and **cloud costs**. -We treat each of these separately in order to be transparent about where community costs are coming from. - -- **Staff costs** cover all of the human time that goes into managing, supporting, developing, and improving our hub service. - See [](service-offerings) for details, and {external:tc:ref}`the Cost Model section in our Team Compass ` for our staffing cost model. -- **Cloud costs** cover the cost we pay a cloud provider for the infrastructure that powers your service. - This is either on a dedicated cloud cluster, or on cluster that you share with other communities. - See [](costs:cloud) for more information. - -(costs:cloud)= -## Estimating cloud costs - -We pass through cloud costs directly to our communities in a transparent manner. -This encourages us to continually reduce the cloud costs for our communities, and helps them understand how their decisions affect their cloud bill. - -### What components make up my cloud bill - -There are a few kinds of infrastructure that make up your cloud bill. -Here is a short summary: - -- **Nodes for user sessions**: A "node" is kind-of like a virtual machine or a dedicated computer. It is reserved cloud infrastructure that you can use as you wish. Nodes have resources allocated to them (e.g., `100GB` of RAM). JupyterHub uses dedicated nodes for user sessions, so more users == more nodes. You generally pay cloud providers by the minute for each node used. -- **Storage costs**: In order for users to persist their work over time, we must pay for filesystem storage. This is used to store user notebooks and content, data, etc. You generally pay cloud providers by the `GB` over time. -- **Nodes for hub infrastructure**: In addition to the cloud nodes for user sessions, there are also nodes to run the JupyterHub and supporting infrastructure to manage user log-ins, do monitoring and reporting of activity, etc. -- **Nodes for specialized computing**: For hubs that have scalable computing resources like a Dask Gateway, these generally request special nodes _on the fly_. When a scalable computation is executed, the cloud quickly requests many new nodes to complete the computation, and then removes them when it is done. You pay for the time used for each node during this computation. - -There are some other components that go into your cloud bill (e.g., "networking costs") but these are the major pieces. - -### User actions that impact cloud costs - -Cloud costs depend on a few key factors that you and your community has control over. -Here we list some major considerations (in decreasing order of importance): - -- **Base user resources needed**: The power and complexity of the user environment is the biggest driver of "base cost per user". This is largely driven by the amount of memory (RAM) each user needs. See below for a more in-depth explanation. -- **Community usage over time**: Resources are requested from the cloud "on-demand", meaning that your cloud costs will scale up and down with number of active users at any given moment. -- **User storage over time**: User storage is different from on-demand resources, because it's "always being used" even when you're not logged-in. We recommend storing large datasets and such in cloud object storage, which is much cheaper. -- **Dedicated vs. shared infrastructure**: If your community requires their own dedicated cloud infrastructure (for example, a dedicated Kubernetes cluster) then this will boost your cloud costs because you will not be sharing this cost with other communities. -- **Cloud optimizations**: There are many ways to make cloud infrastructure more efficient and scalable, and the 2i2c engineering team is constantly experimenting with ways to lower costs for communities. For many non-2i2c hubs, inefficiency is a large source of cloud cost, though the 2i2c hubs are already well-optimized. +## An overview of our infrastructure -### Estimate my cloud costs +The section below provides an overview of our infrastructure and the technical features that are available in any of our hubs. -The following is a very rough guideline to follow in order to understand and estimate what your cloud costs might be. -These are similar whether you're using 2i2c to manage your hub, or running it yourself. - -Generally speaking, **the biggest technical driver of cloud costs is user memory (RAM)**. -This is because RAM must be "reserved" on a node, and each node has a finite amount of memory available to it. - -Let's say a user node costs `$100.00` an hour, and comes with `100GB` total RAM. -If each user is guaranteed `1GB` of RAM, then the node can theoretically fit `100` users at a time. -`100` simultaneous users will cost `$100.00` an hour, or roughly `$1 / user / hour`. - -If we double the guaranteed RAM available to users, then the node can now fit `50` users at once (`100 GB / 2 GB per user = 50 users total`). -We now need twice the number of nodes to handle the same number of users. -`100` simultaneous users will now cost `$200.00` an hour, or roughly `$2 / user / hour`. - -In practice, the cost per node depends heavily on the cloud provider, and is constantly in-flux. -**To estimate your own cloud costs**, follow these steps: - -1. **Estimate memory available to each user**. The amount of RAM needed for each user is often the biggest driver of cloud cost. Decide the "maximum" amount of RAM that a user % will generally need, and multiply that by 1.5x. -2. **Determine how many average simultaneous users you'd like a hub to support**. This isn't necessarily the total size of your community, but how many people you think will be % using the hub *at the same time*. -3. **Look up the monthly price for an `n1-highmem-4` node**. This is a basic node type that serves most use-cases and can be used as a benchmark for comparison. - 1. [Go to the Google Cloud pricing page](https://cloud.google.com/compute/vm-instance-pricing). This lists prices for many kinds of nodes with Google Cloud Platform. - 2. Go to the `N1 high-memory machine types` section. This contains prices for all `N1` node types with high memory. - 3. Look at the hourly price for `n1-highmem-4`. - 4. Divide this amount by `n_simultaneous_users_per_hour * GB_per_user`. - 5. This is your estimated extra cost per hour per user. -4. **Estimate storage costs**. Estimate your storage costs based on the expected storage each user will take up. 2i2c's hubs use a standard NFS File Storage for most hubs, which has very fast latency for interactive computing. [Here are Google's file storage prices](https://cloud.google.com/storage/pricing#price-tables), for example. You can estimate these costs based on the expected storage used across all of your users. - -:::{seealso} -We recommend checking out the following resources to learn more about cloud costs. -None of these are guarantees about costs, but should give you a general idea. - -- For general information and explanation, see [the Zero to JupyterHub cost projection documentation](z2jh:cost). -- For educational or "lightweight resources" hubs, see [this rough cost analysis notebook from the UC Berkeley DataHub](https://nbviewer.jupyter.org/github/berkeley-dsep-infra/datahub-usage-analysis/blob/master/notebooks/03-visualize-cost-and-usage.ipynb). -- For data- and compute-intensive hubs, see the Pangeo two-part series on their Kubernetes costs. ([part 1 link](https://medium.com/pangeo/pangeo-cloud-costs-part1-f89842da411d), [part 2 link](https://medium.com/pangeo/pangeo-cloud-cluster-design-9d58a1bf1ad3)) -::: - -### How we estimate cloud costs for communities - -The previous sections give a high-level overview of how to think about cloud costs and how they'll reflect your community's usage. -This section describes how the 2i2c team calculates cloud costs and passes this on to communities. +```{toctree} +../distributions/index.md +``` -Over time, we will refine this process to make it more precise and (as much as possible) directly tied to the usage a community incurs. +## Education use-cases -#### Shared kubernetes clusters +JupyterHub is excellent for educational use-cases, such as providing a cloud-based learning environment for large-scale data science teaching or domain-specific cloud-enabled science. -For hubs that run on **shared Kubernetes clusters**, we estimate their cloud costs via the following process: +```{toctree} +../distributions/education +``` -1. Calculate the monthly cloud bill for this cluster. -2. Calculate the % usage for a specific community, based on the % of RAM requested throughout the month. -3. Estimate a community's cloud costs for that month by calculating `(monthly_cloud_bill_for_cluster * %_usage_for_this_community)`. +## Research use-cases -#### Dedicated kubernetes clusters +JupyterHub is an excellent gateway to cloud-based resources and data analytics environments. +It can be used as a part of distributed scientific collaborations, scientific communities with cloud-based worklfows, and scalable analytics environments for research teams. -For hubs that run on a **dedicated Kubernetes cluster**, a cloud bill will be generated by the cloud provider, 2i2c will pay it in advance, and we will include this cost in the next month's invoice. -This will exactly reflect the cloud charges incurred by the hub in that time. +```{toctree} +../distributions/research +``` diff --git a/about/service/shared-responsibility.md b/about/service/shared-responsibility.md index 74aa5b8..b0c71c5 100644 --- a/about/service/shared-responsibility.md +++ b/about/service/shared-responsibility.md @@ -1,80 +1,146 @@ -# Shared Responsibility Model +```{team} Service Team +``` +# Shared responsibility model 2i2c **shares responsibility for each hub** with the communities we serve.[^similar-models] This aligns with our mission of promoting collaborative and open workflows in research and education. -It also leads to a more effective, more sustainable, and more transparent service[^ironies-automation]. It also helps ensure that the community has the [Right to Replicate](https://2i2c.org/right-to-replicate) their infrastructure. +It also leads to a more effective, more sustainable, and more transparent service[^ironies-automation], and ensures that the community has the [Right to Replicate](https://2i2c.org/right-to-replicate) their infrastructure. Here's how it works: + +1. Define the major responsibilities needed to run a hub service with a community, and categorize them broadly by skillset. +2. Assign responsibilities that are well-suited to the skills and the interests of each group. +3. Choose an operational and cost model for the group so that each actor is empowered to carry out their responsibilities. + +Below is a high-level summary of the major areas of responsibility in this service and how they work together. + +% This figure is not stored with the repository, it is downloaded at build time +% Diagram here: https://drive.google.com/uc?export=download&id=16r5xE7SguunLfMh5LhSynSUfjb7IXs_n +```{figure} /images/shared_responsibility_diagram.png +:figwidth: 80% +An overview of the major teams that collaborate around the cloud service in order to serve a community of users. +``` + +Below we describe these areas in more detail, and define the roles that 2i2c and our partner communities take on in the service. + +:::{contents} +:local: +:depth: 1 +::: + +## Site Reliability Engineering -We define and divide responsibilities via the following process: +**Key goal**: Ensure that the cloud infrastructure is reliable, robust, and scalable. -- Define the major responsibilities needed to run a hub service with a community, and categorize them broadly by skillset. -- Assign responsibilities that are well-suited to the skills and the interests of each group. -- Choose a cost recovery model according to the responsibilities that 2i2c is taking on. -- Choose an operational model for the group so that each actor is empowered to carry out their responsibilities. +### Responsibilities -This section describes the default model that we use with most communities. +% NOTE: Goal is to have max 5 responsibilities per category, to avoid overwhelming people. +1. **Monitor infrastructure for errors**. Continuously monitor cloud infrastructure to identify usability problems before they affect users. +2. **Respond to incidents**. When incidents are identified or reported, carry out an incident response process to diagnose and resolve the incident. +3. **Deploy and configure the cloud environment**. Make the necessary service connections and technical changes to set up the community's environment (e.g., authenetication, connecting with a database or defining RAM per user, etc). +4. **Enhance and develop cloud infrastructure**. Continuously develop and deploy software improvements with the goal of boosting service reliability and scalability. +5. **Operate a Kubernetes cluster**. This is the cloud platform that manages all of a community's infrastructure, and may be shared between many communities. -## Engineering responsibilities +```{role} Site Reliability Engineer +``` -Engineering responsibilities are technical changes needed to configure the hub for a community and to keep it running over time. -Below are a range of Engineering responsibilities. +```{admonition} Role: Site Reliability Engineer +A team of engineers with expertise in cloud infrastructure and open source tools that we use as part of our services. This group of people oversees the cloud infrastructure that a community uses. They perform new development and upgrades, make changes per the request of {team}`Community Representatives`, and coordinate with the {team}`Community Support Team` during incidents and outages. +This is roughly equivalent to a [Site Reliability Engineering Team](https://en.wikipedia.org/wiki/Site_reliability_engineering). -```{figure} https://drive.google.com/uc?export=download&id=1SIhHrzPXSFBZ0yyVpxHm0WYs63k0SBRQ -:width: 80% +See [our Infrastructure documentation](https://infrastructure.2i2c.org/en/latest/) for more information. -An overview of some categories of shared responsibility between the {term}`Cloud Engineering Team` and the {term}`Community Leadership Team`. +**Usually filled by 2i2c team members.** Though we are experimenting with ways to involve community members in our cloud operations. ``` -::::{grid} +### Responsibility breakdown + +Usually, 2i2c assumes responsibility for all of the above, though we are experimenting with ways to involve community members in our cloud operations. + +## Service applications support -:::{grid-item} -:columns: 2 +**Key goal**: Ensure that communities have the skills and understanding needed to use the cloud infrastructure to have an impact. -**Less technical** +### Responsibilities -```{div} mt-auto -**More technical** +1. **Create documentation and training material**. Write and improve content that helps users learn cloud-native workflows and use the infrastructure effectively. +2. **Provide support to community leaders**. Follow our service {external:tc:ref}`support:guide` for community leaders. +3. **Assist with user environment creation**. Provide domain expertise about how to configure and set up the proper environment using [Binder-style repositories](../../admin/howto/environment/index.md). +4. **Create and manage data in the cloud**. If your communities requires access to a cloud-native dataset, format it properly and put it in a place that the hub can connect to. +5. **Run workshops and training**. Training workshops are geared towards community leaders, with the goal of helping them share knowledge with others in their community. + +```{role} Community Guide ``` +```{admonition} Role: Community Guide -::: +An expert practitioner with familiarity in user workflows as well as the technical use-cases that 2i2c's cloud services enable. +Acts as a bridge between the communities we work with and our {role}`Site Reliability Engineer`s. Facilitates information transfer, signal-boosts community needs and requests, and guides communities in utilizing the infrastructure more effectively. -:::{grid-item} -:columns: 10 - -1. **Use** the interactive computing sessions to accomplish the goals of the community. -2. **Advocate and onboard** new users to the hub to grow its user community. -3. **General user support** for generic questions about interactive computing. -4. **Provide user access** via the JupterHub admin panel to create new usernames and administrative users. -5. **Develop domain-specific software** that is relevant to your community members for their specific use-cases. -6. **Define the basic environment** via a Binder-style repository. -7. **Manage data** that is accessed by users. -8. **Guide 2i2c** with requests and feedback for changes to infrastructure -9. **Escalate to 2i2c** when something is wrong. -10. **Complex environment changes** that require more expertise in packaging and environment design. -11. **Develop software for interactive computing** to improve the underlying infrastructure that provides user sessions (e.g., JupyterHub, JupyterLab, etc). -12. **Support open source communities** so that the service infrastructure has a solid and reliable foundation of tools on which it runs, and so that the communities that produce those tools are healthy. -13. **Communicate with cloud provider** for issues related to infrastructure (e.g., requesting resource limit increases). -14. **Manage Kubernetes configuration** to perform updates to a hub or cluster (e.g., changing RAM available). -15. **Deploy and configure hubs** including configuration, guidance for setting up environment, some connections to cloud resources, etc. -16. **Monitor for incidents** to identify usability problems before they affect users. -17. **Develop software for cloud infrastructure** to improve the performance, features, and robustness of Kubernetes and other cloud infrastructure. -18. **Configure Kubernetes** upgrades and improvements for cloud infrastructure. -19. **Respond to incidents** when a more complex or cloud-related change is needed. -20. **Operate a Kubernetes cluster** that is configured to manage one or more JupyterHubs. +See the {ref}`Support Team Documentation ` for more information. -::: +**Usually filled by 2i2c team members.** Though communities with "Power Users" or those with exceptional engineering and computational skills may serve in this role as well. +``` + +### Responsibility breakdown + +Generally shared between 2i2c and the community partners it works with. +We focus our efforts on general use-case training for community leaders, as well as documentation. +However, our base service model does not allow us to spend extensive time managing complex environments or cloud-native datasets on behalf of communities. -:::: +## Community leadership and management +**Key goal**: Ensure that a hub's community has the structure, support, and leadership to make the most of the hub. -## Community guidance +### Responsibilities -```{figure} https://drive.google.com/uc?export=download&id=1S6Y9TQcXXLkrGrhgXQc7kLzq7dxcuw9a -:width: 80% +A team of leaders *within the community that we work with* who act as {team}`Community Representatives` on behalf of their community. This team coordinates more closely with our {team}`Community Support Team`, facilitates the transfer of knowledge between 2i2c teams and communities of users, and helps manage the structure and dynamics of these communities. They also define the strategic mission and goals of each user community, and help us define the definition of "success" for the hub service. -An overview of some categories of shared responsibility between the {term}`Community Support Team` and the {term}`Community Leadership Team`. +1. **Define success for the hub's community**. Community leaders understand the goals of a community's users, and define whether the hub is meeting their needs. +2. **Oversee user access policy**. Decide who has access to the hub, and what permissions they have. Generally done via the JupterHub admin panel. +3. **Manage and cultivate a community around the hub.** Define the community events, processes, structure, and communication channels that are best for a hub's community. +4. **Represent community in decisions and feedback**. Serve as a point of contact for {role}`Site Reliability Engineer`s, make requests for changes to the hub, and surface incidents or problems if they arise. +5. **Make financial decisions about the hub**. Have decision authority for changes that have a financial impact on the infrastructure, and serve as a point of contact for billing matters. + +```{role} Hub Administrator +``` +```{admonition} Role: Hub Administrator + +Trusted community members that perform common administrative operations on a hub that do not require intervention from a Hub Engineer. +{team}`Community Representatives` are the first Hub Administrators, and they may add new Hub Administrators via the JupyterHub interface. +They are able to add users, start/stop servers, and generally have more control over operations on the hub. + +**Filled by a community member**. +``` + +```{role} Community Representative +``` +```{admonition} Role: Community Representative + +Acts as the primary point of contact for a community, and ensures that the interests of the {team}`Hub Community` are represented in the infrastructure, and that the hub serves their needs. + +They have the authority to speak on behalf of the community, and make decisions about the infrastructure that the community uses. + +**Filled by one or two community leaders**. ``` +### Responsibility breakdown + +Community management and leadership is generally the responsibility of the community. + +## Software engineering + +**Key goal**: Improve and maintain open source tools to support community workflows. + +### Responsibilities + +1. **Develop domain-specific software** that is relevant to your community members for their specific use-cases. +2. **Develop software for interactive computing** to improve the underlying infrastructure that provides user sessions (e.g., JupyterHub, JupyterLab, etc). +3. **Support open source communities** so that the service infrastructure has a solid and reliable foundation of tools on which it runs, and so that the communities that produce those tools are healthy. + +**There are no formal roles for this area**. 2i2c does not currently have the capacity for dedicated software development, though it hopes to grow this capacity in the future. + +### Responsibility breakdown + +Software development is performed by community members or their partners. [^ironies-automation]: Even when collaborating with engineering expertise in other organizations, we describe our service model in terms of areas of responsibility, rather than "tiers" of service that provide "burst capacity" or support only on an as-needed basis. This is because service "tiers" often leads to anti-patterns where support is needed from a person that is not empowered to be efforted in their duties (e.g., if they have been away from infrastructure for many months, and only after a series of escalations are needed to debug something). For more information on this, see [the Ironies of Automation](https://ckrybus.com/static/papers/Bainbridge_1983_Automatica.pdf) as well as [this post](https://blog.acolyer.org/2020/01/08/ironies-of-automation/) and [this post](https://www.thinkautomation.com/automation-advice/the-ironies-of-automation-explored/) explaining its relevance to technology and service delivery. -[^similar-models]: This is inspired by the **Shared Responsibility Model** that is often used to describe cloud services. For example, see [the AWS Shared Responsibility model for compliance](https://aws.amazon.com/compliance/shared-responsibility-model/) and for [Best Practices](https://aws.amazon.com/blogs/industries/applying-the-aws-shared-responsibility-model-to-your-gxp-solution/), the [GxP whitepaper from Google Cloud](https://cloud.google.com/security/compliance/cloud-gxp-whitepaper), and the [Azure Shared Responsibility Model](https://docs.microsoft.com/en-us/azure/security/fundamentals/shared-responsibility). \ No newline at end of file +[^similar-models]: This is inspired by the **Shared Responsibility Model** that is often used to describe cloud services. For example, see [the AWS Shared Responsibility model for compliance](https://aws.amazon.com/compliance/shared-responsibility-model/) and for [Best Practices](https://aws.amazon.com/blogs/industries/applying-the-aws-shared-responsibility-model-to-your-gxp-solution/), the [GxP whitepaper from Google Cloud](https://cloud.google.com/security/compliance/cloud-gxp-whitepaper), and the [Azure Shared Responsibility Model](https://docs.microsoft.com/en-us/azure/security/fundamentals/shared-responsibility). diff --git a/about/service/team.md b/about/service/team.md deleted file mode 100644 index 7fdab74..0000000 --- a/about/service/team.md +++ /dev/null @@ -1,97 +0,0 @@ -(about/roles-for-service)= -# Team structure and roles - -The Managed JupyterHub Service is a **collaborative cloud service** run in partnership with the communities that we serve. -This page describes the major teams and roles that are involved in running this service. - -## Teams and key stakeholders - -The Managed JupyterHub Service is composed of a {term}`Service Team` along with three sub-teams. - -```{figure} https://drive.google.com/uc?export=download&id=16r5xE7SguunLfMh5LhSynSUfjb7IXs_n -An overview of the major teams that collaborate around the cloud service in order to serve a community of users. There are three main teams, and this diagram shows the major traits of each team, as well as a few ways in which they interact with one another. -``` - -### Service team structure - -```{glossary} -Managed JupyterHub Service Team -Service Team - The group of people that collaborate together to run a collaborative cloud service. It is comprised of three major sub-teams: - - 1. The {term}`Cloud Engineering Team` - 2. The {term}`Community Support Team` - 3. The {term}`Partnerships Team` - 4. The {term}`Community Leadership Team` - -Cloud Engineering Team -Engineering Team - A team of engineers with expertise in cloud infrastructure and open source tools that we use as part of our services. This group of people oversees the cloud infrastructure that a community uses. They perform new development and upgrades, make changes per the request of {term}`Community Representatives`, and coordinate with the {term}`Community Support Team` during incidents and outages. - This is roughly equivalent to a [Site Reliability Engineering Team](https://en.wikipedia.org/wiki/Site_reliability_engineering). - - See [our Infrastructure documentation](https://infrastructure.2i2c.org/en/latest/) for more information. - -Community Support Team -Support Team - A team of expert practitioners with familiarity in user workflows as well as the technical use-cases that 2i2c's cloud services enable. This group acts as a bridge between the communities we work with and our {term}`Cloud Engineering Team`, facilitating information transfer, signal-boosting community needs and requests, and guiding communities in utilizing the infrastructure more effectively. - - See the {ref}`Support Team Documentation ` for more information. - -Partnerships Team - A team of experts in building cross-organizational collaborations, contracts, and grants. This team is tasked with forging new partnerships with communities and their organizations, identifying the resources needed to make these partnerships sustainable, and leading the contracting and invoicing process (when needed) to recover our costs. They are the primary interface with our {term}`tc:Fiscal Sponsor`, {term}`Code for Science and Society`. - -Community Leadership Team -Community Team - A team of leaders *within the community that we work with* who act as {term}`Community Representatives` on behalf of their community. This team coordinates more closely with our {term}`Community Support Team`, facilitates the transfer of knowledge between 2i2c teams and communities of users, and helps manage the structure and dynamics of these communities. They also define the strategic mission and goals of each user community, and help us define the definition of "success" for the hub service. -``` - -### Key stakeholders - -In addition, there are two groups of stakeholders that are not directly involved in running the service, but that are important to consider to ensure that each service has the impact that we wish to achieve. -Our {term}`Service Team` spends extra effort interacting with and getting feedback from these stakeholder communities. - -```{glossary} -User Community -User Communities - Anybody that uses the infrastructure on a given hub. These tend to be students, researchers, collaborators, or workshop attendees. They come from a variety of backgrounds and skillsets, but they are all considered to be a part of the community that a hub serves (even if only for a short time). This community is important to our services because the impact of the service is ultimately driven by the work that this community does. - -Open Source Communities -Open Source Community - The distributed communities that lead, develop, and support the open source infrastructure that is used in our collaborative cloud service. Members of the {term}`Service Team` are often also members of these open source communities, and act as liasons to help upstream improvements and lead discussions that are surfaced as part of running our cloud service together. This community is important to our services because part of 2i2c's mission involves using its resources and experience to support and improve the open source communities that underlie our service. -``` - -## Community roles - -The following roles are overseen by one or more members of the user community. -They help direct the infrastructure and service in order to help the community accomplish its goal, and act as leaders to empower the community in using the infrastructure. - -```{glossary} -Community Representative -Community Representatives - Acts as the primary point of contact for a community, and ensures that the interests of the {term}`Hub Community` are represented in the infrastructure, and that the hub serves their needs. - They have the authority to speak on behalf of the community, and make decisions about the infrastructure that the community uses. - - There must be **one or two community representatives for a given community**. - This role is usually filled by someone that is a member of the hub's community of practice. - - Their main responsibilities include: - - - The main point of contact between the hub engineer and the {term}`Hub Community`. - - Collect feedback and questions from users on a hub. - - Surface questions and requests to Hub Engineers via support tickets. - - Oversee the {term}`Hub Administrators`. - -Hub Administrator -Hub Administrators - Trusted community members that perform common administrative operations on a hub that do not require intervention from a Hub Engineer. - {term}`Community Representatives` are the first Hub Administrators, and they may add new Hub Administrators via the JupyterHub interface. - They are able to add users, start/stop servers, and generally have more control over operations on the hub. - - Their responsibilities include: - - - Provide support to users of a hub for common problems that don't require a Hub Engineer to resolve. - - Add new users to a hub, including administrative users. - - Surface major issues or requests to the Community Representative(s). -``` - -Roles that are specific to 2i2c are defined [in the 2i2c Team Compass](https://team-compass.2i2c.org). diff --git a/conf.py b/conf.py index 61031b9..cf6818f 100644 --- a/conf.py +++ b/conf.py @@ -75,11 +75,30 @@ def setup(app): app.add_css_file("custom.css") + app.add_crossref_type("team", "team") + app.add_crossref_type("role", "role") +# -- Custom scripts ------------------------------------------------- -# Scripts to run +# Generate the feature table import subprocess from pathlib import Path build_assets = Path("build_assets") build_assets.mkdir(exist_ok=True) subprocess.run(["python", "feature-table.py"], cwd="scripts") + +# Download figures we keep in Google Drive +from requests import get +figures = { + "https://drive.google.com/uc?export=download&id=1Mr51-s3D_KHPsAuTXbczaQ7mlPZUs9gm": "collaborative_learning_hub.png", + "https://drive.google.com/uc?export=download&id=16r5xE7SguunLfMh5LhSynSUfjb7IXs_n": "shared_responsibility_diagram.png", + "https://drive.google.com/uc?export=download&id=1gWAIQVKcB-uxuJsBHqlDlRTq88oki1zn": "scalable_research_hub.png", +} +for url, filename in figures.items(): + path_image = Path(__file__).parent / "images" / filename + if not path_image.exists(): + print(f"Downloading {filename}...") + resp = get(url) + path_image.write_bytes(resp.content) + else: + print(f"Diagram image exists, delete this file to re-download: {path_image}") diff --git a/index.md b/index.md index 367feb4..1a690a5 100644 --- a/index.md +++ b/index.md @@ -6,8 +6,8 @@ It is divided into a number of **roles and personas** with relevant topics for e :::{seealso} Here are a few other locations with relevant information about 2i2c's services. -- [`team-compass.2i2c.org/managed-hubs/index`](https://team-compass.2i2c.org/en/latest/projects/managed-hubs/index.html): Documentation about {term}`Service Team` processes that are primarily relevant to 2i2c team members. We put this documentation here to prevent [`docs.2i2c.org`](https://docs.2i2c.org) from getting too cluttered. -- [`infrastructure.2i2c.org`](https://infrastructure.2i2c.org): Our {term}`Cloud Engineering Team` and cloud infrastructure documentation. +- [`team-compass.2i2c.org/managed-hubs/index`](https://team-compass.2i2c.org/en/latest/projects/managed-hubs/index.html): Documentation about {team}`Service Team` processes that are primarily relevant to 2i2c team members. We put this documentation here to prevent [`docs.2i2c.org`](https://docs.2i2c.org) from getting too cluttered. +- [`infrastructure.2i2c.org`](https://infrastructure.2i2c.org): Our {team}`Cloud Engineering Team` and cloud infrastructure documentation. ::: This documentation is structured into sections that are meant for various **roles and personas**. @@ -23,7 +23,6 @@ They are meant for individuals who wish to learn about the service for their own :caption: About the service about/service/options about/service/index -about/infrastructure/index ``` ## Use the hub @@ -71,7 +70,7 @@ community/strategy.md ## Community representatives Documentation for those serving as _Community Representatives_. -These tend to cover technical, administrative, and collaborative processes for interacting with 2i2c's team on behalf of your community. +These tend to cover technical, administrative, invoicing, and collaborative processes for interacting with 2i2c's team on behalf of your community. ```{toctree} :caption: Community representatives @@ -80,6 +79,7 @@ These tend to cover technical, administrative, and collaborative processes for i admin/howto/new-hub admin/howto/replicate admin/howto/create-billing-account +topic/cloud-costs ``` ## Reference material @@ -91,4 +91,4 @@ Lists and programmatically-generated content to serve as a quick reference. :maxdepth: 2 about/terminology -``` \ No newline at end of file +``` diff --git a/noxfile.py b/noxfile.py index 31a61f3..f7f0e1c 100644 --- a/noxfile.py +++ b/noxfile.py @@ -11,6 +11,7 @@ def docs(session): AUTOBUILD_IGNORE = [ "_build", "build_assets", + "images/shared_responsibility_diagram.png", ] cmd = ["sphinx-autobuild"] for folder in AUTOBUILD_IGNORE: diff --git a/topic/cloud-costs.md b/topic/cloud-costs.md new file mode 100644 index 0000000..692d324 --- /dev/null +++ b/topic/cloud-costs.md @@ -0,0 +1,87 @@ + +(costs:cloud)= +# Estimat cloud costs + +We pass through cloud costs directly to our communities in a transparent manner. +This encourages us to continually reduce the cloud costs for our communities, and helps them understand how their decisions affect their cloud bill. + +## What components make up my cloud bill + +There are a few kinds of infrastructure that make up your cloud bill. +Here is a short summary: + +- **Nodes for user sessions**: A "node" is kind-of like a virtual machine or a dedicated computer. It is reserved cloud infrastructure that you can use as you wish. Nodes have resources allocated to them (e.g., `100GB` of RAM). JupyterHub uses dedicated nodes for user sessions, so more users == more nodes. You generally pay cloud providers by the minute for each node used. +- **Storage costs**: In order for users to persist their work over time, we must pay for filesystem storage. This is used to store user notebooks and content, data, etc. You generally pay cloud providers by the `GB` over time. +- **Nodes for hub infrastructure**: In addition to the cloud nodes for user sessions, there are also nodes to run the JupyterHub and supporting infrastructure to manage user log-ins, do monitoring and reporting of activity, etc. +- **Nodes for specialized computing**: For hubs that have scalable computing resources like a Dask Gateway, these generally request special nodes _on the fly_. When a scalable computation is executed, the cloud quickly requests many new nodes to complete the computation, and then removes them when it is done. You pay for the time used for each node during this computation. + +There are some other components that go into your cloud bill (e.g., "networking costs") but these are the major pieces. + +## User actions that impact cloud costs + +Cloud costs depend on a few key factors that you and your community has control over. +Here we list some major considerations (in decreasing order of importance): + +- **Base user resources needed**: The power and complexity of the user environment is the biggest driver of "base cost per user". This is largely driven by the amount of memory (RAM) each user needs. See below for a more in-depth explanation. +- **Community usage over time**: Resources are requested from the cloud "on-demand", meaning that your cloud costs will scale up and down with number of active users at any given moment. +- **User storage over time**: User storage is different from on-demand resources, because it's "always being used" even when you're not logged-in. We recommend storing large datasets and such in cloud object storage, which is much cheaper. +- **Dedicated vs. shared infrastructure**: If your community requires their own dedicated cloud infrastructure (for example, a dedicated Kubernetes cluster) then this will boost your cloud costs because you will not be sharing this cost with other communities. +- **Cloud optimizations**: There are many ways to make cloud infrastructure more efficient and scalable, and the 2i2c engineering team is constantly experimenting with ways to lower costs for communities. For many non-2i2c hubs, inefficiency is a large source of cloud cost, though the 2i2c hubs are already well-optimized. + +## Estimate my cloud costs + +The following is a very rough guideline to follow in order to understand and estimate what your cloud costs might be. +These are similar whether you're using 2i2c to manage your hub, or running it yourself. + +Generally speaking, **the biggest technical driver of cloud costs is user memory (RAM)**. +This is because RAM must be "reserved" on a node, and each node has a finite amount of memory available to it. + +Let's say a user node costs `$100.00` an hour, and comes with `100GB` total RAM. +If each user is guaranteed `1GB` of RAM, then the node can theoretically fit `100` users at a time. +`100` simultaneous users will cost `$100.00` an hour, or roughly `$1 / user / hour`. + +If we double the guaranteed RAM available to users, then the node can now fit `50` users at once (`100 GB / 2 GB per user = 50 users total`). +We now need twice the number of nodes to handle the same number of users. +`100` simultaneous users will now cost `$200.00` an hour, or roughly `$2 / user / hour`. + +In practice, the cost per node depends heavily on the cloud provider, and is constantly in-flux. +**To estimate your own cloud costs**, follow these steps: + +1. **Estimate memory available to each user**. The amount of RAM needed for each user is often the biggest driver of cloud cost. Decide the "maximum" amount of RAM that a user % will generally need, and multiply that by 1.5x. +2. **Determine how many average simultaneous users you'd like a hub to support**. This isn't necessarily the total size of your community, but how many people you think will be % using the hub *at the same time*. +3. **Look up the monthly price for an `n1-highmem-4` node**. This is a basic node type that serves most use-cases and can be used as a benchmark for comparison. + 1. [Go to the Google Cloud pricing page](https://cloud.google.com/compute/vm-instance-pricing). This lists prices for many kinds of nodes with Google Cloud Platform. + 2. Go to the `N1 high-memory machine types` section. This contains prices for all `N1` node types with high memory. + 3. Look at the hourly price for `n1-highmem-4`. + 4. Divide this amount by `n_simultaneous_users_per_hour * GB_per_user`. + 5. This is your estimated extra cost per hour per user. +4. **Estimate storage costs**. Estimate your storage costs based on the expected storage each user will take up. 2i2c's hubs use a standard NFS File Storage for most hubs, which has very fast latency for interactive computing. [Here are Google's file storage prices](https://cloud.google.com/storage/pricing#price-tables), for example. You can estimate these costs based on the expected storage used across all of your users. + +:::{seealso} +We recommend checking out the following resources to learn more about cloud costs. +None of these are guarantees about costs, but should give you a general idea. + +- For general information and explanation, see [the Zero to JupyterHub cost projection documentation](z2jh:cost). +- For educational or "lightweight resources" hubs, see [this rough cost analysis notebook from the UC Berkeley DataHub](https://nbviewer.jupyter.org/github/berkeley-dsep-infra/datahub-usage-analysis/blob/master/notebooks/03-visualize-cost-and-usage.ipynb). +- For data- and compute-intensive hubs, see the Pangeo two-part series on their Kubernetes costs. ([part 1 link](https://medium.com/pangeo/pangeo-cloud-costs-part1-f89842da411d), [part 2 link](https://medium.com/pangeo/pangeo-cloud-cluster-design-9d58a1bf1ad3)) +::: + +## How we estimate cloud costs for communities + +The previous sections give a high-level overview of how to think about cloud costs and how they'll reflect your community's usage. +This section describes how the 2i2c team calculates cloud costs and passes this on to communities. + +Over time, we will refine this process to make it more precise and (as much as possible) directly tied to the usage a community incurs. + +### Shared kubernetes clusters + +For hubs that run on **shared Kubernetes clusters**, we estimate their cloud costs via the following process: + +1. Calculate the monthly cloud bill for this cluster. +2. Calculate the % usage for a specific community, based on the % of RAM requested throughout the month. +3. Estimate a community's cloud costs for that month by calculating `(monthly_cloud_bill_for_cluster * %_usage_for_this_community)`. + +### Dedicated kubernetes clusters + +For hubs that run on a **dedicated Kubernetes cluster**, a cloud bill will be generated by the cloud provider, 2i2c will pay it in advance, and we will include this cost in the next month's invoice. +This will exactly reflect the cloud charges incurred by the hub in that time. From 52b9fa7fb250a00c32443a315a3f1b24e6621f4f Mon Sep 17 00:00:00 2001 From: Yuvi Panda Date: Wed, 30 Nov 2022 00:20:50 -0800 Subject: [PATCH 3/3] Fix typo --- topic/cloud-costs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/topic/cloud-costs.md b/topic/cloud-costs.md index 692d324..14b4b53 100644 --- a/topic/cloud-costs.md +++ b/topic/cloud-costs.md @@ -1,6 +1,6 @@ (costs:cloud)= -# Estimat cloud costs +# Estimate cloud costs We pass through cloud costs directly to our communities in a transparent manner. This encourages us to continually reduce the cloud costs for our communities, and helps them understand how their decisions affect their cloud bill.