1
+ .. sectnum ::
2
+ :depth: 3
3
+ :suffix: .
1
4
2
5
.. role :: raw-html(raw)
3
6
:format: html
@@ -12,26 +15,25 @@ DCP/2 Operating Procedures
12
15
==========================
13
16
14
17
|nn | Text in a block quote (like this) is not normative. It was included
15
- for informational purposes only and does not bear on the implementation.
16
- |ne |
18
+ for informational purposes only and does not bear on the implementation. |ne |
17
19
18
20
.. contents ::
19
21
20
22
21
23
22
24
23
-
24
25
Changing the HCA metadata schema
25
26
================================
26
27
27
- Changes to the HCA metadata schema, even seemingly minor ones, can have
28
- significant impact on the individual components of the DCP system. The DCP
29
- consortium therefore uses a review process that ensures all teams in the
30
- consortium approve of a schema change **before ** it
28
+ The HCA metadata schema defines the shape of metadata documents being
29
+ exchanged between components of the DCP/2. It therefore constitutes a
30
+ de-facto contract between those components. The DCP consortium reviews schema
31
+ changes collectively to ensure that all teams in the consortium approve of a
32
+ schema change **before ** it
31
33
32
34
- affects the master branch of the `metadata-schema `_ repository,
33
35
34
- - is published on `schema.humancellatlas.org `_, and
36
+ - is published on `schema.humancellatlas.org `_, or
35
37
36
38
- is used by metadata staged for import into the production instance of TDR
37
39
@@ -46,111 +48,147 @@ consortium approve of a schema change **before** it
46
48
.. _DCP/2 system design : dcp2_system_design.rst
47
49
48
50
51
+ Impact of Schema changes
52
+ -------------------------
53
+
54
+ Changes to the HCA metadata schema, including seemingly minor ones, can have
55
+ significant impact on the individual components of the DCP system. Even the
56
+ addition of a property can, in some case, require significant work in
57
+ metadata consumers like the Data Browser, especially if those consumers are
58
+ required to interpret the added property. It would be difficult to explicitly
59
+ define the criteria for classifing schema changes by impact. Instead, the
60
+ impact is determined individually for every schema change. Changes that are
61
+ deemed of low impactare fast-tracked through a lighter-weight review process.
62
+
63
+
49
64
Sizing schema changes
50
65
---------------------
51
66
52
- A *schema change * is a set of related modifications to one or more of the
53
- JSONSchema documents in the `metadata-schema `_ repository. That set must be
54
- chosen carefully. If a schema change *can * be made smaller by excluding a
55
- particular change to a schema document without sacrificing the desired end-user
56
- value, it *should * be made smaller, so as to not overwhelm the reviewers with a
57
- mix of orthogonal concerns. On the other hand, the set of changes shoudn't be
58
- too small either: all schema document changes that are necessary to achieve the
59
- desired end-user value should be combined in one overall schema change. If
60
- certain end user value is achieved in a series of many small schema changes,
61
- reviewers will likely not be able to see the big picture until they review the
62
- changes at the end of that series at which point it may be too late to reverse
63
- course.
67
+ A *schema change * is a carefully seleced set of related modifications to one
68
+ or more of the JSONSchema documents in the `metadata-schema `_ repository. If
69
+ a schema change *can * be made smaller by excluding a particular change to a
70
+ schema document without sacrificing the desired end-user value, it *should *
71
+ be made smaller, so as to not overwhelm the reviewers with a mix of
72
+ orthogonal concerns.
73
+
74
+ On the other hand, the set of changes shoudn't be too small either: all schema
75
+ document changes that are necessary to achieve the desired end-user value
76
+ should be combined in one schema change. If certain end user value is
77
+ achieved in a series of many small schema changes, reviewers will likely not
78
+ be able to see the big picture until they review the changes at the end of
79
+ that series, at which point it may be too late to reverse course.
64
80
65
81
Some schema changes warrant an update to the `DCP/2 System Design `_
66
- specification. Whether they do or not is established during the review process
67
- of *pull requests * (PRs) against the `metadata-schema ` and `schema-test-data `
68
- repositories. Specification updates are done as a PR against the `dcp2 `_
69
- repository.
82
+ specification. Whether they do or not is established collectively and early
83
+ on during the review process.
70
84
71
85
72
- Schema change authorship
73
- ------------------------
86
+ Responsiveness to submitter requirements
87
+ ----------------------------------------
74
88
75
- Anyone can author a PR. The PRs for one schema change don't all need to be by
76
- the same author, as long as there are no competing and open PRs in the same
77
- repository and for the same schema change. One person *should * author the PRs
78
- against the `metadata-schema `_ and `dcp2 `_ repositories. That person is referred
79
- to as the *schema change author *. The schema change author can delegate
80
- authorship of PRs to other members of the DCP consortium.
81
89
90
+ Schema changes are often motivated by the need to capture novel information
91
+ about the experimental design, phenotypes or sample collection process as
92
+ submitted by a participating laboratory, in a form that the current schema is
93
+ not able to sufficiently capture. Participating labs often don't have the
94
+ resources or tolerance for a prolonged submission process and are likely to
95
+ walk away if it takes too long to incorporate their submission into the
96
+ DCP/2.
82
97
83
- Schema change procedure
98
+ The DCP/2 therefore needs to balance the celerity with which it adopts schema
99
+ changes on one hand against the effort needed to make all components
100
+ compatible with those changes on the other. The earlier a proposed change
101
+ faces scrutiny by the entire DCP/2 consortium, the better will the consortium
102
+ be able to negotiate the right balance, simply because the change will not
103
+ yet have permeated through the system, making it easier to make adujustments
104
+ to the proposed schema changes.
105
+
106
+
107
+ Review process overview
84
108
-----------------------
85
109
86
- The overall procedure for authoring and reviewing a schema change consists of
87
- the following steps:
110
+ The process of reviewing schema changes utilizes Github's pull request
111
+ (PR) feature. Every schema change begins with a PR against the `develop `
112
+ branch of the `metadata-schema `_ repository. This is were the DCP/2
113
+ consortium determines
88
114
89
- 1) If it is considered unnecessary to update the specification and a
90
- specification update wasn't explicitly requested, proceed to step 4.
115
+ * the impact of the change
91
116
92
- 2) Prepare a PR against the `dcp2 `_ repository, updating the `DCP/2 System
93
- Design `_.
117
+ * whether the schema change is `sized <Sizing schema changes >`_ correctly
94
118
95
- 3) Initiate a `Pull request review `_ of your PR. Once the PR has been approved
96
- by the DCP, move on to the next step. Don't merge the PR just yet.
119
+ * whether the change requires updating the specification in the `dcp2 `_ repository
97
120
98
- 4) Prepare a PR against the ``develop `` branch of the `metadata-schema `_
99
- repository. It is OK to perform this step concurrently with the preceding
100
- steps.
121
+ * whether it requires updating the test metadata in the `schema-test-data `_
122
+ repository for other components to code unit tests against
101
123
102
- 5) Initiate a `Pull request review `_ of your PR. Once the PR has been approved
103
- by the DCP, move on to the next step. Don't merge the PR just yet.
124
+ If a change requires specification changes, the `metadata-schema `_ PR review is
125
+ suspended and a PR is filed against the `dcp2 `_ repository and reviewed there.
126
+ Once that PR is approved, the `metadata-schema `_ PR is revised and reviewed
127
+ again.
104
128
105
- Note that the above steps can be performed for several independent schema
106
- changes in parallel.
107
-
108
- 6) Merge the `metadata-schema `_ repository PR into the ``develop `` branch and
109
- from there into the ``staging `` branch.
129
+ Schema changes that are sized correctly and have negligible impact to other
130
+ components, are considered adopted by the DCP/2 as soon as the PR against the
131
+ `develop ` branch is approved. No further reviews of PRs against other branches or
132
+ repositories are required.
110
133
111
- 7) Prepare a PR against the `Test metadata `_ in the `schema-test-data `_
112
- repository. Don't merge the PR just yet.
134
+ Schema changes that are sized correctly, and that are of some impact to other
135
+ components, may require more involved testing. There are two ways of testing
136
+ schema changes:
113
137
114
- 8) Initiate a `Pull request review `_ of your PR. Once the PR has been approved
115
- by the DCP, move on to the next step. As part of the review, the Azul team
116
- updates their unit tests to validate the updated test metadata. If a test
117
- fails, they either improve the test or the team member reviewing the PR
118
- requests the necessary changes to the test metadata.
138
+ A) (Meta)data exhibiting the proposed changes are first staged and then
139
+ imported into the `dev ` instance of TDR after which downstream components
140
+ use the resulting `dev ` snapshot to test the code changes that are required
141
+ to support the schema change. The Ingest component can only populate
142
+ staging areas from its staging instance, so the `develop ` PR must be
143
+ approved(under the provision that the schema change may need to be amended
144
+ by another PR) and the changes are promoted to the staging branch.
119
145
120
- 9) Prepare a PR against the schema repository, targeting the master branch.
146
+ B) (Meta)data exhibiting the proposed schema changes are committed to a feature
147
+ branch of the `schema-test-data `_ repository, and a PR is filed for that
148
+ branch. See also `Test metadata `_ for details. The `schema-test-data `_
149
+ repository can only populated using the staging instance of the Ingest
150
+ component, so the PR against the `develop ` branch of the `metadata-schema `_
151
+ repository must be approved (under the condition that the schema change may
152
+ need to be amended in another PR) and the changes are promoted to the
153
+ staging branch.
121
154
122
- 10) Initiate a `Pull request review `_ of your PR. The review should be quick
123
- because reviewers will already have seen the changes while reviewing the PR
124
- against the ``develop `` branch. Reviewers should not request any substantial
125
- changes at this stage unless last minute considerations absolutely require
126
- it. Once the PR has been approved by the DCP, move on to the next step.
155
+ The reviewers of a PR can request either testing strategy. Strategy A will
156
+ likely be requested for for most changes with some impact on other components.
127
157
128
- 11) Merge all PRs. You may squash and/or rebase a PR before merging it but that
129
- should not affect the diff between the base and head commits of the PR
130
- branch, except for housekeeping like resolving merge conflicts or adjusting
131
- the host name of schema URLs in PRs against the `schema-test-data `_
132
- repository. No semantic changes may be introduced to a PR after it has been
133
- approved by the DCP. All such housekeeping should be done on the PR branch
134
- prior to merging it.
158
+ In some cases a lab submission not only involves a schema change but also
159
+ introduces a novel way of linking the metadata enties into subgraphs. In those
160
+ cases, reviewers are likely to request testing strategies B. A request for
161
+ either strategy should be reasonably motivated. Possible reasons include
135
162
163
+ - the need to not only review schema changes but also review how those changes
164
+ affect actual metadata documents
136
165
137
- Test metadata
138
- -------------
166
+ - the need to test future, unrelated code changes against the test metadata, so
167
+ as to make sure that those future changes don't introduce regressions.
168
+ Especially graph changes fall into that category.
139
169
140
- Every change to the schemas in the `metadata-schema `_ repository must be
141
- accompanied by a matching change to the test metadata in the `schema-test-data `_
142
- repository. Both changes must be done as PRs following the `Pull request
143
- review `_ process.
170
+ Schema changes that are sized incorrectly need to be revised.
144
171
145
- The test metadata should exercise all graph shapes in actual use.
146
172
147
- The schema references (``describedBy ``) in all metadata documents in the main
148
- branch of `schema-test-data `_ repository must match the schemas on the
149
- `master ` branch of the `metadata-schema `_ repository.
173
+ Schema change authorship
174
+ ------------------------
150
175
176
+ Anyone can author a schema change. The PRs for one schema change don't all need
177
+ to be by the same author, as long as there are no competing PRs pending in the
178
+ same repository, for the same schema change. One person *should * author the PRs
179
+ against the `metadata-schema `_ and `dcp2 `_ repositories. That person is
180
+ referred to as the *schema change author *. The schema change author can
181
+ delegate authorship of PRs to other members of the DCP consortium.
151
182
152
- Pull request review
153
- -------------------
183
+
184
+ Pull request reviews
185
+ --------------------
186
+
187
+ Pull requests against the `dcp2 `_, the `metadata-schema `_ or the
188
+ `schema-test-data ` repositories are reviewed and approved in exactly the same
189
+ way, provided they are involved in a semantic change to either the DCP/2
190
+ specification, a DCP/2 standard operating procedure or a metadata schema
191
+ change:
154
192
155
193
As a PR author
156
194
@@ -165,17 +203,26 @@ As a PR author
165
203
channel on Slack, @-mentioning requested reviewers that haven't yet
166
204
reviewed the PR.
167
205
168
- 5) The PR is considered * approved by the DCP* * if either
206
+ 5) If either
169
207
170
- a) all reviewers approve of the PR or
208
+ a) all reviewers approve of the PR without conditions or
171
209
172
210
b) two weeks have passed since step 1 and there are no binding reviews
173
- requesting changes.
174
-
175
- 6) Otherwise, if this is a PR against the `metadata-schema `_ or
176
- `schema-test-data `_ repositories and a reviewer requests that you first
177
- update the DCP/2 system design specifcation first, close this PR and
178
- proceed with step 2 in section `Changing the HCA metadata schema `_.
211
+ requesting changes,
212
+
213
+ merge the PR. You are done unless any of the approvals are conditional. A
214
+ conditional approval is one that's dependent on successful testing using
215
+ one of the strategies mentioned in `Review process overview_ `. If testing
216
+ fails, a reviewer may request amendmends to your changes. These requests
217
+ are made as comments to the original, now merged PR. Open another PR with
218
+ the requested amendmends and start at step 1.
219
+
220
+ 6) If a reviewer requests that you first update the DCP/2 system design
221
+ specifcation or standard operating procedures first, suspend work on this
222
+ PR and open another PR against the `dpc2 `_ repository. Start at step 1
223
+ there. Once that PR has been approved, update this PR branch with any
224
+ follow-up changes resulting from the `dcp2 `_ PR review process and resume
225
+ this PR at step 1.
179
226
180
227
7) Otherwise, respond to every review comment either by making a source code
181
228
change that you think appropriately addresses the comment or by replying
0 commit comments