-
Notifications
You must be signed in to change notification settings - Fork 335
/
Copy pathworkflows-repeated-amend.Rmd
240 lines (172 loc) · 8.6 KB
/
workflows-repeated-amend.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
# The Repeated Amend {#repeated-amend}
One of the principal joys of version control is the freedom to experiment without fear.
If you make a mess of things, you can always go back to a happier version of your project.
We describe several methods of such time travel in *link to come*.
But you must have a good commit to fall back to!
## Rock climbing analogy
<div class="figure">
<blockquote>
Using a Git commit is like using anchors and other protection when climbing. If you're crossing a dangerous rock face you want to make sure you've used protection to catch you if you fall. Commits play a similar role: if you make a mistake, you can't fall past the previous commit. Coding without commits is like free-climbing: you can travel much faster in the short-term, but in the long-term the chances of catastrophic failure are high! Like rock climbing protection, you want to be judicious in your use of commits. Committing too frequently will slow your progress; use more commits when you're in uncertain or dangerous territory. Commits are also helpful to others, because they show your journey, not just the destination.
</blockquote>
<p class="caption">
<a href="http://r-pkgs.had.co.nz/git.html#git-commit">R Packages, Hadley Wickham</a> (@r-pkgs-book)</p>
</div>
Let's talk about this:
> use more commits when you're in uncertain or dangerous territory
When I'm doing something tricky, I often proceed towards my goal in small increments, checking that everything still works along the way.
Yes it works?
Make a commit.
This is my new worst case scenario.
Keep going.
What's not to love?
This can lead to an awful lot of tiny commits.
This is absolutely fine and nothing to be ashamed of.
But one day you may start to care about the utility and aesthetics of your Git history.
The Repeated Amend is a pattern where, instead of cluttering your history with lots of tiny commits, you build up a "good" commit gradually, by amending.
*Yes, there are other ways to do this, e.g. via squashing and interactive rebase, but I think amending is the best way to get started.*
## Workflow sketch
### Initial condition
Start with your project in a functional state:
* R package? Run your tests or `R CMD check`.
* Data analysis? Re-run your script or re-render your `.Rmd` with the new chunk.
* Website or book? Make sure the project still compiles.
* You get the idea.
Make sure your "working tree is clean" and you are synced up with your GitHub remote. `git status` should show something like:
```console
~/tmp/myrepo % git status
On branch main
Your branch is up to date with 'origin/main'.
nothing to commit, working tree clean
```
### Get to work
Imagine we start at commit C, with previous commit B and, before that, A:
``` bash
... -- A -- B -- C
```
Make a small step towards your goal.
Re-check that your project "works".
Stage those changes with and make a commit with the message "WIP", meaning "work in progress".
Do this in RStudio or in the shell (Appendix \@ref(shell)):
```console
git add path/to/the/changed/file
git commit -m "WIP"
```
The message can be anything, but "WIP" is a common convention.
If you use it, whenever you return to a project where the most recent commit message is "WIP", you'll know that you were probably in the middle of something.
If you push a "WIP" commit, on purpose or by mistake, it signals to other people that more commits might be coming.
Your history now looks like this:
``` bash
A -- B -- C -- WIP*
```
**Don't push!**
The `*` above signifies a commit that exists only in your local repo, not (yet) on GitHub.
If you called `git status`, you'd see something like "Your branch is ahead of 'origin/main' by 1 commit.", which is also displayed in RStudio's Git pane.
Do a bit more work.
Re-check that your project is still in a functional state.
Stage and commit again, but this time **amend** your previous commit.
RStudio offers a check box for "Amend previous commit" or in the shell:
```console
git commit --amend --no-edit
```
The `--no-edit` part retains the current commit message of "WIP".
**Don't push!** Your history now looks like this:
``` bash
A -- B -- C -- WIP*
```
but the changes associated with the `WIP*` commit now represent your last two commits, i.e. all the accumulated changes since state C.
Keep going like this.
Let's say you've finally achieved your goal. One last time, check that your project is functional and in a state you're willing to share with others.
Commit, amending again, but with a real commit message this time.
Think of this as commit D.
Push.
Do this in RStudio or the shell:
```console
git commit --amend -m "Implement awesome feature; closes #43"
git push
```
Your history -- and that on GitHub -- look like this:
``` bash
A -- B -- C -- D
```
As far as the world knows, you implemented the feature in one fell swoop.
But you got to work on the task incrementally, with the peace of mind that you could never truly break things.
## What if I need to fall back?
Imagine you're in the middle of a Repeated Amend workflow:
```console
A -- B -- C -- WIP*
```
and you make some changes that break your project, e.g. tests start failing.
These bad changes are not yet committed, but they are saved.
You want to fall back to the last good state, represented by `WIP*`.
In Git lingo, you want to do a **hard reset** to the `WIP*` state.
Your local files will be forcibly reset to their state as of the `WIP*` commit.
With the command line:
```console
git reset --hard
```
which is implicitly the same as
```console
git reset --hard HEAD
```
which says: "reset my files to their state at the most recent commit".
This is also possible in RStudio.
In fact, the RStudio way makes it easier to selectively reset only specific files or only certain changes.
Click on "Diff" or "Commit".
Select a file with changes you do not want.
Use "Discard All" to discard all changes in that file.
Use "Discard chunk" to discard specific changes in a file.
Repeat this procedure for each affected file until you are back to an acceptable state.
Carry on.
If you committed a bad state, go to *link to come* for more reset scenarios.
## Why don't we push intermediate progress?
Amending a commit is an example of what's called "rewriting Git history".
Rewriting history that has already been pushed to GitHub -- and therefore potentially pulled by someone else -- is a controversial practice.
Like most controversial practices, lots of people still indulge in it, as do I.
But there is the very real possibility that you create headaches for yourself and others, so in Happy Git we must recommend that you abstain.
Once you've pushed something, consider it written in stone and move on.
## Um, what if I did push?
I told you not to!
But OK here we are.
Let's imagine you pushed this state to GitHub by mistake:
```console
A -- B -- C -- WIP (85bf30a)
```
and proceeded to `git commit --amend` again locally, leading to this state:
```console
A -- B -- C -- WIP* (6e884e6)
```
I'm deliberately showing two histories that sort of look the same, in terms of commit messages.
But the last SHA reveals they are actually different.
You are in a pickle now, as you can't do a simple push or pull.
A push will be rejected and a pull will probably lead to a merge that you don't want.
You have two choices:
* If you have collaborators who may have pulled the repo at commit
`WIP (85bf30a)`, you have to regard that particular history as being written
in stone now.
If there is any very precious work that only exists locally, such as a
specific file, save a copy of that to a new file path, temporarily.
Hard reset your local repo to `C` (`git reset --hard HEAD^`) and pull from
GitHub.
GitHub and local history now show this:
```console
A -- B -- C -- WIP (85bf30a)
```
If you saved some precious work to a temporary file path, bring it back into
the repo now; save, stage, commit, and push.
GitHub and local history now show this:
```console
A -- B -- C -- WIP (85bf30a) -- E
```
* If you have no collaborators or you have reason to believe they have not
pulled, you can rewrite history, even on GitHub.
You might as well make sure your local commit has a real, non-"WIP" message
at this point.
Force push your history to GitHub (`git push --force`).
GitHub and local history now show this:
```console
A -- B -- C -- D
```
In both cases, you've made the changes you want and your local repo and the
GitHub remote are synced up again.
The history is nicer in the second case, but that's a secondary issue.
*There are many different ways to rewrite history and rescue some of these situations, but we find the approaches described above to be very approachable.*