-
Notifications
You must be signed in to change notification settings - Fork 8
/
Copy pathcopy-with-rsync.sh
executable file
·208 lines (170 loc) · 8.73 KB
/
copy-with-rsync.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
#!/bin/bash
set -o errexit
set -o nounset
set -o pipefail
SCRIPT_NAME="copy-with-rsync.sh"
VERSION_NUMBER="1.08"
abort ()
{
echo >&2 && echo "Error in script \"$0\": $*" >&2
exit 1
}
display_help ()
{
echo
echo "$SCRIPT_NAME version $VERSION_NUMBER"
echo "Copyright (c) 2014-2019 R. Diez - Licensed under the GNU AGPLv3"
echo
echo "Most of the time, I just want to copy files around with \"cp\", but, if a long transfer"
echo "gets interrupted, next time around I want it to resume where it left off, and not restart"
echo "from the beginning."
echo
echo "That's where rsync comes in handy. The trouble is, I can never remember the right options"
echo "for rsync, so I wrote this little wrapper script to help."
echo
echo "Syntax:"
echo " ./$SCRIPT_NAME src dest # Copies src (file or dir) to dest (file or dir)"
echo " ./$SCRIPT_NAME src_dir/ dest_dir # Copies src_dir's contents to dest_dir"
echo
echo "This script assumes that the contents of each file has not changed in the meantime."
echo "If you interrupt this script, modify a file, and resume the copy operation, you will"
echo "end up with a mixed mess of old and new file contents."
echo
echo "You probably want to run this script with \"background.sh\", so that you get a"
echo "visual indication when the transfer is complete."
echo
echo "Use environment variable PATH_TO_RSYNC to specify an alternative rsync tool to use."
echo "This is important on Microsoft Windows, as Cygwin's rsync is known to have problems."
echo "See this script's source code for details."
echo
echo "DETECTING DATA CORRUPTION"
echo "I have yet to copy a large amount of files without some data corruption somewhere."
echo "I have had such trouble restarting file transfers over the network against Windows PCs using SMB/CIFS."
echo "My old laptop had silent USB data corruption issues, and slightly unreliable hard disks"
echo "also show up from time to time. No wonder I have become paranoid over the years."
echo "Your best bet is to calculate checksums of all files at the source, and verify them at the destination."
echo "Checksum creation:"
echo " rhash --recursive --crc32 --simple --percents --output=\"subdir-file-crcs.txt\" -- \"subdir/\""
echo "Checksum verification:"
echo " rhash --check --recursive --crc32 --simple --skip-ok -- \"subdir-file-crcs.txt\""
echo "Further notes:"
echo "- When creating the hashes, rhash option \"--update\" does not work well. I could not make it"
echo " add new file checksums to the list in a recursive manner."
echo " This is allegedly fixed in rhash v1.3.9, see modified --update=filename argument."
echo "- When verifying, do not enable the progress indication. Otherwise, it is hard to see"
echo " which files have failed. This is unfortunate."
echo "- Consider using GNU Parallel or \"xargs --max-procs\" if the CPU becomes a bottleneck"
echo " (which is unusual for simple checksums like CRC-32)."
}
add_to_comma_separated_list ()
{
local NEW_ELEMENT="$1"
local MSG_VAR_NAME="$2"
if [[ ${!MSG_VAR_NAME} = "" ]]; then
eval "$MSG_VAR_NAME+=\"$NEW_ELEMENT\""
else
eval "$MSG_VAR_NAME+=\",$NEW_ELEMENT\""
fi
}
# ------- Entry point -------
if [ $# -eq 0 ]; then
display_help
exit 0
fi
if [ $# -ne 2 ]; then
abort "Invalid number of command-line arguments. Run this script without arguments for help."
fi
ARGS=""
ARGS+=" --no-inc-recursive" # Uses more memory and is somewhat slower, but improves progress indication.
# Otherwise, rsync is almost all the time stuck at a 99% completion rate.
if [[ $OSTYPE = "cygwin" ]]; then
# Using rsync on Windows is difficult. Over the years, I have encountered many problems with Cygwin's rsync.
# Some versions were just very slow, some other would hang after a while, and all of them had problems
# with Windows' file permissions.
#
# I have always used rsync to just copy files locally (not in a client/server environment),
# with a user account that has full access to all files. This is arguably the easiest scenario,
# but it does not work straight away nevertheless.
#
# The first thing to do is to use cwRsync instead of Cygwin's rsync. cwRsync's Free Edition will suffice.
# Although it brings its own Cygwin DLL with it, this rsync version works fine.
#
# Then you need to avoid rsync's "--archive" flag, because it will attempt to copy file permissions,
# which has never worked properly for me. By the way, flag " --no-perms" seems to have no effect.
#
# If you are connecting to a network drive where you have full permissions,
# and you create a new directory with Windows' File Explorer, these are the
# Cygwin permissions you get, viewed on the PC sharing the disk:
#
# d---rwxrwx+ 1 Unknown+User Unknown+Group MyDir
#
# However, cwRsync generates the following permissions:
#
# drwxrwx---+ 1 Unknown+User Unknown+Group MyDir
#
# Normally, it does not matter much, as you still have read/write access to the files, but for some
# operations, like renaming directories, Windows Explorer will ask for admin permissions.
#
# The detailed permissions entries, as viewed with File Explorer's permissions dialog, are also different.
#
# A single file looks like this:
#
# -rwxrwx---+ SomeFile.txt
#
# With rsync's option "--chmod=ugo=rwX", which is often given as a work-around for the file permission issues,
# you get the following permissions:
#
# -rwxrwxr-x+ SomeFile.txt
#
# That is, "--chmod" does have an effect, but only on the permissions for "other" users (in this case),
# which it does not really help.
#
# After finishing the copy operations, you can try using my ResetWindowsFilePermissions.bat script
# so that the copied files end up with the same permissions as if you had copied them with
# Windows File Explorer. Alternatively, these are the steps in order to reset the permissions manually (with the mouse):
#
# 1) Create a top-level directory in the usual way with Windows' File Explorer.
# 2) Temporarily move the just-copied directory (or directories) below the new top-level one.
# 3) Take ownership of all files inside the just-copied directory.
# 4) Reset all permissions of the just-copied directory to the ones inherited from the new top-level directory.
# 5) Move back the just-copied directory to its original location.
ARGS+=" --recursive"
ARGS+=" --times" # Copying the file modification times is necessary. Otherwise, all files
# will be copied again from scratch the next time around.
else
ARGS+=" --archive" # A quick way of saying you want recursion and want to preserve almost everything.
fi
# You may have to add flag --modify-window=1 if you are copying to or from a FAT filesystem,
# because they represent times with a 2-second resolution.
ARGS+=" --append" # Continue partially-transferred files.
ARGS+=" --human-readable" # Display "60M" instead of "60,000,000" and so on.
# Unfortunately, there seems to be no way to display the estimated remaining time for the whole transfer.
PROGRESS_ARGS=""
# Instead of making a quiet pause at the beginning, display a file scanning progress indication.
# That is message "building file list..." and the increasing file count.
# If you are copying a large number of files and are logging to a file, the log file will be pretty big,
# as rsync (as of version 3.1.0) seems to refresh the file count in 100 increments. See below for more
# information about refreshing the progress indication too often.
add_to_comma_separated_list "flist2" PROGRESS_ARGS
# Display a global progress indication.
# If you are copying a large number of files and are logging to a file, the log file will
# grow very quickly, as rsync (as of version 3.1.0) seems to refresh the accumulated statistics
# once for every single file. I suspect the console's speed may end up limiting rsync's performance
# in such scenarios. I have reported this issue, see the following mailing list message:
# "Progress indication refreshes too often with --info=progress2"
# Wed Jun 11 04:50:26 MDT 2014
# https://lists.samba.org/archive/rsync/2014-June/029494.html
add_to_comma_separated_list "progress2" PROGRESS_ARGS
# Warn if files are skipped. Not sure if we need this.
add_to_comma_separated_list "skip1" PROGRESS_ARGS
# Warn if symbolic links are unsafe. Not sure if we need this.
add_to_comma_separated_list "symsafe1" PROGRESS_ARGS
# We want to see how much data there is (the "total size" value).
# Unfortunately, there is no way to suppress the other useless stats.
add_to_comma_separated_list "stats1" PROGRESS_ARGS
ARGS+=" --info=$PROGRESS_ARGS"
printf -v CMD "%q %s -- %q %q" "${PATH_TO_RSYNC:-rsync}" "$ARGS" "$1" "$2"
echo "$CMD"
eval "$CMD"
echo
echo "Copy operation finished successfully."