-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathREADME-largescale.txt
359 lines (307 loc) · 15.8 KB
/
README-largescale.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
We performed two large-scale Egalito evaluations on Debian packages. We
verified Egalito's jump table detection capabilities against ground truth from
GCC (debjumptable/), and ran Debian package tests after transformation with
Egalito (autopkgtest/).
Both experiments rely on internet access to a Debian mirror to download .deb
files, package sources, and build dependencies. The VM has pass-through
networking by default. All dependencies are downloaded into a chroot
environment called "pkgtest". We use Debian stretch in the chroot and in this
VM. Please edit /etc/apt/sources.list in the chroot and/or in the VM if you
wish to try using a different Debian version or a different mirror.
Important note: please run the following before attempting large-scale Debian
package experiments (since mirrors will update to newer package versions):
$ sudo chroot /srv/chroot/pkgtest/ apt update
The chroot environment is already set up, but we include instructions at the
end of this file on how to recreate it if you are interested. Using chroot
(as above) to enter the environment makes your changes persistent. When you
enter the chroot environment with "schroot -c pkgtest", you get a temporary
copy of the chroot (via unionfs): installed packages will be automatically
erased when you logout, and only changes to /home (mapped to VM's /home) will
persist. This is the mode used by our scripts.
=== Jump table experiments (debjumptable/)
The idea of this experiment is to use the special GCC flag -fdump-rtl-dfinish
to look at the RTL intermediate representation and determine ground-truth for
what jump tables and jump table bounds were generated by the compiler. In order
to do this, we need to build Debian packages from source with this extra
compiler flag. Also, we need to map .dfinish files (which are produced for
each .o file) to executables (which can link together arbitrary .o files).
We proceed as follows. The script automate-jumptables4.sh downloads source
and build dependencies for the given package into the chroot. It builds all
binary packages therein with dpkg-buildpackage. Outside the chroot, the
packages are extracted. We find ELF executables/libraries and use readelf to
figure out the .o files they contain, and find the corresponding .dfinish files
in the build tree. We use the (newly built) debug packages to identify symbols
in the built executables.
We run egalito on each executable, asking it to dump jump tables along with
the symbol they are inside. Then we look at the RTL to determine all ground
truth tables for each symbol. Unfortunately, sometimes readelf does not specify
which .o files a symbol comes from, or there are multiple .o files of the same
name, and there can be multiple matching symbols. We consider egalito to be
correct if it matches any one of the possibilities. We also have to handle
differing symbol names due to optimizations etc (this logic is in
jumptable-diff-option.pl).
Optionally, we provide a mechanism to catch deleted .o files (this increases
the number of packages for which we can gather ground truth). To use it, try
$ export PATH=~/intercept-bin:$PATH
Here is a full example.
$ cd ~/debjumptable/gimple
$ export EGALITO_ROOT=~/egalito-head/
$ export LC_ALL=C
$ schroot -p -c pkgtest -- ./automate-jmptable4.sh bzip2
[...]
jumptable diff seems OK
$
After building bzip2 from source inside the chroot, we get build-bzip2:
$ ls build-bzip2/
build-logs bzip2_1.0.6.orig.tar.bz2
bzip2-1.0.6 exe
bzip2-dbgsym_1.0.6-8.1_amd64.deb extract-dbg
bzip2_1.0.6-8.1.debian.tar.bz2 install-root
bzip2_1.0.6-8.1.dsc libbz2-1.0-dbgsym_1.0.6-8.1_amd64.deb
bzip2_1.0.6-8.1_amd64.buildinfo libbz2-1.0_1.0.6-8.1_amd64.deb
bzip2_1.0.6-8.1_amd64.changes libbz2-dev_1.0.6-8.1_amd64.deb
bzip2_1.0.6-8.1_amd64.deb tree
$
tree is a symlink to bzip2-1.0.6, the build directory (useful to have a
consistent name). install-root is the result of extracting all binary packages.
extract-dbg contains debugging files. Finally, exe/ is the result of our
run, with one subdirectory per executable/library in install-root:
$ ls build-bzip2/exe/
0-bzcat 1-bunzip2 2-bzip2 3-bzip2recover
$ ls build-bzip2/exe/2-bzip2/
egalito.log program symbols tables.diff tables.egalito tables.gimple
$ file result-bzip2/exe/2-bzip2/program
result-bzip2/exe/2-bzip2/program: symbolic link to ../../install-root/bin/bzip2
$ cat build-bzip2/exe/2-bzip2/tables.egalito
jump table in [testStream] at 0x6510 with 7 entries
jump table in [uncompressStream] at 0x652c with 7 entries
jump table in [main] at 0x6548 with 74 entries
$ grep 'jump table' build-bzip2/exe/2-bzip2/tables.gimple
jump table in [testStream]:
jump table in [testStream] has 7 entries\n
jump table in [uncompressStream]:
jump table in [uncompressStream] has 7 entries\n
jump table in [main]: (from unknown FILE)
jump table in [main] has 74 entries\n
$ cat build-bzip2/exe/2-bzip2/tables.diff
Here the gimple version has each option for a symbol listed on successive
lines. This example has only one option per symbol. Diffing is performed by
jumptable-diff-option.pl.
The above is how our original experiments were performed, but disk space gets
used quickly (168 packages was about 100GB). On this virtual machine we provide
metaautomate.sh which automatically deletes the build-* directory, keeping only
important parts in result-*. For use:
$ cd ~/debjumptable/gimple
$ export EGALITO_ROOT=~/egalito-head/
$ export LC_ALL=C
$ ./metaautomate.sh bzip2
running experiments
logs will be in automate-logs/
build-* automatically deleted to save disk, summarized in result-*
===
bzip2...
real 0m21.764s
user 0m11.384s
sys 0m2.764s
jumptable diff seems OK
Since build directories are deleted, you can safely try many packages in
succession without going above the 20GB disk space limit of the VM (some large
packages will likely not succeed however). The set of packages we tried is as
follows (originally scraped from the Debian popcon list):
$ ./metaautomate.sh $(cat ../build_list_demo)
This takes several hours to complete. The results can be examined at the end
or during execution with the following script:
$ ./results2.pl automate-logs/*.log 2>/dev/null
good list: accountsservice acl alsa-utils anacron apache2 apache2-bin apache2-ut
ils appstream apt apt-utils aspell at-spi2-core avahi-daemon base-files bash bc
bind9-host bluez bsdmainutils bsdutils bzip2 colord coreutils cpio cracklib-runt
ime crda cups cups-browsed cups-daemon cups-filters dash dbus dbus-user-session
dc dconf-gsettings-backend dconf-service debianutils desktop-file-utils diffutil
s dirmngr dmsetup dpkg e2fsprogs evolution-data-server exim4-config exim4-daemon
-light file findutils fuse gconf-service gcr gir1.2-accountsservice-1.0 gir1.2-a
tk-1.0 gir1.2-atspi-2.0 gir1.2-freedesktop gir1.2-gck-1 gir1.2-gcr-3 gir1.2-gdes
ktopenums-3.0 gir1.2-gdkpixbuf-2.0 gir1.2-glib-2.0 gir1.2-gnomebluetooth-1.0 gir
1.2-gnomedesktop-3.0 gir1.2-gtk-3.0 gir1.2-ibus-1.0 gir1.2-pango-1.0 gir1.2-polk
it-1.0 gir1.2-soup-2.4 gir1.2-upowerglib-1.0 gjs glib-networking gnome-shell gnu
pg gnupg-agent gnupg2 gpgv grep gstreamer1.0-plugins-good gstreamer1.0-x gtk-upd
ate-icon-cache gvfs gvfs-backends gvfs-daemons gvfs-fuse gvfs-libs hdparm hostna
me hplip ifupdown iio-sensor-proxy imagemagick-6.q16 inkscape iproute2 iptables
iputils-ping irqbalance iw kbd klibc-utils kmod less lm-sensors login logrotate
lsb-release lsof lua-bitop lua-expat lua-socket lvm2 man-db mawk mesa-utils mini
ssdpd modemmanager mount mutter ncurses-bin net-tools ntfs-3g openssh-client ope
nssh-server p11-kit p11-kit-modules packagekit parted passwd pciutils pinentry-g
nome3 policykit-1 ppp printer-driver-foo2zjs printer-driver-hpcups procps psmisc
rpcbind rsync rsyslog rtkit shared-mime-info sudo system-config-printer-udev sy
svinit-utils tar udev udisks2 unzip upower usbmuxd usbutils wget whiptail wirele
ss-tools wpasupplicant x11-utils x11-xkb-utils x11-xserver-utils xbrlapi xdg-use
r-dirs xxd xz-utils zenity zlib1g
good count: 162 packages, 1207 exes, 3372 tables, 191 needed heuristic
fail list:
fail count: 0 packages, 0 exes, 0 tables, 0 needed heuristic
skip list: adduser apt-listchanges busybox console-setup console-setup-linux deb
conf firmware-linux-free gnome-backgrounds gnome-settings-daemon initramfs-tools
initramfs-tools-core keyboard-configuration linux-libc-dev lua-json media-playe
r-info mime-support netbase network-manager openssl popularity-contest samba-lib
s sensible-utils tzdata ucf unattended-upgrades ure wireless-regdb xdg-utils
skip count: 28 packages, 17 exes, 0 tables, 0 needed heuristic
FINAL: 100.00% of exes are good, 94.34% of tables had bounds
Here 1207 executables in 162 packages worked successfully. Of 3372 tables, 191
needed a heuristic to determine the bound (hence, 94.34% had an analytical
bound that we could determine from dataflow). Your results may vary. If we
could not build a package, it is reported as skipped, and Debian package
sources are sure to change over time. A small handful of cases we investigated
manually are hardcoded in results2.pl to have a specific report.
=== Debian package tests (autopkgtest/)
Like the large-scale jump table tests, the Debian autopkgtests are performed by
downloading the source and build dependencies for a package within the chroot.
We run the tests that are provided by the package maintainers. To do this, we
use the Debian tool autopkgtest 4.4 (we provide a local copy of this tool,
because we originally modified it, but the stock version should work now).
The script autotest.sh performs tests on a single package. It enters the chroot
twice, once to build and once to run tests. We create our own .deb package with
the Egalito-transformed binaries. Full example:
$ cd ~/autopkgtest
$ export EGALITO_ROOT=~/egalito-head/
$ export LC_ALL=C
$ schroot -p -c pkgtest -- ./automate-jmptable4.sh bzip2
[...]
jumptable diff seems OK
$
$ ./autotest.sh aragorn
[...]
=== Test return value: 0
87 sequences searched
Total tRNA genes = 87
Total tmRNA genes = 0
Nothing found in 1 sequences (98.85% sensitivity)
Configuration: aragorn debian/tests/testseq.fasta
autopkgtest [16:11:49]: test command1: -----------------------]
autopkgtest [16:11:49]: test command1: - - - - - - - - - - results - - - - - - - - - -
command1 PASS
autopkgtest [16:11:49]: @@@@@@@@@@@@@@@@@@@@ summary
command1 PASS
=== end test log
As before, we provide metaautomate.sh to run multiple packages and delete the
temporary build directories between runs to save disk space. build-* are
deleted and result-* containing only the testlog is created. Note that it is
quite possible to find Debian packages that are not worth testing because they
take a lot of time and/or disk space to build (e.g. gnome-settings-daemon), so
keep this in mind when trying new packages.
$ cd ~/autopkgtest
$ export EGALITO_ROOT=~/egalito-head/
$ export LC_ALL=C
$ ./metaautomate.sh an aragorn
running experiments
logs will be in automate-logs/
build-* automatically deleted to save disk, summarized in result-*
===
an...
real 0m15.763s
user 0m6.872s
sys 0m2.396s
=== end test log
aragorn...
real 0m25.838s
user 0m12.192s
sys 0m3.068s
=== end test log
$
We provide count-results.sh which examines the testlogs and prints a summary.
SKIP means the package did not build properly, or did not contain tests.
$ ./count-results.sh
PASS: aragorn
FAIL:
SKIP: an
PASS 1
FAIL 0
SKIP 1
$
In our original experiment, we ran
$ ./metaautomate.sh $(cat package-list)
This has not been replicated on the virtual machine, but our results were as
follows:
$ ./count-results.sh
PASS: akonadiconsole amap-align aragorn atk1.0 bcftools bibtool bio-eagle biosqu
id bomstrip bowtie boxes boxshade bwa cd-hit clustalw cmake confget coz-profiler
cproto dbus deal dialign distro-info dolphin emboss exonerate ffmpeg freecad gd
isk genometools gnudatalanguage haveged hmmer imagemagick infernal ioping linuxi
nfo mafft maq marble miniasm minimap mp4h mummer muscle mustang nagios-plugins-c
ontrib ncompress ocrad okular onscripter parley pbsim pgpool2 pgtap phylip phyml
plink plink1.9 primer3 prips privoxy probcons python2.7 python-coverage python-
cups python-pyxattr python-scipy rc reapr rocs saods9 sim4 since snpomatic snp-s
ites spline swig tantan t-coffee tigr-glimmer timelimit valgrind velvet vifm vor
bis-tools wise yapet yorick zsh
FAIL: aegean apt aria2 augustus bedtools bonnie++ bowtie2 bs1770gain cdebootstra
p colord crash csound cups dcraw doxygen esorex exim4 fityk freeradius gjs gnome
-photos gnuplot grep gwenview haproxy hugin inspircd memcached mscgen ncbi-blast
+ njplot notify-osd pdf2djvu pgbouncer postfix poxml pyxplot quagga radvd rnahyb
rid rna-star ruby-hdfeos5 samtools screen sextractor spades subversion suricata
svn-all-fast-export syslog-ng tetgen theseus tracker umbrello upower util-linux
vlc wireshark x264
SKIP: abyss afterstep an antiword apachetop arping aspectc++ asterisk awesome ax
el bamtools beanstalkd beef bind9 bittwist bosh bti bzip2 cbm cflow cfortran chr
ony c-icap clasp colorized-logs corosync cryfs cutils daisy-player dash dcfldd d
ealer deborphan debsig-verify detox dhcpcd5 docbook-to-man dynare ed entr espeak
exifprobe fakeroot fastdnaml fig2dev fim firejail flite flog freemat fsarchiver
galleta gconjugue gdpc gff2aplot ghostscript gifsicle gnat goaccess gthumb gzip
httping icecast2 ifupdown ii initramfs-tools iproute2 ipset iptables last-align
lbdb lebiniou lighttpd logrotate lxc magicrescue makedumpfile mdk memdump metac
am minissdpd multipath-tools munge nbc nginx nqc nsca octave ola openconnect ope
n-iscsi openscad openssl osm2pgrouting pacemaker pagetools paperkey parcellite p
dfcrack perl pev poa policykit-1 postgis praat proda puredata re2c readseq rhash
safecopy sdate seaview sed sirikali smcroute sosi2osm ssh-agent-filter stunnel4
sunclock surf survex tcpxtract therion tor uchardet udisks2 unhide unrar-free u
px-ucl varnish virt-what vsftpd wml wmnd xapian-omega xfig xrootconsole xterm zi
le
PASS 90
FAIL 59
SKIP 140
$
Note that this count-results.sh script does not contain special cases. The
most common failure is when testlog contains "undefined symbol", which is a
known egalito issue (symbol version bloom filter not implemented). However,
the other failures that remain are still being investigated.
Some newer modular scripts for deb package testing are in ~/debtest. They are
included only in case the reader would like to try similar experiments.
=== How to recreate the chroot (/srv/chroot/pkgtest)
We already ran these commands in this virtual machine image. These
instructions are only included for the interested reader to attempt to
replicate the chroot on another machine.
$ sudo apt install schroot debootstrap
$ sudo mkdir -p /srv/chroot/pkgtest
$ sudo debootstrap stretch /srv/chroot/pkgtest
[...]
$ sudo bash -c "cat > /etc/schroot/chroot.d/pkgtest"
[pkgtest]
type=directory
description=Egalito pkgtest Environment
directory=/srv/chroot/pkgtest
users=egalito
root-users=egalito
union-type=overlay
$ sudo schroot -l
chroot:pkgtest
source:pkgtest
$ sudo schroot -c pkgtest
(pkgtest)# cat /etc/debian_version
9.11
(pkgtest)# logout
$ cat /etc/debian_version
9.11
$
By default, the chroot has an apt sources list that mentions deb (binary
packages) but not deb-src (for source packages). So, we copy in our host
system's sources.list, and assume it has deb-src:
$ sudo cp /etc/apt/sources.list /srv/chroot/pkgtest/etc/apt/sources.list
$ sudo chroot /srv/chroot/pkgtest
(pkgtest)# apt update
(pkgtest)# apt install build-essential file
(pkgtest)# apt install sudo
(pkgtest)# visudo # we will add the following line:
ALL ALL=NOPASSWD: /usr/bin/apt-get
(pkgtest)# exit
Now we can enter (a temporary copy of) the chroot as root with "sudo schroot -c
pkgtest", but files created in /home end up owned as root (and the chroot has
no non-root user to su to). We typically use "schroot -c pkgtest" to enter as
the egalito user, and use sudo apt-get to install packages. You can use "mount"
to see running copies of the chroot.