Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forged ZIP archive manages to hide the Trojan from clamav #1014

Open
devgs opened this issue Aug 29, 2023 · 14 comments
Open

Forged ZIP archive manages to hide the Trojan from clamav #1014

devgs opened this issue Aug 29, 2023 · 14 comments

Comments

@devgs
Copy link

devgs commented Aug 29, 2023

Describe the bug

It seems that bad actors have managed to find a 'hole' in a ZIP format that allows them to inject a hidden content, in such a way, that it isn't visible to clamav.

In our case, it's a virus, so I'm not even sure whether it's wise to attach it here. That's why I'm going to add .gif to the file name as a weak safety measure:

forged.zip.gif

If the archive is extracted, using unzip, on Linux, then the resulting file (another ZIP archive) is now detected as having the virus.

Here's how the other parties see the original file.

How to reproduce the problem

1. Have the forged.zip.gif as and example.

2. Rename it to forged.zip

3. Run clamscan:

clamscan --verbose --archive-verbose -d /var/db/clamav/ /home/foo/2/forged.zip
Loading:    28s, ETA:   0s [========================>]   12.88M/12.88M sigs
Compiling:   4s, ETA:   0s [========================>]       42/42 tasks

Scanning /home/foo/2/forged.zip
/home/foo/2/forged.zip: OK

----------- SCAN SUMMARY -----------
Known viruses: 12878904
Engine version: 1.0.1
Scanned directories: 0
Scanned files: 1
Infected files: 0
Data scanned: 1.03 MB
Data read: 0.96 MB (ratio 1.06:1)
Time: 41.003 sec (0 m 41 s)
Start Date: 2023:08:29 14:40:21
End Date:   2023:08:29 14:41:02

4. Use shell (note the 351470 extra bytes at beginning or within zipfile):

$ ll
total 1008
drwxrwxr-x 2 foo foo    4096 Aug 29 14:22 ./
drwxr-xr-x 8 foo foo   12288 Aug 29 14:22 ../
-rw-rw-r-- 1 foo foo 1013510 Aug 29 14:22 forged.zip
$ file forged.zip
forged.zip: Zip archive data, at least v2.0 to extract
$ unzip forged.zip
Archive:  forged.zip
warning [forged.zip]:  351470 extra bytes at beginning or within zipfile
  (attempting to process anyway)
  inflating: Рахунок_фактура_вiд_18_08_2023р_Помилкове_зарахування.pdf
$ ll
total 1656
drwxrwxr-x 2 foo foo    4096 Aug 29 14:23  ./
drwxr-xr-x 8 foo foo   12288 Aug 29 14:22  ../
-rw-rw-r-- 1 foo foo 1013510 Aug 29 14:22  forged.zip
-rw-rw-r-- 1 foo foo  661676 Aug 29 04:13 'Рахунок_фактура_вiд_18_08_2023р_Помилкове_зарахування.pdf '
$ file 'Рахунок_фактура_вiд_18_08_2023р_Помилкове_зарахування.pdf '
Рахунок_фактура_вiд_18_08_2023р_Помилкове_зарахування.pdf : Zip archive data, at least v2.0 to extract
$ unzip 'Рахунок_фактура_вiд_18_08_2023р_Помилкове_зарахування.pdf '
Archive:  Рахунок_фактура_вiд_18_08_2023р_Помилкове_зарахування.pdf
  inflating: 1_Рахунок_фактура_вiд_18_08_2023р_Помилкове_зарахування.pdf.exe
  inflating: Pax_18_08_23.jpg
$ ll
total 2548
drwxrwxr-x 2 foo foo    4096 Aug 29 14:24  ./
drwxr-xr-x 8 foo foo   12288 Aug 29 14:22  ../
-rw-rw-r-- 1 foo foo  639919 Aug 23 03:28  1_Рахунок_фактура_вiд_18_08_2023р_Помилкове_зарахування.pdf.exe
-rw-rw-r-- 1 foo foo 1013510 Aug 29 14:22  forged.zip
-rw-rw-r-- 1 foo foo  269824 Aug 29 04:10  Pax_18_08_23.jpg
-rw-rw-r-- 1 foo foo  661676 Aug 29 04:13 'Рахунок_фактура_вiд_18_08_2023р_Помилкове_зарахування.pdf '
$ file Pax_18_08_23.jpg
Pax_18_08_23.jpg: PE32 executable (GUI) Intel 80386, for MS Windows
$ file 1_Рахунок_фактура_вiд_18_08_2023р_Помилкове_зарахування.pdf.exe
1_Рахунок_фактура_вiд_18_08_2023р_Помилкове_зарахування.pdf.exe: PE32 executable (GUI) Intel 80386, for MS Windows
$ ll
total 2548
drwxrwxr-x 2 foo foo    4096 Aug 29 14:24  ./
drwxr-xr-x 8 foo foo   12288 Aug 29 14:22  ../
-rw-rw-r-- 1 foo foo  639919 Aug 23 03:28  1_Рахунок_фактура_вiд_18_08_2023р_Помилкове_зарахування.pdf.exe
-rw-rw-r-- 1 foo foo 1013510 Aug 29 14:22  forged.zip
-rw-rw-r-- 1 foo foo  269824 Aug 29 04:10  Pax_18_08_23.jpg
-rw-rw-r-- 1 foo foo  661676 Aug 29 04:13 'Рахунок_фактура_вiд_18_08_2023р_Помилкове_зарахування.pdf '
$ sha256sum forged.zip
3676ee70492d30e156eac1d1b8e7a99e18deb7beabd64dc89bbf18c6a9326579  forged.zip

5. Running clamscan on the extracted (Linux's unzip) archive we now have a match:

clamscan --verbose --archive-verbose -d /var/db/clamav/ /home/foo/2/\?\?\?\?\?\?\?_\?\?\?\?\?\?\?_\?i\?_18_08_2023\?_\?\?\?\?\?\?\?\?\?_\?\?\?\?\?\?\?\?\?\?\?.pdf\
Loading:    28s, ETA:   0s [========================>]   12.88M/12.88M sigs
Compiling:   5s, ETA:   0s [========================>]       42/42 tasks

Scanning /home/foo/2/???????_???????_?i?_18_08_2023?_?????????_???????????.pdf
Scanning /home/foo/2/???????_???????_?i?_18_08_2023?_?????????_???????????.pdf !ZIP:1_▒▒▒㭮▒_䠪▒▒▒_▒i▒_18_08_2023▒_▒▒▒▒▒▒▒▒▒_▒▒▒▒㢠▒▒▒.pdf.exe
/home/foo/2/???????_???????_?i?_18_08_2023?_?????????_???????????.pdf : Sanesecurity.Foxhole.Zip_pdf.UNOFFICIAL FOUND
/home/foo/2/???????_???????_?i?_18_08_2023?_?????????_???????????.pdf !(0)ZIP:1_▒▒▒㭮▒_䠪▒▒▒_▒i▒_18_08_2023▒_▒▒▒▒▒▒▒▒▒_▒▒▒▒㢠▒▒▒.pdf.exe: Sanesecurity.Foxhole.Zip_pdf.UNOFFICIAL FOUND

----------- SCAN SUMMARY -----------
Known viruses: 12878904
Engine version: 1.0.1
Scanned directories: 0
Scanned files: 1
Infected files: 1
Data scanned: 0.00 MB
Data read: 0.63 MB (ratio 0.00:1)
Time: 39.911 sec (0 m 39 s)
Start Date: 2023:08:29 14:30:45
End Date:   2023:08:29 14:31:25

Config file: clamd.conf

LogFile = "/var/log/clamav/clamd.log"
LogFileMaxSize = "268435456"
LogTime = "yes"
LogClean = "yes"
LogRotate = "yes"
ExtendedDetectionInfo = "yes"
PidFile = "/var/run/clamav/clamd.pid"
TemporaryDirectory = "/var/tmp"
LocalSocket = "/var/run/clamav/clamd.sock"
TCPSocket = "3310"
TCPAddr = "127.0.0.1"
MaxConnectionQueueLength = "100"
MaxThreads = "20"
SelfCheck = "3600"
ExitOnOOM = "yes"
User = "clamav"
DetectPUA = "yes"
AlertBrokenExecutables = "yes"
AlertEncryptedArchive = "yes"
AlertPartitionIntersection = "yes"
MaxFileSize = "52428800"

Config file: freshclam.conf

PidFile = "/var/run/clamav/freshclam.pid"
UpdateLogFile = "/var/log/clamav/freshclam.log"
DatabaseMirror = "database.clamav.net"

Config file: clamav-milter.conf

PidFile = "/var/run/clamav/clamav-milter.pid"
User = "clamav"
ClamdSocket = "unix:/var/run/clamav/clamd.sock"
MilterSocket = "/var/run/clamav/clmilter.sock"

Software settings

Version: 1.0.1
Optional features supported: MEMPOOL AUTOIT_EA06 BZIP2 LIBXML2 PCRE2 ICONV JSON RAR

Database information

Database directory: /var/db/clamav
[3rd Party] EK_BleedingLife.yar: 112 sigs
[3rd Party] Email_quota_limit_warning.yar: 31 sigs
[3rd Party] bofhland_malware_attach.hdb: 1836 sigs
[3rd Party] badmacro.ndb: 688 sigs
[3rd Party] spear.ndb: 1 sig
daily.cld: version 27015, sigs: 2040076, built on Tue Aug 29 10:39:45 2023
[3rd Party] CVE-2015-5119.yar: 22 sigs
[3rd Party] WShell_Drupalgeddon2_icos.yar: 26 sigs
[3rd Party] sigwhitelist.ign2: 16 sigs
[3rd Party] securiteinfohtml.hdb: 32966 sigs
[3rd Party] junk.ndb: 56489 sigs
main.cvd: version 62, sigs: 6647427, built on Thu Sep 16 15:32:42 2021
[3rd Party] ukrnet-archive.hsb: 9 sigs
[3rd Party] spearl.ndb: 1 sig
[3rd Party] lott.ndb: 2338 sigs
[3rd Party] rfxn.hdb: 13030 sigs
[3rd Party] CVE-2018-4878.yar: 39 sigs
[3rd Party] scam.ndb: 12922 sigs
[3rd Party] spamattach.hdb: 14 sigs
[3rd Party] securiteinfo.hdb: 49086 sigs
[3rd Party] CVE-2013-0422.yar: 25 sigs
[3rd Party] winnow.complex.patterns.ldb: 3 sigs
[3rd Party] phish.ndb: 29803 sigs
[3rd Party] foxhole_js.ndb: 4 sigs
[3rd Party] MiscreantPunch099-Low.ldb: 1199 sigs
[3rd Party] bofhland_cracked_URL.ndb: 40 sigs
[3rd Party] winnow_phish_complete_url.ndb: 53 sigs
[3rd Party] Sanesecurity_sigtest.yara: 54 sigs
[3rd Party] jurlbl.ndb: 19499 sigs
[3rd Party] bofhland_phishing_URL.ndb: 72 sigs
[3rd Party] CVE-2016-5195.yar: 40 sigs
[3rd Party] CVE-2010-0805.yar: 19 sigs
[3rd Party] bofhland_malware_URL.ndb: 4 sigs
[3rd Party] foxhole_filename.cdb: 3519 sigs
[3rd Party] ukrnet-archive.cdb: 12 sigs
[3rd Party] shelter.ldb: 61 sigs
[3rd Party] phishtank.ndb: 1 sig
[3rd Party] winnow_bad_cw.hdb: 1 sig
[3rd Party] porcupine.ndb: 2676 sigs
bytecode.cvd: version 334, sigs: 91, built on Wed Feb 22 23:33:21 2023
[3rd Party] hackingteam.hsb: 435 sigs
[3rd Party] whitelist.fp: 3081 sigs
[3rd Party] interserver256.hdb: 28766 sigs
[3rd Party] foxhole_generic.cdb: 214 sigs
[3rd Party] sanesecurity.ftm: 185 sigs
[3rd Party] securiteinfopdf.hdb: 3408 sigs
[3rd Party] CVE-2010-1297.yar: 20 sigs
[3rd Party] javascript.ndb: 10557 sigs
[3rd Party] malwarehash.hsb: 1031 sigs
[3rd Party] winnow.attachments.hdb: 1 sig
[3rd Party] spamimg.hdb: 222 sigs
[3rd Party] securiteinfoascii.hdb: 36181 sigs
[3rd Party] CVE-2015-1701.yar: 30 sigs
[3rd Party] securiteinfoandroid.hdb: 29652 sigs
[3rd Party] WShell_ASPXSpy.yar: 21 sigs
[3rd Party] interservertopline.db: 1141 sigs
[3rd Party] ukrnet.yara: 25 sigs
[3rd Party] Sanesecurity_spam.yara: 46 sigs
[3rd Party] CVE-2017-11882.yar: 66 sigs
[3rd Party] EMAIL_Cryptowall.yar: 52 sigs
[3rd Party] rfxn.ndb: 2053 sigs
[3rd Party] spam.ldb: 2 sigs
[3rd Party] scam.yar: 35 sigs
[3rd Party] foxhole_js.cdb: 48 sigs
[3rd Party] securiteinfo.ign2: 190 sigs
[3rd Party] ukrnet-hashes.hsb: 28 sigs
[3rd Party] malwarepatrol.db: 0 sig
[3rd Party] blurl.ndb: 2044 sigs
[3rd Party] winnow_extended_malware.hdb: 1 sig
[3rd Party] email_Ukraine_BE_powerattack.yar: 33 sigs
[3rd Party] winnow_malware.hdb: 1 sig
[3rd Party] winnow_spam_complete.ndb: 26 sigs
[3rd Party] jurlbla.ndb: 687 sigs
[3rd Party] Email_fake_it_maintenance_bulletin.yar: 29 sigs
[3rd Party] rogue.hdb: 5748 sigs
[3rd Party] ukrnet-archive.ndb: 306 sigs
[3rd Party] CVE-2015-2545.yar: 76 sigs
[3rd Party] CVE-2010-0887.yar: 22 sigs
[3rd Party] CVE-2018-20250.yar: 22 sigs
[3rd Party] porcupine.hsb: 216 sigs
[3rd Party] rfxn.yara: 11527 sigs
[3rd Party] winnow_malware_links.ndb: 133 sigs
[3rd Party] CVE-2013-0074.yar: 22 sigs
[3rd Party] CVE-2012-0158.yar: 27 sigs
[3rd Party] urlhaus.ndb: 4128 sigs
[3rd Party] CVE-2015-2426.yar: 49 sigs
[3rd Party] securiteinfoold.hdb: 3849866 sigs
[3rd Party] winnow_extended_malware_links.ndb: 1 sig
Total number of signatures: 12906759

Platform information

<Redacted for security>

Build information

Clang: 15.0.7 (https://github.com/llvm/llvm-project.git llvmorg-15.0.7-0-g8dfdcc7b7bf6) (4.2.1)
sizeof(void*) = 8
Engine flevel: 161, dconf: 161

@teoberi
Copy link
Contributor

teoberi commented Aug 30, 2023

$ file forged.zip
forged.zip: Zip archive data, at least v2.0 to extract

Maybe it would be interesting to know which version of unzip Clamav uses internally.

@devgs
Copy link
Author

devgs commented Aug 30, 2023

As far as I know, clamav doesn't even use a 'standard' libarchive. Nor any Zip libraries. I believe it has its own implementation.

Which is, actually the best, as any vulnerability in a third party library will delay the patch rollout: as it's easier to patch in a single place where you are responsible.

Also, libraries usually aim for the 'best outcome' for the client, not concerning themselves with what this can lead to (see "351470 extra bytes at beginning or within zipfile (attempting to process anyway)").

@teoberi
Copy link
Contributor

teoberi commented Aug 30, 2023

As far as I know, clamav doesn't even use a 'standard' libarchive. Nor any Zip libraries. I believe it has its own implementation.

In this case, the problem is in the implementation.

@devgs
Copy link
Author

devgs commented Aug 30, 2023

In this case, the problem is in the implementation.

That's a no-brainer. Otherwise, why creating an issue here, in the first place 🤷‍♂️?

@teoberi
Copy link
Contributor

teoberi commented Aug 31, 2023

Sophos Protection for Linux detect virus!
avscanner -a forged.zip

[11:34:54] Archive scanning enabled: yes

[11:34:54] Image scanning enabled: no

[11:34:54] PUA detection enabled: no

[11:34:54] Following symlinks: no

[11:34:54] Scanning /home/xxx/Downloads/forged.zip

[11:34:55] Detected "/home/xxx/Downloads/forged.zip/®¬¨«ª®¢¥_§ à å㢠­­ï.pdf /1_®¬¨«ª®¢¥_§ à å㢠­­ï.pdf.exe" is infected with Mal/Generic-S (On Demand)

[11:34:55] Detected "/home/xxx/Downloads/forged.zip/®¬¨«ª®¢¥_§ à å㢠­­ï.pdf /Pax_18_08_23.jpg" is infected with Mal/Generic-S (On Demand)

[11:34:55] End of Scan Summary:

[11:34:55] 1 file scanned in 1 second.

[11:34:55] 1 file out of 1 was infected.

[11:34:55] 2 Mal/Generic-S infections discovered.

@teoberi
Copy link
Contributor

teoberi commented Sep 1, 2023

Some explanations from the creator of 7-Zip, Igor Pavlov about the unzipping mechanisms.
https://sourceforge.net/p/sevenzip/discussion/45798/thread/99350b37eb/

@rsundriyal
Copy link
Contributor

@devgs Thanks for sharing the findings. Looking into it.

@val-ms
Copy link
Contributor

val-ms commented Sep 1, 2023

I see what you mean. And Igor is quite right.

ZIP archives have a central directory at the end of the file with a list of file records. They also have a recognizable "local file header" before each file. So yeah, 2 different ways to get to the files in a ZIP.

The local file header found in front of each file is really simplistic. So you can have false positives from time to time when looking for it. It's not great, but works okay.

AV's should definitely look for both the central directory, as well as look for the file records. And ClamAV has the logic to do both. However Clam only really indexes the file looking for the local file headers if there is no central directory, or if it failed to extract some portion of the files listed in the central directory. In your example, the central directory exists, but there are no file entries there. This is why Clam isn't extracting the embedded PDF.

It would be trivial for us to change this if() statement to look for file entries if there were no files found in the central directory:

if (0 < num_files_unzipped && num_files_unzipped <= (file_count / 4)) { /* FIXME: make up a sane ratio or remove the whole logic */
But that still leaves a gap in case there is 1 file listed in the central directory, and another file omitted.

However, if we always look for file entries in addition to parsing the central directory, then we'll extract every file twice, assuming it is a properly formatted file. That's terrible for performance.

What I think we should do search for local file headers every time, and then cross-reference them with our list of files found in the central directory. It won't be great for performance because it is indeed "the slow way + the fast way". But it seems necessary. We can then extract each file at the location it is found by either the central directory entry, or by the local file header, or both. I imagine we could also raise a heuristic alert if desired if the file was found by one or the other but not both AND correctly decompressed (a.k.a. a real file entry and not a false positive). That would raise an alert for files like the one you shared.

Clam already has some logic to index all the entries in the central directory, sort them by file offset (ascending), de-duplicate them, and alert if there are overlapping file entries.

So we'd just need to rewrite some of the logic of the while loop to index that list of sorted file entries and cross-reference them with the local file headers as they're found.

@Sanesecurity
Copy link

Sanesecurity commented Sep 1, 2023

Only to add, once the code is changed...

Do zip cdb sigs work on both local and central directories.

Will local file start at 1 and what about central directories?

I.e.

FilePos: file position in container (counting from 1); absolute value or
range

@val-ms
Copy link
Contributor

val-ms commented Sep 6, 2023

@Sanesecurity that is a good question.

@teoberi
Copy link
Contributor

teoberi commented Sep 11, 2023

What's going on, nothing for two week.

@val-ms
Copy link
Contributor

val-ms commented Sep 14, 2023

We have lots of things planned and in progress. Haven't dropped those to work on this. If anyone else wants to try to figure it out first, you're welcome to.
But also I have had to take a bunch of time off unexpectedly this week and will have to take off more in a couple weeks.

@candrews
Copy link
Contributor

Has any progress been made on this issue? I'm quite interested in seeing it resolved :)

@val-ms
Copy link
Contributor

val-ms commented Jan 16, 2024

I don't think anyone is working on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants