Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set Gzip compression level to Deflater.BEST_COMPRESSION #12691

Merged
merged 4 commits into from
Dec 22, 2024

Conversation

touhidurrr
Copy link
Contributor

@touhidurrr touhidurrr commented Dec 21, 2024

resolves #12685

@touhidurrr touhidurrr changed the title Set Gzip compression level to Deflater.BEST_COMPRESSION Set Gzip compression level to Deflater.BEST_COMPRESSION Dec 21, 2024
@SomeTroglodyte
Copy link
Collaborator

Shorter and wouldn't change the IDE's icon for the file to "mixed":

Index: core/src/com/unciv/ui/screens/savescreens/Gzip.kt
<+>UTF-8
===================================================================
diff --git a/core/src/com/unciv/ui/screens/savescreens/Gzip.kt b/core/src/com/unciv/ui/screens/savescreens/Gzip.kt
--- a/core/src/com/unciv/ui/screens/savescreens/Gzip.kt	(revision b0bf22e6d149ac4a39dc7ab67d50e97ac098d66a)
+++ b/core/src/com/unciv/ui/screens/savescreens/Gzip.kt	(date 1734797380009)
@@ -5,6 +5,7 @@
 import java.io.ByteArrayInputStream
 import java.io.ByteArrayOutputStream
 import java.io.InputStreamReader
+import java.util.zip.Deflater
 import java.util.zip.GZIPInputStream
 import java.util.zip.GZIPOutputStream
 
@@ -15,7 +16,11 @@
 
     private fun compress(data: String): ByteArray {
         val bos = ByteArrayOutputStream(data.length)
-        val gzip = GZIPOutputStream(bos)
+        val gzip = object : GZIPOutputStream(bos) {
+            init {
+                def.setLevel(Deflater.BEST_COMPRESSION)
+            }
+        }
         gzip.write(data.toByteArray())
         gzip.close()
         val compressed = bos.toByteArray()

Don't get me wrong - not a criticism, just a matter of taste.

@SomeTroglodyte
Copy link
Collaborator

Also, since the difference between BEST_COMPRESSION and DEFAULT_COMPRESSION isn't obvious from digging into the Deflater source:
https://stackoverflow.com/questions/16485573/what-exactly-is-default-compression /cmp: https://www.zlib.net/manual.html
Synopsis: The pre-PR state using defaults defaulted to 6 (out of 1..9) a decade ago and still does, best compromise between speed and savings.

@touhidurrr
Copy link
Contributor Author

Noob question as I am new to kotlin.

-        val gzip = GZIPOutputStream(bos)
+        val gzip = object : GZIPOutputStream(bos) {
+            init {
+                def.setLevel(Deflater.BEST_COMPRESSION)
+            }
+        }

how does it work?

@yairm210 yairm210 merged commit 997648e into yairm210:master Dec 22, 2024
4 checks passed
@yairm210
Copy link
Owner

I was far too hasty merging this
Since we're designed to run on low-end devices, we REALLY need to compare time vs compression for these two options
It's entirely possible we're causing users to wait twice as long for like a 5% decrease in file size, this isn't free real estate

@yairm210
Copy link
Owner

Reverting for the deployment

yairm210 added a commit that referenced this pull request Dec 22, 2024
@SomeTroglodyte
Copy link
Collaborator

how does it work

You originally did an explicit subclass, that's an anonymous subclass. The object notation is better known for use as container for statics (or even as namespace kludge), but there's really no difference - object Foo { ... } creates a single global instance of an anonymous subclass of Any. There's tons of online docs, but refer to https://kotlinlang.org/docs/object-declarations.html when you read one of the simpler tutorials.

The other difference is default constructor vs. secondary constructor. Android Studio might even have an inspection warning that your original notation can be simplified. ... yes it does:
image

But I was negligent and thus wrong as well - the simplest form by far would be: val gzip = GZIPOutputStream(bos, Deflater.BEST_COMPRESSION) - not a notation or language thing but a thing of what the std library has prepared for us or reading documentation. 🙈 sorry.

@touhidurrr
Copy link
Contributor Author

touhidurrr commented Dec 22, 2024

val gzip = GZIPOutputStream(bos, Deflater.BEST_COMPRESSION)

How did I miss this though... I am pretty sure I read through all constructors of GZIPOutputStream before writing this. -_-

@SomeTroglodyte
Copy link
Collaborator

SomeTroglodyte commented Dec 22, 2024

... aaannd that's a buffer size not a level, so I was wrong AGAIN

Edit: "pretty sure I read through all constructors of GZIPOutputStream before writing this" - looks like you did!
Also: I got nicely dipped 🐕 in that 💩 by the unit test below

@SomeTroglodyte
Copy link
Collaborator

Dumb test using the biggest old save I had (2.3M uncompressed):
Deflate level 1: 268877 bytes, 53.085977ms
Deflate level 2: 249772 bytes, 44.123684ms
Deflate level 3: 241189 bytes, 42.973530ms
Deflate level 4: 218127 bytes, 52.882007ms
Deflate level 5: 200665 bytes, 59.617448ms
Deflate level 6: 190941 bytes, 82.739471ms
Deflate level 7: 187107 bytes, 112.158701ms
Deflate level 8: 183676 bytes, 159.312677ms
Deflate level 9: 183240 bytes, 152.431860ms

Patch to produce such measurements
Index: core/src/com/unciv/ui/screens/savescreens/Gzip.kt
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/core/src/com/unciv/ui/screens/savescreens/Gzip.kt b/core/src/com/unciv/ui/screens/savescreens/Gzip.kt
--- a/core/src/com/unciv/ui/screens/savescreens/Gzip.kt	(revision b0bf22e6d149ac4a39dc7ab67d50e97ac098d66a)
+++ b/core/src/com/unciv/ui/screens/savescreens/Gzip.kt	(date 1734876690913)
@@ -5,17 +5,24 @@
 import java.io.ByteArrayInputStream
 import java.io.ByteArrayOutputStream
 import java.io.InputStreamReader
+import java.util.zip.Deflater
 import java.util.zip.GZIPInputStream
 import java.util.zip.GZIPOutputStream
+import org.jetbrains.annotations.VisibleForTesting
 
 object Gzip {
 
     fun zip(data: String): String = encode(compress(data))
     fun unzip(data: String): String  = decompress(decode(data))
 
-    private fun compress(data: String): ByteArray {
+    @VisibleForTesting
+    fun compress(data: String, compressionLevel: Int = Deflater.DEFAULT_COMPRESSION): ByteArray {
         val bos = ByteArrayOutputStream(data.length)
-        val gzip = GZIPOutputStream(bos)
+        val gzip = object : GZIPOutputStream(bos) {
+            init {
+                def.setLevel(compressionLevel)
+            }
+        }
         gzip.write(data.toByteArray())
         gzip.close()
         val compressed = bos.toByteArray()
Index: tests/src/com/unciv/testing/CompressionTests.kt
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/tests/src/com/unciv/testing/CompressionTests.kt b/tests/src/com/unciv/testing/CompressionTests.kt
new file mode 100644
--- /dev/null	(date 1734876958011)
+++ b/tests/src/com/unciv/testing/CompressionTests.kt	(date 1734876958011)
@@ -0,0 +1,30 @@
+package com.unciv.testing
+
+
+import com.unciv.ui.screens.savescreens.Gzip
+import java.io.File
+import org.junit.Ignore
+import org.junit.Test
+import org.junit.runner.RunWith
+import kotlin.time.TimeSource
+import kotlin.time.measureTimedValue
+
+@RunWith(GdxTestRunner::class)
+class CompressionTests {
+    @Test
+    @RedirectOutput(RedirectPolicy.Show)
+    //@Ignore
+    fun compareCompressionLevels() {
+        val gameFile = File("../../tests/assets/bigsave")
+        println(gameFile.canonicalPath)
+        val testData = gameFile.readText()
+
+        Gzip.compress(testData) // warm up
+        for (level in 1..9) {
+            val result = TimeSource.Monotonic.measureTimedValue { 
+                Gzip.compress(testData, level)
+            }
+            println("Deflate level $level: ${result.value.size} bytes, ${result.duration}")
+        }
+    }
+}

Copy any save to your liking (or just any file at all, the test does not parsing) to /tests/assets/bigsave and run the test... Of course this shouldn't go into prod without at least an @Ignore annotation.

So for this random sample we can expect +84% CPU time for -4% bandwidth gain... I'd have expected a bit better. Of course it's true CPU is somewhat cheaper than metered bandwidth, but... You weigh it.

@touhidurrr
Copy link
Contributor Author

@SomeTroglodyte, read #12685 (comment). Already did a similar benchmark there for this topic. I covered compression / decompression times and size saving vs level 1 etc in there. Also set iterations to 100 for each test and took the average.

@touhidurrr touhidurrr deleted the gzip-level-9 branch December 23, 2024 07:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Suggestion: Use maximum Gzip compression level (level 9) for uploading jsons to the server.
3 participants