Skip to content

Commit

Permalink
checkpoint
Browse files Browse the repository at this point in the history
  • Loading branch information
Greg Zoller committed Nov 15, 2023
1 parent ee5da35 commit caaa5e3
Show file tree
Hide file tree
Showing 42 changed files with 1,956 additions and 2,593 deletions.
24 changes: 23 additions & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
MIT License
## MIT License

Copyright (c) 2020 Greg Zoller

Expand All @@ -19,3 +19,25 @@ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

## ZIO-Json Attribution

Parts of JSON reading software were used either directly or derived from the
[ZIO-Json project](https://github.com/zio/zio-json)
licensed under the [Apache License, Version 2.0](http://www.apache.org/licenses/LICENSE-2.0).
In most cases, "derived" means removal of features not needed by ScalaJack or
other changes needed to adapt the code to the ScalaJack ecosystem. Regardless,
they are credited here as materially the same as those in the ZIO-Json project.

The files used directly or derived are:
* FastStringBuilder.scala
* FieldKeyDecoder.scala
* JsonDecoder.scala
* JsonParser.scala
* JsonReader.scala
* Numbers.scala
* StringMatrix.scala

The terms, privileges, and restrictions provided by the Apache License 2.0
fully apply to these files, where these differ from the MIT license, which
applies to the rest of ScalaJack's code.
68 changes: 47 additions & 21 deletions benchmark/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,37 +10,63 @@ of its life, again to be a more realistic use case.

Run benchmark from the ScalaJack/benchmark directory (not the main ScalaJack project directory):
```
sbt "jmh:run -i 10 -wi 10 -f 2 -t 1 co.blocke.WritingBenchmark"
sbt "jmh:run -i 10 -wi 10 -f 2 -t 1 co.blocke.*"
```

## Reading Performance:

| Benchmark | Mode | Count | Score | Error | Units |
|------------------|-------|-------:|----------------:|-------------:|-------|
| Jsoniter | thrpt | 20 | 987991.329 | ± 6645.992 | ops/s |
| **ScalaJack 8** | thrpt | 20 | **633764.943**| ± 10394.860 | ops/s |
| ZIOJson | thrpt | 20 | 586716.228 | ± 2542.783 | ops/s |
| Circe | thrpt | 20 | 266568.198 | ± 5695.754 | ops/s |
| Play | thrpt | 20 | 207737.560 | ± 842.108 | ops/s |
| Argonaut | thrpt | 20 | 197876.777 | ± 11181.751 | ops/s |

## Writing Performance:

| Benchmark | Mode | Count | Score | Error | Units |
|------------------|-------|-------:|----------------:|-------------:|-------|
| Hand-Tooled | thrpt | 20 | 2,575,393.513 | ± 178731.952 | ops/s |
| Circe | thrpt | 20 | 1,939,339.085 | ± 6279.547 | ops/s |
|**ScalaJack 8** | thrpt | 20 | **176,867,514.557** | ± 12260.518 | ops/s |
| ZIO JSON | thrpt | 20 | 818,228.736 | ± 3070.298 | ops/s |
| Argonaut | thrpt | 20 | 716,228.404 | ± 6241.145 | ops/s |
| Play JSON | thrpt | 20 | 438,538.475 | ± 16319.198 | ops/s |
| ScalaJack 7 | thrpt | 20 | 106,292.338 | ± 330.111 | ops/s |
| Jsoniter | thrpt | 20 | 2843150.452 | ± 21478.503 | ops/s |
| Hand-Tooled | thrpt | 20 | 2732571.374 | ± 15129.007 | ops/s |
| Circe | thrpt | 20 | 1958244.437 | ± 23965.817 | ops/s |
|**ScalaJack 8** | thrpt | 20 | **1729426.328** | ± 4484.721 | ops/s |
| ZIO JSON | thrpt | 20 | 794352.301 | ± 32336.852 | ops/s |
| Argonaut | thrpt | 20 | 690269.697 | ± 6348.882 | ops/s |
| Play JSON | thrpt | 20 | 438650.022 | ± 23800.221 | ops/s |

### Interpretation

The Hand-Tooled case is creating JSON manually in code. I included it to show the presumed
upper-bound of achievable performance. No serializer, with whatever logic it must do, can be faster
than hand-tooled code that is hard-wired in its output and requires zero logic.
Performance for ScalaJack has been... a journey. As I've explored the population of serializers
available for Scala, of which this is a sample of popular choices, I've observed "generations"
of designs. ScalaJack 8 has grown through each successive generation.

Focusing originally on write performance, my original design was very attuned to the internal
structure of ScalaJack. The code was clean and beautiful--and slow! I was able to get a write
score of only 30-50000, vs Circe, my original benchmark, which was just under 2 million. Not a
good showing. After an extensive overhaul and re-think, performance peaked at the 1.7 million
mark, which I was happy with. That put ScalaJack ahead of everyone else except Circe. Then
something unexpected happened...

I tried Jsoniter, and was dumbstruck when it outperformed hand-tooled code, which I expected
would be a natural theoretical maximum write performance. How can a full serializer, having
whatever logic, be *faster* than hand-tooled code with zero logic?! This breakout level of
performance for Jsoniter continued on the read tests, being roughly 4x faster than most and
2x the level of ZIO-Json, the previous front-runner. How?!

We see in these results that both Circe and ScalaJack are very close in performance--close to each
other and very close to the performance of hand-tooled code.
I observed the older serializers processed JSON to/from an AST and used conventional JSON
parsing techniques; basically fortified editions of a simple JSON parser. ZIO-Json's
impressive read performance wasn't achieved by any one thing, but rather a collection of well-
applied techniques, including *not* using an intermediate AST. So naturally I incorporated
some of ZIO-Json's approach (and a bit of their code), stripped, refitted, and adapted to
ScalaJack, and read performance jumped to 633K. Nice!

Circe is the gold standard for JSON serializers due to its many features, excellent performance, and
widespread adoption. The one cost Circe imposes is the same one virtually all other serializers
require: boilerplate must be provided to define encoders/decoders to aid the serializion. Circe's
boilerplate is actually not terrible. Others require a fair bit of extra code per class serialized.
Jsoniter, it turns out, achieves its neck-breaking speed by going deep--very deep. They
use a lot of low level byte arrays and bitwise operators, much as you'd expect in a C program,
to improve on the standard library functions everyone else uses. It works.

ScalaJack's focus is first and foremost to be frictionless--no drama to the user. The very slight
difference in maximal performance is a worthy expense--its still blazing fast. ScalaJack requires
ScalaJack's focus is first and foremost to be frictionless--no drama to the user. ScalaJack requires
zero boilerplate--you can throw any Scala object (or even a Java object) at it with no pre-preparation
and it will serialize it. You'll notice the enormous performange improvement ScalaJack 8 has over
ScalaJack 7, due to moving everything possible into compile-time macros for speed.
and it will serialize it. For its intended use-cases, ScalaJack offers performance equal
to, or exceeding, several widely-used alternative choices.
2 changes: 2 additions & 0 deletions benchmark/build.sbt
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ lazy val benchmark = project
"org.typelevel" %% "fabric-io" % "1.12.6",
"org.typelevel" %% "jawn-parser" % "1.3.2",
"org.typelevel" %% "jawn-ast" % "1.3.2",
"com.github.plokhotnyuk.jsoniter-scala" %% "jsoniter-scala-core" % "2.24.4",
"com.github.plokhotnyuk.jsoniter-scala" %% "jsoniter-scala-macros" % "2.24.4" % "compile-internal",
// "io.circe" %% "circe-derivation" % "0.15.0-M1",
// "io.circe" %% "circe-jackson29" % "0.14.0",
// "org.json4s" %% "json4s-jackson" % "4.0.4",
Expand Down
6 changes: 6 additions & 0 deletions benchmark/src/main/scala/co.blocke/Argonaut.scala
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,12 @@ object ArgonautZ:
implicit val CodecRecord: CodecJson[Record] =
casecodec4(Record.apply, (a: Record) => Option((a.person, a.hobbies, a.friends, a.pets)))("person", "hobbies", "friends", "pets")


trait ArgonautReadingBenchmark {
@Benchmark
def readRecordArgonaut = Parse.decodeEither[Record](jsData)
}

trait ArgonautWritingBenchmark {
@Benchmark
def writeRecordArgonaut = record.asJson
Expand Down
32 changes: 13 additions & 19 deletions benchmark/src/main/scala/co.blocke/Benchmark.scala
Original file line number Diff line number Diff line change
Expand Up @@ -43,24 +43,24 @@ trait HandTooledWritingBenchmark {
@BenchmarkMode(Array(Mode.Throughput))
@OutputTimeUnit(TimeUnit.SECONDS)
class ReadingBenchmark
// extends CirceZ.CirceReadingBenchmark
extends ScalaJackZ.ScalaJackReadingBenchmark
// extends ZIOZ.ZIOJsonReadingBenchmark
// extends PlayZ.PlayReadingBenchmark
// extends FabricZ.FabricReadingBenchmark
// extends JawnZ.JawnReadingBenchmark
extends CirceZ.CirceReadingBenchmark
with ScalaJackZ.ScalaJackReadingBenchmark
with JsoniterZ.JsoniterReadingBenchmark
with ZIOZ.ZIOJsonReadingBenchmark
with PlayZ.PlayReadingBenchmark
with ArgonautZ.ArgonautReadingBenchmark

@State(Scope.Thread)
@BenchmarkMode(Array(Mode.Throughput))
@OutputTimeUnit(TimeUnit.SECONDS)
class WritingBenchmark
// extends CirceZ.CirceWritingBenchmark
extends ScalaJackZ.ScalaJackWritingBenchmark
// with HandTooledWritingBenchmark
// with ArgonautZ.ArgonautWritingBenchmark
// with PlayZ.PlayWritingBenchmark
// with ZIOZ.ZIOJsonWritingBenchmark

extends HandTooledWritingBenchmark
with CirceZ.CirceWritingBenchmark
with ScalaJackZ.ScalaJackWritingBenchmark
with JsoniterZ.JsoniterWritingBenchmark
with ZIOZ.ZIOJsonWritingBenchmark
with PlayZ.PlayWritingBenchmark
with ArgonautZ.ArgonautWritingBenchmark

// "Old-New" ScalaJack
// [info] Benchmark Mode Cnt Score Error Units
Expand All @@ -75,9 +75,3 @@ class WritingBenchmark
// Jawn (parse only + AST) 336384.617
// ScalaJack JsonParser3 (parse only + AST) 279456.523
// Fabric (new!) (parse only + AST) 270706.567




// SJ StringBuffer : 1740040.225
// SJ FastStringBuffer :
17 changes: 0 additions & 17 deletions benchmark/src/main/scala/co.blocke/Fabric.scala

This file was deleted.

24 changes: 24 additions & 0 deletions benchmark/src/main/scala/co.blocke/Fabric.scalax
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
package co.blocke

import org.openjdk.jmh.annotations._

object FabricZ:
import fabric.*
import fabric.io.*
import fabric.rw.*

implicit val rw: RW[Record] = RW.gen

trait FabricReadingBenchmark{
@Benchmark
def readRecordFabric = rw.write(JsonParser(jsData, Format.Json))
//JsonParser(jsData, Format.Json)
}

trait FabricWritingBenchmark{
@Benchmark
def writeRecordFabric = rw.read(record)
}

// No Fabric write test. Fabric has a different model... not simple serialization.
// Kinda like a query captured and compiled.
17 changes: 0 additions & 17 deletions benchmark/src/main/scala/co.blocke/Jawn.scala

This file was deleted.

17 changes: 17 additions & 0 deletions benchmark/src/main/scala/co.blocke/Jawn.scalax
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
package co.blocke

import org.openjdk.jmh.annotations._

object JawnZ:

import org.typelevel.jawn.ast.*

trait JawnReadingBenchmark{
@Benchmark
def readRecordJawn = JParser.parseFromString(jsData)
}

trait JawnWritingBenchmark {
@Benchmark
def writeRecordJawn = record.asJson
}
19 changes: 19 additions & 0 deletions benchmark/src/main/scala/co.blocke/Jsoniter.scala
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
package co.blocke

import org.openjdk.jmh.annotations._

object JsoniterZ:

import com.github.plokhotnyuk.jsoniter_scala.core._
import com.github.plokhotnyuk.jsoniter_scala.macros._

given codec: JsonValueCodec[Record] = JsonCodecMaker.make
trait JsoniterReadingBenchmark{
@Benchmark
def readRecordJsoniter = readFromString[Record](jsData)
}

trait JsoniterWritingBenchmark{
@Benchmark
def writeRecordJsoniter = writeToString(record)
}
15 changes: 6 additions & 9 deletions benchmark/src/main/scala/co.blocke/Run.scala
Original file line number Diff line number Diff line change
@@ -1,16 +1,13 @@
package co.blocke

import com.github.plokhotnyuk.jsoniter_scala.core._
import com.github.plokhotnyuk.jsoniter_scala.macros._

object RunMe extends App:

import ZIOZ.*
import zio.json._
import co.blocke.scalajack.*
given codec: JsonValueCodec[Record] = JsonCodecMaker.make
println(readFromString[Record](jsData))

val f = jsData.fromJson[Record]
println(f)
println(writeToString(record))

println("\n---------")
println(ScalaJack.write(f))

println("ZIO Decoder (Address): "+DeriveJsonDecoder.gen[Address])
println("\nDone")
48 changes: 12 additions & 36 deletions src/main/scala/co.blocke.scalajack/ScalaJack.scala
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,12 @@ package co.blocke.scalajack
import co.blocke.scala_reflection.{RTypeRef, TypedName}
import co.blocke.scala_reflection.reflect.ReflectOnType
import co.blocke.scala_reflection.reflect.rtypeRefs.ClassRef
import parser.ParseError
import scala.collection.mutable.{HashMap, Map}
import scala.quoted.*
import quoted.Quotes
import json.*

object ScalaJack:
object sj: // Shorter and "lighter" than "ScalaJack" everywhere.

inline def write[T](a: T)(using cfg: JsonConfig = JsonConfig()): String = ${ writeImpl[T]('a, 'cfg) }

Expand All @@ -27,38 +26,15 @@ object ScalaJack:
def readImpl[T: Type](js: Expr[String], cfg: Expr[JsonConfig])(using q: Quotes): Expr[Either[ParseError, T]] =
import quotes.reflect.*

val classRef = ReflectOnType[T](quotes)(TypeRepr.of[T], true)(using scala.collection.mutable.Map.empty[TypedName, Boolean])

val instruction = JsonReader.refRead[T](classRef)

'{
val foo = json2.JsonReader($js)
var c = 0
while c != json2.BUFFER_EXCEEDED do c = foo.read()
Right(null.asInstanceOf[T])
try {
val classRef = ReflectOnType[T](quotes)(TypeRepr.of[T], true)(using scala.collection.mutable.Map.empty[TypedName, Boolean])
val decoder = JsonReader.refRead[T](classRef)
'{
$decoder.decodeJson($js)
}
} catch {
case t: Throwable =>
val error = Expr(t.getClass.getName())
val msg = Expr(t.getMessage())
'{ Left(ParseError($error + " was thrown with message " + $msg)) }
}

// '{
// val parser = JsonParser2($js, $cfg)
// parser.parse($instruction).asInstanceOf[Either[ParseError, T]]
// }

// -------------------->>>> OLD <<----------------------

// // Used to trap SelfRef's from going into endless loops and causing Stack Overflow.
// val seenBeforeFnCache = HashMap.empty[Expr[TypedName], Expr[(JsonConfig, JsonParser) => Either[ParseError, ?]]]

// val parserE = '{ new JsonParser($js, $cfg) }
// JsonReader().refRead(classRef, parserE, cfg)(using quotes, Type.of[T])(using seenBeforeFnCache)

// ---------------

// val fn = JsonReader().readerFn[T](classRef)(using quotes, Type.of[T])(using seenBeforeFnCache)
// val listifiedCache = Expr.ofList(seenBeforeFnCache.toList.map(t => Expr.ofTuple(t)))

// '{ // run-time
// val parser = JsonParser($js, $listifiedCache.toMap)
// $fn($cfg, parser) match
// case Right(v) => v
// case Left(t) => throw t
// }
Loading

0 comments on commit caaa5e3

Please sign in to comment.