Skip to content

Commit

Permalink
First tentative steps into reading
Browse files Browse the repository at this point in the history
  • Loading branch information
Greg Zoller committed Jan 26, 2024
1 parent 3a3001a commit facd89e
Show file tree
Hide file tree
Showing 17 changed files with 1,006 additions and 150 deletions.
68 changes: 40 additions & 28 deletions benchmark/README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# Performance

JSON serialization benchmarks I found in various repos often measured (IMO) silly things like how fast
a parser could handle a small list of Int. For this benchmark I used a more substantial model + JSON.
JSON serialization benchmarks I found in various project repos often measured (IMO) silly things like how fast
a parser could handle a small list of Int. For this benchmark I used a slightly more substantial model.
It's still a small model, but it does have some nested objects and collections that make it a more
realistic test.
interesting test.

The test is run via jmh, a common and accepted benchmarking tool. The JVM is **stock**--not tuned to
within an inch of its life, again to be a more realistic use case.
The test is run via jmh. The JVM is **stock**--not tuned to within an inch of its life, to be a more realistic
use case.

Run benchmark from the ScalaJack/benchmark directory (not the main ScalaJack project directory):
```
Expand Down Expand Up @@ -40,60 +40,72 @@ sbt "jmh:run -i 10 -wi 10 -f 2 -t 1 co.blocke.*"
| Argonaut | thrpt | 20 | 690269.697 | ± 6348.882 | ops/s |
| Play JSON | thrpt | 20 | 438650.022 | ± 23800.221 | ops/s |

**Note:** Exact numbers aren't terribly important--they will vary depending on the platform
**Note:** Exact numbers aren't terribly important--they may vary widely depending on the platform
used. The important thing is the relative relationship between libraries given all tests
were performed on the same platform.

### Interpretation

Performance for ScalaJack has been a journey. ScalaJack is a mature product, and while it
was once (a long time ago) quite fast vs its competition, its performance has lagged
considerably. ScalaJack 8 changes that!
Performance for ScalaJack has been a journey. ScalaJack is a mature product--over 10 yrs old,
can you believe it?. Long ago it was quite fast vs its competition. Over the years though, its
performance has lagged considerably, to the point that it was one of the slower serialization
libraries. ScalaJack 8 changes that!

I was sampling and testing against a collection of popular serializers for Scala util
something quite unexpected happend. When I tested Jsoniter, its performance was through
the roof! Even faster than hand-tooled code. This was a shock. I had to learn how this
worked.
the roof! It far outpaced all competitors for raw speed. This was a shock. I had to
learn how this worked.

So full credit where credit is due: ScalaJack 8's reading/writing codec architecture
is heavily derived from Jsoniter.
is heavily informed from Jsoniter, so I'll post their licence here:

[Jsoniter's License](https://github.com/plokhotnyuk/jsoniter-scala/blob/af23cf65a70d48834b8fecb792cc333b23409c6f/LICENSE)

There are a number of optimizations and design choices I elected not to bring over from
Jsoniter, and of course ScalaJack utilizes our own scala-reflection library to great effect.
Jsoniter, in many cases because ScalaJack doesn't need them for its intended feature set.
Of course ScalaJack utilizes our own macro-driven scala-reflection library to great effect,
which Jsoniter does not.

Jsoniter, it turns out, achieves its neck-breaking speed by going deep--very deep. They
use a lot of low level byte arrays and bitwise operators, much as you'd expect to see in
a C program, to improve on the standard library functions everyone else uses. It works.
Jsoniter achieves its neck-breaking speed by going deep--very deep into macro code
generation. They also use a lot of low level byte arrays and bitwise operators, much as you'd
expect to see in a C program, to improve on the standard library functions everyone else uses.
It works.

ScalaJack's focus is first and foremost to be frictionless--no drama to the user. ScalaJack requires
zero boilerplate--you can throw any Scala object (or even a Java object) at it with no pre-preparation
and it will serialize it. For its intended use-cases, ScalaJack offers excellent performance, equal
to or exceeding a number of widely-used alternative choices.
and it will serialize it. For its intended use-cases, out-of-the-box ScalaJack 8 offers excellent
performance, equal to or exceeding a number of widely-used alternative choices.

If you're willing to suffer just 1 single line of boilerplate, ScalaJack 8 will reward you with
speed that's in the top one or two of its class ("fast mode" in the results).

### Technical Notes

Achieving extreme speed for ScalaJack was weeks of learning, trial, error,
and re-writes. I studied Jsoniter, Circe, and ZIO Json, and others to learn optimizations.
Achieving extreme speed for ScalaJack 8 was several weeks of learning, trial, error,
and re-writes. I studied Jsoniter, Circe, ZIO Json, and others to learn optimizations.
The tough news for anyone wanting to duplicate this kind of performance in your own code
is that there isn't one magic trick to achieve maximum performance. It's a basket
of techniques, each achieving marginal gains that add up, and you must decide when enough
is enough. Here's a partial list of learnings incorporated into ScalaJack:
of techniques, each achieving small marginal gains that add up, and you must decide when
enough is enough for you. Here's a partial list of learnings incorporated into ScalaJack 8:

* Being careful when using .asInstanceOf[]... in fact try to avoid it wherever possible
as it messes up CPU cache harming performance. This means a lot of very careful type
as it messes up CPU cache, harming performance. This means a lot of very careful type
management, and its why you see the RTypeRefs from scala-reflection are now all typed
in the latest version

* Lots of specific typing. Don't make the computer think--provide detailed types wherever
* Lots of specific typing. Don't make the compiler think--provide detailed types wherever
you can

* For macro-based software like this--find every opportunity to do hard work at
compile-time

* Be mindful of what code your macros generate! You can paint by the numbers with quotes and
splices, like the documentaion and blogs suggest, and you'll get something working. When you
examine the code this produces, you may be disappointed. If it looks kludgy it will be slow--rework
your macros until the code is smooth. For fastest performance you'll actually have to generate
custom functions as shown in ScalaJack's code (look at JsonCodecMaker.scala)
splices, like the documentaion and blogs suggest, and you will get something working.
When you examine the code a "stock" macro use produces, you may be disappointed
if ultimate runtime speed is your goal. Then generated code might look a litle kludgy, and
it will not necessarily be speed optimized. Rework your macros carefully until the generated code
is as smooth as you might write by hand. Remember: your macro code doesn't have to win awards for
style or beauty--your generated code does! For the fastest performance you'll actually have
to generate custom functions as shown in ScalaJack's code (look at JsonCodecMaker.scala) This
isn't for the faint of heart. If it all looks like Greek, step back and layer yourself into
macros slowly a piece at a time.
10 changes: 6 additions & 4 deletions benchmark/build.sbt
Original file line number Diff line number Diff line change
Expand Up @@ -36,15 +36,17 @@ lazy val benchmark = project
libraryDependencies ++= Seq(
"org.playframework" %% "play-json" % "3.0.1",
"io.argonaut" %% "argonaut" % "6.3.9",
"co.blocke" %% "scalajack" % "fc0b25_unknown",
"co.blocke" %% "scala-reflection" % "sj_fixes_edbef8",
"co.blocke" %% "scalajack" % "3a3001_unknown",
"co.blocke" %% "scala-reflection" % "sj_fixes_f43af7",
"dev.zio" %% "zio-json" % "0.6.1",
"org.typelevel" %% "fabric-core" % "1.12.6",
"org.typelevel" %% "fabric-io" % "1.12.6",
"org.typelevel" %% "jawn-parser" % "1.3.2",
"org.typelevel" %% "jawn-ast" % "1.3.2",
"com.github.plokhotnyuk.jsoniter-scala" %% "jsoniter-scala-core" % "2.24.4",
"com.github.plokhotnyuk.jsoniter-scala" %% "jsoniter-scala-macros" % "2.24.4" % "compile-internal",
"com.github.plokhotnyuk.jsoniter-scala" %% "jsoniter-scala-core" % "2.24.5-SNAPSHOT",
"com.github.plokhotnyuk.jsoniter-scala" %% "jsoniter-scala-macros" % "2.24.5-SNAPSHOT" % "compile-internal",
// "com.github.plokhotnyuk.jsoniter-scala" %% "jsoniter-scala-core" % "2.24.4",
// "com.github.plokhotnyuk.jsoniter-scala" %% "jsoniter-scala-macros" % "2.24.4" % "compile-internal",
// "io.circe" %% "circe-derivation" % "0.15.0-M1",
// "io.circe" %% "circe-jackson29" % "0.14.0",
// "org.json4s" %% "json4s-jackson" % "4.0.4",
Expand Down
12 changes: 6 additions & 6 deletions benchmark/src/main/scala/co.blocke/Benchmark.scala
Original file line number Diff line number Diff line change
Expand Up @@ -43,12 +43,12 @@ trait HandTooledWritingBenchmark {
@BenchmarkMode(Array(Mode.Throughput))
@OutputTimeUnit(TimeUnit.SECONDS)
class ReadingBenchmark
// extends CirceZ.CirceReadingBenchmark
extends ScalaJackZ.ScalaJackReadingBenchmark
// with JsoniterZ.JsoniterReadingBenchmark
// with ZIOZ.ZIOJsonReadingBenchmark
// with PlayZ.PlayReadingBenchmark
// with ArgonautZ.ArgonautReadingBenchmark
extends CirceZ.CirceReadingBenchmark
with ScalaJackZ.ScalaJackReadingBenchmark
with JsoniterZ.JsoniterReadingBenchmark
with ZIOZ.ZIOJsonReadingBenchmark
with PlayZ.PlayReadingBenchmark
with ArgonautZ.ArgonautReadingBenchmark

@State(Scope.Thread)
@BenchmarkMode(Array(Mode.Throughput))
Expand Down
24 changes: 21 additions & 3 deletions benchmark/src/main/scala/co.blocke/Run.scala
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,30 @@ object RunMe extends App:
// deriveEncoder[Record]
// }

co.blocke.scalajack.internal.CodePrinter.code {
given codec: JsonValueCodec[Record] = JsonCodecMaker.make
}
// import co.blocke.scalajack.*
// import ScalaJack.*
// implicit val blah: ScalaJack[Record] = sj[Record]
// println(ScalaJack[Record].fromJson(jsData))


// co.blocke.scalajack.internal.CodePrinter.code {
// given codec: JsonValueCodec[Record] = JsonCodecMaker.make
// }
// given codec: JsonValueCodec[Record] = JsonCodecMaker.make
// println(readFromString[Record](jsData))

// println(writeToString(record))

import com.github.plokhotnyuk.jsoniter_scala.core._
import com.github.plokhotnyuk.jsoniter_scala.macros._

given codec: JsonValueCodec[Record] = JsonCodecMaker.make
println(readFromString[Record](jsData))
// }

// trait JsoniterWritingBenchmark{
// @Benchmark
// def writeRecordJsoniter = writeToString(record)
// }

println("\nDone")
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
package co.blocke.scalajack.json;
package co.blocke.scalajack.json.exp;

import java.lang.invoke.MethodHandles;
import java.lang.invoke.VarHandle;
Expand Down
23 changes: 17 additions & 6 deletions src/main/scala/co.blocke.scalajack/ScalaJack.scala
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,15 @@ import scala.quoted.*
import quoted.Quotes
import json.*

case class ScalaJack[T](jsonCodec: JsonCodec[T]): // extends JsonCodec[T] //with YamlCodec with MsgPackCodec
def fromJson(js: String): T = // Either[JsonParseError, T] =
jsonCodec.decodeValue(reading.JsonSource(js))

val out = writing.JsonOutput() // let's clear & re-use JsonOutput--avoid re-allocating all the internal buffer space
def toJson(a: T): String =
jsonCodec.encodeValue(a, out.clear())
out.result
/*
case class ScalaJack[T](jsonDecoder: reading.JsonDecoder[T], jsonEncoder: JsonCodec[T]): // extends JsonCodec[T] //with YamlCodec with MsgPackCodec
def fromJson(js: String): Either[JsonParseError, T] =
jsonDecoder.decodeJson(js)
Expand All @@ -15,6 +24,7 @@ case class ScalaJack[T](jsonDecoder: reading.JsonDecoder[T], jsonEncoder: JsonCo
def toJson(a: T): String =
jsonEncoder.encodeValue(a, out.clear())
out.result
*/

// ---------------------------------------

Expand All @@ -27,20 +37,21 @@ object ScalaJack:
def sjImpl[T: Type](using Quotes): Expr[ScalaJack[T]] =
import quotes.reflect.*
val classRef = ReflectOnType[T](quotes)(TypeRepr.of[T], true)(using scala.collection.mutable.Map.empty[TypedName, Boolean])
val jsonDecoder = reading.JsonReader.refRead(classRef)
val jsonEncoder = writing.JsonCodecMaker.generateCodecFor(classRef, JsonConfig)
// val jsonDecoder = reading.JsonReader.refRead2(classRef)
// println(s"Decoder: ${jsonDecoder.show}")
val jsonCodec = writing.JsonCodecMaker.generateCodecFor(classRef, JsonConfig)

'{ ScalaJack($jsonDecoder, $jsonEncoder) }
'{ ScalaJack($jsonCodec) }

// ----- Use given JsonConfig
inline def sj[T](inline cfg: JsonConfig): ScalaJack[T] = ${ sjImplWithConfig[T]('cfg) }
def sjImplWithConfig[T: Type](cfgE: Expr[JsonConfig])(using Quotes): Expr[ScalaJack[T]] =
import quotes.reflect.*
val cfg = summon[FromExpr[JsonConfig]].unapply(cfgE)
val classRef = ReflectOnType[T](quotes)(TypeRepr.of[T], true)(using scala.collection.mutable.Map.empty[TypedName, Boolean])
val jsonDecoder = reading.JsonReader.refRead(classRef)
val jsonEncoder = writing.JsonCodecMaker.generateCodecFor(classRef, cfg.getOrElse(JsonConfig))
'{ ScalaJack($jsonDecoder, $jsonEncoder) }
// val jsonDecoder = reading.JsonReader.refRead2(classRef)
val jsonCodec = writing.JsonCodecMaker.generateCodecFor(classRef, cfg.getOrElse(JsonConfig))
'{ ScalaJack($jsonCodec) }

// refRead[T](classRef)

Expand Down
9 changes: 4 additions & 5 deletions src/main/scala/co.blocke.scalajack/json/JsonCodec.scala
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,13 @@ package co.blocke.scalajack
package json

import writing.*
import reading.*

trait JsonCodec[A] {

// TBD... when we're ready to tackle reading!
// def decodeValue(in: JsonReader, default: A): A = ${
// if (cfg.encodingOnly) '{ ??? }
// else genReadVal(rootTpe :: Nil, 'default, cfg.isStringified, false, 'in)
// }
// def decodeValue(in: JsonReader, default: A): A =
// ${ genReadVal(rootTpe :: Nil, 'default, cfg.isStringified, false, 'in) }

def encodeValue(in: A, out: JsonOutput): Unit
def decodeValue(in: JsonSource): A
}
18 changes: 10 additions & 8 deletions src/main/scala/co.blocke.scalajack/json/JsonError.scala
Original file line number Diff line number Diff line change
@@ -1,21 +1,23 @@
package co.blocke.scalajack
package json

class JsonIllegalKeyType(msg: String) extends Throwable(msg)
class JsonNullKeyValue(msg: String) extends Throwable(msg)
class JsonUnsupportedType(msg: String) extends Throwable(msg)
class JsonConfigError(msg: String) extends Throwable(msg)
class JsonEitherLeftError(msg: String) extends Throwable(msg)
import scala.util.control.NoStackTrace

class ParseError(val msg: String) extends Throwable(msg):
class JsonIllegalKeyType(msg: String) extends Throwable(msg) with NoStackTrace
class JsonNullKeyValue(msg: String) extends Throwable(msg) with NoStackTrace
class JsonUnsupportedType(msg: String) extends Throwable(msg) with NoStackTrace
class JsonConfigError(msg: String) extends Throwable(msg) with NoStackTrace
class JsonEitherLeftError(msg: String) extends Throwable(msg) with NoStackTrace

class ParseError(val msg: String) extends Throwable(msg) with NoStackTrace:
val show: String = ""

// Thrown at compile-time only!
case class JsonTypeError(override val msg: String) extends ParseError(msg):
case class JsonTypeError(override val msg: String) extends ParseError(msg) with NoStackTrace:
override val show: String = ""

// Thrown at runtime only!
case class JsonParseError(override val msg: String, context: reading.JsonSource) extends ParseError(msg + " at position " + context.pos):
case class JsonParseError(override val msg: String, context: reading.JsonSource) extends ParseError(msg + " at position " + context.pos) with NoStackTrace:
override val show: String =
val js = context.js.toString
val (clip, dashes) = context.pos match {
Expand Down
Loading

0 comments on commit facd89e

Please sign in to comment.