Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#193: Drop support of Spark 2.4 #196

Merged
merged 7 commits into from
May 22, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
98 changes: 39 additions & 59 deletions build.sbt
Original file line number Diff line number Diff line change
Expand Up @@ -14,41 +14,39 @@
* limitations under the License.
*/

import Dependencies._
import JacocoSetup._
import sbt.Keys.name
import sbt.*
import Dependencies.*
import Dependencies.Versions.spark3
import VersionAxes.*

ThisBuild / organization := "za.co.absa.atum-service"
sonatypeProfileName := "za.co.absa"

ThisBuild / scalaVersion := Versions.scala213 // default version
ThisBuild / scalaVersion := Setup.scala213 // default version

ThisBuild / versionScheme := Some("early-semver")

Global / onChangedBuildSource := ReloadOnSourceChanges

publish / skip := true

lazy val printSparkScalaVersion = taskKey[Unit]("Print Spark and Scala versions for atum-service is being built for.")
lazy val printScalaVersion = taskKey[Unit]("Print Scala versions for atum-service is being built for.")
initialize := {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the assignment necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest I don't know. This code block is a remnant of my tries to keep Spakr 2.4 support. And I found it interesting, so kept it for now, with the intention to ask you what you think to have that information (of the active Java version) there.
Just forgot to comment accordingly.
So what do you think, should we keep the info (with the commented out cod removed)?

Copy link
Collaborator

@salamonpavel salamonpavel May 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's useful to have a log statement with the Java version in use. See an alternative implementation below.

val checkJavaVersion = taskKey[Unit]("Check Java version")

checkJavaVersion := {
  val javaVersionUsed = VersionNumber(sys.props("java.specification.version"))
  streams.value.log.info(s"Running on Java version $javaVersionUsed")
}

// Run the task when the project is loaded
Global / onLoad := (Global / onLoad).value.andThen(state => "checkJavaVersion" :: state)
(base) AB032LP@LFGQVV4V6D atum-service % sbt compile
[info] welcome to sbt 1.9.7 (Amazon.com Inc. Java 11.0.22)
[info] loading global plugins from /Users/AB032LP/.sbt/1.0/plugins
[info] loading settings for project atum-service-build-build-build from metals.sbt ...
[info] loading project definition from /Users/AB032LP/IdeaProjects/atum-service/project/project/project
[info] loading settings for project atum-service-build-build from metals.sbt ...
[info] loading project definition from /Users/AB032LP/IdeaProjects/atum-service/project/project
[success] Generated .bloop/atum-service-build-build.json
[success] Total time: 1 s, completed 16 May 2024, 10:05:28
[info] loading settings for project atum-service-build from metals.sbt,plugins.sbt ...
[info] loading project definition from /Users/AB032LP/IdeaProjects/atum-service/project
[success] Generated .bloop/atum-service-build.json
[success] Total time: 1 s, completed 16 May 2024, 10:05:29
[info] loading settings for project atum-service from build.sbt,publish.sbt ...
[info] set current project to atum-service (in build file:/Users/AB032LP/IdeaProjects/atum-service/)
[info] Running on Java version 11
[success] Total time: 0 s, completed 16 May 2024, 10:05:38
[info] Executing in batch mode. For better performance use sbt's shell
[info] Building atum-database with Scala 2.13.11
[info] Building atum-model with Scala 2.12.18
[info] Building atum-server with Scala 2.13.11
[info] Building atum-model with Scala 2.13.11
[info] Building atum-agent with Spark 3.3.2, Scala 2.12.18
[info] Building atum-agent with Spark 3.3.2, Scala 2.13.11
[success] Total time: 4 s, completed 16 May 2024, 10:05:42

val _ = initialize.value // Ensure previous initializations are run

val requiredJavaVersion = VersionNumber("17")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is currently not applied, but in general, can you explain the intended requirement for Java 17?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hope the explanation above make it cleared. This would be removed for sure (or maybe used in the information 🤔 )

val current = VersionNumber(sys.props("java.specification.version"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider renaming to something like javaVersion or javaVersionInUse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If code kept, will rename.

// Assert that the JVM meets the minimum required version, for example, Java 17
//assert(specVersion.toDouble >= 17, "Java 17 or above is required to run this project.")
println(s"Running on Java version $current")
}

lazy val commonSettings = Seq(
scalacOptions ++= Seq("-unchecked", "-deprecation", "-feature", "-Xfatal-warnings"),
scalacOptions ++= Setup.commonScalacOptions,
Test / parallelExecution := false,
jacocoExcludes := jacocoProjectExcludes()
jacocoExcludes := JacocoSetup.jacocoProjectExcludes()
)

val serverMergeStrategy = assembly / assemblyMergeStrategy := {
case PathList("META-INF", "services", xs @ _*) => MergeStrategy.filterDistinctLines
case PathList("META-INF", "maven", "org.webjars", "swagger-ui", "pom.properties") => MergeStrategy.singleOrError
case PathList("META-INF", "resources", "webjars", "swagger-ui", _*) => MergeStrategy.singleOrError
case PathList("META-INF", _*) => MergeStrategy.discard
case PathList("META-INF", "versions", "9", xs@_*) => MergeStrategy.discard
case PathList("module-info.class") => MergeStrategy.discard
case "application.conf" => MergeStrategy.concat
case "reference.conf" => MergeStrategy.concat
case _ => MergeStrategy.first
}

enablePlugins(FlywayPlugin)
flywayUrl := FlywayConfiguration.flywayUrl
Expand All @@ -58,86 +56,68 @@ flywayLocations := FlywayConfiguration.flywayLocations
flywaySqlMigrationSuffixes := FlywayConfiguration.flywaySqlMigrationSuffixes
libraryDependencies ++= flywayDependencies

/**
* Module `server` is the service application that collects and stores measured data And upo request retrives them
*/
lazy val server = (projectMatrix in file("server"))
.settings(
commonSettings ++ Seq(
name := "atum-server",
libraryDependencies ++= Dependencies.serverDependencies ++ testDependencies,
javacOptions ++= Seq("-source", "11", "-target", "11", "-Xlint"),
scalacOptions ++= Seq("-release", "11", "-Ymacro-annotations"),
javacOptions ++= Setup.serviceJavacOptions,
Compile / packageBin / publishArtifact := false,
printScalaVersion := {
val log = streams.value.log
log.info(s"Building ${name.value} with Scala ${scalaVersion.value}")
},
(Compile / compile) := ((Compile / compile) dependsOn printScalaVersion).value,
packageBin := (Compile / assembly).value,
artifactPath / (Compile / packageBin) := baseDirectory.value / s"target/${name.value}-${version.value}.jar",
testFrameworks += new TestFramework("zio.test.sbt.ZTestFramework"),
jacocoReportSettings := jacocoSettings(scalaVersion.value, "atum-server"),
serverMergeStrategy
Setup.serverMergeStrategy
): _*
)
.enablePlugins(AssemblyPlugin)
.enablePlugins(AutomateHeaderPlugin)
.jvmPlatform(scalaVersions = Seq(Versions.serviceScalaVersion))
.singleRow(Setup.serviceScalaVersion, Dependencies.serverDependencies)
.dependsOn(model)

/**
* Module `agent` is the library to be plugged into the Spark application to measure the data and send it to the server
*/
lazy val agent = (projectMatrix in file("agent"))
.settings(
commonSettings ++ Seq(
name := "atum-agent",
javacOptions ++= Seq("-source", "1.8", "-target", "1.8", "-Xlint"),
libraryDependencies ++= jsonSerdeDependencies ++ testDependencies ++ Dependencies.agentDependencies(
if (scalaVersion.value == Versions.scala211) Versions.spark2 else Versions.spark3,
scalaVersion.value
),
printSparkScalaVersion := {
val log = streams.value.log
val sparkVer = sparkVersionForScala(scalaVersion.value)
log.info(s"Building ${name.value} with Spark $sparkVer, Scala ${scalaVersion.value}")
},
(Compile / compile) := ((Compile / compile) dependsOn printSparkScalaVersion).value,
jacocoReportSettings := jacocoSettings(scalaVersion.value, "atum-agent")
javacOptions ++= Setup.clientJavacOptions
): _*
)
.jvmPlatform(scalaVersions = Versions.clientSupportedScalaVersions)
.sparkRow(SparkVersionAxis(spark3), Setup.clientSupportedScalaVersions, Dependencies.agentDependencies)
.dependsOn(model)

/**
* Module `mode` is the data model for data exchange with server
*/
lazy val model = (projectMatrix in file("model"))
.settings(
commonSettings ++ Seq(
name := "atum-model",
javacOptions ++= Seq("-source", "1.8", "-target", "1.8", "-Xlint"),
libraryDependencies ++= jsonSerdeDependencies ++ testDependencies ++ Dependencies.modelDependencies(scalaVersion.value),
printScalaVersion := {
val log = streams.value.log
log.info(s"Building ${name.value} with Scala ${scalaVersion.value}")
},
(Compile / compile) := ((Compile / compile) dependsOn printScalaVersion).value,
jacocoReportSettings := jacocoSettings(scalaVersion.value, "atum-agent: model")
javacOptions ++= Setup.clientJavacOptions,
): _*
)
.jvmPlatform(scalaVersions = Versions.clientSupportedScalaVersions)
.scalasRow(Setup.clientSupportedScalaVersions, Dependencies.modelDependencies)

/**
* Module `database` is the source of database structures of the service
*/
lazy val database = (projectMatrix in file("database"))
.settings(
commonSettings ++ Seq(
name := "atum-database",
printScalaVersion := {
val log = streams.value.log
log.info(s"Building ${name.value} with Scala ${scalaVersion.value}")
},
libraryDependencies ++= Dependencies.databaseDependencies,
(Compile / compile) := ((Compile / compile) dependsOn printScalaVersion).value,
javacOptions ++= Setup.serviceJavacOptions,
test := {}
): _*
)
.jvmPlatform(scalaVersions = Seq(Versions.serviceScalaVersion))
.singleRow(Setup.serviceScalaVersion, Dependencies.databaseDependencies)

//----------------------------------------------------------------------------------------------------------------------
lazy val dbTest = taskKey[Unit]("Launch DB tests")

dbTest := {
println("Running DB tests")
(database.jvm(Versions.serviceScalaVersion) / Test / test).value
(database.jvm(Setup.serviceScalaVersion) / Test / test).value
}
73 changes: 31 additions & 42 deletions project/Dependencies.scala
Original file line number Diff line number Diff line change
Expand Up @@ -13,20 +13,14 @@
* limitations under the License.
*/

import sbt._
import sbt.*

object Dependencies {

object Versions {
val spark2 = "2.4.8"
val spark3 = "3.3.2"

val scala211 = "2.11.12"
val scala212 = "2.12.18"
val scala213 = "2.13.11"

val serviceScalaVersion: String = scala213
val clientSupportedScalaVersions: Seq[String] = Seq(scala211, scala212, scala213)

val scalatest = "3.2.15"
val scalaMockito = "1.17.12"
Expand Down Expand Up @@ -54,6 +48,14 @@ object Dependencies {

val json4s_spark2 = "3.5.3"
val json4s_spark3 = "3.7.0-M11"
def json4s(scalaVersion: String): String = {
Versions.truncateVersion(scalaVersion, 2) match {
case "2.11" => json4s_spark2
case "2.12" => json4s_spark3
case "2.13" => json4s_spark3
case _ => throw new IllegalArgumentException("Only Scala 2.11, 2.12, and 2.13 are currently supported.")
}
}

val logback = "1.2.3"

Expand All @@ -72,41 +74,21 @@ object Dependencies {
val awssdk = "2.23.15"

val scalaNameof = "4.0.0"
}


private def truncateVersion(version: String, parts: Int): String = {
version.split("\\.").take(parts).mkString(".")
}

def getVersionUpToMinor(version: String): String = {
truncateVersion(version, 2)
}

def getVersionUpToMajor(version: String): String = {
truncateVersion(version, 1)
}
def truncateVersion(version: String, parts: Int): String = {
version.split("\\.").take(parts).mkString(".")
}

// this is just for the compile-depended printing task
def sparkVersionForScala(scalaVersion: String): String = {
truncateVersion(scalaVersion, 2) match {
case "2.11" => Versions.spark2
case "2.12" => Versions.spark3
case "2.13" => Versions.spark3
case _ => throw new IllegalArgumentException("Only Scala 2.11, 2.12, and 2.13 are currently supported.")
def getVersionUpToMinor(version: String): String = {
truncateVersion(version, 2)
}
}

def json4sVersionForScala(scalaVersion: String): String = {
truncateVersion(scalaVersion, 2) match {
case "2.11" => Versions.json4s_spark2
case "2.12" => Versions.json4s_spark3
case "2.13" => Versions.json4s_spark3
case _ => throw new IllegalArgumentException("Only Scala 2.11, 2.12, and 2.13 are currently supported.")
def getVersionUpToMajor(version: String): String = {
truncateVersion(version, 1)
}
}

def testDependencies: Seq[ModuleID] = {
private def testDependencies: Seq[ModuleID] = {
lazy val scalatest = "org.scalatest" %% "scalatest" % Versions.scalatest % Test
lazy val mockito = "org.mockito" %% "mockito-scala" % Versions.scalaMockito % Test

Expand All @@ -116,8 +98,8 @@ object Dependencies {
)
}

def jsonSerdeDependencies: Seq[ModuleID] = {
val json4sVersion = json4sVersionForScala(Versions.scala212)
private def jsonSerdeDependencies(scalaVersion: String): Seq[ModuleID] = {
val json4sVersion = Versions.json4s(scalaVersion)

lazy val jacksonModuleScala = "com.fasterxml.jackson.module" %% "jackson-module-scala" % Versions.jacksonModuleScala

Expand Down Expand Up @@ -209,12 +191,13 @@ object Dependencies {
zioTestSbt,
zioTestJunit,
sbtJunitInterface
)
) ++
testDependencies
}

def agentDependencies(sparkVersion: String, scalaVersion: String): Seq[ModuleID] = {
val sparkMinorVersion = getVersionUpToMinor(sparkVersion)
val scalaMinorVersion = getVersionUpToMinor(scalaVersion)
val sparkMinorVersion = Versions.getVersionUpToMinor(sparkVersion)
val scalaMinorVersion = Versions.getVersionUpToMinor(scalaVersion)

lazy val sparkCore = "org.apache.spark" %% "spark-core" % sparkVersion % Provided
lazy val sparkSql = "org.apache.spark" %% "spark-sql" % sparkVersion % Provided
Expand All @@ -238,14 +221,20 @@ object Dependencies {
sttp,
logback,
nameOf
)
) ++
testDependencies
}

def modelDependencies(scalaVersion: String): Seq[ModuleID] = {
lazy val specs2core = "org.specs2" %% "specs2-core" % Versions.specs2 % Test
lazy val typeSafeConfig = "com.typesafe" % "config" % Versions.typesafeConfig

Seq(specs2core, typeSafeConfig)
Seq(
specs2core,
typeSafeConfig
) ++
testDependencies ++
jsonSerdeDependencies(scalaVersion)
}

def databaseDependencies: Seq[ModuleID] = {
Expand Down
9 changes: 5 additions & 4 deletions project/JacocoSetup.scala
Original file line number Diff line number Diff line change
Expand Up @@ -24,16 +24,17 @@ object JacocoSetup {
private val jacocoReportCommonSettings: JacocoReportSettings = JacocoReportSettings(
formats = Seq(JacocoReportFormats.HTML, JacocoReportFormats.XML)
)

private def now: String = {
val utcDateTime = ZonedDateTime.now.withZoneSameInstant(ZoneId.of("UTC"))
s"as of ${DateTimeFormatter.ofPattern("yyyy-MM-dd hh:mm Z z").format(utcDateTime)}"
}

def jacocoSettings(sparkVersion: String, scalaVersion: String, moduleName: String): JacocoReportSettings = {
val utcDateTime = ZonedDateTime.now.withZoneSameInstant(ZoneId.of("UTC"))
val now = s"as of ${DateTimeFormatter.ofPattern("yyyy-MM-dd hh:mm Z z").format(utcDateTime)}"
jacocoReportCommonSettings.withTitle(s"Jacoco Report on `$moduleName` for spark:$sparkVersion - scala:$scalaVersion [$now]")
}

def jacocoSettings(scalaVersion: String, moduleName: String): JacocoReportSettings = {
val utcDateTime = ZonedDateTime.now.withZoneSameInstant(ZoneId.of("UTC"))
val now = s"as of ${DateTimeFormatter.ofPattern("yyyy-MM-dd hh:mm Z z").format(utcDateTime)}"
jacocoReportCommonSettings.withTitle(s"Jacoco Report on `$moduleName` for scala:$scalaVersion [$now]")
}

Expand Down
62 changes: 62 additions & 0 deletions project/Setup.scala
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
import sbt.Keys.javacOptions
import sbt.TaskKey
import sbtassembly.AssemblyKeys.assemblyMergeStrategy
import sbtassembly.AssemblyPlugin.autoImport
import sbtassembly.AssemblyPlugin.autoImport.{MergeStrategy, assembly}
import sbtassembly.PathList

/*
* Copyright 2024 ABSA Group Limited
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/


object Setup {

//supported Scala versions
// val scala212 = Version.asSemVer("2.12.18") TODO
// val scala213 = Version.asSemVer("2.13.11")
val scala212 = "2.12.18"
val scala213 = "2.13.11"

val serviceScalaVersion: String = scala213
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could align the naming as we sometimes use server and sometimes service. I would suggest to use server everywhere given it's a name of the module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, the original idea had been) (discussed bakc than with @lsulak ) that:
server = REST server
service = all parts of the running service (REST server + database)
If it's not consistent somewhere though, definitely wrong.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I comprehend the logic behind it. However, without prior knowledge, it might not be immediately clear. Hence, I would suggest being more descriptive, for instance, using a name like "serverAndDatabaseScalaVersion".

val clientSupportedScalaVersions: Seq[String] = Seq(scala212, scala213)

val commonScalacOptions: Seq[String] = Seq("-unchecked", "-deprecation", "-feature", "-Xfatal-warnings")

val serviceJavacOptions: Seq[String] = Seq("-source", "11", "-target", "11", "-Xlint")
val serviceScalacOptions: Seq[String] = Seq("-release", "11", "-Ymacro-annotations")

val clientJavacOptions: Seq[String] = Seq("-source", "1.8", "-target", "1.8", "-Xlint")
def clientScalacOptions(scalaVersion: String): Seq[String] = {
if (scalaVersion == scala213) {
Seq("-release", "8", "-Ymacro-annotations")
} else {
Seq("-target", "8", "-release", "8")
}
}
// val clientScalacOptions: Seq[String] = Seq("-Ymacro-annotations")

val serverMergeStrategy = assembly / assemblyMergeStrategy := {
case PathList("META-INF", "services", xs @ _*) => MergeStrategy.filterDistinctLines
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This xs @ _* is marked by IntelliJ with warning. (I just blindly moved it from build.sbt. What does it actually means @salamonpavel, please?

Copy link
Collaborator

@salamonpavel salamonpavel May 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xs is an alias, it's not needed here as we do not reference the alias and could be removed and the _* expands the elements of the list

case PathList("META-INF", "maven", "org.webjars", "swagger-ui", "pom.properties") => MergeStrategy.singleOrError
case PathList("META-INF", "resources", "webjars", "swagger-ui", _*) => MergeStrategy.singleOrError
case PathList("META-INF", _*) => MergeStrategy.discard
case PathList("META-INF", "versions", "9", xs@_*) => MergeStrategy.discard
case PathList("module-info.class") => MergeStrategy.discard
case "application.conf" => MergeStrategy.concat
case "reference.conf" => MergeStrategy.concat
case _ => MergeStrategy.first
}
}
Loading
Loading