Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider supporting uploads to Kafka schema registry #27

Open
julianpeeters opened this issue Sep 1, 2016 · 7 comments
Open

Consider supporting uploads to Kafka schema registry #27

julianpeeters opened this issue Sep 1, 2016 · 7 comments

Comments

@julianpeeters
Copy link
Owner

draft courtesy of @xelax: https://gitter.im/julianpeeters/avrohugger?at=57c732e129ee4a6705812df9

@mariussoutier
Copy link
Contributor

So this would be an explicit command? Should we also store the schema version or would only the registry do that?

@julianpeeters
Copy link
Owner Author

Yeah, I suppose that it would mirror the downloadSchemas task proposed in #26.

@julianpeeters
Copy link
Owner Author

Not sure about storing locally tho. Nor I am I currently well-versed on version through the registry. I like "simple", but maybe it will be a technical necessity? I'm also not sure if the task should be manual or tacked on to compile to be automatic, and that decision might also determine if local storage is necessary.

@xelax
Copy link

xelax commented Sep 1, 2016

the way I am using it to upload a schema after I generated it from my .idl file,
java -jar ~/Downloads/avro-tools-1.8.0.jar idl2schemata src/main/avro-schema/my.idl somedir
cp somedir/MainClass.avsc src/main/avro/.
and then upload the content of src/main/avro to the schema-server.

that points out that another useful piece would be to add the transformation idl -> avsc to the methods of sbt-avrohugger.

the schema version is automatically assigned by the registry.

@mariussoutier
Copy link
Contributor

So the schema registry only reads avsc?

In a team setting, how do you control which schema (latest in version control) corresponds to which version in the registry? That's why I asked about storing the version locally as well. Especially when we also want to support downloading schemas off the registry.

I'd also be interested in how the versions are handled later on, when you read a message off Kafka, how do you know which version the schema adheres to?

@xelax
Copy link

xelax commented Sep 1, 2016

Yes, the schema server wants AVSC, that is the reason why it would be really interesting if I could support the transformations via sbt-avrohugger:

I edit the .idl
sbt converts .idl to .avsc
sbt publish it to the schema server.

the way it works is that each kafka topic is associated to a specific schema and schema id by the schema server. The id is globally unique (i.e. the id matches a schema and a version) and each message contains the schema id that has been encoded with. But reading from the queue works in this way:
the deserializer talks to the schema server to find out the current schema associated with the topic and uses it to deserialize the object: the schemas are supposed to evolve in compatible manner, so an old message will be promoted to the latest version of the schema, even if it was written using a previous iteration.

In my projects I keep separate writers from readers: when i want to use a topic I use my extension to sbt-avrohugger to re-generate the scala classes from the schema server every time I rebuild the project, I do not rely on the idl file that I maintain in the writer project: they communicate only though the schema server.

@xelax
Copy link

xelax commented Sep 6, 2016

I also implemented a task that converts avro IDL to AVSC and copies the top class to the source directory:

def idlToSchemata(input: String, tmpDir: String, mainClass: String) = {
   println(s"converting $input to AVSC")
   import collection.JavaConverters._
   (new org.apache.avro.tool.IdlToSchemataTool).run(System.in, System.out, System.err, List(input, tmpDir).asJava)
   IO.copyFile(new File(s"tt/$mainClass.avsc"), new File(s"src/main/avro/$mainClass.avsc"))
}

lazy val generateIdl = taskKey[Unit]("generate the idl")

generateIdl := {
   idlToSchemata("src/main/avro-schema/my.idl", "tt", "MyMainClass")
}

this bring in a dependency on "org.apache.avro" % "avro-tools" % "1.8.1",

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants