-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add schema naming strategy for Kafka Connect #126
base: master
Are you sure you want to change the base?
Add schema naming strategy for Kafka Connect #126
Conversation
import org.apache.avro.Schema; | ||
|
||
@Slf4j | ||
public class AWSSchemaNamingStrategyConnectNameImpl implements AWSSchemaNamingStrategy { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for submitting this PR!
Few comments:
- In my opinion this is also useful for use-cases beyond Kafka connect. We can drop 'Connect' from the class name. Also, we support multiple data-formats like JSON / Protobuf (future) but logic here seems to be specific to Avro, I suggest either of the following suggestions:
1.1 Add a TODO / issue and a note that this needs to be improved to support other data formats. I can take up the improvement later. Or
1.2 Feel free to implement it for JSON format if you can. :)
Name suggestion: RecordSchemaNameNamingStrategy
or better.
- Can we also support Avro SpecificRecord?
- In order to have predictable behavior, can we throw an exception if
fullName
is null or it's not an Avro record. - Were you able to test this end-to-end from serializer to de-serializer? It would be great if you can add a unit-test.
Thanks again for the PR!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for review and superb suggestions!
- I will rename it to AWSSchemaNamingStrategyRecordSchemaNameImpl
Unfortunately it seems to me that JSON Schema converter ignores value of both key.converter.schemaNameGenerationClass and value.converter.schemaNameGenerationClass parameters. So I'm unable to overload default implementation, unfortunately. - Yes, this works with key.converter.avroRecordType=SPECIFIC_RECORD and value.converter.avroRecordType=SPECIFIC_RECORD
- Will be done.
- Currently I'm testing in my Oracle RDBMS source and sink connector environment. I will try to backport some record examples for use in end-to-end unit test.
Regards,
Aleksei
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Thanks! We can drop
Impl
from the name. - Should the code be updated as well? From the code, it seems like it only works if the passed object is
GenericRecord
.
Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR @averemee-si !
Would it be possible for you to implement the comments and resubmit this PR ?
Hi @blacktooth @averemee-si |
Hi Vitaly, This naming strategy is also available in separate repo - aws-glue-schema-naming Regards, |
@averemee-si Do you want to update this PR with suggested feedback? |
Issue #, if available: 125
#125
Description of changes:
Added a new implementation of AWSSchemaNamingStrategy that makes AWS Glue Schema names same as Kafka Connect schema names. Default AWSSchemaNamingStrategy implementation makes schema names same as topic name and this cause strange naming and versioning in AWS Glue Schema Registry for Kafka Connect records created using both key and value schemas.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.