-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse each file only once #200
Parse each file only once #200
Conversation
val importedSchemaOrProtocols = importedFiles.flatMap(file => { | ||
val res = infile.getName.split("\\.").last match { | ||
case "avro" => | ||
val gdr = new GenericDatumReader[GenericRecord] | ||
val dfr = new DataFileReader(infile, gdr) | ||
val schemas = unUnion(dfr.getSchema) | ||
schemas.map(Left(_)) | ||
case "avsc" => | ||
val schemas = tryParse(infile, parser) | ||
schemas.map(Left(_)) | ||
case "avpr" => | ||
val protocol = Protocol.parse(infile) | ||
List(Right(protocol)) | ||
case "avdl" => | ||
val idlParser = new Idl(infile, classLoader) | ||
val protocol = idlParser.CompilationUnit() | ||
/** | ||
* IDLs may refer to types imported from another file. When converted | ||
* to protocols, the imported types that share the IDL's namespace | ||
* cannot be distinguished from types defined within the IDL, yet | ||
* should not be generated as subtypes of the IDL's ADT and should | ||
* instead be generated in its own namespace. So, strip the protocol | ||
* of all imported types and generate them separately. | ||
*/ | ||
val importedFiles = IdlImportParser.getImportedFiles(infile, classLoader) | ||
val importedSchemaOrProtocols = importedFiles.flatMap { file => | ||
if (!processedFiles.contains(file.getCanonicalPath)) { | ||
processedFiles += file.getCanonicalPath | ||
val importParser = new Parser() // else attempts to redefine schemas | ||
getSchemaOrProtocols(file, format, classStore, classLoader, importParser) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here the code potentially revisits the same file multiple times
Impressive speed-up, thanks I'm inclined to release this as |
Just found another one of this uncached recursive calls, but this is all I have for now. It cuts down 92% time vs original in my case. |
thanks for round 2 avrohugger 2.11.0 and sbt-avrohugger 2.11.0 are up on sonatype and synching to maven central now |
part of #199 |
With this change I was able to reduce compilation time by 85% vs the original (about 40% vs round 1).