-
Notifications
You must be signed in to change notification settings - Fork 66
Sync Maven data to graph #2048
Comments
Fixed a typo @msrb :) |
@sivaavkd description updated, is it better now? |
Carrying forward from - #1085 (comment) Some intermittent failures encountered in Graph layer. This seems to be happening very sparsely:
|
Figured out the cause of above error from gremlin server logs:
Essentially the queries are not formed should factor in special characters like newlines correctly in existing code here: drop_props.append('declared_licenses')
prp_version += " ".join(["ver.property('declared_licenses', '{}');".format
(dl) for dl in declared_licenses])
# Create License Node and edge from EPV
for lic in declared_licenses:
prp_version += "lic = g.V().has('lname', '{lic}').tryNext().orElseGet{{" \
"graph.addVertex('vertex_label', 'License', 'lname', '{lic}', " \
"'last_updated',{last_updated})}}; g.V(ver).out(" \
"'has_declared_license').has('lname', '{lic}').tryNext()." \
"orElseGet{{ver.addEdge('has_declared_license', lic)}};".format(
lic=lic, last_updated=str(time.time())
) Happens for this package
|
Notice the newline. |
Last time I checked the progress of Maven graph sync, it went till package named |
@miteshvp can we query graph for exact numbers of Maven packages/components please? |
In the first pass, less than
There could be some packages skipped due to Gateway Timeouts. We can sync those once first pass is complete. |
First pass is complete. Many of the Maven packages failed to sync in graph due to multiple reasons:
I have scheduled the sync of those (pending) packages again now. |
Thanks Saleem 👍 I've created #2256 for improving test coverage in data-importer. |
Late reply but here it is - there are total |
@miteshvp How did you figure this out ? |
Maven is in graph, closing. Thanks @tuxdna 😉 |
Description
There were times when data ingestion pipeline was broken or certain parts of the pipeline were disabled. During such times, we analyzed plenty of packages and stored results in S3, but never ingested the data to graph database. With #1085 implemented, we want to sync all missing data from S3 to graph.
Acceptance criteria
The text was updated successfully, but these errors were encountered: