-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split Titan and Gremlin - impossible to scale #10
Comments
Totally agree with having only one TitanGraph instance. |
My understanding (may be completely wrong) is that Gremlin is just a thin wrapper sitting in front of Titan. So, ideally, we would like to be able to scale (also) Titan, as titan is the component that does all the heavy lifting. Can we scale Titan as well? |
If I understand the overall architecture correctly, gremlin also performs query strategy evaluation and path traversal computations to optimize queries. So we would probably save something if we scale Gremlin (besides possible connection reuse?).
I think this will not be doable - based on the configuration section, it is not possible to let two or more Titans talk to the same graph - see http://titan.thinkaurelius.com/wikidoc/0.3.1/Graph-Configuration.html |
Do we have current issues documented (for ingestion onto Gremlin layer.. ) ? |
What kind of issues we getting into when more workers / threads / containers run in parallel and ingest onto Gremlin ? Any plan action can be based on this. |
We wanted to scale Gremlin (now we can scale only Gremlin+Titan as it is one container) when we were on heavy load. It took quiet a while to store/retrieve data from the graph database. I will need to check communication parameters or find a bottleneck there. Also on UI level there are done retries as it takes time to get data. Currently the main issue is the data_model importer which is an API that simply pushes data into the graph. There was a plan to remove this API service (save one container in deployment) and directly let workers to communicate with Gremlin. From my perspective there could be a nice opportunity to write a really small library that would help us with gremlin communication, serializing query results and constructing queries (it could directly use schemas we maintain for task results) - it could be used on worker side and on API side to utilize work with the graph. This will also address other concerns we currently have. |
We are using Titan 1.0.0. This document relates to pretty old version of Titan. |
Which mode are we using in production for Titan ? multiple/ single item model ? |
@krishnapaparaju Do you mean the SINGLE vs MULTI as documented here in the document below:? https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Tools.TitanDB.BestPractices.html As per my understanding of the DynamoDB Titan plugin, gremlin is using MULTI by default. |
@krishnapaparaju - yes we are using multi item model. |
Once these issues are resolved, we might want to move to JanusGraph (again with AWS plugin) , this would protect our prior investments both at Gremlin (code) and DynamoDB (deployment). Don't think we need to be doing anything related to JanusGraph anytime soon, this is more of FYI for moving to a DB which is actively maintained and very minimal rework. https://github.com/awslabs/dynamodb-janusgraph-storage-backend |
|
@miteshvp thanks. Can start with experimenting with SINGLE model (on a different instance) to check if we get better on WRITE side. Good to know that changing to SINGLE would not lead to any READ side challenges. Based on these results from experiment, can plan course of action. |
closing this issue w.r.t #11 |
As the current stack looks like the following:
DynamoDB - Titan - Gremlin
We would like to split Titan and Gremlin into two standalone containers which would allow us to scale Gremlin independently. Note that creating multiple Titan instances talking to the same DynamoDB tables results in data inconsistencies and data corruptions as stated in "Titan Limitations":
Source: http://titan.thinkaurelius.com/wikidoc/0.3.1/Titan-Limitations.html
The text was updated successfully, but these errors were encountered: