-
Notifications
You must be signed in to change notification settings - Fork 1
Replacing FUSE with a notification system #339
Conversation
docker/bootstrap/lega.sh
Outdated
@@ -204,87 +150,48 @@ services: | |||
- POSTGRES_USER=${DB_USER} | |||
- POSTGRES_PASSWORD=${DB_PASSWORD} | |||
- POSTGRES_DB=lega | |||
- PGDATA=/ega/data | |||
hostname: db | |||
container_name: db | |||
image: postgres:latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should fix postgres to 9.6
.. we even mention this in the docs: https://localega.readthedocs.io/en/latest/db.html?highlight=9.6
Also replace other mentions of postgres:latest
This PR also adds a stage (stages run in parallel according to Stefan) where we run the simple Makefile. That makefile has been adapted so that I ran some numbers:
I don't know how you get 1GB for Apache Mina, it seems to be stable roughly around 345MB. The upload speed is not very good, mostly because we are using the default driver for the docker volumes (which are...on files). Normally, the speed should increase with better volume solutions. Let us know what you get in your respective deployment. OpenSSH seems 50% faster than Apache Mina. |
@silverdaz would be awesome if the history would not contain so many commits with "weird" (but funny) messages - the last 16 commit can and should be squashed.
It is according to: https://docs.travis-ci.com/user/build-stages#what-are-build-stages
Got the information from Openshift (also my instance has been up for 22 days and this grew over time): Would like to monitor the inbox solutions over time > 10 days of uptime, don't know the uptime on the "Memory Consumption" numbers illustrated, and probably load tests should be implemented to stress out those solutions, but again this is not the subject of this issue and scalability is not address at this moment, as we have no indication of any usage expectations.
We observed this behaviour in Kubernetes deployment as well (with attached volume, even the data is still there, the ids get overwritten), and would be more comfortable to say it is solved if we were to implement some sort of UUID-generation (some pitfalls of this explained here: https://tomharrisonjr.com/uuid-or-guid-as-primary-keys-be-careful-7b2aa3dcb439) |
No need to update MAIN_REPO Using dd for file creation. FILESIZE is a variable. Adding a target to check the MQ messages, for successful ingestion
ef14a43
to
399751c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This addresses some of the discussions we had with using different FTP servers with the inbox, related to User story: #4 - using two different types of inboxes allows us to cover a wider range of servers.
I do not believe that the #329 is addressed with this issue, as it should be investigated further.
Additionally it bumps the versions some of python dependencies and other fixes
Replacing FUSE with a notification system
This PR replaces the FUSE program with a notification system. We also upgraded the authentication mechanism to support multiple user ids.
It has the following benefits:
/lega/<user>
to/ega/inbox/<user>
. In case something went wrong the uploaded file would not end up in the inbox.cron
to clean up the mountpoints (since we chroot the user in its home directory,umount
would fail)./dev/fuse
, no FUSE python code. The file system calls do not go through the library, so we get performance.Regarding the authentication mechanism:
lega
to impersonate all logged-in users. Instead, each user has its own id. Each site administration can configure in which range the user id lands (by shifting the user id). The fakeCentralEGA is updated accordingly.
Regarding the SFTP server and the notification system:
lo
interface (no external access).Moreover, this PR solves Issue #329, by making the postgres data persistent in its own volume. Rebooting the database should pick up the database where it left off.
Finally, the lega.yml is slightly updated in order to allow scalability of the ingest and verify workers.
Note: we could send the message directly to the local broker instead of the listener, but using
pika
andconf.ini
in the listener was so much simpler than using a C-based AMQPS client. After rudimentary tests, it seems performant enough.