-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lustreinfiniband #289
base: master
Are you sure you want to change the base?
Lustreinfiniband #289
Conversation
…over infiniband (IPoIB) Changes to files to enable Infiniband functionality: lfsmaster.sh lfsoss.sh lfsclient.sh lfsrepo.sh Addition for correct drives placement of OSSes : instaldrives.sh *installdrives.sh takes about 15 minutes to run so please either remote this entity, or wait it out.
…e using IP over infiniband (IPoIB) using the existing 700GB NVMe drives in the H series nodes Changes to files to enable Infiniband functionality: lfsmaster.sh lfsoss.sh lfsclient.sh lfsrepo.sh
…over infiniband (IPoIB) - lustre-rdma - This is a created implementation of Lustre using native Remote Direct Memory Access (RDMA) Changes to files to enable Infiniband functionality: lfsmaster.sh lfsoss.sh lfsclient.sh lfsrepo.sh lfspkgs.sh Addition for the installation of new OFED : installOFED.sh Addition for correct Lustre kernel : lustreinstall1.sh Lustre packages : lustreinstall2.sh Addition for rebooting of Lustre MDS/OSS: rebootlustre.sh Addition for pause after MDS/OSS reboot : waitforreboot.sh
…remove unnecessary lines
…remove unnecessary lines
…ition of lustre_rdma_avs
…y, so removing Infiniband components from headnode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- please use something smaller than hc44 for headnode (no hpc/ib requirements)
- do not rely on sleep for az resources to be made, better actively check
- why modify .ssh dir in lsf scripts? that should already be set up
- waagent isn't the only one trying to manage sdb; cloud-init is here as well...
- why install ofed on centos-hpc image?
- modify waagent.conf, but not restart?
- double modification of waagent.conf?
- sakey for saskey?
please update, so I can start functional tests.... thanks!
I have reduced the size of the headnode to a 'Standard_D8s_v3' since there is no infiniband connectivity with the Lustre servers anyway. |
lustre_rdma_nvmedrives:
Changes to files to enable Infiniband functionality:
lfsmaster.sh
lfsoss.sh
lfsclient.sh
lfsrepo.sh
lfspkgs.sh
Addition for the installation of new Mellanox OFED (MOFED) for the Lustre kernel : installMOFED.sh
Addition for correct drives placement of OSSes : installdrives.sh
*installdrives.sh takes about 15 minutes to run so please either remote this entity, or wait it out.
Additions for correct Lustre kernel :
lustreinstall1.sh
lustreinstall2.sh
Addition for pause after MDS/OSS reboot : waitforreboot.sh