Skip to content
This repository has been archived by the owner on Apr 28, 2021. It is now read-only.

headless rendering #146

Open
SimonDanisch opened this issue Feb 27, 2017 · 42 comments
Open

headless rendering #146

SimonDanisch opened this issue Feb 27, 2017 · 42 comments

Comments

@SimonDanisch
Copy link
Member

SimonDanisch commented Feb 27, 2017

I've decided to go for hardware accelerated headless rendering only for now...

I found these resources, which should be relatively easy to try out, if one has the correct hardware:

https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/
https://arrayfire.com/remote-off-screen-rendering-with-opengl/
EDIT: I got GLFW figured out

I will try out the last solution hopefully soon and set up https://github.com/SimiDCI/GLVisualizeCI.jl on a machine :)

CC: @timholy, @vchuravy

@timholy
Copy link
Contributor

timholy commented Feb 27, 2017

This is huge. Given that so much of our visualization needs to happen on a server, it's been my main concern about taking the leap to GLVisualize. Thanks for working on this!

@SimonDanisch
Copy link
Member Author

@timholy would you like to explore this some more? is there an easy test we can do?
I think I'm ready to try a few things out, now that I got my first headless ci working :)
Would it be possible to install X server on your headless systems? That'd make things a lot easier!

@Cody-G
Copy link

Cody-G commented Mar 13, 2017

Yes we do have X server installed on our headless systems :)

@Cody-G
Copy link

Cody-G commented Mar 13, 2017

I should add that we use Tesla GPUs in most of these systems, which might complicate things. We can avoid using the Teslas for this if needed. But if I understand #117 correctly then we may be able to use the Teslas?

@SimonDanisch
Copy link
Member Author

Great!
Yes, tesla might be harder. This will need some more research. What does glxinfo | grep OpenGL spit out for the tesla?
On another system, you could try this to see if GLVisualize works already:

using GLVisualize
window = glscreen(resolution = (500, 500), visible = false) # visible = false is optional
_view(visualize(rand(Float32, 32, 32)))
yield()
GLWindow.poll_glfw()
GLWindow.render_frame(window)
GLWindow.swapbuffers(window)
GLWindow.screenshot(window, path = homedir() * "/test.png")

Make sure you're up to date with:

foreach(x-> Pkg.checkout(x), ("GLWindow", "GLAbstraction", "GLVisualize", "GLFW"))

@SimonDanisch
Copy link
Member Author

You can also run the whole testsuite (Pkg.test("GLVisualize")), when you set these environment variables:

ENV["CI"] = "true"
ENV["CI_REPORT_DIR"] = "path/where/renders/will/be/stored"

@timholy
Copy link
Contributor

timholy commented Mar 13, 2017

Really sorry I missed that ping 6 days ago.

If I ssh -X myserver then I get this:

tim@cannon:~$ glxinfo | grep OpenGL
X Error of failed request:  GLXBadContext
  Major opcode of failed request:  154 (GLX)
  Minor opcode of failed request:  6 (X_GLXIsDirect)
  Serial number of failed request:  22
  Current serial number in output stream:  22
tim@cannon:~$

I can try running it locally too. The server is in a closet, so obviously we don't want to sit in front of it for hours, but it does have an attached monitor and keyboard.

@SimonDanisch
Copy link
Member Author

That doesn't look too good :D
Are there even video drivers installed on there?

@timholy
Copy link
Contributor

timholy commented Mar 13, 2017

If I'm sitting in front of the server using the keyboard and monitor, I get this

tim@cannon:~$ glxinfo | grep OpenGL
OpenGL vendor string: NVIDIA Corporation
OpenGL renderer string: Tesla C2075/PCIe/SSE2
OpenGL core profile version string: 4.3.0 NVIDIA 375.26
OpenGL core profile shading language version string: 4.30 NVIDIA via Cg compiler
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 4.5.0 NVIDIA 375.26
OpenGL shading language version string: 4.50 NVIDIA
OpenGL context flags: (none)
OpenGL profile mask: (none)
OpenGL extensions:
tim@cannon:~$ 

@SimonDanisch
Copy link
Member Author

interesting, that looks much better! I guess this means, you simply have to start the X-Server!
This article should help you:
https://arrayfire.com/remote-off-screen-rendering-with-opengl/

@timholy
Copy link
Contributor

timholy commented Mar 13, 2017

BTW we can also buy a dedicated video card; it will compete for slot space with other things (e.g., compute GPUs).

@SimonDanisch
Copy link
Member Author

It walks you through all different gotchas, not sure which ones are relevant on your specific system ;)

@timholy
Copy link
Contributor

timholy commented Mar 13, 2017

Will try. I have a couple of other things I have to do first, but back at you later today.

@SimonDanisch
Copy link
Member Author

BTW we can also buy a dedicated video card

doesn't seem to be needed :)

OpenGL core profile version string: 4.3.0 NVIDIA 375.26

That means it should be possible to get hardware accelerated OpenGL in some way!
If you're on there with a keyboard and display, GLVisualize should already run.

@timholy
Copy link
Contributor

timholy commented Mar 13, 2017

I just verified that GLVisualize works on that server if you're sitting in front of it. 🎆

Thanks for the link re remote access! Since that will involve rebooting the server, there will need to be some checking/scheduling with users of the server.

@SimonDanisch
Copy link
Member Author

Great! Can't be that hard then!

@Cody-G
Copy link

Cody-G commented Mar 25, 2017

So I got this working on one of our Tesla servers. Instead of following the ArrayFire instructions in the end I decided to use VirtualGL. Instructions:

  1. Make sure the nvidia driver is installed along with OpenGL libraries. So if you originally installed the drivers with the --no-opengl-libs option as suggested in nvidia's guide then you will need to reinstall the driver without that flag.
  2. Modify xorg.conf according to the instructions on page 15 of this guide from nvidia. If you just want to set up rendering on one gpu you can use this command, substituting in the bus ID of your gpu.
nvidia-xconfig --busid=PCI:4:0:0 --use-display-device=none

(You can find the bus ID with nvidia-xconfig --query-gpu-info)
It should be possible to use multiple GPUs by creating similar entries in xorg.conf and using vglrun with the -d flag to set the display, but I didn't test that.

  1. Install and configure VirtualGL on the server following instructions here.
  2. When logged into the server with a remote desktop client you can now run OpenGL applications by prefixing vglrun to the start command, i.e. start julia with vglrun ./julia when you want to use GLVisualize. I only tested with X2Go but it should work with other clients or simply within an ssh -X session.

You can test that it's working by comparing the output of glxinfo with and without vglrun.

without:

cody@hydra:~$ glxinfo | grep OpenGL
OpenGL vendor string: Mesa project: www.mesa3d.org
OpenGL renderer string: Mesa GLX Indirect
OpenGL version string: 1.2 (1.5 Mesa 6.4.1)
OpenGL extensions:

with:

cody@hydra:~$ vglrun glxinfo | grep OpenGL
OpenGL vendor string: NVIDIA Corporation
OpenGL renderer string: Tesla M2090/PCIe/SSE2
OpenGL core profile version string: 4.5.0 NVIDIA 375.26
OpenGL core profile shading language version string: 4.50 NVIDIA
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 4.5.0 NVIDIA 375.26
OpenGL shading language version string: 4.50 NVIDIA
OpenGL context flags: (none)
OpenGL profile mask: (none)
OpenGL extensions:

There is one issue with this. VirtualGL doesn't like it when GLVisualize windows are closed. When running the tests for GLVisualize everything works fine (I can play with the cat) but when I close the window I get this error:

julia> Pkg.test("GLVisualize")
INFO: Computing test dependencies for GLVisualize...
INFO: Installing Highlights v0.2.1
INFO: Testing GLVisualize
Now showing rotate_robj.jl:

[VGL] ERROR: in readback--
[VGL]    254: Window has been deleted by window manager
=============================[ ERROR: GLVisualize ]=============================

failed process: Process(`/home/cody/src/julia/usr/bin/julia -Cnative -J/home/cody/src/julia/usr/lib/julia/sys.so --compile=yes --depwarn=yes --check-bounds=yes --code-coverage=none --color=yes --compilecache=yes /home/cody/v0.5_for_seg/v0.5/GLVisualize/test/runtests.jl`, ProcessExited(1)) [1]

================================================================================
INFO: Removing Highlights v0.2.1
ERROR: GLVisualize had test errors
 in #test#61(::Bool, ::Function, ::Array{AbstractString,1}) at ./pkg/entry.jl:740
 in (::Base.Pkg.Entry.#kw##test)(::Array{Any,1}, ::Base.Pkg.Entry.#test, ::Array{AbstractString,1}) at ./<missing>:0
 in (::Base.Pkg.Dir.##2#3{Array{Any,1},Base.Pkg.Entry.#test,Tuple{Array{AbstractString,1}}})() at ./pkg/dir.jl:31
 in cd(::Base.Pkg.Dir.##2#3{Array{Any,1},Base.Pkg.Entry.#test,Tuple{Array{AbstractString,1}}}, ::String) at ./file.jl:59
 in #cd#1(::Array{Any,1}, ::Function, ::Function, ::Array{AbstractString,1}, ::Vararg{Array{AbstractString,1},N}) at ./pkg/dir.jl:31
 in (::Base.Pkg.Dir.#kw##cd)(::Array{Any,1}, ::Base.Pkg.Dir.#cd, ::Function, ::Array{AbstractString,1}, ::Vararg{Array{AbstractString,1},N}) at ./<missing>:0
 in #test#3(::Bool, ::Function, ::String, ::Vararg{String,N}) at ./pkg/pkg.jl:258
 in test(::String, ::Vararg{String,N}) at ./pkg/pkg.jl:258

@Cody-G
Copy link

Cody-G commented Mar 25, 2017

This sheds some light on the error:
https://sourceforge.net/p/virtualgl/mailman/message/33672284/

@Cody-G
Copy link

Cody-G commented Apr 7, 2017

I'm thinking maybe there's something we can do within GLVisualize to prevent the above error. Per the link that I posted, maybe we can call XDestroyWindow() from GLVisualize? But I'm thinking this may give us only one valid OpenGL window per julia session, assuming one started julia with vglrun julia as I was doing. Do you think that's right? If so I guess we could move the call to vglrun into the Julia code, and call it once for each new window that opens, but I would rather avoid that complication. I know almost nothing about X, but I can give this a try with some guidance.

@timholy
Copy link
Contributor

timholy commented Apr 7, 2017

I suspect that the issue is that the destroy call is being generated by the window manager on the client rather than the application on the server---so the server doesn't realize the window has been closed, tries to draw on it, and gets an error. If you run using include("rotate_robj.jl") (from inside GLVisualize/examples/introduction), is the error message more informative than it is from Pkg.test?

Ideally it would be nice not to have to set up any communication between VirtualGL and GLFW; instead, it would be great if GLFW would emit some signal that we can catch in a callback and terminate rendering. That will depend on where the failure occurs.

Another option might be to wrap this loop in a try/catch. (I'm assuming that this is what gets called by renderloop.) It's possible that we have to detect the problem before the error gets thrown, however.

@timholy
Copy link
Contributor

timholy commented Apr 7, 2017

See also http://www.glfw.org/docs/latest/window_guide.html#window_close. GLFW.jl doesn't yet wrap that function, but it would only be a few lines. I'm not sure whether GLFW actually gets the signal in time, however; it's possible that GLFW only learns about the problem after the "pipe" is broken and it's too late to do anything about it.

You could also try commenting out the render_frame and swapbuffers lines in renderloop and see whether it's specifically drawing that causes the error. If that fixes the error, maybe inserting more !GLFW.WindowShouldClose(window) in various places might do the trick.

@Cody-G
Copy link

Cody-G commented Apr 9, 2017

Well I didn't solve this, but I understand the problem a bit better. Note that one can replicate the problem without GLVisualize, simply by using virtualgl to run the example script in the GLFW.jl README.

Another option might be to wrap this loop in a try/catch. (I'm assuming that this is what gets called by renderloop.) It's possible that we have to detect the problem before the error gets thrown, however.

It seems we have to detect it before it's thrown.

See also http://www.glfw.org/docs/latest/window_guide.html#window_close. GLFW.jl doesn't yet wrap that function, but it would only be a few lines. I'm not sure whether GLFW actually gets the signal in time, however; it's possible that GLFW only learns about the problem after the "pipe" is broken and it's too late to do anything about it.

Are you referring to glfwSetWindowShouldClose? GFLW.jl does wrap the function already actually. Anyway it seems that whatever the glfw library is doing is coming too late. Here's my understanding the order of what's happening:

  1. Whenever a new X window is created, virtualgl asks to be notified when the window manager tries to close the window. It does this by using the WM_DESTROY_WINDOW protocol, described here. If virtualgl ever receives this signal, then it is already too late to prevent the error. Basically we want to avoid letting this line get executed, per the advice that I linked to in a previous post.
  2. When glfw creates a window it also listens for the close signal by using the same protocol, and executes this when it gets the signal.
  3. The user clicks the close button and both virtualgl and glfw get the signal, and it's a race condition from that point on. glfw gets time to execute a few more lines before virtualgl shuts things down. I was hoping that we could intervene with glfw's ability to set a window close callback--code here--but again it's too late at that time

So we want to prevent virtualgl from getting the signal at all. There are examples of how to do this using calls to Xlib--here is one--but glfw seems to take a similar approach here, so I don't understand why glfw's interception is not sufficient. Maybe there's a difference between "intercepting" and sharing a signal, but I don't see how to distinguish between the two.

For now the workaround is to never close windows with X window buttons--always call GFLW.DestroyWindow(win)

@timholy
Copy link
Contributor

timholy commented Apr 9, 2017

I meant glfwSetWindowCloseCallback, which doesn't seem to be wrapped. I wonder if one could use this to prevent VirtualGL from actually closing the window, but allowing the WindowShouldClose mechanism to work.

If this is too troublesome we should file an issue with VirtualGL.

@Cody-G
Copy link

Cody-G commented Apr 9, 2017

glfwSetWindowCloseCallback is also wrapped (it's a bit unclear looking at the code alone, but it happens here), and I did try it. I was referencing it in my 3rd point above. I tried assigning a callback in which I destroy the window, but either it doesn't work or the race condition I mentioned prevents my destroy command from finishing.

It seems that several people have raised the issue with virtualgl before, but the developers are adamant that they are doing the right thing. Could it be a bug in the way GLFW handles X11 window closing?

@timholy
Copy link
Contributor

timholy commented Apr 9, 2017

Hmm, but if I do this:

julia> using GLFW

julia> window = GLFW.CreateWindow(800, 600, "Context creation")
GLFW.Window(Ptr{Void} @0x000000000304afe0,Function[#undef,#undef,#undef,#undef,#undef,#undef,#undef,#undef,#undef,#undef,#undef,#undef,#undef,#undef,#undef])

julia> while !GLFW.WindowShouldClose(window)
           GLFW.PollEvents()
       end

then when I click the close button, the loop terminates but the window stays open; I have to call GLFW.DestroyWindow(window) to actually close it. What happens if you try this through VirtualGL?

@timholy
Copy link
Contributor

timholy commented Apr 9, 2017

Ah, thanks for explaining. Assuming the above demo is broken if you run it over VirtualGL, perhaps we should ask the devs how they think we should solve this?

@Cody-G
Copy link

Cody-G commented Apr 9, 2017

So...surprisingly, your demo works for me! The key difference between that and the demo in the GLFW.jl README is that you don't call GLFW.MakeContextCurrent(window)

That seems bizarre to me. Do you understand why the current context might be important?

@timholy
Copy link
Contributor

timholy commented Apr 9, 2017

Hmm, I'm not the best person to advise you on this, @SimonDanisch knows much more. Maybe try the GLFW_CONTEXT_RELEASE_BEHAVIOR hint? http://www.glfw.org/docs/latest/group__context.html#ga1c04dc242268f827290fe40aa1c91157

@Cody-G
Copy link

Cody-G commented Apr 9, 2017

My guess is that somehow we're fooling VirtualGL--maybe it only monitors the thread associated with the active context? If so a workaround might be to remove the current context by passing NULL to that function (see docs) before we destroy the window

EDIT: That workaround seems impossible to use. It only works if we ensure that the current context does not match the window whenever the user might click the close button (all the time). I can bring the issue over to the VirtualGL folks, but I want to make sure that I can articulate the proper question to ask them. Simon, maybe you could help me form the question?

@timholy
Copy link
Contributor

timholy commented Apr 10, 2017

Any thoughts here, @SimonDanisch?

@SimonDanisch
Copy link
Member Author

Sorry I lost a bit track of this issue. I will look into it after lunch!

@SimonDanisch
Copy link
Member Author

Seems like you two have already done great detective work. I can't add much to it.
I'm pretty sure that messing with MakeContextCurrent shouldn't be a fix, though ;)
Please go ahead and file an issue with VirtualGL :)

@Cody-G
Copy link

Cody-G commented Apr 10, 2017

Cool thanks Simon! I've opened an issue with VirtualGL. I tried to cc both of you but the @ mechanism wasn't working from there for some reason.

@Cody-G
Copy link

Cody-G commented Apr 12, 2017

VirtualGL just posted a new pre-release build that fixes the issue, so we are in business!

@remoore
Copy link

remoore commented Nov 19, 2017

So I got this working on one of our Tesla servers. Instead of following the ArrayFire instructions in the end I decided to use VirtualGL. Instructions:

Make sure the nvidia driver is installed along with OpenGL libraries. So if you originally installed the drivers with the --no-opengl-libs option as suggested in nvidia's guide then you will need to reinstall the driver without that flag.
Modify xorg.conf according to the instructions on page 15 of this guide from nvidia. If you just want to set up rendering on one gpu you can use this command, substituting in the bus ID of your gpu.
nvidia-xconfig --busid=PCI:4:0:0 --use-display-device=none
(You can find the bus ID with nvidia-xconfig --query-gpu-info)
It should be possible to use multiple GPUs by creating similar entries in xorg.conf and using vglrun with the -d flag to set the display, but I didn't test that.

Install and configure VirtualGL on the server following instructions here.
When logged into the server with a remote desktop client you can now run OpenGL applications by prefixing vglrun to the start command, i.e. start julia with vglrun ./julia when you want to use GLVisualize. I only tested with X2Go but it should work with other clients or simply within an ssh -X session.
You can test that it's working by comparing the output of glxinfo with and without vglrun.

without:

cody@hydra:~$ glxinfo | grep OpenGL
OpenGL vendor string: Mesa project: www.mesa3d.org
OpenGL renderer string: Mesa GLX Indirect
OpenGL version string: 1.2 (1.5 Mesa 6.4.1)
OpenGL extensions:
with:

cody@hydra:~$ vglrun glxinfo | grep OpenGL
OpenGL vendor string: NVIDIA Corporation
OpenGL renderer string: Tesla M2090/PCIe/SSE2
OpenGL core profile version string: 4.5.0 NVIDIA 375.26
OpenGL core profile shading language version string: 4.50 NVIDIA
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 4.5.0 NVIDIA 375.26
OpenGL shading language version string: 4.50 NVIDIA
OpenGL context flags: (none)
OpenGL profile mask: (none)
OpenGL extensions:
There is one issue with this. VirtualGL doesn't like it when GLVisualize windows are closed. When running the tests for GLVisualize everything works fine (I can play with the cat) but when I close the window I get this error:

julia> Pkg.test("GLVisualize")
INFO: Computing test dependencies for GLVisualize...
INFO: Installing Highlights v0.2.1
INFO: Testing GLVisualize
Now showing rotate_robj.jl:

[VGL] ERROR: in readback--
[VGL] 254: Window has been deleted by window manager
=============================[ ERROR: GLVisualize ]=============================

failed process: Process(/home/cody/src/julia/usr/bin/julia -Cnative -J/home/cody/src/julia/usr/lib/julia/sys.so --compile=yes --depwarn=yes --check-bounds=yes --code-coverage=none --color=yes --compilecache=yes /home/cody/v0.5_for_seg/v0.5/GLVisualize/test/runtests.jl, ProcessExited(1)) [1]

================================================================================
INFO: Removing Highlights v0.2.1
ERROR: GLVisualize had test errors
in #test#61(::Bool, ::Function, ::Array{AbstractString,1}) at ./pkg/entry.jl:740
in (::Base.Pkg.Entry.#kw##test)(::Array{Any,1}, ::Base.Pkg.Entry.#test, ::Array{AbstractString,1}) at ./:0
in (::Base.Pkg.Dir.##2#3{Array{Any,1},Base.Pkg.Entry.#test,Tuple{Array{AbstractString,1}}})() at ./pkg/dir.jl:31
in cd(::Base.Pkg.Dir.##2#3{Array{Any,1},Base.Pkg.Entry.#test,Tuple{Array{AbstractString,1}}}, ::String) at ./file.jl:59
in #cd#1(::Array{Any,1}, ::Function, ::Function, ::Array{AbstractString,1}, ::Vararg{Array{AbstractString,1},N}) at ./pkg/dir.jl:31
in (::Base.Pkg.Dir.#kw##cd)(::Array{Any,1}, ::Base.Pkg.Dir.#cd, ::Function, ::Array{AbstractString,1}, ::Vararg{Array{AbstractString,1},N}) at ./:0
in #test#3(::Bool, ::Function, ::String, ::Vararg{String,N}) at ./pkg/pkg.jl:258
in test(::String, ::Vararg{String,N}) at ./pkg/pkg.jl:258

I've tried following these steps fairly carefully, but I can't seem to get it to work. When I run vglrun julia I get the following error message:

➜ ~ vglrun julia [VGL] NOTICE: Automatically setting VGL_CLIENT environment variable to [VGL] MYIP, the IP address of your SSH client. WARNING: Error during initialization of module PCRE: ErrorException("could not load library "libpcre2-8" libpcre2-8.so: cannot open shared object file: No such file or directory") WARNING: Error during initialization of module GMP: ErrorException("could not load library "libpcre2-8" libpcre2-8.so: cannot open shared object file: No such file or directory") WARNING: Error during initialization of module Random: ErrorException("could not load library "libdSFMT" libdSFMT.so: cannot open shared object file: No such file or directory") WARNING: Error during initialization of module LinAlg: ErrorException("could not load library "libopenblas64_" libopenblas64_.so: cannot open shared object file: No such file or directory") fatal: error thrown and no exception handler available. Base.InitError(mod=:LibGit2, error=ErrorException("could not load library "libpcre2-8" libpcre2-8.so: cannot open shared object file: No such file or directory")) rec_backtrace at /home/centos/buildbot/slave/package_tarball64/build/src/stackwalk.c:84 record_backtrace at /home/centos/buildbot/slave/package_tarball64/build/src/task.c:245 jl_throw at /home/centos/buildbot/slave/package_tarball64/build/src/task.c:564 jl_errorf at /home/centos/buildbot/slave/package_tarball64/build/src/rtutils.c:77 jl_dlerror at /home/centos/buildbot/slave/package_tarball64/build/src/dlload.c:74 [inlined] jl_load_dynamic_library_ at /home/centos/buildbot/slave/package_tarball64/build/src/dlload.c:205 jl_get_library at /home/centos/buildbot/slave/package_tarball64/build/src/runtime_ccall.cpp:159 jl_load_and_lookup at /home/centos/buildbot/slave/package_tarball64/build/src/runtime_ccall.cpp:170 unknown function (ip: 0x7f43ecacdb42) compile at ./pcre.jl:97 compile at ./regex.jl:55 ismatch at ./regex.jl:145 [inlined] ismatch at ./regex.jl:145 [inlined] joinpath at ./path.jl:213 joinpath at ./path.jl:205 jl_call_fptr_internal at /home/centos/buildbot/slave/package_tarball64/build/src/julia_internal.h:339 [inlined] jl_call_method_internal at /home/centos/buildbot/slave/package_tarball64/build/src/julia_internal.h:358 [inlined] jl_apply_generic at /home/centos/buildbot/slave/package_tarball64/build/src/gf.c:1933 jl_apply at /home/centos/buildbot/slave/package_tarball64/build/src/julia.h:1424 [inlined] jl_f__apply at /home/centos/buildbot/slave/package_tarball64/build/src/builtins.c:426 abspath at ./path.jl:277 jl_call_fptr_internal at /home/centos/buildbot/slave/package_tarball64/build/src/julia_internal.h:339 [inlined] jl_call_method_internal at /home/centos/buildbot/slave/package_tarball64/build/src/julia_internal.h:358 [inlined] jl_apply_generic at /home/centos/buildbot/slave/package_tarball64/build/src/gf.c:1933 __init__ at ./libgit2/libgit2.jl:914 unknown function (ip: 0x7f43ecd00838) jl_call_fptr_internal at /home/centos/buildbot/slave/package_tarball64/build/src/julia_internal.h:339 [inlined] jl_call_method_internal at /home/centos/buildbot/slave/package_tarball64/build/src/julia_internal.h:358 [inlined] jl_apply_generic at /home/centos/buildbot/slave/package_tarball64/build/src/gf.c:1933 jl_apply at /home/centos/buildbot/slave/package_tarball64/build/src/julia.h:1424 [inlined] jl_module_run_initializer at /home/centos/buildbot/slave/package_tarball64/build/src/toplevel.c:87 _julia_init at /home/centos/buildbot/slave/package_tarball64/build/src/init.c:733 julia_init at /home/centos/buildbot/slave/package_tarball64/build/src/task.c:297 unknown function (ip: 0x401656) __libc_start_main at /build/glibc-bfm8X4/glibc-2.23/csu/../csu/libc-start.c:291 unknown function (ip: 0x40170c)

Any help would be appreciated.

Cheers
Rob

@Cody-G
Copy link

Cody-G commented Nov 20, 2017

It looks like a dependency problem -- either dependencies are missing or somehow the path isn't getting set correctly. Am I correct that you're using CentOS? I've never used it before, so the setup may be different than what worked on Ubuntu for me.

It looks like the dependency problems are all dependencies of julia. Is the command julia an alias on your system? You might try specifying the full path to the julia executable instead, i.e. vglrun /path/to/julia

@remoore
Copy link

remoore commented Nov 20, 2017

Hi Cody,

Thanks for getting back to me.

Oddly enough I'm running Ubuntu 16.04 LTS. I'm not sure why centos is appearing in the output. Perhaps I've inadvertently installed the wrong version of VirtualGL?

Yes, I've got julia setup as an alias for /usr/local/bin/julia. I've already tried passing the absolute path to vglrun and that doesn't work either.

@SimonDanisch
Copy link
Member Author

@Cody-G is right, that's all julia related. Does this even run without vglrun?

I'm running Ubuntu 16.04 LTS
Yes, I've got julia setup as an alias for /usr/local/bin/julia

Well, I don't know what's going on, but the path, /home/centos/buildbot/slave/package_tarball64/build/src/toplevel.c:87 should point to where you installed julia... How did you install julia? The paths look a lot like an automatic Julia install on a CI.

@remoore
Copy link

remoore commented Nov 20, 2017

Yes, julia runs fine without the vglrun prefix.

I followed the instructions on this page when installing Julia: https://julialang.org/downloads/platform.html

@remoore
Copy link

remoore commented Nov 20, 2017

And went for the symbolic link option

@remoore
Copy link

remoore commented Nov 27, 2017

I've had another stab at getting this to work.

This time I built Julia from source instead of using a downloaded linux binary. This has solved the problems that I was experiencing previously and I can now run vglrun julia without getting any errors. It's still not working as I'd like though.

I can run using Plots; gr(); scatter(rand(1000), rand(1000)) within the Julia session, which creates a figure on my local machine. However, if I try using the GLVisualize backend by running using Plots; glvisualize() instead, I get the following error message: ERROR: GLFWError (PLATFORM_ERROR): X11: RandR gamma ramp support seems broken.

Any advice would be appreciated.

@SimonDanisch
Copy link
Member Author

Well, GR doesn't use OpenGL, so it should work independent of the GPU setup.
This might be a regression in GLFW. I can look into this.
If you want, tinker with:
https://github.com/JuliaGL/GLFW.jl/blob/master/src/GLFW.jl#L40
And remove everything that throws. If I remember correctly, ERROR: GLFWError (PLATFORM_ERROR): X11: RandR gamma ramp support seems broken is a non fatal error, not sure why GLFW started throwing it again!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants