nn.GPU #835

nicholas-leonard · 2016-05-27T18:00:27Z

This PR adds nn.GPU which can be used to distribute modules across multiple devices.

OMG why is it in nn and not in cunn? As discussed with @soumith, putting in nn means that it can be used in CPU-only environments, which is common for production.

The unit tests are located in cunn to avoid a pcall(function() require cunn end) in the nn unit tests (see PR torch/cunn#282).

iamalbert · 2016-05-27T19:51:12Z

doc/simple.md

+   :add(nn.GPU(nn.Linear(10000,10000), 1))
+   :add(nn.GPU(nn.Linear(10000,10000), 2))
+   :add(nn.GPU(nn.Linear(10000,10000), 3))
+   :add(nn.GPU(nn.Linear(10000,10000), 4, cutorch.getDevice()))


I am wondering if this line can run in CPU-only environments

the cutorch.getDevice will not. But it isn't mandatory to use cutorch.getDevice(), this is just an example.

nicholas-leonard · 2016-06-06T17:20:23Z

I have tested the nn.GPU implementation on a language model task using 4 GPUs and it works.

iamalbert · 2016-06-06T17:40:05Z

This is cool but the name and the order of parameters is .. a little confusing

:add(nn.OnGPU(1, nn.Linear(10000,10000))
:add(nn.OnGPU(2, nn.Linear(10000,10000))
:add(nn.OnGPU(3, nn.Linear(10000,10000))
:add(nn.OnGPU(4, cutorch.getDevice(), nn.Linear(10000,10000)))

may be clearer than

:add(nn.GPU(nn.Linear(10000,10000), 1))
:add(nn.GPU(nn.Linear(10000,10000), 2))
:add(nn.GPU(nn.Linear(10000,10000), 3))
:add(nn.GPU(nn.Linear(10000,10000), 4, cutorch.getDevice()))

how do you think?

nicholas-leonard · 2016-06-06T19:41:52Z

@iamalbert I prefer nn.GPU, but I like your order. Except for the optional outdevice argument which I think should still be last (as it is optional):

:add(nn.GPU(1, nn.Linear(10000,10000))
:add(nn.GPU(2, nn.Linear(10000,10000))
:add(nn.GPU(3, nn.Linear(10000,10000))
:add(nn.GPU(4, nn.Linear(10000,10000)), cutorch.getDevice())

szagoruyko · 2016-06-07T02:16:47Z

The order is module, device in ModelParallel already, it would be confusing if we introduce another order

nicholas-leonard · 2016-06-07T04:58:15Z

@szagoruyko Good point. The argument of backwards compatibility always wins (plus I am lazy). I will leave it as it is.

szagoruyko · 2016-06-07T06:05:29Z

@nicholas-leonard there is a couple of places like self.output = output that break tensor sharing and optnet, can we avoid it?

nicholas-leonard · 2016-06-07T14:32:05Z

@szagoruyko how does it break tensor sharing (I don't know anyone that shares outputs). As for optnet, what does it do that cannot support self.output = output? because I am pretty sure I do this in many other places.

fmassa · 2016-06-15T04:12:43Z

Hi,

Sorry for the delay in replying.

The current underlying requirement in optnet is that the tensor/storage objects corresponding to the output and gradInput of each module doesn't change across runs of forward/backward.
There are two cases actually:

when one wants to create graph visualizations, the tensor shouldn't change.
Thus doing something like the following will currently not work (although there is a pending PR which tries to fix it, but I'm not 100% sure it doesn't have bad side effects):

function MySelectModule:updateOutput(input)
  self.output = input[1] -- creates a new tensor which shares the storage
  return self.output
end

when we care only about optimizing for memory, optnet only looks for the storages. So a module which allocates new memory for the output/gradInput during each forward/backward won't be able to reuse buffers, but I think that it shouldn't cause any problems wrt the correctness of the entire network forward/backward.

About this PR, as it's a generic module that supports both tensors and tables, using set is not an option. But from a quick look I have the impression that this module could potentially work as is with optnet, but I haven't tried it to check.

nicholas-leonard · 2016-06-23T03:24:21Z

So then I guess this PR is ready to merge then :)

nicholas-leonard · 2016-06-28T21:30:00Z

soumith · 2016-07-01T23:09:18Z

GPU.lua

+   self.modules[1] = module
+
+   if module:type() == 'torch.CudaTensor' then
+      self:cuda()


this :cuda() is no executing in the context of "device". needs a fix

It is : https://github.com/nicholas-leonard/nn/blob/db7b97209e3a4b540d51447be6ebfaf3b8fc28a9/GPU.lua#L158-L162

soumith · 2016-07-02T12:44:22Z

Thanks for your patience Nicholas!

szagoruyko · 2016-07-02T13:00:32Z

just realized, we need to add support for other cuda datatypes than float

soumith · 2016-07-02T13:01:30Z

he seems to be checking for Cuda*Tensor, is that not sufficient?

nicholas-leonard mentioned this pull request May 27, 2016

nn.GPU torch/cunn#282

Merged

iamalbert reviewed May 27, 2016
View reviewed changes

nicholas-leonard force-pushed the GPU branch 2 times, most recently from b1d68d6 to dba06e5 Compare June 3, 2016 21:04

nicholas-leonard mentioned this pull request Jun 14, 2016

Multi gpu nce Element-Research/rnn#271

Merged

1 task

GPU

db7b972

nicholas-leonard force-pushed the GPU branch from dba06e5 to db7b972 Compare June 14, 2016 21:07

nicholas-leonard mentioned this pull request Jun 14, 2016

GPUParallelTable (work in progress) #850

Closed

soumith reviewed Jul 1, 2016
View reviewed changes

nicholas-leonard added 2 commits July 1, 2016 19:57

fixes

4c53ee2

small fixes

ef2f327

soumith merged commit 07d3bdd into torch:master Jul 2, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nn.GPU #835

nn.GPU #835

nicholas-leonard commented May 27, 2016 •

edited

Loading

iamalbert May 27, 2016

nicholas-leonard May 27, 2016 •

edited

Loading

nicholas-leonard commented Jun 6, 2016

iamalbert commented Jun 6, 2016

nicholas-leonard commented Jun 6, 2016 •

edited

Loading

szagoruyko commented Jun 7, 2016

nicholas-leonard commented Jun 7, 2016

szagoruyko commented Jun 7, 2016

nicholas-leonard commented Jun 7, 2016

fmassa commented Jun 15, 2016

nicholas-leonard commented Jun 23, 2016

nicholas-leonard commented Jun 28, 2016

soumith Jul 1, 2016

nicholas-leonard Jul 1, 2016

soumith commented Jul 2, 2016

szagoruyko commented Jul 2, 2016

soumith commented Jul 2, 2016

nn.GPU #835

nn.GPU #835

Conversation

nicholas-leonard commented May 27, 2016 • edited Loading

iamalbert May 27, 2016

Choose a reason for hiding this comment

nicholas-leonard May 27, 2016 • edited Loading

Choose a reason for hiding this comment

nicholas-leonard commented Jun 6, 2016

iamalbert commented Jun 6, 2016

nicholas-leonard commented Jun 6, 2016 • edited Loading

szagoruyko commented Jun 7, 2016

nicholas-leonard commented Jun 7, 2016

szagoruyko commented Jun 7, 2016

nicholas-leonard commented Jun 7, 2016

fmassa commented Jun 15, 2016

nicholas-leonard commented Jun 23, 2016

nicholas-leonard commented Jun 28, 2016

soumith Jul 1, 2016

Choose a reason for hiding this comment

nicholas-leonard Jul 1, 2016

Choose a reason for hiding this comment

soumith commented Jul 2, 2016

szagoruyko commented Jul 2, 2016

soumith commented Jul 2, 2016

nicholas-leonard commented May 27, 2016 •

edited

Loading

nicholas-leonard May 27, 2016 •

edited

Loading

nicholas-leonard commented Jun 6, 2016 •

edited

Loading