Request: Vector of Ptr<Float> as parameter #69

robiwano · 2018-11-03T08:11:19Z

In order to run an algorithm on batches of vectors, I'd like to be able to send a vector of pointers to arrays. Example:

void gpu_algo(Int n, Int m, Vector<Ptr<Float>> ins, Ptr<Float> out)
{
  Int inc = numQPUs() << 4;
  Float acc = 0;
  For (Int i=0, i < n, i = i +1)
    Ptr<Float> in = ins[i] + index() + (me() << 4);
    gather(in);
    Float r0;
    For (Int j=0, j < m, j += inc)
      gather(in + inc);
      receive(r0);
      ... do stuff ...
      acc = acc + r0
      in = in + inc
    End
    store(acc, *out);
    receive(r0);
  End
}

Possible ?

The text was updated successfully, but these errors were encountered:

robiwano · 2018-11-05T16:43:49Z

Ok, I see that it is "half done". The parameter passing code is there in Kernel.h, but the associated mkArg is not, resulting in a linking error. Would really appreciate this looked upon :)

robiwano · 2018-11-12T16:18:53Z

@mn416 thoughts on this ?

mn416 · 2018-11-12T20:48:47Z

Hi @robiwano,

I agree, this is a desirable feature. There seem to be two possible approaches:

Support Ptr<Ptr<Float>> in kernel arguments
Support a new variant of kernel call, say callWithUniforms, where the first argument is an std::vector, and this vector can be read in stream fashion inside the kernel. To read the next element of the stream, the kernel simply calls getUniform().

As you say, it looks like (1) is half done. You could try adding

template <> inline Ptr<Ptr<Float>> mkArg< Ptr<Ptr<Float>> >() {
  Ptr<Ptr<Float>> x;
  x = getUniformPtr<Ptr<Float>>();
  return x;
}

to Kernel.h. If that works, then on the ARM side you can create a SharedArray<float*> as follows:

SharedArray<float> floatsA(256);
SharedArray<float> floatsB(256);
SharedArray<float*> floatPointers(16);
floatPointers[0] = floatsA.getPointer();
floatPointers[1] = floatsB.getPointer();

robiwano · 2018-11-13T06:54:00Z

As you say, it looks like (1) is half done. You could try adding

template <> inline Ptr<Ptr<Float>> mkArg< Ptr<Ptr<Float>> >() {
  Ptr<Ptr<Float>> x;
  x = getUniformPtr<Ptr<Float>>();
  return x;
}

Yes, I already tried that and it does make it link, however, the emulator crashes on an access violation, this is the test kernel:

void gpu_test(Int n, Ptr<Ptr<Float>> a_s)
{
    Ptr<Float> p = a_s[0];
    Float val = *p;
    Print(val)
}

and crash is in Emulator.cpp:

          // LD2: wait for DMA completion
          case LD2: {
            assert(s->dmaLoad.active);
            uint32_t hp = (uint32_t) s->dmaLoad.addr.intVal;
            int vpmAddr = NUM_LANES *
                            (4*s->id + (s->dmaLoad.buffer == A ? 0 : 1));
            for (int i = 0; i < NUM_LANES; i++) {
              state.vpm[vpmAddr+i].intVal = emuHeap[hp>>2];  <<<< access violation
              hp += 4*(s->readStride+1);
            }
            s->dmaLoad.active = false;
            break;
          }

it seems the s->dmaLoad.addr.intVal has an invalid value.

mn416 · 2018-11-13T08:50:43Z

Thanks for the debug info.

The getPointer() method in SharedArray.h looks wrong:

  T* getPointer() {
    return (T*) &emuHeap[address];
  }

I think it should be:

  T* getPointer() {
    return (T*) address;
  }

Of course, the return value should never actually be dereferenced on the ARM side, only inside a kernel.

I don't think I rely on the current getPointer() definition anywhere, but it might be worth doing a grep -r just to check.

mn416 · 2018-11-13T09:35:33Z

Correction, I think it's:

  T* getPointer() {
    return (T*) (address*4);
  }

robiwano · 2018-11-13T09:45:06Z

Correction, I think it's:

  T* getPointer() {
    return (T*) (address*4);
  }

You mean the same impl as for getAddress() ?

robiwano · 2018-11-13T09:49:59Z

Ok, that seems to work in the emulator! Tonight I'll be able to try on the Pi zero.

robiwano · 2018-11-13T14:27:36Z

Hmm... I seem to get a crash when doing gather/receive, my test kernel accumulates input vectors into an output vector:

#define USE_GATHER_RECEIVE 1
void gpu_test(Int num_inputs, Int inputs_length, Ptr<Ptr<Float>> inputs, Ptr<Float> output)
{
	For(Int i = 0, i < num_inputs, i = i + 1) {
		Ptr<Float> ptr_in = inputs[i];
		Ptr<Float> ptr_out = output;
#if USE_GATHER_RECEIVE
		gather(ptr_in); gather(ptr_out);
#endif
		Float val_in, val_out;
		For(Int n = 0, n < inputs_length, n = n + 16) {
#if USE_GATHER_RECEIVE
			gather(ptr_in + 16); gather(ptr_out + 16);
			receive(val_in); receive(val_out);
			store(val_in + val_out, ptr_out);
#else
			val_in = *ptr_in;
			val_out = *ptr_out;
			*ptr_out = val_in + val_out;
#endif
			ptr_in = ptr_in + 16;
			ptr_out = ptr_out + 16;
		} End
#if USE_GATHER_RECEIVE
		receive(val_in); receive(val_out);
#endif
	} End
}

Setting USE_GATHER_RECEIVE to 0 yields correct output result, but setting it to 1 induces an access violation in Emulator.cpp:

        case SPECIAL_HOST_INT: {
          return;
        }
        case SPECIAL_TMU0_S: {
          assert(s->loadBuffer->numElems < 8);
          Vec val;
          for (int i = 0; i < NUM_LANES; i++) {         <<< i == 3
            uint32_t a = (uint32_t) v.elems[i].intVal;   <<< a = 0xcdcdcdcd
            val.elems[i].intVal = emuHeap[a>>2];      <<< access violation
          }
          s->loadBuffer->append(val);
          return;
        }
        default:
          break;

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request: Vector of Ptr<Float> as parameter #69

Request: Vector of Ptr<Float> as parameter #69

robiwano commented Nov 3, 2018 •

edited

Loading

robiwano commented Nov 5, 2018

robiwano commented Nov 12, 2018

mn416 commented Nov 12, 2018 •

edited

Loading

robiwano commented Nov 13, 2018

mn416 commented Nov 13, 2018

mn416 commented Nov 13, 2018

robiwano commented Nov 13, 2018

robiwano commented Nov 13, 2018

robiwano commented Nov 13, 2018 •

edited

Loading

Request: Vector of Ptr<Float> as parameter #69

Request: Vector of Ptr<Float> as parameter #69

Comments

robiwano commented Nov 3, 2018 • edited Loading

robiwano commented Nov 5, 2018

robiwano commented Nov 12, 2018

mn416 commented Nov 12, 2018 • edited Loading

robiwano commented Nov 13, 2018

mn416 commented Nov 13, 2018

mn416 commented Nov 13, 2018

robiwano commented Nov 13, 2018

robiwano commented Nov 13, 2018

robiwano commented Nov 13, 2018 • edited Loading

robiwano commented Nov 3, 2018 •

edited

Loading

mn416 commented Nov 12, 2018 •

edited

Loading

robiwano commented Nov 13, 2018 •

edited

Loading