Is there any way to use self-defined ones when setting up multithreading？ #1155

jiajia1417 · 2025-01-14T06:59:12Z

Problem description:
I now have a matrix multiplication that needs to be called in an external multithread.

arm_compute::IScheduler& scheduler = arm_compute::Scheduler::get();
scheduler.set_num_threads(total_thread);

void run_acl_gemm_thread(
    float* out, const float* x, const float* weight,
    int num_tokens, int in_channels, int out_channels,
    float alpha, float beta, int total_thread, int threadid) {
    
    arm_compute::Tensor tensor_x, tensor_weight, tensor_out;

    arm_compute::TensorShape shape_x(in_channels, num_tokens);       
    arm_compute::TensorShape shape_weight(in_channels, out_channels);     
    arm_compute::TensorShape shape_out(out_channels, num_tokens);  


    tensor_x.allocator()->init(arm_compute::TensorInfo(shape_x, 1, arm_compute::DataType::F32));
    tensor_weight.allocator()->init(arm_compute::TensorInfo(shape_weight, 1, arm_compute::DataType::F32));
    tensor_out.allocator()->init(arm_compute::TensorInfo(shape_out, 1, arm_compute::DataType::F32));
    

    tensor_x.allocator()->import_memory(const_cast<float*>(x));
    tensor_weight.allocator()->import_memory(const_cast<float*>(weight)); 
    tensor_out.allocator()->import_memory(out);  

    arm_compute::NEGEMM gemm;
    arm_compute::GEMMInfo gemm_info;
    gemm_info.set_pretranspose_B(true);
    gemm.configure(&tensor_x, &tensor_weight, nullptr, &tensor_out, alpha, beta, gemm_info);
    gemm.run();
}

But now that I've defined other threads externally, is there any way to set the scheduler's multithreading.

std::vector<std::thread> threads;
    for (int thread_id = 0; thread_id < total_thread; ++thread_id) {
        threads.emplace_back();
    }

like this, make threads set by the scheduler point to threads.

The text was updated successfully, but these errors were encountered:

morgolock · 2025-01-16T11:06:19Z

Hi @jiajia1417

Just to clarify: arm_compute::IScheduler& scheduler is used internally by the NEGEMM function to create multiple threads for the computation. Each thread will run the same kernel on different non overlapping slices of the data. For example: if there are 4 cpu cores, NEGEMM will use 4 threads and divide the work equally among these threads.
The Scheduler job is to create the threads for each kernel and execute the workloads. You do not have to set the scheduler explicitly, the function NEGEMM already knows how to use it.

From what I read above you would like to run NEGEMM altogether from multiple threads, is this correct?

jiajia1417 closed this as completed Jan 14, 2025

jiajia1417 reopened this Jan 14, 2025

morgolock added Help wanted Question labels Jan 14, 2025

morgolock closed this as completed Jan 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there any way to use self-defined ones when setting up multithreading？ #1155

Is there any way to use self-defined ones when setting up multithreading？ #1155

jiajia1417 commented Jan 14, 2025 •

edited

Loading

morgolock commented Jan 16, 2025

Is there any way to use self-defined ones when setting up multithreading？ #1155

Is there any way to use self-defined ones when setting up multithreading？ #1155

Comments

jiajia1417 commented Jan 14, 2025 • edited Loading

morgolock commented Jan 16, 2025

jiajia1417 commented Jan 14, 2025 •

edited

Loading