-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cpu: move all the operators into a separate c++ file (except mul_mat) #1167
Conversation
Co-authored-by: Diego Devesa <slarengh@gmail.com>
Thanks for the review comments! I've fixed the suggestions, except the one about ggml/src/ggml-cpu/ggml-cpu-impl.h Lines 42 to 74 in f06264e
Maybe we could move those lines into a different file (possibly in |
I missed those. In this case, let's leave the include as it is.
I think these CPU feature defines have to remain in |
Thanks. Should I take a deep breath and merge this in? I've tested on the runners in ggml-ci, and parsed the include dependency order a couple of times in my head (as much as I could). Will read through the PR one more time, with a fine comb. |
Yes, squash-merge and then I'll sync to |
…esent in ggml-cpu.c previously
Tested the latest changes (PR comments) on ggml-ci . All tests pass - 6f4ce79 Squash-merging.. |
The CI build failure on the I created a new branch (from master) and ran the CI again, and it ran successfully - https://github.com/ggml-org/ggml/commits/cmdr2-test-master-after-cpu-refactor/ And the CUDA tests pass on my local PC as well. The CI build failure on the Not sure why there was a transient error in the mean-square-error for |
Some of the tests occasionally exceed the error threshold, so it is expected to have them failing from time to time. I think this is such case, though let's keep an eye out and see if it happens again. |
// This shrinks the size of
ggml-cpu.c
to around 3500 lines (down from 15k lines a few weeks ago). A future PR will organizeops.cpp
into separate files, and use templates to simplify it further.It does not modify any lines in
ggml-cpu.c
, it only removes lines. Actual diff.This PR looks big, but only does a few simple things:
simd-mappings.h
vec.h
(andvec.cpp
for non-inline functions)ops.cpp
(exceptmul_mat
, which has some C-specific behavior)Detailed changes:
ggml_float
was being cast implicitly tofloat
. diffextern
.GGML_UNUSED
directly instead of theUNUSED
macro.float *
orggml_fp16_t *
in functions that expected that type, e.g.ggml_fp16_to_fp32_row()
.static_cast<ggml_op_pool>(op)
for converting an int to an enum.ggml_get_type_traits_cpu()
function instead of accessingtype_traits_cpu
directly.std::hardware_destructive_interference_size
instead ofhardware_destructive_interference_size
.GGML_UNUSED
directly instead of theUNUSED
macro.This change passes all the runners at ggml-ci.
Please let me know if this approach requires any changes, happy to fix!
Thanks!