nc::linalg::lstsq very slow #85

66Ton99 · 2020-11-27T00:02:29Z

Describe the bug
It takes forever to do SVD::decompose()

To Reproduce
Run linalg.lstsq with big NdArray, my was 8905 rows and 316 cols

Expected behavior
It must be much more faster

mu578 · 2020-12-16T16:57:24Z

hello, please feel free to scavenge things there, might help you to get rid of bottlenecks. I suspect a too low tol settings

https://github.com/moe123/macadam/tree/master/macadam/details/numa

dpilger26 · 2020-12-16T19:52:50Z

Thank you @moe123, i'll take a look. I assume you are referring to your SVD implementation here? Do you have any bench marks for this code, like on an 8905x316 matrix like above for instance?

mu578 · 2020-12-16T23:30:57Z

@dpilger26 not really on that for now, however, parameter description is missing, I should add that. Check your implementation first. You might want to converge with a too low value; or too high expectations, would explain why reduction is very slow and somehow goes to a very high number of iterations for nothing. BTW, by any standard a 8905 x 316 matrix is something very big + beyond the dimension fact; the very nature of data must be known, is it well conditioned?; the OP should reconsider his mathematical approach to solve his problem; maybe not using the right tools here.

Aditya-11 · 2021-02-12T10:00:26Z

Numpy lstsq function is faster . BenchMark details lstsq of A (201,200) and B (201 , 1) in numpy python it takes 6 seconds and in Numcpp it takes about 98 seconds.

mu578 · 2021-02-12T18:16:43Z

@Aditya-11 does not mean anything the two underlaying implementations differ by a lot; what are your calling arguments ?especially double inTolerance = 1e-12 if let by default, I am not surprised, which is to me one of the culprit as I early on mentioned briefly + surely this is driven by the pre-conditioning of your data; until, you are not deciding all to be rational this very issue will never be addressed. + How do you compile this very project? which compiler ? vectorization and so on ; iq is low here; that's only the true assertion in term of quantity to be made.

Aditya-11 · 2021-02-12T18:57:25Z

Thanks @moe123 , I have to check this inTolerance parameter. Compiler clang used . If preconditioning is bad then numpy answer should also come slower I guess .

mu578 · 2021-02-12T19:24:16Z

@Aditya-11 no numpy implementation differs by a lot (it is a ~20 years old project, so older than you), it also does take account of rcond + select different strategies for decomps + applies residual decimation steps and so on. Preconditioning is never good or bad; numerical analysis is not about pseudo-moralistic casts; but facts; it is what it is contextually; so your guess is as just as good as your previous rant a million thumb-up won't change facts, so whatever, thus, here, we are evidently talking about the logical relationship between a tolerance threshold and the nature of input data for a given algorithm using a SVD solver. The implied tolerance should not exceed something around ~ 1E-7 in the most optimistic scenario within the context of this present implementation; what are your flags ? loop-vectorize ? size. + You all seem so smart that it is too much to ask to upload your actual benchmarks ; until now, the discussion is all about entitled B.S of unverifiable assertions.

Aditya-11 · 2021-02-15T06:58:37Z

@moe123 , I have changed the argument inTolerance to 1e-6 , still it is much slower compared to python implementation .

Benchmark details : n = 200 , a_1 (200 * 200) , b_1 (200*1) .

in c++ :
for (int i = 0 ; i < n ; i++)
auto c1 = nc::linalg::lstsq(a_1,b_1,intolerance) ;

time : 100-160s

python :
for i in range(n) :
c_ = num.linalg.lstsq(a_1 , b_1)

time : 6-8 s

Any other stuff to be changed , I had previously implemented the gmres algorithm in python using numpy and then converted to c++ using Numcpp and it was slower .Therefore I thought that lstsq might be slow .

mu578 · 2021-02-15T17:41:49Z

@Aditya-11 do you know what's the definition of science? the very definition is a set of methods that given to one can be reproduced by another *. What's your Valgrind output or similar; where does cpu-time happen?

(*) so would you give the content of your data that an apple-to-apple comparison can take place? same remark goes for the OP, I don't know what you are smoking and in which planet do you live on; but, that's far away from here and now; if you would kindly subscribe to reality there would be a tremendous improvement. To address an issue, you need to understand it; meaning literally circumventing it; this is true with anything happening in this world; however, it requires embracing the idea of making an effort; we are not yet there.

-1 Does the fact that numpy runs faster is of any help ? no that's a given reference which we would like very much to reproduce or approach.

-2 Does the complexity of this issue resides on telling one is faster and the other one slower, over and over again; and by such, not exposing the entire problem properly and openly giving it a context that can be shared and anyone can reproduce? no, the complexity will reside when this simple basic step is achieved and resolved; (for now, you failed at it, same goes for the OP); what are the set of rational options we have for improving the actual algorithm.

cbenitez81 · 2021-02-23T15:26:08Z

Hi @Aditya-11 can you provide the flags used in CMake, it might be something familiar with compilation of the library/executable itself, threading etc. Both numpy and this package relies on lapack which as others mentioned is quite old an stable

66Ton99 changed the title ~~linalg.lstsq very slow~~ nc::linalg::lstsq very slow Nov 27, 2020

dpilger26 self-assigned this Dec 2, 2020

dpilger26 added enhancement New feature or request help wanted Extra attention is needed labels Dec 2, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nc::linalg::lstsq very slow #85

nc::linalg::lstsq very slow #85

66Ton99 commented Nov 27, 2020

mu578 commented Dec 16, 2020 •

edited

Loading

dpilger26 commented Dec 16, 2020

mu578 commented Dec 16, 2020 •

edited

Loading

Aditya-11 commented Feb 12, 2021

mu578 commented Feb 12, 2021 •

edited

Loading

Aditya-11 commented Feb 12, 2021 •

edited

Loading

mu578 commented Feb 12, 2021 •

edited

Loading

Aditya-11 commented Feb 15, 2021

mu578 commented Feb 15, 2021 •

edited

Loading

cbenitez81 commented Feb 23, 2021

nc::linalg::lstsq very slow #85

nc::linalg::lstsq very slow #85

Comments

66Ton99 commented Nov 27, 2020

mu578 commented Dec 16, 2020 • edited Loading

dpilger26 commented Dec 16, 2020

mu578 commented Dec 16, 2020 • edited Loading

Aditya-11 commented Feb 12, 2021

mu578 commented Feb 12, 2021 • edited Loading

Aditya-11 commented Feb 12, 2021 • edited Loading

mu578 commented Feb 12, 2021 • edited Loading

Aditya-11 commented Feb 15, 2021

mu578 commented Feb 15, 2021 • edited Loading

cbenitez81 commented Feb 23, 2021

mu578 commented Dec 16, 2020 •

edited

Loading

mu578 commented Dec 16, 2020 •

edited

Loading

mu578 commented Feb 12, 2021 •

edited

Loading

Aditya-11 commented Feb 12, 2021 •

edited

Loading

mu578 commented Feb 12, 2021 •

edited

Loading

mu578 commented Feb 15, 2021 •

edited

Loading