-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
intrinsics module with alternative implementations #915
base: master
Are you sure you want to change the base?
Conversation
One philosophical question: should the fsum interface be renamed to Regarding the kahan versions, given that the accuracy gains are close between the pure chunked version and the kahan one, I'm wondering which level of support should be enabled to switch between them? |
IMHO shorter names are better, and don't see a problem if they overlap with the intrinsics. First, because one can always pick the right version: use stdlib_intrinsics, only: dot_product vs. ! Force using intrinsic
intrinsic :: dot_product And then because they can be augmented by more/different arguments c = dot_product(a,b) ! intrinsic
c = dot_product(a,b,mode='kahan') ! stdlib
c = dot_product(a,b,mode='blocked') ! stdlib
... I find this more elegant and definitely not confusing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jalvesz. LGTM. It seems to be close to be ready for mergin.
|
||
#### Description | ||
|
||
The `stdlib_sum` function can replace the intrinsic `sum` for `real` or `complex` arrays. It follows a chunked implementation which maximizes vectorization potential as well as reducing the round-off error. This procedure is recommended when summing large arrays, for repetitive summation of smaller arrays consider the classical `sum`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it not for integer
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No specific reason, when implementing the first version my first need was for reals and so that's what I proposed here. I can test if it also brings benefits for integers and extend the template.
|
||
#### Description | ||
|
||
The `stdlib_dot_product_kahan` function can replace the intrinsic `dot_product` for 1D `real` or `complex` arrays. It follows a chunked implementation which maximizes vectorization potential , complemented by the same `elemental` kernel based on the [kahan summation](https://en.wikipedia.org/wiki/Kahan_summation_algorithm) used for `stdlib_sum` to reduce the round-off error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the license of wikipedia in agreement with the MIT license of stdlib?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not take content directly from Wikipedia, I just cited the wiki page that summarizes the kahan summation algorithm. Would such citation be problematic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Most content of WIkipedia is under CC BY-SA4.0, which states: "Share Alike—If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original."
To avoid any potential confusions/issues, maybe could you reference to the original paper: https://doi.org/10.1145%2F363707.363723 ?
!! This interface provides standard conforming call for sum of elements of any rank. | ||
!! The 1-D base implementation follows a chunked approach for optimizing performance and increasing accuracy. | ||
!! The `N-D` interfaces calls upon the `(N-1)-D` implementation. | ||
!! Supported data types include `real` and `complex`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are integers not supported?
I prefer to keep use stdlib_intrinsics, only: dot_product => stdlib_dot_product With this approach, the user will not inadvertently use the stdlib implementation.
This approach would break backward compatibility with the intrinsics. IMO I prefer the previous approach (either an overlap, or a name with a prefix |
Co-authored-by: Jeremie Vandenplas <[email protected]>
Co-authored-by: Jeremie Vandenplas <[email protected]>
Co-authored-by: Jeremie Vandenplas <[email protected]>
Add intrinsics module containing replacements for intrinsic function where some feature is found interesting: faster implementation, better accuracy, both simultaneously.
This PR follows the discussion in discourse https://fortran-lang.discourse.group/t/lfortran-now-supports-all-intrinsic-functions/8844/41 and it's based on https://github.com/jalvesz/fast_math
stdlib_sum
andstdlib_sum_kahan
)stdlib_dot_product
andstdlib_dot_product_kahan
)cc: @fortran-lang/stdlib @perazz @certik @jvdp1