-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
intrinsics module with alternative implementations #915
base: master
Are you sure you want to change the base?
Changes from 30 commits
08ec0aa
c36251e
2207f41
2bc7af9
4625205
243ea6f
c38dcd6
bf1ce2f
cc9df61
671fd61
75945f1
d05903f
c0d96e5
4abd8d3
7c6e8a4
7cea1fd
eaffa4a
ad64162
a3d24e4
5a1fdcb
47396ac
ecb7050
87ef502
14be974
aaa68bc
cc232e1
6e36b6f
65175d7
8a35f38
16a0e96
f0ed271
316269b
52aab02
a6be0a0
3e171f7
332b748
537cef8
a4370c2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,158 @@ | ||||||||||
--- | ||||||||||
title: intrinsics | ||||||||||
--- | ||||||||||
|
||||||||||
# The `stdlib_intrinsics` module | ||||||||||
|
||||||||||
[TOC] | ||||||||||
|
||||||||||
## Introduction | ||||||||||
|
||||||||||
The `stdlib_intrinsics` module provides replacements for some of the well known intrinsic functions found in Fortran compilers for which either a faster and/or more accurate implementation is found which has also proven of interest to the Fortran community. | ||||||||||
jvdp1 marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||
|
||||||||||
<!-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --> | ||||||||||
### `stdlib_sum` function | ||||||||||
|
||||||||||
#### Description | ||||||||||
|
||||||||||
The `stdlib_sum` function can replace the intrinsic `sum` for `real` or `complex` arrays. It follows a chunked implementation which maximizes vectorization potential as well as reducing the round-off error. This procedure is recommended when summing large arrays, for repetitive summation of smaller arrays consider the classical `sum`. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is it not for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No specific reason, when implementing the first version my first need was for reals and so that's what I proposed here. I can test if it also brings benefits for integers and extend the template. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wonder if it should support also There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. From what I see in the code, implementing it would "just" be a matter of adding integer |
||||||||||
|
||||||||||
#### Syntax | ||||||||||
|
||||||||||
`res = ` [[stdlib_intrinsics(module):stdlib_sum(interface)]] ` (x [,mask] )` | ||||||||||
|
||||||||||
`res = ` [[stdlib_intrinsics(module):stdlib_sum(interface)]] ` (x, dim [,mask] )` | ||||||||||
|
||||||||||
#### Status | ||||||||||
|
||||||||||
Experimental | ||||||||||
|
||||||||||
#### Class | ||||||||||
|
||||||||||
Pure function. | ||||||||||
|
||||||||||
#### Argument(s) | ||||||||||
|
||||||||||
`x`: N-D array of either `real` or `complex` type. This argument is `intent(in)`. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Suggested change
|
||||||||||
|
||||||||||
`dim` (optional): scalar of type `integer` with a value in the range from 1 to n, where n equals the rank of `x`. | ||||||||||
|
||||||||||
`mask` (optional): N-D array of `logical` values, with the same shape as `x`. This argument is `intent(in)`. | ||||||||||
|
||||||||||
#### Output value or Result value | ||||||||||
|
||||||||||
If `dim` is absent, the output is a scalar of the same `type` and `kind` as to that of `x`. Otherwise, an array of rank n-1, where n equals the rank of `x`, and a shape similar to that of `x` with dimension `dim` dropped is returned. | ||||||||||
|
||||||||||
<!-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --> | ||||||||||
### `stdlib_sum_kahan` function | ||||||||||
|
||||||||||
#### Description | ||||||||||
|
||||||||||
The `stdlib_sum_kahan` function can replace the intrinsic `sum` for `real` or `complex` arrays. It follows a chunked implementation which maximizes vectorization potential complemented by an `elemental` kernel based on the [kahan summation](https://en.wikipedia.org/wiki/Kahan_summation_algorithm) strategy to reduce the round-off error: | ||||||||||
|
||||||||||
```fortran | ||||||||||
elemental subroutine kahan_kernel_<kind>(a,s,c) | ||||||||||
type(<kind>), intent(in) :: a | ||||||||||
type(<kind>), intent(inout) :: s | ||||||||||
type(<kind>), intent(inout) :: c | ||||||||||
type(<kind>) :: t, y | ||||||||||
y = a - c | ||||||||||
t = s + y | ||||||||||
c = (t - s) - y | ||||||||||
s = t | ||||||||||
end subroutine | ||||||||||
``` | ||||||||||
|
||||||||||
#### Syntax | ||||||||||
|
||||||||||
`res = ` [[stdlib_intrinsics(module):stdlib_sum_kahan(interface)]] ` (x [,mask] )` | ||||||||||
|
||||||||||
`res = ` [[stdlib_intrinsics(module):stdlib_sum_kahan(interface)]] ` (x, dim [,mask] )` | ||||||||||
|
||||||||||
#### Status | ||||||||||
|
||||||||||
Experimental | ||||||||||
|
||||||||||
#### Class | ||||||||||
|
||||||||||
Pure function. | ||||||||||
|
||||||||||
#### Argument(s) | ||||||||||
|
||||||||||
`x`: 1D array of either `real` or `complex` type. This argument is `intent(in)`. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
|
||||||||||
`dim` (optional): scalar of type `integer` with a value in the range from 1 to n, where n equals the rank of `x`. | ||||||||||
|
||||||||||
`mask` (optional): N-D array of `logical` values, with the same shape as `x`. This argument is `intent(in)`. | ||||||||||
|
||||||||||
#### Output value or Result value | ||||||||||
|
||||||||||
If `dim` is absent, the output is a scalar of the same `type` and `kind` as to that of `x`. Otherwise, an array of rank n-1, where n equals the rank of `x`, and a shape similar to that of `x` with dimension `dim` dropped is returned. | ||||||||||
|
||||||||||
#### Example | ||||||||||
|
||||||||||
```fortran | ||||||||||
{!example/intrinsics/example_sum.f90!} | ||||||||||
``` | ||||||||||
|
||||||||||
<!-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --> | ||||||||||
### `stdlib_dot_product` function | ||||||||||
|
||||||||||
#### Description | ||||||||||
|
||||||||||
The `stdlib_dot_product` function can replace the intrinsic `dot_product` for 1D `real` or `complex` arrays. It follows a chunked implementation which maximizes vectorization potential as well as reducing the round-off error. This procedure is recommended when crunching large arrays, for repetitive products of smaller arrays consider the classical `dot_product`. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
|
||||||||||
#### Syntax | ||||||||||
|
||||||||||
`res = ` [[stdlib_intrinsics(module):stdlib_dot_product(interface)]] ` (x, y)` | ||||||||||
|
||||||||||
#### Status | ||||||||||
|
||||||||||
Experimental | ||||||||||
|
||||||||||
#### Class | ||||||||||
|
||||||||||
Pure function. | ||||||||||
|
||||||||||
#### Argument(s) | ||||||||||
|
||||||||||
`x`: 1D array of either `real` or `complex` type. This argument is `intent(in)`. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
|
||||||||||
`y`: 1D array of the same type and kind as `x`. This argument is `intent(in)`. | ||||||||||
|
||||||||||
#### Output value or Result value | ||||||||||
|
||||||||||
The output is a scalar of `type` and `kind` same as to that of `x` and `y`. | ||||||||||
|
||||||||||
<!-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --> | ||||||||||
### `stdlib_dot_product_kahan` function | ||||||||||
|
||||||||||
#### Description | ||||||||||
|
||||||||||
The `stdlib_dot_product_kahan` function can replace the intrinsic `dot_product` for 1D `real` or `complex` arrays. It follows a chunked implementation which maximizes vectorization potential , complemented by the same `elemental` kernel based on the [kahan summation](https://en.wikipedia.org/wiki/Kahan_summation_algorithm) used for `stdlib_sum` to reduce the round-off error. | ||||||||||
jalvesz marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||
|
||||||||||
#### Syntax | ||||||||||
|
||||||||||
`res = ` [[stdlib_intrinsics(module):stdlib_dot_product_kahan(interface)]] ` (x, y)` | ||||||||||
|
||||||||||
#### Status | ||||||||||
|
||||||||||
Experimental | ||||||||||
|
||||||||||
#### Class | ||||||||||
|
||||||||||
Pure function. | ||||||||||
|
||||||||||
#### Argument(s) | ||||||||||
|
||||||||||
`x`: 1D array of either `real` or `complex` type. This argument is `intent(in)`. | ||||||||||
|
||||||||||
`y`: 1D array of the same type and kind as `x`. This argument is `intent(in)`. | ||||||||||
|
||||||||||
#### Output value or Result value | ||||||||||
|
||||||||||
The output is a scalar of `type` and `kind` same as to that of `x` and `y`. | ||||||||||
|
||||||||||
```fortran | ||||||||||
{!example/intrinsics/example_dot_product.f90!} | ||||||||||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
ADD_EXAMPLE(sum) | ||
ADD_EXAMPLE(dot_product) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
program example_dot_product | ||
use stdlib_kinds, only: sp | ||
use stdlib_intrinsics, only: stdlib_dot_product, stdlib_dot_product_kahan | ||
implicit none | ||
|
||
real(sp), allocatable :: x(:), y(:) | ||
real(sp) :: total_prod(3) | ||
|
||
allocate( x(1000), y(1000) ) | ||
call random_number(x) | ||
call random_number(y) | ||
|
||
total_prod(1) = dot_product(x,y) !> compiler intrinsic | ||
total_prod(2) = stdlib_dot_product(x,y) !> chunked summation over inner product | ||
total_prod(3) = stdlib_dot_product_kahan(x,y) !> chunked kahan summation over inner product | ||
print *, total_prod(1:3) | ||
|
||
end program example_dot_product |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
program example_sum | ||
use stdlib_kinds, only: sp | ||
use stdlib_intrinsics, only: stdlib_sum, stdlib_sum_kahan | ||
implicit none | ||
|
||
real(sp), allocatable :: x(:) | ||
real(sp) :: total_sum(3) | ||
|
||
allocate( x(1000) ) | ||
call random_number(x) | ||
|
||
total_sum(1) = sum(x) !> compiler intrinsic | ||
total_sum(2) = stdlib_sum(x) !> chunked summation | ||
total_sum(3) = stdlib_sum_kahan(x)!> chunked kahan summation | ||
print *, total_sum(1:3) | ||
|
||
end program example_sum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.