Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update from Deltares #7

Merged
merged 5 commits into from
Nov 5, 2023
Merged

Update from Deltares #7

merged 5 commits into from
Nov 5, 2023

Conversation

Peter9192
Copy link
Contributor

No description provided.

Copy link
Contributor Author

@Peter9192 Peter9192 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, thanks a lot for all the work in updating the sfincs BMI!

I've been trying to use the new functionality today, but I've run into some issues that I haven't been able to resolve yet. I've left some comments below to try and explain. It seems the main problem is that we need the grid sizes to pass destination arrays to the getters (otherwise I get segfaults). However, the get_grid_size/get_var_size function doesn't seem to return the expected shapes.

I might continue to try and fix things over the weekend. Any help (also on Monday still) would be appreciated.

#include "sfincs_bmi.hxx"

// sfincs_bmi.f90 implements/exports the following bmi functions:
extern "C" int initialize(char *c_config_file);
extern "C" int update(double dt); // Doesn't seem to update current time
extern "C" int initialize(const char *c_config_file);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to work well

extern "C" int get_component_name(char *name);

extern "C" int update();
extern "C" int update_until(double t);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The BMI spec expects the target time, I think what's implemented now is the dt, i.e. target time - current time. I suppose we can work around that.

extern "C" void set_var(const char *c_var_name, float *xptr); // should be set get_value?
extern "C" void get_var_shape(const char *c_var_name, int *var_shape); // should be get_grid_shape
// extern "C" void get_var_rank(char *c_var_name, int *rank); // should be get_grid_rank
extern "C" int get_value(const char *c_var_name, void *x); // not exported; should be get_value?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't seem to work like before, or as expected. Calling with

grid_id = 0
grid_rank = model.get_grid_rank(grid_id)
shape = model.get_grid_shape(grid_id, np.empty(grid_rank, dtype=int))

model.get_value('zs', np.zeros(shape[0]))
# or
model.get_value('zs', np.zeros(shape[1]))
# or
model.get_value('zs', np.zeros(886*908))

gives segmentation fault.
What is the shape supposed to be?

model.get_grid_shape(grid_id, np.empty(grid_rank, dtype=int))

gives array([313509, 32614], that seems very big!! In sfincs.ini, I have

mmax                 = 886
nmax                 = 908

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conclusion: the second number is meaningless, first number is total points. It is less than nmax*mmax, I guess because some points in the grid are masked/boundaries/...

extern "C" int set_value(const char *c_var_name, float *xptr); // should be set get_value?
extern "C" int set_value_at_indices(const char *name, int *inds, int count, void *src);

extern "C" int get_var_shape(const char *c_var_name, int *var_shape); // should be get_grid_shape
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BMI doesn't specify get_var_shape. Instead, it defines get_grid_shape for a grid_id that you can get through get_var_grid. I tried to implement a workaround in add99b2

However, it seems that the get_var_shape implementation in fortran is not returning what we expect. According to bmi spec (also see get_grid_rank) this should return an array of len 2 (in case of a rank 2 array). In the fortran code, it seems a len(1) array is used for var_shape instead.

Note: we use get_grid_shape to transform the 1d array that we get from get_value back to a 2d array

src/sfincs_bmi.cxx Show resolved Hide resolved
@Peter9192
Copy link
Contributor Author

Peter9192 commented Nov 5, 2023

Okay, after looking into this some more:

  • Internal data structure of sfincs is a quadtree (1d array with complex mapping of indices to grid x/y)
  • SFINCS BMI currently returns all flat arrays in their quadtree representation (?)
  • Hence, in client we should treat it as a 1d, unstructured mesh.
  • get_grid_rank should be consistent with that, so I've fixed it to 1 in 96e1c51 I've forced it to return 1 for now. Note that get_var_rank in fortran seems would also be a workable fallback.

Next challenge: get_value always seems to give segfaults.

PS: also unstructured meshes can (and should) be rank 2 according to BMI spec.

@Peter9192
Copy link
Contributor Author

Peter9192 commented Nov 5, 2023

get_value always seems to give segfaults.

  • dtype in fortran is REAL*4, so should use float32 for dest arrays
  • length of the array, e.g. zs in fortran corresponds to np, which equals 313509 for our case.

Debugging in fortran code now:

  • Something seems off with these blocks:
   case("zs") ! water level
     call c_f_pointer(ptr, f_ptr, [np])
     do i = 1, np
       f_ptr(i) = zs(i)
     end do

The segfaults happen as soon as f_ptr(i) = zs(i) is called. I'm no fortran expert, but could it be that this answer hints at the problem? i.e. f_ptr is a pointer to an array instead of an array of pointers?

Changing the code above to

   case("zs") ! water level
     call c_f_pointer(ptr, f_ptr, [np])
     f_ptr => zs

solves the segfaults and I get data back through bmi, although the values seem unrealistic and not stable (i.e. calling multiple times gives different results).

  • Alternatively, I tried calling the fortran get_value_ptr function inside c++'s get_value. This also does return data, but I observe similar behaviour as above: not realistic nor consistent.

@BSchilperoort
Copy link
Member

  • SFINCS BMI currently returns all flat arrays in their quadtree representation (?)

  • Hence, in client we should treat it as a 1d, unstructured mesh.

Seems that that's similar to Wflow.jl then. They use an unstructured grid .

The required (grid) BMI functions would then be:

And get_var_location(name) which always returns "node"

In the ewatercycle-wflowjl plugin I have already managed to get the ewatercycle helper functions to work with unstructured grids.

@Peter9192
Copy link
Contributor Author

Peter9192 commented Nov 5, 2023

Trying to get xsrc for easier checking of the values through grpc4bmi now gives another error:

ValueError                                Traceback (most recent call last)
Cell In[25], line 1
----> 1 model.get_value('xsrc', np.zeros(25, dtype=np.float32))

File ~/mambaforge/envs/ewatercycle/lib/python3.10/site-packages/grpc4bmi/bmi_grpc_client.py:236, in BmiClient.get_value(self, name, dest)
    234 try:
    235     response = self.stub.getValue(bmi_pb2.GetVarRequest(name=name))
--> 236     numpy.copyto(src=BmiClient.make_array(response), dst=dest)
    237     return dest
    238 except grpc.RpcError as e:

File <__array_function__ internals>:180, in copyto(*args, **kwargs)

ValueError: could not broadcast input array from shape (313509,) into shape (25,)

grpc4bmi allocates it's own arrays as well via https://github.com/eWaterCycle/grpc4bmi/blob/main/grpc4bmi/reserve.py#L7-L13, so these need to be correct as well. get_grid_size is the culprit, this is hardcoded in fortran to np.

@Peter9192
Copy link
Contributor Author

after 2dc147a model.get_value('ysrc', np.zeros(25, dtype=np.float32)) works, but values are showing similar suspicious behaviour. Perhaps due to wrong dtype in https://github.com/eWaterCycle/grpc4bmi/blob/main/grpc4bmi/reserve.py#L7-L13?

@Peter9192
Copy link
Contributor Author

Peter9192 commented Nov 5, 2023

Set values seems to work as expected.

Calling model.set_value('qsrc_1', 15*np.ones(25, dtype=np.float32)) sets all values to 15

model.set_value('qsrc_1', 15*np.ones(25, dtype=np.float)) gives segfault

@Peter9192
Copy link
Contributor Author

Peter9192 commented Nov 5, 2023

Modified get_value in fortran code as follows:

   function get_value(var_name, dest, n) result(ierr) bind(C, name="get_value")
      !DEC$ ATTRIBUTES DLLEXPORT :: get_value

      integer(kind=c_int) :: ierr, n
      character(kind=c_char), intent(in) :: var_name(*)
      real(c_float), dimension(n) :: dest

      character(len=strlen(var_name)) :: f_var_name
      f_var_name = char_array_to_string(var_name, strlen(var_name))

      ierr = ret_code%success

      write(*,*)'get_value called with', f_var_name

      select case(f_var_name)
       case("z_xz") ! x grid cell centre
         dest = z_xz
       case("z_yz") ! y grid cell centre
         dest = z_yz
       case("zs") ! water level
         dest = zs
       case("zb") ! bed level
         if(subgrid) then
            dest = subgrid_z_zmin
         else
            dest = subgrid_z_zmin
         end if
       case("qsrc_1")
         dest = qsrc(:, 1)
       case("qsrc_2")
         dest = qsrc(:,2)
       case("xsrc")
         write(*,*)'xsrc', xsrc
         dest = xsrc
       case("ysrc")
         dest = ysrc
       case("tsrc")
         dest = tsrc
       case("zst_bnd")
         dest = zst_bnd
       case default
         write(*,*) 'get_value error'
         ierr = ret_code%failure
      end select

   end function get_value

makes it work for sz and qsrc_1 and qsrc_2, but xsrc and ysrc now give segfaults.
Note that I'm now passing in the length of the array through the c++ wrapper.

inspired by https://stackoverflow.com/a/34562060

@Peter9192
Copy link
Contributor Author

Peter9192 commented Nov 5, 2023

Modified get_value in fortran code as follows:

patch in Deltares/SFINCS@2a4f2a5

@Peter9192 Peter9192 merged commit 2feba1d into eWaterCycle:main Nov 5, 2023
1 check passed
@Peter9192
Copy link
Contributor Author

Peter9192 commented Nov 5, 2023

Merged via #8 , still some notes. Copying to issue #9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants