Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mesh_stats checks and field_operations features #51

Open
wants to merge 35 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
0e323e9
Added implementations of mesh statistic check and field assignment check
ryar9534 May 7, 2024
420a72c
Initial commit
alexbenedicto Jul 12, 2024
3e40aaa
Updating mesh_stats files from Aronson
alexbenedicto Jul 12, 2024
f8d19d5
Mesh stats completed with node sharing number and better parsing format.
alexbenedicto Aug 8, 2024
cde3385
Fields range of values are now being checked by mesh_stats
alexbenedicto Aug 13, 2024
656da06
yapf formatting
alexbenedicto Aug 14, 2024
4e34794
Check for disconnected nodes added
alexbenedicto Aug 15, 2024
50a0ee9
Test file created for mesh stats
alexbenedicto Aug 15, 2024
be2aa69
Disconnected nodes added to parsing
alexbenedicto Aug 15, 2024
38bde63
Check for NaN values added + fields from FieldData now checked
alexbenedicto Aug 16, 2024
b4c0f18
Number of cells neighbors + number of disconnected cells added
alexbenedicto Aug 16, 2024
6f774ec
To revert
alexbenedicto Sep 13, 2024
548ae2a
To revert
alexbenedicto Sep 16, 2024
6285d64
Merge branch 'origin/benedicto/followUpMeshStats' of https://github.c…
alexbenedicto Sep 16, 2024
eaab705
Mesh_stats tests done
alexbenedicto Oct 15, 2024
645f3e4
Bug fix where args of mesh_doctor.py execution would be mesh_stats
alexbenedicto Oct 21, 2024
404bc22
Change name from "add_fields" to "fields_manipulation"
alexbenedicto Nov 6, 2024
0676967
intermediate commit
alexbenedicto Nov 12, 2024
ea4eb84
Changed names from manipulations to operations
alexbenedicto Nov 13, 2024
983b3bc
vtk_utils new functions to read .vtm file directly or from a .pvd file
alexbenedicto Nov 15, 2024
a613e08
First version working with "cell" as support
alexbenedicto Nov 18, 2024
e863fd1
Added points as support
alexbenedicto Nov 20, 2024
7de5d87
Tests of vtk_utils added
alexbenedicto Nov 27, 2024
60fad80
Bugfix and update of test caes for vtk_utils
alexbenedicto Dec 6, 2024
23d1f6a
Added get_all_array_names to vtk_utils
alexbenedicto Dec 6, 2024
01b9af1
Add numexpr package for mesh_doctor
alexbenedicto Dec 13, 2024
9002e9c
copy and creation of fields added
alexbenedicto Dec 13, 2024
c0ce7f6
Testing of field_operations
alexbenedicto Dec 13, 2024
bd814da
Changed storages of fields from dict to tuple
alexbenedicto Dec 16, 2024
8c14003
Allow modification of options copy and create used depending on the a…
alexbenedicto Dec 18, 2024
67c7d0b
Update of tests
alexbenedicto Dec 18, 2024
479a1b2
Documentation of mesh_stats and field_operations with update from dev…
alexbenedicto Dec 18, 2024
4b200a7
Improved implementation of operations
alexbenedicto Dec 18, 2024
ce8f650
Merge branch 'main' into origin/benedicto/followUpMeshStats
alexbenedicto Dec 18, 2024
72741ee
yapf formatting
alexbenedicto Dec 19, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 63 additions & 5 deletions docs/geos-mesh.rst
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,47 @@ Cells with negative volumes will typically be an issue for ``geos`` and should b
-h, --help show this help message and exit
--min 0.0 [float]: The minimum acceptable volume. Defaults to 0.0.

``field_operations``
""""""""""""""""""""""""""

Using a source file containing PointData or CellData, allows to perform operations on that data from the source file to an output .vtu file.
The source file can be a .vtu, .vtm or .pvd file as long as the geometry of the multiblock corresponds to the geometry of the output .vtu file.
An example of source file can be the vtkOutput.pvd from a GEOS simulation and the output file can be your VTK mesh used in this simulation.
The term 'operations' covers two distinct categories:
'COPY' operations which copies data arrays from the source file to the output file, with the possibility to rename the arrays copied and to apply multiplication or addition to these arrays.
'CREATE' operations which uses the source file data to create new arrays by performing addition between several arrays, applying log or sqrt functions ... allowed by the numexpr library.

.. code-block::

$ python src/geos/mesh/doctor/mesh_doctor.py field_operations --help
usage: mesh_doctor.py field_operations [-h] [--support point, cell] [--source SOURCE] [--operations OPERATIONS]
[--which_vtm WHICH_VTM] --output OUTPUT [--data-mode binary, ascii]

options:
-h, --help show this help message and exit
--support point, cell
[string]: Where to define field.
--source SOURCE [string]: Where field data to use for operation comes from .vtu, .vtm or .pvd file.
--operations OPERATIONS
[list of string comma separated]: The syntax here is function0:new_name0, function1:new_name1, ...
Allows to perform a wide arrays of operations to add new data to your output mesh using the source file data.
Examples are the following:
1. Copy of the field 'poro' from the input to the ouput with 'poro:poro'.
2. Copy of the field 'PERM' from the input to the ouput with a multiplication of the values by 10 with 'PERM*10:PERM'.
3. Copy of the field 'TEMP' from the input to the ouput with an addition to the values by 0.5 and change the name of the field to 'temperature' with 'TEMP+0.5:TEMPERATURE'.
4. Create a new field 'NEW_PARAM' using the input 'PERM' field and having the square root of it with 'sqrt(PERM):NEW_PARAM'.
Another method is to use precoded functions available which are:
1. 'distances_mesh_center' will create a field where the distances from the mesh centerare calculated for all the elements chosen as support. To use: 'distances_mesh_center:NEW_FIELD_NAME'.
2. 'random' will create a field with samples from a uniform distribution over (0, 1). To use: 'random:NEW_FIELD_NAME'.
--which_vtm WHICH_VTM
[string]: If your input is a .pvd, choose which .vtm (each .vtm corresponding to a unique timestep) will be used for the operation.
To do so, you can choose amongst these possibilities: 'first' will select the initial timestep;
'last' will select the final timestep; or you can enter directly the index starting from 0 of the timestep (not the time).
By default, the value is set to 'last'.
--output OUTPUT [string]: The vtk output file destination.
--data-mode binary, ascii
[string]: For ".vtu" output format, the data mode can be binary or ascii. Defaults to binary.

``fix_elements_orderings``
""""""""""""""""""""""""""

Expand Down Expand Up @@ -229,6 +270,27 @@ The ``generate_global_ids`` can generate `global ids` for the imported ``vtk`` m
--data-mode binary, ascii
[string]: For ".vtu" output format, the data mode can be binary or ascii. Defaults to binary.

``mesh_stats``
"""""""""""""""""

Performs a summary over certain geometrical, topological and data painting mesh properties.
The future goal for this feature would be to provide a deeper mesh analysis and to evaluate the 'quality' of this mesh before using it in GEOS.

.. code-block::

$ python src/geos/mesh/doctor/mesh_doctor.py mesh_stats --help
usage: mesh_doctor.py mesh_stats [-h] --write_stats [0, 1] [--disconnected [0, 1]]
[--field_values [0, 1]] [--output OUTPUT]

options:
-h, --help show this help message and exit
--write_stats [0, 1] [int]: The stats of the mesh will be printed in a file to the folder specified in --output.
--disconnected [0, 1]
[int]: Display all disconnected nodes ids and disconnected cell ids.
--field_values [0, 1]
[int]: Display all range of field values that seem not realistic.
--output OUTPUT [string]: The output folder destination where the stats will be written.

``non_conformal``
"""""""""""""""""

Expand Down Expand Up @@ -339,8 +401,4 @@ API
^^^

.. automodule:: geos.mesh.conversion.abaqus_converter
:members:




:members:
1 change: 1 addition & 0 deletions docs/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,4 @@ vtk >= 9.1
networkx >= 2.4
tqdm
numpy
numexpr
1 change: 1 addition & 0 deletions geos-mesh/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ dependencies = [
"networkx >= 2.4",
"tqdm",
"numpy",
"numexpr",
"meshio>=5.3.2",
]

Expand Down
276 changes: 276 additions & 0 deletions geos-mesh/src/geos/mesh/doctor/checks/field_operations.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,276 @@
import logging
from numexpr import evaluate
from dataclasses import dataclass
from geos.mesh.doctor.checks.vtk_utils import ( VtkOutput, get_points_coords_from_vtk, get_cell_centers_array,
get_vtm_filepath_from_pvd, get_vtu_filepaths_from_vtm,
get_all_array_names, read_mesh, write_mesh )
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move the "local" imports at the end

from numpy import array, empty, full, sqrt, int64, nan
from numpy.random import rand
from scipy.spatial import KDTree
from tqdm import tqdm
from vtkmodules.util.numpy_support import numpy_to_vtk, vtk_to_numpy
from vtkmodules.vtkCommonDataModel import vtkUnstructuredGrid


@dataclass( frozen=True )
class Options:
support: str # choice between 'cell' and 'point' to operate on fields
source: str # file from where the data is collected
operations: list[ tuple[ str ] ] # [ ( function0, new_name0 ), ... ]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the comment, it seems like we expect exactly 2 str in the tuple. Would something like list[ tuple[ str, str ] ] work?

vtm_index: int # useful when source is a .pvd or .vtm file
vtk_output: VtkOutput


@dataclass( frozen=True )
class Result:
info: bool


__SUPPORT_CHOICES = [ "point", "cell" ]
support_construction: dict[ str, tuple[ any ] ] = {
__SUPPORT_CHOICES[ 0 ]: get_points_coords_from_vtk,
__SUPPORT_CHOICES[ 1 ]: get_cell_centers_array
}


def get_distances_mesh_center( mesh: vtkUnstructuredGrid, support: str ) -> array:
f"""For a specific support type {__SUPPORT_CHOICES}, returns a numpy array filled with the distances between
their coordinates and the center of the mesh.

Args:
support (str): Choice between {__SUPPORT_CHOICES}.

Returns:
array: [ distance0, distance1, ..., distanceN ] with N being the number of support elements.
"""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a great opportunity to use your support_construction defined at L30.

if support == __SUPPORT_CHOICES[ 0 ]:
coords: array = get_points_coords_from_vtk( mesh )
elif support == __SUPPORT_CHOICES[ 1 ]:
coords = get_cell_centers_array( mesh )
else:
raise ValueError( f"For support, the only choices available are {__SUPPORT_CHOICES}." )

center = ( coords.max( axis=0 ) + coords.min( axis=0 ) ) / 2
distances = empty( coords.shape[ 0 ] )
for i in range( coords.shape[ 0 ] ):
distance_squared: float = 0.0
coord = coords[ i ]
for j in range( len( coord ) ):
distance_squared += ( coord[ j ] - center[ j ] ) * ( coord[ j ] - center[ j ] )
distances[ i ] = distance_squared
distances = sqrt( distances )
return distances
Comment on lines +46 to +62
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would try to simplify the whole function, and do something like

Suggested change
if support == __SUPPORT_CHOICES[ 0 ]:
coords: array = get_points_coords_from_vtk( mesh )
elif support == __SUPPORT_CHOICES[ 1 ]:
coords = get_cell_centers_array( mesh )
else:
raise ValueError( f"For support, the only choices available are {__SUPPORT_CHOICES}." )
center = ( coords.max( axis=0 ) + coords.min( axis=0 ) ) / 2
distances = empty( coords.shape[ 0 ] )
for i in range( coords.shape[ 0 ] ):
distance_squared: float = 0.0
coord = coords[ i ]
for j in range( len( coord ) ):
distance_squared += ( coord[ j ] - center[ j ] ) * ( coord[ j ] - center[ j ] )
distances[ i ] = distance_squared
distances = sqrt( distances )
return distances
if support not in __SUPPORT_CHOICES:
raise ValueError( f"Only available choices for support are {__SUPPORT_CHOICES}." )
coords = support_construction[support]( mesh )
center = ( coords.max( axis=0 ) + coords.min( axis=0 ) ) / 2
distances = np.sqrt(((coords-center)**2).sum(axis=1))
return distances



def get_random_field( mesh: vtkUnstructuredGrid, support: str ) -> array:
f"""For a specific support type {__SUPPORT_CHOICES}, an array with samples from a uniform distribution over [0, 1).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the doc string, "returns an array..."


Args:
support (str): Choice between {__SUPPORT_CHOICES}.

Returns:
array: Array of size N being the number of support elements.
"""
if support == __SUPPORT_CHOICES[ 0 ]:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's tempting to have a seperate function that would return the number of elements, especially if the support_choices list is intended to grow in the future.
That would avoid changing the code everywhere it is repeated, like L209.

number_elements: int = mesh.GetNumberOfPoints()
elif support == __SUPPORT_CHOICES[ 1 ]:
number_elements = mesh.GetNumberOfCells()
else:
raise ValueError( f"For support, the only choices available are {__SUPPORT_CHOICES}." )
return rand( number_elements, 1 )


create_precoded_fields: dict[ str, any ] = {
"distances_mesh_center": get_distances_mesh_center,
"random": get_random_field
}


def get_vtu_filepaths( options: Options ) -> tuple[ str ]:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point, could maybe be moved to the vtk utils ?

"""Returns the vtu filepaths to use for the rest of the workflow.

Args:
options (Options): Options chosen by the user.

Returns:
tuple[ str ]: ( "file/path/0.vtu", ..., "file/path/N.vtu" )
"""
source_filepath: str = options.source
if source_filepath.endswith( ".vtu" ):
return ( source_filepath, )
elif source_filepath.endswith( ".vtm" ):
return get_vtu_filepaths_from_vtm( source_filepath )
elif source_filepath.endswith( ".pvd" ):
vtm_filepath: str = get_vtm_filepath_from_pvd( source_filepath, options.vtm_index )
return get_vtu_filepaths_from_vtm( vtm_filepath )
else:
raise ValueError( f"The source filepath '{options.source}' provided does not target a .vtu, a .vtm nor a " +
".pvd file." )


def get_reorder_mapping( kd_tree_grid_ref: KDTree, sub_grid: vtkUnstructuredGrid, support: str ) -> array:
"""Builds an array containing the indexes of the reference grid linked to every
cell ids / point ids of the subset grid.

Args:
kd_tree_grid_ref (KDTree): A KDTree of the nearest neighbor cell centers for every cells /
points coordinates for point of the reference grid.
sub_grid (vtkUnstructuredGrid): A vtk grid that is a subset of the reference grid.
support (str): Either "point" or "cell".

Returns:
np.array: [ cell_idK_grid, cell_idN_grid, ... ] or [ point_idK_grid, point_idN_grid, ... ]
"""
support_elements: array = support_construction[ support ]( sub_grid )
# now that you have the support elements, you can map them to the reference grid
number_elements: int = support_elements.shape[ 0 ]
mapping: array = empty( number_elements, dtype=int64 )
for cell_id in range( number_elements ):
_, index = kd_tree_grid_ref.query( support_elements[ cell_id ] )
mapping[ cell_id ] = index
return mapping


def get_array_names_to_collect_and_options( sub_vtu_filepath: str,
options: Options ) -> tuple[ list[ tuple[ str ] ], Options ]:
"""We need to have the list of array names that are required to perform copy and creation of new arrays. To build
global_arrays to perform operations, we need only these names and not all array names present in the sub meshes.

Args:
sub_vtu_filepath (str): Path to sub vtu file that can be used to find the names of the arrays within its data.
options (Options): Options chosen by the user.

Returns:
list[ str ]: Array names.
"""
ref_mesh: vtkUnstructuredGrid = read_mesh( sub_vtu_filepath )
all_array_names: dict[ str, dict[ str, int ] ] = get_all_array_names( ref_mesh )
if options.support == __SUPPORT_CHOICES[ 0 ]: # point
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see other comment L209

support_array_names: list[ str ] = list( all_array_names[ "PointData" ].keys() )
else: # cell
support_array_names = list( all_array_names[ "CellData" ].keys() )

to_use_arrays: set[ str ] = set()
to_use_operate: list[ tuple[ str ] ] = list()
for function_newname in options.operations:
funct: str = function_newname[ 0 ]
if funct in create_precoded_fields:
to_use_operate.append( function_newname )
continue

is_usable: bool = False
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there must be a more pythonic way to code this

for support_array_name in support_array_names:
if support_array_name in funct:
to_use_arrays.add( support_array_name )
is_usable = True
if is_usable:
to_use_operate.append( function_newname )
else:
logging.warning( f"Cannot perform operations with '{funct}' because some or all the fields do not " +
f"exist in '{sub_vtu_filepath}'." )

updated_options: Options = Options( options.support, options.source, to_use_operate, options.vtm_index,
options.vtk_output )
return ( list( to_use_arrays ), updated_options )


def merge_local_in_global_array( global_array: array, local_array: array, mapping: array ) -> None:
"""Fill the values of a global_array using the values contained in a local_array that is smaller or equal to the
size as the global_array. A mapping is used to copy the values from the local_array to the right indexes in
the global_array.

Args:
global_array (np.array): Array of size N.
local_array (np.array): Array of size M <= N that is representing a subset of the global_array.
mapping (np.array): Array of global indexes of size M.
"""
size_global, size_local = global_array.shape, local_array.shape
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's a shape, I would rename in global_shape and local_shape

if size_global[ 0 ] < size_local[ 0 ]:
raise ValueError( "The global array to fill is smaller than the local array to merge." )
number_columns_global: int = size_global[ 1 ] if len( size_global ) == 2 else 1
number_columns_local: int = size_local[ 1 ] if len( size_local ) == 2 else 1
if number_columns_global != number_columns_local:
raise ValueError( "The arrays do not have same number of columns." )
# when converting a numpy array to vtk array, you need to make sure to have a 2D array
if len( size_local ) == 1:
local_array = local_array.reshape( -1, 1 )
global_array[ mapping ] = local_array


def implement_arrays( mesh: vtkUnstructuredGrid, global_arrays: dict[ str, array ], options: Options ) -> None:
"""Implement the arrays that are contained in global_arrays into the Data of a mesh.

Args:
mesh (vtkUnstructuredGrid): A vtk grid.
global_arrays (dict[ str, np.array ]): { "array_name0": np.array, ..., "array_nameN": np.array }
options (Options): Options chosen by the user.
"""
data = mesh.GetPointData() if options.support == __SUPPORT_CHOICES[ 0 ] else mesh.GetCellData()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defaulting to mesh.getcelldata does not sound safe for future code evolution.

Either the support_choices is intended to grow, and in that case, it would be better to have if ... elif... else RaiseValue like you did L51 or L79.
Or, we are limited to cell and point, in which case the use of __SUPPORT_CHOICES[ 0 ] is a bit of an overkill, and
could be changed to 'point' for better readability.

number_elements: int = mesh.GetNumberOfPoints() if options.support == __SUPPORT_CHOICES[ 0 ] else \
mesh.GetNumberOfCells()

arrays_to_implement: dict[ str, array ] = dict()
# proceed operations
for function_newname in tqdm( options.operations, desc="Performing operations" ):
funct, new_name = function_newname
if funct in create_precoded_fields:
created_arr: array = create_precoded_fields[ funct ]( mesh, options.support )
else:
created_arr = evaluate( funct, local_dict=global_arrays )
arrays_to_implement[ new_name ] = created_arr

# once the data is selected, we can implement the global arrays inside it
for final_name, final_array in arrays_to_implement.items():
number_columns: int = final_array.shape[ 1 ] if len( final_array.shape ) == 2 else 1
if number_columns > 1: # Reshape the VTK array to match the original dimensions
vtk_array = numpy_to_vtk( final_array.flatten() )
vtk_array.SetNumberOfComponents( number_columns )
vtk_array.SetNumberOfTuples( number_elements )
else:
vtk_array = numpy_to_vtk( final_array )
vtk_array.SetName( final_name )
data.AddArray( vtk_array )


def __check( grid_ref: vtkUnstructuredGrid, options: Options ) -> Result:
sub_vtu_filepaths: tuple[ str ] = get_vtu_filepaths( options )
array_names_to_collect, new_options = get_array_names_to_collect_and_options( sub_vtu_filepaths[ 0 ], options )
if len( array_names_to_collect ) == 0:
raise ValueError( "No array corresponding to the operations suggested was found in the source" +
f" {new_options.support} data. Check your support and source file." )
# create the output grid
output_mesh: vtkUnstructuredGrid = grid_ref.NewInstance()
output_mesh.CopyStructure( grid_ref )
output_mesh.CopyAttributes( grid_ref )
# find the support elements to use and construct their KDTree
support_elements: array = support_construction[ new_options.support ]( output_mesh )
number_elements: int = support_elements.shape[ 0 ]
kd_tree_ref: KDTree = KDTree( support_elements )
# perform operations to construct the global arrays to implement in the output mesh from copy
global_arrays: dict[ str, array ] = dict()
for vtu_id in tqdm( range( len( sub_vtu_filepaths ) ), desc="Processing VTU files" ):
sub_grid: vtkUnstructuredGrid = read_mesh( sub_vtu_filepaths[ vtu_id ] )
sub_data = sub_grid.GetPointData() if new_options.support == __SUPPORT_CHOICES[ 0 ] else sub_grid.GetCellData()
usable_arrays: list[ tuple[ int, str ] ] = list()
for array_index in range( sub_data.GetNumberOfArrays() ):
array_name: str = sub_data.GetArrayName( array_index )
if array_name in array_names_to_collect:
usable_arrays.append( ( array_index, array_name ) )
if not array_name in global_arrays:
number_components: int = sub_data.GetArray( array_index ).GetNumberOfComponents()
global_arrays[ array_name ] = full( ( number_elements, number_components ), nan )

if len( usable_arrays ) > 0:
reorder_mapping: array = get_reorder_mapping( kd_tree_ref, sub_grid, new_options.support )
for index_name in usable_arrays:
sub_array: array = vtk_to_numpy( sub_data.GetArray( index_name[ 0 ] ) )
merge_local_in_global_array( global_arrays[ index_name[ 1 ] ], sub_array, reorder_mapping )
# The global arrays have been filled, so now we need to implement them in the output_mesh
implement_arrays( output_mesh, global_arrays, new_options )
write_mesh( output_mesh, new_options.vtk_output )
return Result( info="OK" )


def check( vtk_input_file: str, options: Options ) -> Result:
mesh = read_mesh( vtk_input_file )
return __check( mesh, options )
Loading
Loading