Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline terminates with pdal::pdal_error for filters.covariancefeatures #176

Open
G-Anjanappa opened this issue Sep 16, 2024 · 9 comments

Comments

@G-Anjanappa
Copy link

G-Anjanappa commented Sep 16, 2024

Hello,

I am using parallel processing (Dask) to generate covariance features for a batch of point clouds. It is a part of a larger pipeline that includes various other steps like DBSCAN and CSF.

Occasionally, the pipeline fails with the following error:
Terminate called after throwing an instance of 'pdal::pdal_error' what(): filters.covariancefeatures: Cannot perform eigen decomposition.

The pipelines run successfully for the failed files when tested individually.

How can I catch pdal::pdal_error exception in Python? RuntimeError doesn't seem to work.

Thank you for your assistance.

@abellgithub
Copy link
Collaborator

Are you saying that you're getting this error for the SAME dataset that succeeds when not running using DASK?

@G-Anjanappa
Copy link
Author

G-Anjanappa commented Sep 16, 2024

Yes, that's correct! The error occurs specifically when using Dask, but the pipeline runs successfully on the same dataset when tested individually using the inline pipeline or the JSON format pipeline.

For your reference, here is a part of my Dask pipeline:

pipeline = (
    pdal.Reader('input.laz') |  
    pdal.Filter("filters.csf", resolution=0.5, threshold=1, iterations=200) |
    pdal.Filter("filters.optimalneighborhood") | 
    pdal.Filter("filters.covariancefeatures", knn=10, optimized=True,
                feature_set="Verticality, Linearity, SurfaceVariation, Scattering")
)

I am using this pipeline for over 1500 files, and it works fine for the majority of them. The issue only occurs with a very small subset of the files.

@hobu
Copy link
Member

hobu commented Sep 16, 2024

The PDAL Python bindings currently release the GIL. We are also using Dask + PDAL Python for SilviMetric, but we are not using these filters. I suspect these filters are not thread safe because they have internal state that's being managed.

The solution is probably an option being added to the bindings that prevents them from releasing the GIL. It is not realistic to check every filter and make them thread safe.

@G-Anjanappa
Copy link
Author

Thank you for the explanation.

Would you recommend any other approaches or workarounds that I can use to avoid these errors with Dask? Or is sequential processing for the failed files the only practical and quick solution in this case?

@abellgithub
Copy link
Collaborator

If this is indeed a thread-safety issue, there's going to be nothing special about the files that didn't work -- just because a run with the files failed once doesn't mean it will fail again. If you're seeing consistent failure behavior with specific datasets then something else is going on and perhaps sharing a couple of datasets would be in order.

@hobu
Copy link
Member

hobu commented Sep 17, 2024

You aren't by chance running Numpy 2 are you? I'm noticing some tests failing with Numpy 2 that appear to be threading or gil-release related. I don't have it figured out quite yet though.

@hobu
Copy link
Member

hobu commented Sep 17, 2024

2.1, rather. I don't have any issue on 2.0

@G-Anjanappa
Copy link
Author

If this is indeed a thread-safety issue, there's going to be nothing special about the files that didn't work -- just because a run with the files failed once doesn't mean it will fail again. If you're seeing consistent failure behavior with specific datasets then something else is going on and perhaps sharing a couple of datasets would be in order.

I did retry processing the files multiple times. For some of the files that initially failed, they did work on subsequent retries, which aligns with what you mentioned about potential thread-safety issues. However, there are still a few files that consistently fail regardless of the number of retries.

@G-Anjanappa
Copy link
Author

G-Anjanappa commented Sep 17, 2024

You aren't by chance running Numpy 2 are you? I'm noticing some tests failing with Numpy 2 that appear to be threading or gil-release related. I don't have it figured out quite yet though.

No, I am using Numpy version 1.26.4.
PDAL is 2.7.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants