Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoiding 2D-like structures which are just a "chain" of atoms #46

Open
marcelmbn opened this issue Sep 25, 2024 · 0 comments
Open

Avoiding 2D-like structures which are just a "chain" of atoms #46

marcelmbn opened this issue Sep 25, 2024 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@marcelmbn
Copy link
Member

How can we avoid such structures?

...I asked ChatGPT-4o this question and the answer was the following (imho, quite insightful):


How to Check if a Molecule is Nearly Planar Using Cartesian Coordinates

To determine if a molecule is effectively planar (close to a 2D plane), you can analyze its Cartesian coordinates to see how much the points deviate from a best-fit plane. Here are a few methods you can use:

1. Principal Component Analysis (PCA)

PCA can help you determine the dimensions along which most of the data variance occurs. For a planar molecule, the variance along one of the principal components (usually the third component) should be significantly smaller than the other two.

  • Steps:
    1. Subtract the mean of the coordinates to center the molecule.
    2. Perform PCA on the centered coordinates.
    3. Check the eigenvalues of the covariance matrix. If the smallest eigenvalue is much smaller than the other two, the molecule is close to being planar.

2. Plane Fitting Using Singular Value Decomposition (SVD)

This method directly fits a plane to the points and measures how much the points deviate from this plane.

  • Steps:
    1. Center the data by subtracting the mean of the coordinates.
    2. Perform SVD on the centered coordinates.
    3. The smallest singular value corresponds to the direction with the least variance (normal to the plane). If this value is near zero, the molecule is nearly planar.

3. Distance Calculation from a Best-Fit Plane

Fit a plane to the molecule and compute the distances of each atom from this plane.

  • Steps:
    1. Fit a plane to the coordinates using least squares.
    2. Calculate the distance of each point from the fitted plane.
    3. If the maximum or standard deviation of the distances is small, the molecule is nearly planar.

4. Check Coordinate Spreads

Examine the spread of coordinates along each axis. If one axis has a significantly smaller spread (e.g., standard deviation) than the other two, the molecule may be planar.

Implementing the PCA Method in Python

Here is an example implementation of the PCA approach in Python:

import numpy as np
from sklearn.decomposition import PCA

# Sample data: Replace with your molecule's coordinates
coordinates = np.array([
    [x1, y1, z1],
    [x2, y2, z2],
    # Add more points
])

# Center the coordinates
centered = coordinates - np.mean(coordinates, axis=0)

# Apply PCA
pca = PCA(n_components=3)
pca.fit(centered)

# Get the explained variance ratios
variance_ratios = pca.explained_variance_ratio_

# Check the variance in the third principal component
print("Variance ratios:", variance_ratios)

# If the third component has a very low variance, the molecule is nearly planar
if variance_ratios[2] < 0.05:  # You can adjust the threshold as needed
    print("The molecule is nearly planar.")
else:
    print("The molecule is not planar.")
@marcelmbn marcelmbn added the enhancement New feature or request label Sep 25, 2024
@marcelmbn marcelmbn self-assigned this Sep 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant