Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inconsistent program execution #24

Open
codercahol opened this issue Mar 18, 2024 · 4 comments
Open

inconsistent program execution #24

codercahol opened this issue Mar 18, 2024 · 4 comments

Comments

@codercahol
Copy link
Owner

I get inconsistent results when I execute the code in database_creation:

The following is all being run on the Carnegie cloud in a jupyter notebook

I execute the following to initialize the jupyter notebook:

import sys
chlamy_impi_module_path = "{path-to-local-repo}"
if chlamy_impi_module_path not in sys.path:
    sys.path.append(chlamy_impi_module_path)

the functions from the module are exported as follows:

from chlamy_impi.well_segmentation_preprocessing.main import (
    main as well_segmentation_preprocessing,
)
from chlamy_impi.database_creation.main import main as database_creation

# for dev
from chlamy_impi.database_creation import utils as db_utils
from chlamy_impi.database_creation import error_correction as db_error_correction
from chlamy_impi.database_creation import main as db_main
from chlamy_impi.lib import mask_functions
from chlamy_impi.lib import npq_functions
from chlamy_impi.lib import y2_functions
from chlamy_impi.lib import fv_fm_functions
import chlamy_impi.paths as paths

When I run the database_creation/main.py:main() by executing cp.database_creation() or cp.db_main.main() I get an error in database_creation/main.py:merge_plate_and_well_info_dfs() saying that the i and j columns of the dataframe are NaNs.

When I run all the sub-functions of main() in the notebook it runs to completion with no error (i.e. I define and run local_main() as below):

def local_main():
    plat_info = cp.db_main.construct_plate_info_df()
    well_info = cp.db_main.construct_well_info_df()
    exptl_df = cp.db_main.merge_plate_and_well_info_dfs(plat_info, well_info)

    mut_df = cp.db_main.construct_mutations_dataframe()
    ident_df = cp.db_main.construct_identity_dataframe(mut_df)
    total = cp.db_main.merge_identity_and_experimental_dfs(exptl_df, ident_df)
    return total
@codercahol
Copy link
Owner Author

Further info:
the wells that cause the issue are consistent, with the following indices: [4608, 8065, 16130, 18819, 22276, 34181]
in the erroring code, the measurement times are all null, num_frames is 2x that of the working code, and the threshold values do not match

@codercahol
Copy link
Owner Author

Screenshot 2024-03-18 at 11 31 00 AM

@codercahol
Copy link
Owner Author

Screenshot 2024-03-18 at 11 31 37 AM

@murraycutforth
Copy link
Collaborator

Hey, I've just had a quick read. If I've understood correctly this is a code issue, and not an issue with the underlying data?

It's not immediately apparent to me why there should be a difference in execution depending on which functions are imported and run in the scope of the notebook. If it works correctly when the sub-functions are manually imported then that would suggest to me that there may be a name collision in the scope of the notebook? But I would expect that to cause the code to catastrophically fail rather than this subtle difference in a few rows.

In general it could be dangerous to import these main scripts as modules, because everything not wrapped in a if __name__ == '__main__': clause will be executed at the time of import. Not sure if that could be the cause..?

Have you checked if all the other rows in exptl_df are identical in both methods of running it? Also, just running the main.py script directly from the terminal is how I used the code when I was working on it, is it possible to try that method on the cluster?

Sorry I don't have enough time this week to try running anything myself!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants