-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestion for updating the closest function. #163
Comments
Hi, @WANGchuang715, have you considered filtering dataframe by the distance column after applying closest with return_distance=True? |
|
hi Wang, if you are interested in many features in df2 around df1, perhaps
is what you are looking for, rather than a bf.closest operation? |
I understand the functionality you mentioned, and in comparison, the "closest" feature aligns better with my requirements. I am using it to find the cis-mRNAs for lncRNAs, so I need to differentiate the upstream and downstream relationships within a certain distance and determine if there is any direct overlap. Currently, the "closest" functionality is able to meet my basic needs, and I also hope that you can consider my suggestion. |
Can you formulate the problem more precisely? You mention that you are "unable to determine the appropriate value for k", so it sounds to me like what you really want is to make what is known as a "ball query" of some radius around lncRNAs (differentiating by strand, etc.)? i.e you want to catch all cis-mRNAs up to some given maximum distance away from each lncRNA in a particular direction. Regardless of how this functionality might be exposed, the task I just described would make more sense as an extension of the |
I think they are two different filtering dimensions. Currently, the "closest" functionality filters the nearest k ranges without considering the distance. It selects k ranges that are closest in proximity. What I want is to filter N ranges that are within a certain distance. I believe both of these filtering approaches are necessary in practical applications.
|
I suggest adding an option for distance-based filtering in the
closest
function. Currently,closest
only allows selecting the number of closest intervals to report with thek
parameter. It would be beneficial to include an option to filter intervals based on a maximum distance criterion. This enhancement would provide more flexibility and control in selecting intervals based on their proximity. I recommend considering the addition of this feature to improve the functionality of theclosest
function.The text was updated successfully, but these errors were encountered: