-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assessing Caitlin's feedback #9
Comments
Regarding the worms aphiaID- I wrote a quick helper function to get the aphiaID from the sciname before Jon correctly pointed out that the worrms library does exactly that, so that's what I'm using now. I still have to write a couple of little helper functions around it, because plugging it into the mutate functions means we're doing a lot of queries that can bloat the code's runtime. So I'm making a little lookup table out of the unique scinames/aphiaids and then we can just do each scientific name query once, building the columns in Surimi out of the lookup table. |
Code to get and add the aphia ID added; going to write in the comments and packagify it before I merge it. |
Added code to handle assigning receiver_project_name and tagging_project_name. The code runs and seems to me to act correctly, we will check out the correctness when we run the second set of tests after these issues are resolved. |
Took the releases out of the detections dataframe before deriving the receivers from it. |
Just checked and it seems like we are already using release lat/lon/and datetime to fill in the appropriate columns, so we just need to handle locality. |
While closing Remora tickets I found this: Looks like we decided on the mapping some time ago- receiver_name and receiver_project_name can be mapped to receiver and otn_array respectively. |
![]() |
@jackVanish do you have examples of each of the receiver_id and receiver_name values, once these are provided, i should be able to provide some guidance |
In checking out the IMOS -> OTN pipeline I realized it's referring to the OTN->IMOS receiver/tag derivation functions. This needs to be fixed! |
@naomitress Here are the IMOS data test files included in Remora, so these are what I would've been using to build towards Surimi's OTN -> IMOS pipeline. IMOS_animal_measurements.csv |
IMOS_receiver_deployment_metadata.csv has IMOS_detections.csv has so, receiver_id should be ignored in favour or receiver_name as this is analogous to @jackVanish were |
When I went back to build out the imos -> OTN piece I found that the reason the main imos_otn column mapping function was underbuilt was because I had started building out two separate functions to map receiver metadata and tag metadata. I finished out those and then built a detections one based on the mapping files supplied in the now-closed Remora tickets. I think that'll get us most of the way through the IMOS -> OTN pipeline feedback, I've checked off the stuff that i know is solid. I will have some new test files to look at shortly. |
Also thanks for this Naomi, I will favour receiver-name! |
Also, in the IMOS test files above, we should be able to find the analogue to detectedby/project_code. I think we're handling this now with the coll_code parameter passed to otn_imos_column_map but we can always double-check. |
re: when looking at the otn_detections output file (made from the IMOS test data), is this format supposed to match schema.c_detections_YYYY table formats for OTN? there are very few columns included so i am wondering the reasoning behind this being our end-product (what are we going to use it for? it doesnt match otn detection extracts for example) also, my point still stands that receiver (otn column) needs to be completed (recver_id is the IMOS column) |
Right, the IMOS->OTN surimi output was erroneous, so I'm not surprised it's weird and bonkers. We won't be using that, we're shooting for OTN detection-extract-like. Shannon and I are working on building that out. That should also cap off the receiver_id piece as well, since that'll be handled in building out the new functions. Can you speak to the project_code/detectedby bit in the first group? The long version regarding the IMOS -> OTN piece. is that when I generated the IMOS->OTN output, I used a function that was a carbon copy of the OTN->IMOS one, because I had forgotten that I'd decided to break it up into three separate functions (one for tags, one for receivers, one for detections). So the output was basically garbage, and it's my fault for that. I spent some time last week building out the two functions (receivers and tags, IMOS -> OTN) that already existed, and working in a new one (IMOS -> OTN deteections) based on mappings given as part of building the code in Remora. The detections file is incomplete right now, which is what Shannon's helping out with, so the feedback here regarding columns in the IMOS -> OTN output will be addressed as part of building out those functions. |
|
Caitlin helpfully checked out a set of surimi output- OTN detection extract to IMOS tripartite format, then from that same format to OTN tripartite format. She helped identify some of the places where the process loses data, so I've taken her feedback and assembled it into this checklist so I can keep track of the changes I need to make/have made. Her messages are first, followed by my notes in parentheses. A note- anything that can go from a detExtract to a tripartite file will ALSO have to be accounted for if someone HAS the tripartite OTN file. i.e, anything I fix in the 'derive' functions must be accounted for in the main function as well.
OTN DetExtract -> IMOS detections
OTN DetExtract -> IMOS receivers
OTN DetExtract -> IMOS tags
IMOS detections -> OTN detections
IMOS receivers -> OTN receivers
IMOS tags -> OTN tags
The text was updated successfully, but these errors were encountered: