-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shared input sample sizes for sex = 0 and sex = 3 rows #151
Comments
There's a long discussion about these calculations in #29. Quoting from that thread:
[edit: add line break to start new paragraph missing in original comment] I can try to explore some ways to apply a multiplier to the input sample size for unsexed fish based on the proportion unsexed. However, I think a reasonable approach is to just discard the unsexed fish for all years where they represent a small fraction of the total, and make all fish unsexed in the years where they represent the majority, which is another way to resolve the problem (though again it requires more work from the user). |
Hmm, I am not sure I agree with the discussion about sample sizes. It does not make sense to me. I fully agree that {pacfintools} should only need to output sex = 0 and sex = 3, no need to do fancy stuff allowing for sex = 1 and 2. HOWEVER, if you have unsexed fish from 3 tows and sexed fish from 5 tows, then you have more information (weight) from sexed fish than you do from unsexed fish. Why would we want to weight those two multinomial draws similarly? @chantelwetzel-noaa says:
Yes. Why would you not want to do this? You have more data from sexed fish.
Again, I totally agree with this statement, but I come to the opposite conclusion. If you have sexed samples from more tows, they should be weighted more than the unsexed samples. Why would you weight them similarly? I should add, my understanding is there are two reasons for unsexed fish: 1) they are too small to sex (this is why we have unsexed fish in survey data) and 2) there is not time/capacity to sex them. My logic assumes (2) is much more common in fishery-dependent data. |
Hold on. Does {pacfintools} include ALL fish when it makes the line for sex = 0? And then include only sexed fish when it makes the line for sex = 3? I had assumed the line for sex = 0 ONLY included unsexed fish. |
If the choice is sex=0, it should be a combination of all unsexed and sexed
fish. Any other sex designation should have the sexed values and a separate
sex=0 for the remaining unsexed fish.
…On Wed, Feb 12, 2025 at 10:09 AM Kiva Oken ***@***.***> wrote:
Hold on. Does {pacfintools} include ALL fish when it makes the line for
sex = 0? And then include only sexed fish when it makes the line for sex =
3? I had assumed the line for sex = 0 ONLY included unsexed fish.
—
Reply to this email directly, view it on GitHub
<#151 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB6IZLU3E7NBIVLCFPQEBZL2POE4PAVCNFSM6AAAAABW6KHTK2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMNJUGQ4DSNRYHA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
--
*Jason M. Cope, Ph.D.Research Fishery Biologist*
*Fishery Resource Analysis and Monitoring Division*
*Northwest Fisheries Science Center*
*2725 Montlake Blvd. East*
*Seattle, WA 98112-2013*
*NOAA Fisheries*
*Affiliate assistant faculty*
*School of Aquatic and Fishery Sciences*
*University of Washington*
***@***.*** ***@***.***>
(206) 302-2417
www.nmfs.noaa.gov
|
Why would we want to count an individual fish in two separate rows of the comp data? I feel like each fish should only appear once (either in sex = 3 OR sex = 0). It seems kind of fundamental to the multinomial distribution, that each entry is iid. If you are including the same individual fish from 1990 under sex = 0 and sex = 3, the sex = 0 and sex = 3 entries would not be independent, as the likelihood assumes. EDIT: wait, I am realizing @shcaba and I said the same thing. |
There are two many [edit: TOO many] possible use cases for the generalized tools to create good defaults in every case, but here are a few that I can think of and some ideas on what to do:
In the first 3 cases, we don't have to worry about separate sample sizes for sexed and unsexed. * with regard to yellowtail, it does look like there are a few years where there's are some periods in the 90s and 2010s where we may want to include the unsexed fish as separate (case 5), which I assume is why @okenk posted this issue in the first place |
Describe the bug
If you provide both sexed and unsexed biological data, default settings in the analysis pipeline lead to the same sample size for the row with sex = 0 and sex = 3, rather than the specific sample sizes of sexed and unsexed groups.
To Reproduce
For any species with sexed and unsexed biological data run:
Expected behavior
Input sample sizes for the rows with sex = 0 should differ from those in the rows with sex = 3.
Additional context
A sufficient but somewhat inefficient solution to this is to run everything from
get_pacfin_expansions()
down separately for sexed and unsexed data, and then put the data back together later when inputting it into the SS3 files.Also, I totally may have missed an argument somewhere (probably would be in
getComps()
?) that would avoid this behavior-- it was not immediately obvious to me though!The text was updated successfully, but these errors were encountered: