-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Function naming: pin_ctrl or valid_hetu ? #1
Comments
Personally I prefer tidyverse naming conventions: https://style.tidyverse.org/functions.html If we used that, the name could be check_validity. Of course it may be preferential to use the same names with sweidnumbr. I wonder if we can have good parts from both somehow without repeating the code.. |
Fully OK with me. Would be good to have a comment from sweidnumbr developer @MansMeg Re: replicating code: I went through sweidnumbr again last week and one possible problem is that the ID systems are somewhat different in different countries. There is a certain risk that unifying the code base will bring unnecessary amount of maintenance overhead, although in general I agree that it is good idea to sync the pkgs where possible. In principle we could have a shared class structure and corresponding methods, which could be imported from sweidnumbr for instance. Then the Finnish hetu package would be a set of wrappers around that. Also need to think of who would implement and which parts. |
Just let me know if you would need anything from sweidnumbr to facilitate it. I could export functions that you would find useful. |
I can also chacge to tidyverse naming convensions. The sweidnumber is older than the tidyverse... =) |
Should we perhaps start by planning/listing the functions that we need and their arguments. So we would have a list of function names and their arguments and short explanation what the function does. We can use functionality of sweidnumbr as a basis for this. After that it will be easier to go thinking about classes and other more technical stuff. Also it is good to have a clear naming of functions and arguments in the beginning so that one needs not to change them from multiple files later. |
Sounds good. I will probably implement that as wrappers initially (to minimize work). As Leo mentioned before, I think these packages will mainly be used internally in Sweden and Finland and I do not think many users would use both packages. The main benefit is probably to abide the tidyverse conventions and share functionality? What functionality from sweidnumbr would be useful in Finland? |
Also I think that not necessarily function names need to be same with sweidnumbr. Most likely users of hetu package will be finnish and get the idea of function better if the name contains word 'hetu' rather than 'pin'. To me pin brings phone pin code to my mind. I still think it will be useful to have the same naming conventions with sweidnumbr package on function arguments. Different function names also make it less likely that someone gets confused from which package the function is. Using wrappers can be one approach on hetu package as well, I think. |
Yes. The minimal set of functions and arguments would be something that is common between the two countries. In addition, there can be country-specific additional functions in each pkg. The following shared functionality comes immediately to mind:
Not sure if we should launch a separate planning document, or use Github repo wiki function to systematically collect and refine the ideas(?). |
.. and yes, not everything needs to be shared but if we use class structures then it might make sense to use a generic class structure that applies to both cases and can have shared methods (to retrieve age, sex etc). |
Perhaps we go with the generic names for the sake of sharing the class structure then and use package specific wrappers to implement good understandable names for the users of each package? Who knows if someone wants to make similar package for some third country later. |
Yes. Let's just be careful with overheads, complexity easily increases with synchronization, too. Having further countries in the loop would be ideal, let's prepare for that. |
I agree that "pin" may be strange in Finnish, but it is also strange in Swedish. I used the English name to enable non-Swedish speakers to understand the concepts (personal identity number and oin, organizational identity number). I think using English for the basic concepts is probably a reasonable approach also in Finnish? |
One possibility is to decide on the shared standards, then have wrappers / aliases to implement country-specific helper functions. |
Currently hetu and sweidnumbr share the following similarly named functions:
I personally find pin to be alright. Curiously it seemed that in English Wikipedia the term "Personal Identity Number" is used only in Sweden's case, whereas other countries call it national identification number, national identity number etc. Finnish authorities seem to prefer the term "personal identity code" (PIC) or even the old-fashioned "social security number". I am not so sure anymore whether sharing function names with sweidnumbr is wise, if the abbreviation "PIN" is neither used nor brings the correct connotations to mind (like phone pin code) when using it in the Finnish context. Using names like hetu_age, hetu_ctrl, hetu_sex and hetu_to_date (or hetu_birthdate) and rhetu might be clearer but would also remove the risk of using a function from wrong package when having both packages loaded simultenously. |
I would be ok with either solution. It is potentially useful to have same names across different countries but since those functions (with same name) come from different packages there is indeed a risk of confusion. The "hetu" naming would be well justified and intuively more clear in the Finnish context. One option is to have the "hetu" names as additional aliases. But this may add too much complexity. I am not sure how much the unification of the names between the two packages is needed. At least it could facilitate later harmonization. On the other hand, if we now use country specific acronyms (pin for SE and hetu for FI), then some later unified version could introduce an entirely new acronym that would avoid the confusion. If I would need to choose, I tend to prefer "hetu" names because the main userbase is in Finland anyway, and it is not clear how many use cases there will be for the two packages simultaneously. Might be useful to hear some feedback from others. In particular @MansMeg @juuussi |
I agree. I do not think there is a large need to harmonize the different packages for the users, so the API can definitely be hetu, if that well known in Finland. I actually didnt know that PIN was a swedish thing. :)
The things we would like to harmonize is probably underlying, common algorithms. For example, maybe the luhn algorithm. We could start out by the hetu package using sweidnumbr, and if then in a later stage move out common functionality to a separate package?
|
It is definitely useful to share common functionalities. @pitkant might be able to tell more about possible overlapping parts (if any) now after the latest developments. The functionality itself is similar (picking age, sex, etc) but the technical implementation (functions) are different because of the different structure of pins/hetus between the two countries. Hence, I am not sure how much can be shared but I agree that everything that can be naturally reused from sweidnumbr, should be reused. In my understanding, the Luhn algorithm is not used in the Finnish context but corrections are welcome. |
Sweidnumbr has all pin-related functions in the same pin.R file, whereas in hetu package different functions are in different files. In sweidnumbr many functions such as pin_sex and pin_to_date actually contain the code that is used to extract sex or birthdate out of the pin. In hetu package every pin goes through all the operations in hetu function and relevant information can be then extracted from the data frame, essentially meaning that hetu::pin_sex(pin) or hetu::pin_date(pin) are just fancier ways to write hetu(pin, extract = "sex") or hetu(pin, extract = "date"). Functions that are practically identical in both packages:
Functions that are quite similar and could probably be made more similar, if wanted:
Functions that share the same name but actually do slightly things:
Functions that are exclusive to sweidnumbr:
|
Great, thanks for the summary. Seem pretty ok like this to me at least. |
In version 1.0.1 the package uses pin_ctrl, pin_sex etc. type function names and hetu_ctrl, hetu_sex as aliases. The latter type is also used in all examples and vignettes and therefore the former style could quite easily become deprecated in the future versions, but that'll be a discussion for a possible future release. |
In earlier versions we had valid_hetu to check ID validity. This is now changed into pin_ctrl because the sweidnumbr package is using that name. Comments are welcome before we close this issue.
The text was updated successfully, but these errors were encountered: