Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

common: added countries mapping #191

Closed
wants to merge 7 commits into from
Closed

common: added countries mapping #191

wants to merge 7 commits into from

Conversation

ErnestaP
Copy link
Contributor

@ErnestaP ErnestaP commented Jan 18, 2024

@ErnestaP
Copy link
Contributor Author

We are parsing countries differently in comparison with old scoap3.
In these workflows are taking the country value from their dedicated fields, and if these fields do not exist, we are splitting affiliation value per comma, and taking the last value.

The old scoap3 takes the affiliation value, and countries mapping and looks at the country match by using regex and predefined countries dict.

if re.search(r'\b%s\b' % key, affiliation, flags=re.IGNORECASE):

https://github.com/SCOAP3/scoap3-next/blob/master/scoap3/utils/nations.py#L20-L23

My question:

  1. Should we also do the same as in old one? (We already faced problems in the past with wrong-parsed affiliations with this concept)
  2. Keep it as it is, but look for a match by using already parsed countries and the predefined dict

@codecov-commenter
Copy link

codecov-commenter commented Jan 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (90ae504) 92.46% compared to head (26e65da) 92.50%.

❗ Current head 26e65da differs from pull request most recent head 7270ee2. Consider uploading reports for the commit 7270ee2 to get more accurate results

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #191      +/-   ##
==========================================
+ Coverage   92.46%   92.50%   +0.03%     
==========================================
  Files         110      111       +1     
  Lines        4860     4880      +20     
==========================================
+ Hits         4494     4514      +20     
  Misses        366      366              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ErnestaP ErnestaP requested a review from pamfilos January 22, 2024 14:35


def parse_country_from_value(affiliation_value):
country = COUNTRY_PARSING_PATTERN.search(affiliation_value).group(0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need the COUNTRIES_DEFAULT_MAPPING if we use pycountry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, because pycountry there are cases when it gives more than one country, for example:

In [3]: pycountry.countries.search_fuzzy("USA")
Out[3]: 
[Country(alpha_2='US', alpha_3='USA', flag='🇺🇸', name='United States', numeric='840', official_name='United States of America'),
 Country(alpha_2='ID', alpha_3='IDN', flag='🇮🇩', name='Indonesia', numeric='360', official_name='Republic of Indonesia'),
 Country(alpha_2='AZ', alpha_3='AZE', flag='🇦🇿', name='Azerbaijan', numeric='031', official_name='Republic of Azerbaijan'),
 Country(alpha_2='PH', alpha_3='PHL', flag='🇵🇭', name='Philippines', numeric='608', official_name='Republic of the Philippines'),
 Country(alpha_2='TR', alpha_3='TUR', flag='🇹🇷', name='Turkey', numeric='792', official_name='Republic of Turkey'),
 Country(alpha_2='KR', alpha_3='KOR', common_name='South Korea', flag='🇰🇷', name='Korea, Republic of', numeric='410'),
 Country(alpha_2='OM', alpha_3='OMN', flag='🇴🇲', name='Oman', numeric='512', official_name='Sultanate of Oman'),
 Country(alpha_2='ZM', alpha_3='ZMB', flag='🇿🇲', name='Zambia', numeric='894', official_name='Republic of Zambia'),
 Country(alpha_2='EE', alpha_3='EST', flag='🇪🇪', name='Estonia', numeric='233', official_name='Republic of Estonia'),
 Country(alpha_2='IT', alpha_3='ITA', flag='🇮🇹', name='Italy', numeric='380', official_name='Italian Republic'),
 Country(alpha_2='KH', alpha_3='KHM', flag='🇰🇭', name='Cambodia', numeric='116', official_name='Kingdom of Cambodia'),
 Country(alpha_2='NA', alpha_3='NAM', flag='🇳🇦', name='Namibia', numeric='516', official_name='Republic of Namibia'),
 Country(alpha_2='PS', alpha_3='PSE', flag='🇵🇸', name='Palestine, State of', numeric='275', official_name='the State of Palestine')]

@pamfilos
Copy link
Contributor

closing as merged with #198

@pamfilos pamfilos closed this Apr 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants