Get ethnicity from name. Currently, we cover the following ethnicities:
- Anglo-Saxon
- Arabic
- Chinese
- Fijian
- German
- Greek
- Hawaiian
- Hispanic (Spanish and Mexican)
- Indian (excluding islamic states)
- Iranian
- Italian
- Japanese
- Khmer (Cambodian)
- Korean
- Polish
- Portuguese (Portuguese and Brazilian)
- Russian
- Samoan
- South-Slavic (Serbian, Croatian, Bosnian, Slovenian)
- Thai
- Turkish
- Vietnamese
- Surnames by race (the US Census data)
- Most common Japanese surnames
- Most common Indian surnames
- Most common Greek surnames
- Most common Italian surnames
- Samoan surnames
- Hawaiian surnames
- German surnames
- Most popular Spanish names
- Most frequent Spanish surnames
- Most common Portuguese surnames
- Most common names I Brazil
- Most common surnames in Brazil
- Most common Russian surnames
- Popular baby names in NSW, Australia
- Most common surnames in Saudi Arabia
- Most common surnames in Egypt
- Most common surnames in Lebanon
- Most common surnames in Morocco
- Most common Palestinian Surnames
- Most common surnames in Kuwait
- Most common surnames in Europe
pip3 install ethnicity
from ethnicity import Ethnicity
# initialize and create dictionaries
e = Ethnicity().make_dicts()
# apply to a list of names
e.get(['emele kuoi', 'andrew miller', 'peter', 'andrey', 'nima al hassan', 'tomasz bolowski', 'christiano ronaldo', 'parisa karimi,', 'lisa bowen', 'melissa chan'])
# which gives you a pandas dataframe as below
Name Ethnicity
0 Emele Kuoi fijian
1 Andrew Miller anglo-saxon
2 Peter ---
3 Andrey russian
4 Nima Al Hassan arabic
5 Tomasz Bolowski polish
6 Christiano Ronaldo portuguese
7 Parisa Karimi iranian
8 Lisa Bowen anglo-saxon
9 Melissa Chan chinese