Skip to content

A simple game where you guess the realness of given cities names

Notifications You must be signed in to change notification settings

chominskib/namegen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fakename -- ML game

Is "Chwałek" a name of real Polish city or village? It sounds quite right, but as it turns out it's fake and was generated by a short Python script using a simple Markov chain constructed with a database of all Polish city and village names.

Can you guess is a given place name is real or fake? You can check it by playing this game.

Instructions:

  1. Run main.py script and have fun!

Remarks:

  • schemes.py file contains some "strategies" to enhance faithfulness of generated words and it's currently tuned up for Polish language. It is possible to tune it for English, but it will be a lot harder, since in Polish spelling is much more bound to pronunciation than in English.
  • You can input your own dataset to play with: write it in data.txt file (no commas, spaces, tabs, just one name in one line) and run train.py script. By default data.txt contains all Polish city and village names and it is pre-trained and ready to use.

How does it work?

The names given in training data are split into phones using rules found in schemes.py. Then there is a Markov chain constructed with a pair of phones on each vertex such that the number written on edge from (a, b) to (b, c) is probability that after phones a, b there will be phone c. This Markov chain is stored (rather effortlessly) in file network.py with training being run by train.py script.

Generating a fake name boils down to just going through this Markov chain with a little tweak: to prevent generating absurdly short or long names the probabilities of the next phone being space are zeroed until the number of generated phones reaches four and are increasing progressively after seventh generated phone.

To-dos:

  • Create a roughly working schemes.py file for English.

About

A simple game where you guess the realness of given cities names

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages