Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the notebook fully parametric #81

Open
grayswandyr opened this issue Nov 16, 2024 · 1 comment
Open

Make the notebook fully parametric #81

grayswandyr opened this issue Nov 16, 2024 · 1 comment

Comments

@grayswandyr
Copy link

grayswandyr commented Nov 16, 2024

Hi, thanks for the good work! I'd like to generate my own layout based on my personal corpus (my mails, code, LaTeX reports, etc., both in French and English). I tried to adapt the code but there are too many parts I don't understand sufficiently, and it also seems that some hard-coded data is defined in several places in the code, so that I ultimately don't end up a specific layout.

Would it be possible to adapt the code such that interested people only have to provide a few arrays at the beginning of the notebook and then one just has to run everything to get candidate layouts in the end? In principle, it should be enough to provide a table of letter frequencies and bigram frequencies, right?

As an example, this is what I did for my own letter frequencies :

my_24letters = [ ('E', 1286273), ('T', 911921), ('I', 785967), ('A', 767995), ... ]
my_bigrams = [('IN', 178498), ('ON', 149623), ('TH', 134033), ('TI', 132851), ('RE', 131569), ... ]

letters24, instances24 = list(zip(*my_24letters))
max_frequency = instances24[0]

bigrams_arr, bigram_freqs_arr = list(zip(*my_bigrams))
bigrams = np.array(bigrams_arr)
bigram_frequencies = np.array(bigram_freqs_arr)

I don't know where to go from there... but if the code was totally parametric, this would be far-reaching for Engram I think.

@binarybottle
Copy link
Owner

Thank you for reaching out! I appreciate your interest in tailoring a keyboard layout to a personalized corpus. You write at a particularly opportune time, as I am revisiting this project from scratch and am developing a data-driven approach with crowdsourced information with a new software pipeline. I will keep your interest in mind as I progress in this project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants