Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: str is not iterable (Edgecase) #91

Open
dlinx opened this issue Jan 18, 2022 · 1 comment
Open

TypeError: str is not iterable (Edgecase) #91

dlinx opened this issue Jan 18, 2022 · 1 comment

Comments

@dlinx
Copy link

dlinx commented Jan 18, 2022

Getting following error while using kuroshiro but it is only in some cases. 90% of the time, it is not throwing any error. I do not have the input to test for this case.

Stacktrace

TypeError: str is not iterable
    at toRawHiragana (/server/node_modules/kuroshiro/lib/util.js:177:14)
    at /server/node_modules/kuroshiro/lib/core.js:225:88
    at Generator.next ()
    at asyncGeneratorStep (/server/node_modules/kuroshiro/lib/core.js:10:103)
    at _next (/server/node_modules/kuroshiro/lib/core.js:12:194)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
@matthieu-locussol
Copy link

matthieu-locussol commented Mar 31, 2023

This issue occurs when converting a sentence having (U+30FB) character(s) in it. I chose to replace this character with · (U+00B7) character only during the conversion and I'm not having this problem anymore.

Here is a minimal code reproducing the problem (I encountered this problem using furigana mode, but it might occur in different modes too):

const Kuroshiro = require("kuroshiro");
const KuromojiAnalyzer = require("kuroshiro-analyzer-kuromoji");

const sample = async () => {
  const sentence1 = "映画『ジュラシック·パーク』の恐竜は本物そっくりだ。";
  const sentence2 = "映画『ジュラシック・パーク』の恐竜は本物そっくりだ。";

  const kuroshiro = new Kuroshiro();
  await kuroshiro.init(new KuromojiAnalyzer());

  kuroshiro.convert(sentence1, { mode: "furigana", to: "hiragana" }); // Does not throw
  kuroshiro.convert(sentence2, { mode: "furigana", to: "hiragana" }); // Throws
};

sample();

You could imagine having two functions to do this job of converting back and forth:

const sanitizeJapaneseSentence = (sentence: string) => sentence.replace(//gi, '·');
const unsanitizeJapaneseSentence = (sentence: string) => sentence.replace(/·/gi, '・');

Hope this can help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants