Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add language select, abstract text transformations #584

Merged
merged 101 commits into from
Feb 17, 2024
Merged
Show file tree
Hide file tree
Changes from 93 commits
Commits
Show all changes
101 commits
Select commit Hold shift + click to select a range
5bbbe72
Copy functions from JapaneseUtil
toasted-nutbread Jan 21, 2024
52a5f5b
Remove JapaneseUtil
toasted-nutbread Jan 21, 2024
59fd9ed
Update usages of JapaneseUtil functions
toasted-nutbread Jan 21, 2024
188bab5
part1
StefanVukovic99 Jan 22, 2024
ed08427
frotend done?
StefanVukovic99 Jan 22, 2024
004fc0e
fix tests
StefanVukovic99 Jan 22, 2024
f07b236
offscreen and type complications
StefanVukovic99 Jan 22, 2024
2d71abb
add tests
StefanVukovic99 Jan 23, 2024
982323a
Merge branch 'text-transformation-tests' into language-select
StefanVukovic99 Jan 23, 2024
53ed36f
start fixing tests
StefanVukovic99 Jan 23, 2024
b1b4446
keep fixing tests
StefanVukovic99 Jan 23, 2024
c9b9675
fix tests
StefanVukovic99 Jan 23, 2024
e0ff55f
Copy functions from JapaneseUtil
toasted-nutbread Jan 21, 2024
51d1d22
Remove JapaneseUtil
toasted-nutbread Jan 21, 2024
e5c6994
Update usages of JapaneseUtil functions
toasted-nutbread Jan 21, 2024
1185860
Merge branch 'master' into text-transformation-tests
StefanVukovic99 Jan 27, 2024
bea7c56
Merge remote-tracking branch 'toasted-nutbread/japanese-util-refactor…
StefanVukovic99 Jan 27, 2024
beac9ea
Merge branch 'master' into language-select
StefanVukovic99 Jan 27, 2024
d6cb0cc
delete pt
StefanVukovic99 Jan 27, 2024
a262149
Merge branch 'master' into text-transformation-tests
djahandarie Jan 28, 2024
9437888
renames
StefanVukovic99 Jan 28, 2024
197b746
Merge branch 'master' into text-transformation-tests
StefanVukovic99 Jan 28, 2024
ee3c4c2
Merge branch 'text-transformation-tests' into language-select
StefanVukovic99 Jan 28, 2024
4ef38ae
add tests
StefanVukovic99 Jan 28, 2024
2925c3f
Merge branch 'master' into language-select
StefanVukovic99 Jan 28, 2024
42580fa
kebab-case filenames
StefanVukovic99 Jan 28, 2024
8292ec7
lint
StefanVukovic99 Jan 29, 2024
3e95962
Merge branch 'master' into language-select
StefanVukovic99 Jan 29, 2024
39ec942
Merge branch 'master' into language-select
StefanVukovic99 Jan 31, 2024
96a0b0e
Merge branch 'master' into language-select
StefanVukovic99 Jan 31, 2024
44ecc94
minor fixes
StefanVukovic99 Jan 31, 2024
b4eeb66
Merge branch 'master' into language-select
StefanVukovic99 Feb 1, 2024
77e7944
merge
StefanVukovic99 Feb 1, 2024
61e0ff6
Merge branch 'master' into language-select
StefanVukovic99 Feb 1, 2024
97d0564
Merge branch 'master' into language-select
StefanVukovic99 Feb 3, 2024
2580d24
fixes
StefanVukovic99 Feb 3, 2024
16c1e39
fix part of comments
StefanVukovic99 Feb 3, 2024
2865806
fix more comments
StefanVukovic99 Feb 3, 2024
d6b4708
delete unused types
StefanVukovic99 Feb 3, 2024
5abd90c
comment
StefanVukovic99 Feb 3, 2024
9b5bfa0
comment
StefanVukovic99 Feb 3, 2024
c088e23
do backend
StefanVukovic99 Feb 3, 2024
61c0236
other files
StefanVukovic99 Feb 3, 2024
0b59ceb
move fetch utils to own file
StefanVukovic99 Feb 3, 2024
46b782b
remove extra line
StefanVukovic99 Feb 3, 2024
5acad61
add extra line
StefanVukovic99 Feb 3, 2024
c3b594d
remove unnecessary export
StefanVukovic99 Feb 3, 2024
0142a91
simplify folder structure
StefanVukovic99 Feb 3, 2024
cd560a2
Merge branch 'fetch-utilities' into language-select
StefanVukovic99 Feb 3, 2024
1ed8653
remove redundant async
StefanVukovic99 Feb 3, 2024
b204d3e
fix param type in api
StefanVukovic99 Feb 3, 2024
a9b6f4e
fix language index
StefanVukovic99 Feb 3, 2024
986a417
undo changes to cssStyleApplier
StefanVukovic99 Feb 3, 2024
fc6cccf
Merge branch 'fetch-utilities' into language-select
StefanVukovic99 Feb 3, 2024
be400d4
Merge branch 'master' into language-select
StefanVukovic99 Feb 4, 2024
e9dfc45
undo changes to utilities.js
StefanVukovic99 Feb 4, 2024
7079ceb
undo changes to utilities.js
StefanVukovic99 Feb 4, 2024
79a6eb3
simplify language util
StefanVukovic99 Feb 4, 2024
b1314cd
lint
StefanVukovic99 Feb 4, 2024
996c721
undo phantom changes to anki integration
StefanVukovic99 Feb 4, 2024
0ed108a
require textTransformations options
StefanVukovic99 Feb 4, 2024
68e6fd8
explicit locale in localeCompare
StefanVukovic99 Feb 4, 2024
49d02e7
punctuate notes
StefanVukovic99 Feb 4, 2024
2ef75a8
prefer early exit
StefanVukovic99 Feb 4, 2024
17ceb75
rename LanguageOptionsObjectMap
StefanVukovic99 Feb 4, 2024
6557689
rename to textPreprocessor
StefanVukovic99 Feb 4, 2024
0611e06
tuple with names instead of boolean array
StefanVukovic99 Feb 4, 2024
12911af
safe data setting
StefanVukovic99 Feb 4, 2024
2c8a92a
optional chaining
StefanVukovic99 Feb 4, 2024
c1d2197
simplify LanguageOptions
StefanVukovic99 Feb 4, 2024
3b64fc2
encapsulate languages
StefanVukovic99 Feb 4, 2024
4699904
delete language util
StefanVukovic99 Feb 4, 2024
f032d99
nullable language in text preprocessors controller
StefanVukovic99 Feb 4, 2024
ddb9b97
Merge branch 'master' into language-select
StefanVukovic99 Feb 5, 2024
183215a
rename transform to process
StefanVukovic99 Feb 5, 2024
a99442f
Merge branch 'master' into language-select
StefanVukovic99 Feb 5, 2024
edce060
remove settings
StefanVukovic99 Feb 7, 2024
a1d3c8d
Merge branch 'master' into language-select
StefanVukovic99 Feb 7, 2024
ecc08ee
make translation advanced again
StefanVukovic99 Feb 7, 2024
fb249cf
remove unused getTextTransformations api call
StefanVukovic99 Feb 7, 2024
ab8112b
comments
StefanVukovic99 Feb 8, 2024
950979c
change language types
StefanVukovic99 Feb 8, 2024
3a89fae
RIP flags
StefanVukovic99 Feb 8, 2024
eda3b76
comments
StefanVukovic99 Feb 8, 2024
61788d5
Merge branch 'master' into language-select
StefanVukovic99 Feb 8, 2024
f416de0
fix tests
StefanVukovic99 Feb 8, 2024
9486269
Merge branch 'master' into language-select
StefanVukovic99 Feb 11, 2024
dbbdf0a
lint
StefanVukovic99 Feb 11, 2024
a3007d3
Text preprocessor type changes (#10)
toasted-nutbread Feb 11, 2024
4c387d6
Merge remote-tracking branch 'yezichak/language-select' into language…
StefanVukovic99 Feb 11, 2024
53433fa
lint
StefanVukovic99 Feb 11, 2024
e45f39b
update translator benchmark
StefanVukovic99 Feb 11, 2024
56ffd3f
undo markdown changes
StefanVukovic99 Feb 11, 2024
4ab1696
Merge branch 'master' into language-select
djahandarie Feb 12, 2024
f82987e
Merge branch 'master' into language-select
StefanVukovic99 Feb 12, 2024
65d6b31
Merge remote-tracking branch 'yezichak/language-select' into language…
StefanVukovic99 Feb 12, 2024
950e1b3
undo markdown changes
StefanVukovic99 Feb 12, 2024
d639e25
undo markdown changes
StefanVukovic99 Feb 12, 2024
bcd6d1d
Merge branch 'master' into language-select
StefanVukovic99 Feb 15, 2024
1d7a349
more merge
StefanVukovic99 Feb 15, 2024
000bfa7
simplify language controller
StefanVukovic99 Feb 16, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .eslintrc.json
Original file line number Diff line number Diff line change
Expand Up @@ -502,9 +502,13 @@
"ext/js/general/object-property-accessor.js",
"ext/js/general/regex-util.js",
"ext/js/general/text-source-map.js",
"ext/js/language/en/language-english.js",
"ext/js/language/ja/japanese-wanakana.js",
"ext/js/language/ja/japanese.js",
"ext/js/language/ja/language-japanese.js",
"ext/js/language/language-transformer.js",
"ext/js/language/languages.js",
"ext/js/language/text-preprocessors.js",
"ext/js/language/translator.js",
"ext/js/media/audio-downloader.js",
"ext/js/media/media-util.js",
Expand Down
15 changes: 3 additions & 12 deletions benches/translator.bench.js
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,8 @@ import {fileURLToPath} from 'node:url';
import path from 'path';
import {bench, describe} from 'vitest';
import {parseJson} from '../dev/json.js';
import {createFindKanjiOptions, createFindTermsOptions} from '../test/utilities/translator.js';
import {createTranslatorContext} from '../test/fixtures/translator-test.js';
import {createFindKanjiOptions, createFindTermsOptions} from '../test/utilities/translator.js';

const dirname = path.dirname(fileURLToPath(import.meta.url));
const dictionaryName = 'Test Dictionary 2';
Expand All @@ -33,25 +33,16 @@ describe('Translator', () => {
const {optionsPresets, tests} = parseJson(readFileSync(testInputsFilePath, {encoding: 'utf8'}));

const findKanjiTests = tests.filter((data) => data.options === 'kanji');
const findTermTests = tests.filter((data) => data.options === 'default');
const findTermWithTextTransformationsTests = tests.filter((data) => data.options !== 'kanji' && data.options !== 'default');
const findTermTests = tests.filter((data) => data.options !== 'kanji');

bench(`Translator.prototype.findTerms - no text transformations (n=${findTermTests.length})`, async () => {
bench(`Translator.prototype.findTerms - (n=${findTermTests.length})`, async () => {
for (const data of /** @type {import('test/translator').TestInputFindTerm[]} */ (findTermTests)) {
const {mode, text} = data;
const options = createFindTermsOptions(dictionaryName, optionsPresets, data.options);
await translator.findTerms(mode, text, options);
}
});

bench(`Translator.prototype.findTerms - text transformations (n=${findTermWithTextTransformationsTests.length})`, async () => {
for (const data of /** @type {import('test/translator').TestInputFindTerm[]} */ (findTermWithTextTransformationsTests)) {
const {mode, text} = data;
const options = createFindTermsOptions(dictionaryName, optionsPresets, data.options);
await translator.findTerms(mode, text, options);
}
});

bench(`Translator.prototype.findKanji - (n=${findKanjiTests.length})`, async () => {
for (const data of /** @type {import('test/translator').TestInputFindKanji[]} */ (findKanjiTests)) {
const {text} = data;
Expand Down
3 changes: 3 additions & 0 deletions dev/jsconfig.json
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,9 @@
"error": ["../types/ext/error"],
"event-listener-collection": ["../types/ext/event-listener-collection"],
"japanese-util": ["../types/ext/japanese-util"],
"language": ["../types/ext/language"],
"language-english": ["../types/ext/language-english"],
"language-japanese": ["../types/ext/language-japanese"],
"ext/json-schema": ["../types/ext/json-schema"],
"language-transformer": ["../types/ext/language-transformer"],
"language-transformer-internal": ["../types/ext/language-transformer-internal"],
Expand Down
116 changes: 57 additions & 59 deletions docs/anki-integration.md

Large diffs are not rendered by default.

41 changes: 5 additions & 36 deletions ext/data/schemas/options-schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,7 @@
"type": "object",
"required": [
"enable",
"language",
"resultOutputMode",
"debugInfo",
"maxResults",
Expand Down Expand Up @@ -126,6 +127,10 @@
"type": "boolean",
"default": true
},
"language": {
"type": "string",
"default": "ja"
},
"resultOutputMode": {
"type": "string",
"enum": ["group", "merge", "split"],
Expand Down Expand Up @@ -722,12 +727,6 @@
"translation": {
"type": "object",
"required": [
"convertHalfWidthCharacters",
"convertNumericCharacters",
"convertAlphabeticCharacters",
"convertHiraganaToKatakana",
"convertKatakanaToHiragana",
"collapseEmphaticSequences",
"textReplacements",
"searchResolution"
],
Expand All @@ -740,36 +739,6 @@
],
"default": "letter"
},
"convertHalfWidthCharacters": {
"type": "string",
"enum": ["false", "true", "variant"],
"default": "false"
},
"convertNumericCharacters": {
"type": "string",
"enum": ["false", "true", "variant"],
"default": "false"
},
"convertAlphabeticCharacters": {
"type": "string",
"enum": ["false", "true", "variant"],
"default": "false"
},
"convertHiraganaToKatakana": {
"type": "string",
"enum": ["false", "true", "variant"],
"default": "false"
},
"convertKatakanaToHiragana": {
"type": "string",
"enum": ["false", "true", "variant"],
"default": "variant"
},
"collapseEmphaticSequences": {
"type": "string",
"enum": ["false", "true", "full"],
"default": "false"
},
"textReplacements": {
"type": "object",
"required": [
Expand Down
26 changes: 11 additions & 15 deletions ext/js/background/backend.js
StefanVukovic99 marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ import {DictionaryDatabase} from '../dictionary/dictionary-database.js';
import {Environment} from '../extension/environment.js';
import {ObjectPropertyAccessor} from '../general/object-property-accessor.js';
import {distributeFuriganaInflected, isCodePointJapanese, isStringPartiallyJapanese, convertKatakanaToHiragana as jpConvertKatakanaToHiragana} from '../language/ja/japanese.js';
import {getLanguageSummaries} from '../language/languages.js';
import {Translator} from '../language/translator.js';
import {AudioDownloader} from '../media/audio-downloader.js';
import {getFileExtensionFromAudioMediaType, getFileExtensionFromImageMediaType} from '../media/media-util.js';
Expand Down Expand Up @@ -183,7 +184,8 @@ export class Backend {
['textHasJapaneseCharacters', this._onApiTextHasJapaneseCharacters.bind(this)],
['getTermFrequencies', this._onApiGetTermFrequencies.bind(this)],
['findAnkiNotes', this._onApiFindAnkiNotes.bind(this)],
['openCrossFramePort', this._onApiOpenCrossFramePort.bind(this)]
['openCrossFramePort', this._onApiOpenCrossFramePort.bind(this)],
['getLanguageSummaries', this._onApiGetLanguageSummaries.bind(this)]
]);
/* eslint-enable @stylistic/no-multi-spaces */

Expand Down Expand Up @@ -907,6 +909,11 @@ export class Backend {
return {targetTabId, targetFrameId};
}

/** @type {import('api').ApiHandler<'getLanguageSummaries'>} */
_onApiGetLanguageSummaries() {
return getLanguageSummaries();
}

// Command handlers

/**
Expand Down Expand Up @@ -2364,15 +2371,9 @@ export class Backend {
if (typeof deinflect !== 'boolean') { deinflect = true; }
const enabledDictionaryMap = this._getTranslatorEnabledDictionaryMap(options);
const {
general: {mainDictionary, sortFrequencyDictionary, sortFrequencyDictionaryOrder},
general: {mainDictionary, sortFrequencyDictionary, sortFrequencyDictionaryOrder, language},
scanning: {alphanumeric},
translation: {
convertHalfWidthCharacters,
convertNumericCharacters,
convertAlphabeticCharacters,
convertHiraganaToKatakana,
convertKatakanaToHiragana,
collapseEmphaticSequences,
textReplacements: textReplacementsOptions,
searchResolution
}
Expand All @@ -2397,16 +2398,11 @@ export class Backend {
sortFrequencyDictionary,
sortFrequencyDictionaryOrder,
removeNonJapaneseCharacters: !alphanumeric,
convertHalfWidthCharacters,
convertNumericCharacters,
convertAlphabeticCharacters,
convertHiraganaToKatakana,
convertKatakanaToHiragana,
collapseEmphaticSequences,
searchResolution,
textReplacements,
enabledDictionaryMap,
excludeDictionaryDefinitions
excludeDictionaryDefinitions,
language
};
}

Expand Down
7 changes: 7 additions & 0 deletions ext/js/comm/api.js
Original file line number Diff line number Diff line change
Expand Up @@ -361,6 +361,13 @@ export class API {
return this._invoke('openCrossFramePort', {targetTabId, targetFrameId});
}

/**
* @returns {Promise<import('api').ApiReturn<'getLanguageSummaries'>>}
*/
getLanguageSummaries() {
return this._invoke('getLanguageSummaries', void 0);
}

// Utilities

/**
Expand Down
28 changes: 27 additions & 1 deletion ext/js/data/options-util.js
Original file line number Diff line number Diff line change
Expand Up @@ -521,7 +521,8 @@ export class OptionsUtil {
this._updateVersion21,
this._updateVersion22,
this._updateVersion23,
this._updateVersion24
this._updateVersion24,
this._updateVersion25
];
/* eslint-enable @typescript-eslint/unbound-method */
if (typeof targetVersion === 'number' && targetVersion < result.length) {
Expand Down Expand Up @@ -1135,6 +1136,31 @@ export class OptionsUtil {
}
}

/**
* - Added general.language.
* - Modularized text preprocessors.
* @type {import('options-util').UpdateFunction}
*/
_updateVersion25(options) {
const textPreprocessors = [
'convertHalfWidthCharacters',
'convertNumericCharacters',
'convertAlphabeticCharacters',
'convertHiraganaToKatakana',
'convertKatakanaToHiragana',
'collapseEmphaticSequences'
];

for (const {options: profileOptions} of options.profiles) {
profileOptions.general.language = 'ja';

for (const preprocessor of textPreprocessors) {
delete profileOptions.translation[preprocessor];
}
}
}


/**
* @param {string} url
* @returns {Promise<chrome.tabs.Tab>}
Expand Down
29 changes: 29 additions & 0 deletions ext/js/language/en/language-english.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
/*
* Copyright (C) 2024 Yomitan Authors
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <https://www.gnu.org/licenses/>.
*/

import {capitalizeFirstLetter, decapitalize} from '../text-preprocessors.js';

/** @type {import('language-english').EnglishLanguageDescriptor} */
export const descriptor = {
name: 'English',
iso: 'en',
exampleText: 'read',
textPreprocessors: {
capitalizeFirstLetter,
decapitalize
}
};
77 changes: 77 additions & 0 deletions ext/js/language/ja/language-japanese.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
/*
* Copyright (C) 2024 Yomitan Authors
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <https://www.gnu.org/licenses/>.
*/

import {basicTextPreprocessorOptions} from '../text-preprocessors.js';
import {convertAlphabeticToKana} from './japanese-wanakana.js';
import {collapseEmphaticSequences, convertHalfWidthKanaToFullWidth, convertHiraganaToKatakana, convertKatakanaToHiragana, convertNumericToFullWidth} from './japanese.js';

/** @type {import('language-japanese').JapaneseLanguageDescriptor} */
export const descriptor = {
name: 'Japanese',
iso: 'ja',
exampleText: '読め',
textPreprocessors: {
convertHalfWidthCharacters: {
name: 'Convert half width characters to full width',
description: 'ヨミチャン → ヨミチャン',
options: basicTextPreprocessorOptions,
/** @type {import('language').TextPreprocessorFunction<boolean>} */
process: (str, setting, sourceMap) => (setting ? convertHalfWidthKanaToFullWidth(str, sourceMap) : str)
},
convertNumericCharacters: {
name: 'Convert numeric characters to full width',
description: '1234 → 1234',
options: basicTextPreprocessorOptions,
/** @type {import('language').TextPreprocessorFunction<boolean>} */
process: (str, setting) => (setting ? convertNumericToFullWidth(str) : str)
},
convertAlphabeticCharacters: {
name: 'Convert alphabetic characters to hiragana',
description: 'yomichan → よみちゃん',
options: basicTextPreprocessorOptions,
/** @type {import('language').TextPreprocessorFunction<boolean>} */
process: (str, setting, sourceMap) => (setting ? convertAlphabeticToKana(str, sourceMap) : str)
},
convertHiraganaToKatakana: {
name: 'Convert hiragana to katakana',
description: 'よみちゃん → ヨミチャン',
options: basicTextPreprocessorOptions,
/** @type {import('language').TextPreprocessorFunction<boolean>} */
process: (str, setting) => (setting ? convertHiraganaToKatakana(str) : str)
},
convertKatakanaToHiragana: {
name: 'Convert katakana to hiragana',
description: 'ヨミチャン → よみちゃん',
options: basicTextPreprocessorOptions,
/** @type {import('language').TextPreprocessorFunction<boolean>} */
process: (str, setting) => (setting ? convertKatakanaToHiragana(str) : str)
},
collapseEmphaticSequences: {
name: 'Collapse emphatic character sequences',
description: 'すっっごーーい → すっごーい / すごい',
options: [[false, false], [true, false], [true, true]],
/** @type {import('language').TextPreprocessorFunction<[collapseEmphatic: boolean, collapseEmphaticFull: boolean]>} */
process: (str, setting, sourceMap) => {
const [collapseEmphatic, collapseEmphaticFull] = setting;
if (collapseEmphatic) {
str = collapseEmphaticSequences(str, collapseEmphaticFull, sourceMap);
}
return str;
}
}
}
};
Loading
Loading