Korean_FA: Korean Forced-Aligner

v.1.7.0(09.03.23)

Docker

Korean Forced Aligner (Korean_FA) can now be executed within a Docker image. When you run this Docker image, Korean_FA becomes accessible via a user-friendly web interface. This interface enables users to effortlessly upload audio and text pairs and subsequently download the resulting TextGrid files. Please follow the instructions below.
If Docker is not already installed on your computer system, please download and install it from Docker's official website.

Installation

(recommended) pulling a docker image
- To utilize the recommended method, follow the step in your terminal (macOS, Linux, or Windows WSL).
```
 $ docker pull hyung8758/koreanfa
```
building a docker image
- To build a Docker image, navigate to the Korean_FA directory and execute the following command.
```
 $ bash ./build.sh
```
- Inside the "build.sh" file, you can customize values for the USE_BUILDX and USE_SUDO variables if needed. While running the script as-is should work fine, system-specific errors may occur.
- Upon successful completion of the building process, you should see "korean_fa_app" listed in the Docker image inventory.
```
 $ docker images # Displays a list of Docker images.
```

Usage

Starting the Container from the Image.

 $ docker run -d -p 31066:31066 --name korean_fa_web_server hyung8758/koreanfa

Once you've executed the command, you can access the Korean_FA web UI at http://localhost:31066
Stopping a Docker Container.
```
 $ docker stop korean_fa_web_server
```
Restarting a Docker Container.
```
 $ docker start korean_fa_web_server
```

Removing Korean_FA Container and Image.

 $ docker rm korean_fa_web_server # remove a container. ensure it is stopped first.
 $ docker rmi hyung8758/koreanfa # remove an image.

Local Environment

It is highly recommended to utilize a Docker image for running the Korean_FA application. Nevertheless, for those who prefer running the application directly in a terminal, please proceed with the following steps.

OS

Mac OSX 11.0.1(recent Big Sur): Stable.
Linux (recent Ubuntu 18.04): Stable.
Windows: unstable (Not tested)

Prerequisite

Installing Kaldi

Type below in command line.

 $ git clone https://github.com/kaldi-asr/kaldi.git kaldi --origin upstream
 $ cd kaldi
 $ git pull

Read INSTALL and follow the direction written there.

Installing Dependencies
- You will need Python 3.8 or a more recent version. You can achieve this by using Conda and setting up a virtual environment.
- On mac terminal
```
 $ brew install sox coreutils
 $ pip install -r requirements.txt
```
- On Ubuntu terminal
```
 $ apt-get install sox coreutilss
 $ pip install -r requirements.txt
```

Usage

Navigate to the 'Korean_FA' directory.
Open the 'forced_align.sh' file with any text editor to specify the user path of the Kaldi directory.
- Change 'kaldi' name variable. (initial setting: kaldi=/home/kaldi)

Run the code with the path to the data for forced alignment.

$ bash forced_align.sh (options) (data directory)
$ bash forced_align.sh -nw ./example/readspeech
- Options
 	1. -h  | --help    : display instruction.
 	2. -nj | --num-job : Parallel alignment to speed up.
 	3. -s  | --skip    : Skip alignment for already aligned data.
 	4. -nw | --no-word : remove word tier.
 	5. -np | --no-phone: remove phone tier.

Textgrid(s) will be saved in the data directory.

Materials (Data Preparation)

Audio files (.wav) (sampling rate at 16,000Hz)
- Please ensure that your audio file(s) are in WAV format ('.wav') and have a sampling rate of 16,000Hz.
- Korean_FA is designed to work with audio files that have a sampling rate of 16,000Hz.
Text files (.txt)
- When naming your transcription text files, please use ordered numbers as suffixes.
  - ex) name01.txt, name02.txt, ...
- Each text file should contain one complete sentence.
- Refrain from including any punctuation marks such as periods ('.') or commas (',') in the text file.
- The sentences should be written in the target language.
- Ensure there are no trailing white spaces or tabs at the end of each line.

VERSION HISTORY

v.1.0(08/27/16): Introduced gmm, sgmm_mmi, and dnn-based Korean FA.
v.1.1(09/06/16): Updated g2p. Added the monophone model.
v.1.2(10/10/16): Simplified phoneset. Removed the option to choose models like dnn or gmm for forced alignment.
v.1.3(10/24/16): Introduced the ability to select specific labels in TextGrid. Changed the alignment procedure to align audio files one by one in the directory. This change increased alignment time but improved accuracy. Detailed alignment process information is now available in the log directory, along with more useful command-line information.
v.1.4(01.14.16): Improved error handling. Log files are now tagged with respect to each wave file name.
v.1.5(02.08.17): Major changes in the g2p system to make it compatible with the new g2p system. Added a skipping option to skip alignment of audio files with existing TextGrids. Fixed a few minor bugs.
v.1.5.1(02.26.17): Addressed bug reports, particularly related to time mismatch in the word tier. Fixed the issue.
v.1.5.2(05.17.17): Made changes to return and exit procedures, resolved option errors, and fixed minor bugs. Added a skip option.
v.1.5.3(07.10.17): Enabled the alignment of long audio files. Increased information printing on the screen. Resolved a floating error causing time mismatch between the start and end points of each phone segment.
v.1.5.4(11.14.17): Addressed a floating error in the Kaldi code. This error will be resolved during post-processing, and it's expected that time mismatch errors will no longer occur
v.1.6(01.18.18): Introduced the num-jb option to split multiple files into subgroups and align multiple files simultaneously, speeding up the alignment process. Made changes to how log histories are printed and adjusted the structure of the main script.
v.1.6.1(10.28.20):Stabilized the lexicon process. Removed redundant jobs in main_fa.sh and fa_prep_data.sh.
v1.6.2(01.03.21): Adjusted the audio processing part (sampling rate = 16,000, channel = 1, bit = 16) and changed log variables.
v.1.7.0(09.03.23): Introduced a web UI available in a Docker image.

Name		Name	Last commit message	Last commit date
Latest commit History 124 Commits
conf		conf
docker		docker
example		example
model		model
src		src
.gitignore		.gitignore
KoreanFA_guidlines.pdf		KoreanFA_guidlines.pdf
README.md		README.md
build.sh		build.sh
fa_restapi_server.py		fa_restapi_server.py
fa_server.py		fa_server.py
forced_align.sh		forced_align.sh
license		license
path.sh		path.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Korean_FA: Korean Forced-Aligner

Docker

Installation

Usage

Local Environment

OS

Prerequisite

Usage

Materials (Data Preparation)

VERSION HISTORY

About

Releases

Packages

Languages

License

hyung8758/Korean_FA

Folders and files

Latest commit

History

Repository files navigation

Korean_FA: Korean Forced-Aligner

Docker

Installation

Usage

Local Environment

OS

Prerequisite

Usage

Materials (Data Preparation)

VERSION HISTORY

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages