Skip to content

Commit

Permalink
Merge pull request #3 from pettarin/master
Browse files Browse the repository at this point in the history
aeneas v1.0.4: added boundary adjustment algorithm, run_vad, subtitle…
  • Loading branch information
readbeyond committed Aug 9, 2015
2 parents 75a0ec8 + 186ba18 commit 45ffff7
Show file tree
Hide file tree
Showing 70 changed files with 2,720 additions and 483 deletions.
31 changes: 23 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@

**aeneas** is a Python library and a set of tools to automagically synchronize audio and text.

* Version: 1.0.3
* Date: 2015-06-12
* Version: 1.0.4
* Date: 2015-08-09
* Developed by: [ReadBeyond](http://www.readbeyond.it/)
* Lead Developer: [Alberto Pettarin](http://www.albertopettarin.it/)
* License: the GNU Affero General Public License Version 3 (AGPL v3)
Expand Down Expand Up @@ -39,7 +39,7 @@ and [this audio file](aeneas/tests/res/container/job/assets/p001.mp3),

Moreover, the map can be output in several formats: SMIL for EPUB 3,
SRT/TTML/VTT for closed captioning, JS for Web usage,
or raw CSV/TSV/TXT/XML for further processing.
or raw CSV/SSV/TSV/TXT/XML for further processing.


## System Requirements, Supported Platforms and Installation
Expand Down Expand Up @@ -76,7 +76,8 @@ callable by the `subprocess` Python module.
A way to ensure the latter consists
in adding the three executables to your `$PATH`.
Alternatively, you can use VirtualBox
to run **aeneas** inside a virtualized Debian image.
to run **aeneas** inside a virtualized Debian image,
for example using [aeneas-vagrant](https://github.com/readbeyond/aeneas-vagrant).

### Installation

Expand All @@ -91,6 +92,15 @@ If the last command prints a success message,
you have all the required dependencies installed
and you can confidently run **aeneas** in production.

If you get an error, try running the
[provided `install_dependencies.sh` script](install_dependencies.sh)

```bash
$ sudo bash install_dependencies.sh
```

and then try running `check_dependencies.py` again.

Alternatively, consider using the [Vagrant box](http://www.vagrantup.com)
created by [aeneas-vagrant](https://github.com/readbeyond/aeneas-vagrant).

Expand Down Expand Up @@ -156,10 +166,12 @@ $ make html

Tutorial: [A Practical Introduction To The aeneas Package](http://www.albertopettarin.it/blog/2015/05/21/a-practical-introduction-to-the-aeneas-package.html)

Mailing list: [https://groups.google.com/d/forum/aeneas-forced-alignment](https://groups.google.com/d/forum/aeneas-forced-alignment)


## Supported Features

* Input text files in plain, parsed or unparsed format
* Input text files in plain, parsed, subtitles, or unparsed format
* Text extraction from XML (e.g., XHTML) files using `id` and `class` attributes
* Arbitrary text fragment granularity (single word, subphrase, phrase, paragraph, etc.)
* Input audio file formats: all those supported by `ffmpeg`
Expand All @@ -168,6 +180,7 @@ Tutorial: [A Practical Introduction To The aeneas Package](http://www.albertopet
* Supported (= tested) languages: BG, CA, CY, DA, DE, EL, EN, ES, ET, FI, FR, GA, GRC, HR, HU, IS, IT, LA, LT, LV, NL, NO, RO, RU, PL, PT, SK, SR, SV, TR, UK
* Robust against misspelled/mispronounced words, local rearrangements of words, background noise/sporadic spikes
* Code suitable for a Web app deployment (e.g., on-demand AWS instances)
* Adjustable splitting times, including a max character/second constraint for CC applications


## Limitations and Missing Features
Expand All @@ -189,9 +202,8 @@ Tutorial: [A Practical Introduction To The aeneas Package](http://www.albertopet
* Reporting the alignment score
* Improving (removing?) dependency from `espeak`, `ffmpeg`, `ffprobe` executables
* Multilevel sync map granularity (e.g., multilevel SMIL output)
* Enforcing a max char/second constraint for CC applications
* Supporting input text encodings other than UTF-8
* Adding more languages
* Adding (testing) more languages
* Better documentation
* Testing other approaches, like HMM
* Publishing the package on PyPI
Expand Down Expand Up @@ -249,6 +261,10 @@ No copy rights were harmed in the making of this project.

## Supporting and Contributing

### Sponsors

* **July 2015**: [Michele Gianella](https://plus.google.com/+michelegianella/about) generously supported the development of the boundary adjustment code

### Supporting

Would you like supporting the development of **aeneas**?
Expand Down Expand Up @@ -334,4 +350,3 @@ helped shaping the structure of this package
for its asynchronous usage.
34 changes: 27 additions & 7 deletions README.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ aeneas
**aeneas** is a Python library and a set of tools to automagically
synchronize audio and text.

- Version: 1.0.3
- Date: 2015-06-12
- Version: 1.0.4
- Date: 2015-08-09
- Developed by: `ReadBeyond <http://www.readbeyond.it/>`__
- Lead Developer: `Alberto Pettarin <http://www.albertopettarin.it/>`__
- License: the GNU Affero General Public License Version 3 (AGPL v3)
Expand Down Expand Up @@ -43,7 +43,7 @@ audio file <aeneas/tests/res/container/job/assets/p001.mp3>`__,

Moreover, the map can be output in several formats: SMIL for EPUB 3,
SRT/TTML/VTT for closed captioning, JS for Web usage, or raw
CSV/TSV/TXT/XML for further processing.
CSV/SSV/TSV/TXT/XML for further processing.

System Requirements, Supported Platforms and Installation
---------------------------------------------------------
Expand Down Expand Up @@ -81,7 +81,8 @@ sure ``ffmpeg``, ``ffprobe`` and ``espeak`` are properly installed and
callable by the ``subprocess`` Python module. A way to ensure the latter
consists in adding the three executables to your ``$PATH``.
Alternatively, you can use VirtualBox to run **aeneas** inside a
virtualized Debian image.
virtualized Debian image, for example using
`aeneas-vagrant <https://github.com/readbeyond/aeneas-vagrant>`__.

Installation
~~~~~~~~~~~~
Expand All @@ -97,6 +98,15 @@ If the last command prints a success message, you have all the required
dependencies installed and you can confidently run **aeneas** in
production.

If you get an error, try running the `provided
``install_dependencies.sh`` script <install_dependencies.sh>`__

.. code:: bash

$ sudo bash install_dependencies.sh

and then try running ``check_dependencies.py`` again.

Alternatively, consider using the `Vagrant
box <http://www.vagrantup.com>`__ created by
`aeneas-vagrant <https://github.com/readbeyond/aeneas-vagrant>`__.
Expand Down Expand Up @@ -165,10 +175,12 @@ Generated from the source (requires ``sphinx``):
Tutorial: `A Practical Introduction To The aeneas
Package <http://www.albertopettarin.it/blog/2015/05/21/a-practical-introduction-to-the-aeneas-package.html>`__

Mailing list: https://groups.google.com/d/forum/aeneas-forced-alignment

Supported Features
------------------

- Input text files in plain, parsed or unparsed format
- Input text files in plain, parsed, subtitles, or unparsed format
- Text extraction from XML (e.g., XHTML) files using ``id`` and
``class`` attributes
- Arbitrary text fragment granularity (single word, subphrase, phrase,
Expand All @@ -183,6 +195,8 @@ Supported Features
of words, background noise/sporadic spikes
- Code suitable for a Web app deployment (e.g., on-demand AWS
instances)
- Adjustable splitting times, including a max character/second
constraint for CC applications

Limitations and Missing Features
--------------------------------
Expand All @@ -206,9 +220,8 @@ TODO List
- Improving (removing?) dependency from ``espeak``, ``ffmpeg``,
``ffprobe`` executables
- Multilevel sync map granularity (e.g., multilevel SMIL output)
- Enforcing a max char/second constraint for CC applications
- Supporting input text encodings other than UTF-8
- Adding more languages
- Adding (testing) more languages
- Better documentation
- Testing other approaches, like HMM
- Publishing the package on PyPI
Expand Down Expand Up @@ -265,6 +278,13 @@ No copy rights were harmed in the making of this project.
Supporting and Contributing
---------------------------

Sponsors
~~~~~~~~

- **July 2015**: `Michele
Gianella <https://plus.google.com/+michelegianella/about>`__
generously supported the development of the boundary adjustment code

Supporting
~~~~~~~~~~

Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.0.3
1.0.4
4 changes: 3 additions & 1 deletion aeneas/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
to automagically synchronize audio and text.
"""

from aeneas.adjustboundaryalgorithm import AdjustBoundaryAlgorithm
from aeneas.analyzecontainer import AnalyzeContainer
from aeneas.audiofile import AudioFile
from aeneas.container import Container, ContainerFormat
Expand All @@ -27,6 +28,7 @@
from aeneas.synthesizer import Synthesizer
from aeneas.task import Task, TaskConfiguration
from aeneas.textfile import TextFile, TextFileFormat, TextFragment
from aeneas.vad import VAD
from aeneas.validator import Validator

__author__ = "Alberto Pettarin"
Expand All @@ -35,7 +37,7 @@
Copyright 2013-2015, ReadBeyond Srl (www.readbeyond.it)
"""
__license__ = "GNU AGPL v3"
__version__ = "1.0.3"
__version__ = "1.0.4"
__email__ = "[email protected]"
__status__ = "Production"

Expand Down
Loading

0 comments on commit 45ffff7

Please sign in to comment.