Skip to content

Commit

Permalink
Merge pull request #62 from gsantiago/feat/streams
Browse files Browse the repository at this point in the history
Release v4.0.0
  • Loading branch information
gsantiago authored Sep 19, 2020
2 parents 029b4a7 + 54f3489 commit abf7831
Show file tree
Hide file tree
Showing 44 changed files with 69,114 additions and 15,003 deletions.
10 changes: 9 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,14 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](http://keepachangelog.com/)
and this project adheres to [Semantic Versioning](http://semver.org/).

## [4.0.0] - 2020-09-19
- Fixes #6 by introducing the stream interface (`parse`, `stringify` and `resync` are now stream-based functions)
- Add `parseSync` and `stringifySync` as synchronous version of `parse` and `stringify`
- Add `map` and `filter` to manipulate the parse stream
- Update the nodes tree so it can support more types than just a cue
- Refactor the internals by creating the Parser and Formatter classes
- Format types are now `"SRT"` and `"WebVTT"` instead of `"srt"` and `"vtt"`

## [3.0.0] - 2020-08-31
- Rewrite the project with TypeScript
- Fixes #43 and #39
Expand All @@ -15,7 +23,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/).
- `parseTimestamp(timestamp: string): number`
- `parseTimestamps(timestamps: string): Timestamp`
- `formatTimestamp(timestamp: number, options?: { format: 'srt' | 'vtt' }): string`
- `parse` supports optional indexes
- `parse` supports optional indexes

## [2.0.5] - 2020-08-28
- Remove zero-fill dependency
Expand Down
276 changes: 252 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,18 @@
[![downloads](https://img.shields.io/npm/dm/subtitle?style=flat-square)](https://www.npmjs.com/package/subtitle)
[![npm](https://img.shields.io/npm/v/subtitle?style=flat-square)](https://www.npmjs.com/package/subtitle)

Parse, manipulate and stringify SRT (SubRip) format, with partial support for WebVTT.
Stream-based library for parsing and manipulating subtitle files.

>["Thanks for this rad package!"](https://github.com/gsantiago/subtitle.js/pull/15#issuecomment-282879854)
>John-David Dalton, creator of Lodash
:white_check_mark: Stream API<br>
:white_check_mark: Written in TypeScript<br>
:white_check_mark: SRT support<br>
:white_check_mark: Partial support for WebVTT (full support comming soon)<br>
:white_check_mark: 100% code coverage<br>
:white_check_mark: Actively maintained since 2015

## Installation

### npm
Expand All @@ -21,79 +28,227 @@ Parse, manipulate and stringify SRT (SubRip) format, with partial support for We

`yarn add subtitle`

## Usage

This library provides some stream-based functions to work with subtitles. The following example parses a SRT file, resyncs it and outputs a VTT file:

```ts
import fs from 'fs'
import { parse, resync, stringify } from 'subtitle'

fs.createReadStream('./my-subtitles.srt')
.pipe(parse())
.pipe(resync(-100))
.pipe(stringify({ format: 'WebVTT' }))
.pipe(fs.createWriteStream('./my-subtitles.vtt'))
```

It also provides functions like `map` and `filter`:

```ts
import { parse, map, filter, stringify } from 'subtitle'

inputStream
.pipe(parse())
.pipe(
filter(
// strips all cues that contains "𝅘𝅥𝅮"
node => !(node.type === 'cue' && node.data.text.includes('𝅘𝅥𝅮'))
)
)
.pipe(
map(node => {
if (node.type === 'cue') {
// convert all cues to uppercase
node.data.text = node.data.text.toUpperCase()
}

return node
})
)
.pipe(stringify({ format: 'WebVTT' }))
.pipe(outputStream)
```

Besides the stream functions, this module also provides synchronous functions like `parseSync` and `stringifySync`. However, you should avoid them and rather use the stream-based functions for better performance:

```ts
import { parseSync, stringifySync } from 'subtitle'

const nodes = parseSync(srtContent)

// do something with your subtitles
// ...

const output = stringify(nodes, { format: 'WebVTT' })
```

## API

The API is minimal and provides only six pure functions:
The module exports the following functions:

* [`parse`](#parse)
* [`parseSync`](#parseSync)
* [`stringify`](#stringify)
* [`stringifySync`](#stringifySync)
* [`map`](#map)
* [`filter`](#filter)
* [`resync`](#resync)
* [`parseTimestamp`](#parseTimestamp)
* [`parseTimestamps`](#parseTimestamps)
* [`formatTimestamp`](#formatTimestamp)

### parse

- `parse(input: string): Caption[]`
- `parse(): DuplexStream`

It receives a string containing a SRT or VTT content and returns
an array of captions:
It returns a Duplex stream for parsing subtitle contents (SRT or WebVTT).

```ts
import { parse } from 'subtitle'

inputStream
.pipe(parse())
.on('data', node => {
console.log('parsed node:', node)
})
.on('error', console.error)
.on('finish', () => console.log('parser has finished'))
```

Check out the [Examples](#examples) section for more examples.

### parseSync

- `parseSync(input: string): Node[]`

> **NOTE**: For better perfomance, consider to use the stream-based `parse` function
It receives a string containing a SRT or VTT content and returns
an array of nodes:

```ts
import { parseSync } from 'subtitle'
import fs from 'fs'

const input = fs.readFileSync('awesome-movie.srt', 'utf8')

parse(input)
parseSync(input)

// returns an array like this:
[
{
start: 20000, // milliseconds
end: 24400,
text: 'Bla Bla Bla Bla'
type: 'cue',
data: {
start: 20000, // milliseconds
end: 24400,
text: 'Bla Bla Bla Bla'
}
},
{
start: 24600,
end: 27800,
text: 'Bla Bla Bla Bla',
settings: 'align:middle line:90%'
type: 'cue',
data: {
start: 24600,
end: 27800,
text: 'Bla Bla Bla Bla',
settings: 'align:middle line:90%'
}
},
// ...
]
```

### stringify

- `stringify(captions: Caption[], options?: { format: 'srt' | 'vtt }): string`
- `stringify({ format: 'SRT' | 'vtt' }): DuplexStream`

It returns a Duplex that receives parsed nodes and transmits the node formatted in SRT or WebVTT:

```ts
import { parse, stringify } from 'subtitle'

inputStream
.pipe(parse())
.pipe(stringify({ format: 'WebVTT' }))
```

Check out the [Examples](#examples) section for more examples.

### stringifySync

- `stringify(nodes: Node[], options: { format: 'SRT' | 'vtt }): string`

> **NOTE**: For better perfomance, consider to use the stream-based `stringify` function
It receives an array of captions and returns a string in SRT (default), but it also supports VTT format through the options.

```ts
import { stringify } from 'subtitle'
import { stringifySync } from 'subtitle'

stringify(captions)
stringifySync(nodes, { format: 'SRT' })
// returns a string in SRT format

stringify(options, { format: 'vtt' })
stringifySync(nodes, { format: 'WebVTT' })
// returns a string in VTT format
```

### map

- `map(callback: function): DuplexStream`

A useful Duplex for manipulating parsed nodes. It works similar to the `Array.map` function, but for streams:

```ts
import { parse, map, stringify } from 'subtitle'

inputStream
.pipe(parse())
.pipe(map((node, index) => {
if (node.type === 'cue') {
node.data.text = node.data.text.toUpperCase()
}

return node
}))
.pipe(stringify({ format: 'SRT' }))
.pipe(outputStream)
```

### filter

- `filter(callback: function): DuplexStream`

A useful Duplex for filtering parsed nodes. It works similar to the `Array.filter` function, but for streams:

```ts
import { parse, filter, stringify } from 'subtitle'

inputStream
.pipe(parse())
.pipe(filter((node, index) => {
return !(node.type === 'cue' && node.data.text.includes('𝅘𝅥𝅮'))
}))
.pipe(stringify({ format: 'SRT' }))
.pipe(outputStream)
```

### resync

- `resync(captions: Caption[], time: number): Caption[]`
- `resync(time: number): DuplexStream`

Resync all the given captions at once:
Resync all cues from the stream:

```ts
import { resync } from 'subtitle'
import { parse, resync, stringify } from 'subtitle'

// Advance subtitles by 1s
const newCaptions = resync(captions, 1000)
readableStream
.pipe(parse())
.pipe(resync(1000))
.pipe(outputStream)

// Delay 250ms
const newCaptions = resync(captions, -250)
stream.pipe(resync(captions, -250))
```

### parseTimestamp
Expand Down Expand Up @@ -130,7 +285,7 @@ parseTimestamps('12:34:56,789 --> 98:76:54,321 align:middle line:90%')

### formatTimestamp

- `formatTimestamp(timestamp: number, options?: { format: 'srt' | 'vtt' }): string`
- `formatTimestamp(timestamp: number, options?: { format: 'SRT' | 'vtt' }): string`

It receives a timestamp in milliseconds and returns it formatted as SRT or VTT:

Expand All @@ -140,10 +295,83 @@ import { formatTimestamp } from 'subtitle'
formatTimestamp(142542)
// => '00:02:22,542'

formatTimestamp(142542, { format: 'vtt' })
formatTimestamp(142542, { format: 'WebVTT' })
// => '00:02:22.542'
```

## Examples

### Nodes

This is what a list of nodes looks like:

```ts
[
{
type: 'header',
data: 'WEBVTT - Header content'
},
{
type: 'cue',
data: {
start: 150066, // timestamp in milliseconds,
end: 158952,
text: 'With great power comes great responsibility'
}
},
...
]
```

For now, it only supports two types of node: `header` and `cue`. Soon, it will support more types
like `comment`.

### Convert SRT file to WebVTT

```ts
import fs from 'fs'
import { parse, stringify } from 'subtitle'

fs.createReadStream('./source.srt')
.pipe(parse())
.pipe(stringify({ format: 'WebVTT' }))
.pipe(fs.createWriteStream('./dest.vtt'))
```

### Extract subtitles from a video

The following example uses the `rip-subtitles` for extracting subtitles from a mkv video and save it
as WebVTT.

```ts
import extract from 'rip-subtitles'
import { parse, stringify } from 'subtitle'

extract('video.mkv')
.pipe(parse())
.pipe(stringify({ format: 'WebVTT' }))
.pipe(fs.createWriteStream('./video.vtt'))
```

### Create subtitles

```ts
import { stringifySync } from 'subtitle'

const list = []

list.push({
type: 'cue',
data: {
start: 1200,
end: 1300,
text: 'Something'
}
})

stringifySync(list)
```

## License

MIT
Loading

0 comments on commit abf7831

Please sign in to comment.