Screenplay Parser

This is a screenplay parser that extracts dialogues between characters. However it extracts the dialogues if the second character has a paranthetical. The scripts are crawled from http://www.imsdb.com/ .

Getting Started

Create a new environment
Clone the repository
Install the dependencies pip install -r requirements.txt
Run scrapy : Go to brickset-scraper folder and run this in your terminal:
```
 scrapy runspider scraper.py --output=data/names_links.json
```
This will generate data/names_links.json.
python json_parser.py data/names_links.json. This will read names_links.json and will create all_name_script.txt. This new txt file has a movie name and a link to its script for each movie in the json file. Note that each script takes 1-2 seconds.
python html_list_parser.py . This will read all_name_script.txt and will generate all_dialogues.txt. This file has all the relevant dialogues from the movie scripts.

Prerequisites

You need to have

Authors

Kamil Veli Toraman: kvtoraman

License

There is no licence for now. You can use as you please. This code tries to have a rule-based algorithm for movie scripts. If you have a better way, please inform me :)

Acknowledgments

This is a result of a 2 month internship in Data Science Lab, Kaist.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
data		data
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
html_list_parser.py		html_list_parser.py
json_parser.py		json_parser.py
requirements.txt		requirements.txt
scraper.py		scraper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Screenplay Parser

Getting Started

Prerequisites

Authors

License

Acknowledgments

About

Releases

Packages

Contributors 2

Languages

kvtoraman/Screenplay

Folders and files

Latest commit

History

Repository files navigation

Screenplay Parser

Getting Started

Prerequisites

Authors

License

Acknowledgments

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages