The Filmweb Cinemas Data Scraper is a Node.js application designed to scrape cinema and movie screening data from the Filmweb website. The application uses Puppeteer and JSDOM for web scraping and Prisma for database interactions.
- Scrapes cinema data including name, city, latitude, longitude, and screenings URL.
- Scrapes movie data including title, year, duration, director, description, main cast, and genres.
- Saves the scraped data into a PostgreSQL database using Prisma ORM.
- Node.js
- PostgreSQL
-
Clone the repository:
git clone https://github.com/Biplo12/filmweb-cinemas-data-scraper.git cd filmweb-cinemas-data-scraper
-
Install dependencies:
npm install
-
Set up environment variables:
- Create a
.env.local
or.env.production
file in the root directory. - Add the following environment variables:
NODE_ENV=development | production POSTGRES_PRISMA_URL=your_postgres_connection_string
- Create a
To start the application, run:
npm run start
To start the application in watch mode (useful for development), run:
npm run start:watch
For development:
npm run prisma-reset-dev
For production:
npm run prisma-reset-prod
For development:
npm run prisma-push-dev
For production:
npm run prisma-push-prod
- Fork the repository.
- Create a new branch (
git checkout -b feature-branch
). - Make your changes.
- Commit your changes (
git commit -am 'Add new feature'
). - Push to the branch (
git push origin feature-branch
). - Create a new Pull Request.
This project is licensed under the MIT License.