Scrapy AWS Lambda Deployment Code

This project is a web scraping application that utilizes Scrapy to extract data from web pages and store it in an AWS S3 bucket. It includes a Scrapy spider for data extraction, a Dockerfile for containerization, and configuration settings in Conf.py.

Prerequisites

Python 3.8
Scrapy
Docker

Project Structure

app.py: Lambda function code for running the Scrapy spider.
Spider.py: The Scrapy spider for web data extraction.
Dockerfile: Dockerfile for creating a containerized environment for the project.
Conf.py: Configuration settings file for the project.

Getting Started

Install Dependencies: Make sure you have Python 3.8 installed and install the required dependencies using pip:
Configure Settings: Customize the settings in Conf.py as needed for your specific web scraping and AWS S3 configurations.
Run Locally: You can test your Scrapy spider locally using the following command:

scrapy runspider Spider.py

Dockerize the Application: You can containerize your project using Docker by building an image from the provided Dockerfile:

Usage

The app.py script is designed to be used in a Lambda function and can be triggered as an AWS Lambda function.
The Spider.py script defines the Scrapy spider for web scraping. Customize it according to the specific website structure and data to be extracted.
The Dockerfile is used to containerize the project, which can be deployed in various containerization platforms.
Conf.py contains configuration settings for the project, such as S3 bucket and file names, field names, and XPath selectors.

Contributing

If you'd like to contribute to this project, please follow these guidelines.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Dockerfile		Dockerfile
app.py		app.py
conf.py		conf.py
readme.MD		readme.MD
requirements.txt		requirements.txt
spider.py		spider.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scrapy AWS Lambda Deployment Code

Prerequisites

Project Structure

Getting Started

Usage

Contributing

About

Releases

Packages

Languages

shahbaz9221/Scrapy-Lambda-Deployment

Folders and files

Latest commit

History

Repository files navigation

Scrapy AWS Lambda Deployment Code

Prerequisites

Project Structure

Getting Started

Usage

Contributing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages