Skip to content

Commit

Permalink
Merge pull request #9 from joseguerr/minor-SW10-add-dockerfile
Browse files Browse the repository at this point in the history
Minor sw10 add dockerfile
  • Loading branch information
joseguerr authored Sep 23, 2024
2 parents 5b5f01b + 8cf97d7 commit 2173f54
Show file tree
Hide file tree
Showing 5 changed files with 48 additions and 2 deletions.
26 changes: 26 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
FROM python:3.11-slim

EXPOSE 4040

RUN echo "Installing Packages" && \
apt-get update && \
apt-get install -y default-jre wget tar make && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*

RUN echo "Installing Spark" && \
cd /opt && \
wget -q https://archive.apache.org/dist/spark/spark-3.5.2/spark-3.5.2-bin-hadoop3.tgz && \
tar -xf spark-3.5.2-bin-hadoop3.tgz && \
rm -f spark-3.5.2-bin-hadoop3.tgz && \
pip install -e /opt/spark-3.5.2-bin-hadoop3/python

ENV HADOOP_CONF_DIR=/opt/spark-3.5.2-bin-hadoop3/conf

COPY . .

RUN make clean && \
make setup && \
make build

CMD ["/bin/bash"]
8 changes: 8 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,14 @@ build: # Build and package the application and its dependencies to be used throu
cp spark_web_events_etl/main.py app_config.yaml spark_web_events_etl/tasks/*/dq_checks_*.yaml deps
poetry run python -m zipfile -c deps/libs.zip libs/*

.PHONY: docker-build
docker-build: # Build the application in a docker container.
docker build -t jg-data-engineer-test .

.PHONY: docker-run
docker-run: # Run the application in a docker container.
docker run -it jg-data-engineer-test

.PHONY: run-local
run-local: # Run a task locally (example: make run-local task=standardise execution-date=2023-04-12).
poetry run spark-submit \
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ Configuration is defined in [app_config.yaml](app_config.yaml) and managed by th
## Pre-requisites
- OpenJDK@17
- [email protected]
- [email protected].0
- [email protected].2
- [email protected]

## Execution instructions
Expand Down
13 changes: 12 additions & 1 deletion poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ pyspark = "3.5.2"
dynaconf = "3.2.1"
delta-spark = "3.2"
soda-core-spark-df = "^3.3.5"
venv-pack = "^0.2.0"

[tool.poetry.group.dev.dependencies]
pytest = "^8.2.2"
Expand Down

0 comments on commit 2173f54

Please sign in to comment.