-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docker optimised #1676
base: master
Are you sure you want to change the base?
docker optimised #1676
Conversation
WalkthroughThe pull request focuses on updating the Dockerfile for improved Python version management and installation process. The changes primarily involve updating the Python version from 3.10.13 to 3.10.16, introducing Changes
Possibly related PRs
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (1)
docker/Dockerfile (1)
40-40
: Consider multi-stage build and volume mounting for modelsTraining models during image build increases both build time and image size. This can impact CI/CD pipeline performance and container deployment.
Consider:
- Using multi-stage builds to separate the training environment from the runtime environment
- Moving model training to a separate process
- Using volumes to mount pre-trained models
Example multi-stage approach:
# Training stage FROM amazonlinux:latest as builder # ... (previous setup steps) ... RUN python -m rasa train --data ${TEMPLATE_DIR_DEFAULT}/data \ --config ${TEMPLATE_DIR_DEFAULT}/config.yml \ --domain ${TEMPLATE_DIR_DEFAULT}/domain.yml \ --out ${TEMPLATE_DIR_DEFAULT}/models # Runtime stage FROM amazonlinux:latest # ... (copy only necessary files and models from builder) ... VOLUME ["${TEMPLATE_DIR_DEFAULT}/models"]
RUN yum install -y wget make gcc zlib-devel tar xz gzip openssl openssl-devel bzip2-devel sqlite-devel libffi-devel openblas-devel ca-certificates git | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Optimize package installation and reduce image size
The current setup uses multiple RUN instructions and doesn't clean up package manager cache, leading to larger image size.
-RUN yum update -y
-RUN yum install -y wget make gcc zlib-devel tar xz gzip openssl openssl-devel bzip2-devel sqlite-devel libffi-devel openblas-devel ca-certificates git
+RUN yum update -y && \
+ yum install -y \
+ wget \
+ make \
+ gcc \
+ zlib-devel \
+ tar \
+ xz \
+ gzip \
+ openssl \
+ openssl-devel \
+ bzip2-devel \
+ sqlite-devel \
+ libffi-devel \
+ openblas-devel \
+ ca-certificates \
+ git && \
+ yum clean all && \
+ rm -rf /var/cache/yum
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
RUN yum install -y wget make gcc zlib-devel tar xz gzip openssl openssl-devel bzip2-devel sqlite-devel libffi-devel openblas-devel ca-certificates git | |
RUN yum update -y && \ | |
yum install -y \ | |
wget \ | |
make \ | |
gcc \ | |
zlib-devel \ | |
tar \ | |
xz \ | |
gzip \ | |
openssl \ | |
openssl-devel \ | |
bzip2-devel \ | |
sqlite-devel \ | |
libffi-devel \ | |
openblas-devel \ | |
ca-certificates \ | |
git && \ | |
yum clean all && \ | |
rm -rf /var/cache/yum |
ENV KAIRON_HOME=/app PYTHON_VERSION=3.10.16 TEMPLATE_DIR_DEFAULT=./template/use-cases/Hi-Hello-GPT RASA_TELEMETRY_ENABLED=false | ||
ENV PYENV_ROOT=/root/.pyenv | ||
ENV PATH=$PYENV_ROOT/shims:$PYENV_ROOT/bin:$PATH |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Optimize Docker layers by combining ENV instructions
Multiple ENV instructions create unnecessary layers. Consider combining them for better optimization.
-ENV KAIRON_HOME=/app PYTHON_VERSION=3.10.16 TEMPLATE_DIR_DEFAULT=./template/use-cases/Hi-Hello-GPT RASA_TELEMETRY_ENABLED=false
-ENV PYENV_ROOT=/root/.pyenv
-ENV PATH=$PYENV_ROOT/shims:$PYENV_ROOT/bin:$PATH
+ENV KAIRON_HOME=/app \
+ PYTHON_VERSION=3.10.16 \
+ TEMPLATE_DIR_DEFAULT=./template/use-cases/Hi-Hello-GPT \
+ RASA_TELEMETRY_ENABLED=false \
+ PYENV_ROOT=/root/.pyenv \
+ PATH=$PYENV_ROOT/shims:$PYENV_ROOT/bin:$PATH
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
ENV KAIRON_HOME=/app PYTHON_VERSION=3.10.16 TEMPLATE_DIR_DEFAULT=./template/use-cases/Hi-Hello-GPT RASA_TELEMETRY_ENABLED=false | |
ENV PYENV_ROOT=/root/.pyenv | |
ENV PATH=$PYENV_ROOT/shims:$PYENV_ROOT/bin:$PATH | |
ENV KAIRON_HOME=/app \ | |
PYTHON_VERSION=3.10.16 \ | |
TEMPLATE_DIR_DEFAULT=./template/use-cases/Hi-Hello-GPT \ | |
RASA_TELEMETRY_ENABLED=false \ | |
PYENV_ROOT=/root/.pyenv \ | |
PATH=$PYENV_ROOT/shims:$PYENV_ROOT/bin:$PATH |
RUN curl https://pyenv.run | bash | ||
RUN echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.zshrc && \ | ||
echo '[[ -d $PYENV_ROOT/bin ]] && export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.zshrc && \ | ||
echo 'eval "$(pyenv init -)"' >> ~/.zshrc | ||
|
||
RUN source ~/.zshrc && \ | ||
pyenv install ${PYTHON_VERSION} && \ | ||
pyenv global ${PYTHON_VERSION} && \ | ||
pyenv rehash && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Security and optimization concerns in pyenv installation
Several issues need attention:
- Executing scripts directly from curl poses security risks
- Using .zshrc in amazonlinux (which typically uses bash)
- Multiple RUN instructions creating unnecessary layers
Consider this safer and more optimized approach:
-RUN curl https://pyenv.run | bash
-RUN echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.zshrc && \
- echo '[[ -d $PYENV_ROOT/bin ]] && export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.zshrc && \
- echo 'eval "$(pyenv init -)"' >> ~/.zshrc
-
-RUN source ~/.zshrc && \
- pyenv install ${PYTHON_VERSION} && \
- pyenv global ${PYTHON_VERSION} && \
- pyenv rehash && \
+RUN set -ex && \
+ curl -L https://github.com/pyenv/pyenv-installer/raw/master/bin/pyenv-installer -o pyenv-installer && \
+ chmod +x pyenv-installer && \
+ ./pyenv-installer && \
+ rm pyenv-installer && \
+ echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc && \
+ echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc && \
+ echo 'eval "$(pyenv init -)"' >> ~/.bashrc && \
+ . ~/.bashrc && \
+ pyenv install ${PYTHON_VERSION} && \
+ pyenv global ${PYTHON_VERSION} && \
+ pyenv rehash && \
python -m pip install --upgrade pip && \
mkdir data_generator training_data testing_data models /home/cache .rasa
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
RUN curl https://pyenv.run | bash | |
RUN echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.zshrc && \ | |
echo '[[ -d $PYENV_ROOT/bin ]] && export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.zshrc && \ | |
echo 'eval "$(pyenv init -)"' >> ~/.zshrc | |
RUN source ~/.zshrc && \ | |
pyenv install ${PYTHON_VERSION} && \ | |
pyenv global ${PYTHON_VERSION} && \ | |
pyenv rehash && \ | |
RUN set -ex && \ | |
curl -L https://github.com/pyenv/pyenv-installer/raw/master/bin/pyenv-installer -o pyenv-installer && \ | |
chmod +x pyenv-installer && \ | |
./pyenv-installer && \ | |
rm pyenv-installer && \ | |
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc && \ | |
echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc && \ | |
echo 'eval "$(pyenv init -)"' >> ~/.bashrc && \ | |
. ~/.bashrc && \ | |
pyenv install ${PYTHON_VERSION} && \ | |
pyenv global ${PYTHON_VERSION} && \ | |
pyenv rehash && \ | |
python -m pip install --upgrade pip && \ | |
mkdir data_generator training_data testing_data models /home/cache .rasa |
RUN cp kairon/shared/rule_policy.py /root/.pyenv/versions/${PYTHON_VERSION}/lib/python3.10/site-packages/rasa/core/policies/rule_policy.py | ||
RUN cp kairon/shared/schemas/domain.yml /root/.pyenv/versions/${PYTHON_VERSION}/lib/python3.10/site-packages/rasa/shared/utils/schemas/domain.yml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid direct modification of installed packages
Copying files directly into the Python packages directory is risky and can break during version updates. This approach bypasses package management and can lead to maintenance issues.
Consider these alternatives:
- Create proper patches and apply them during package installation
- Fork and maintain your own version of the packages
- Use Python's site-packages directory for custom overrides:
-RUN cp kairon/shared/rule_policy.py /root/.pyenv/versions/${PYTHON_VERSION}/lib/python3.10/site-packages/rasa/core/policies/rule_policy.py
-RUN cp kairon/shared/schemas/domain.yml /root/.pyenv/versions/${PYTHON_VERSION}/lib/python3.10/site-packages/rasa/shared/utils/schemas/domain.yml
+# Create a custom package directory
+RUN mkdir -p /app/custom_overrides/rasa/{core/policies,shared/utils/schemas}
+
+# Copy your custom files
+COPY kairon/shared/rule_policy.py /app/custom_overrides/rasa/core/policies/
+COPY kairon/shared/schemas/domain.yml /app/custom_overrides/rasa/shared/utils/schemas/
+
+# Add the custom directory to Python path
+ENV PYTHONPATH=/app/custom_overrides:$PYTHONPATH
Committable suggestion skipped: line range outside the PR's diff.
Summary by CodeRabbit
New Features
PYENV_ROOT
for better Python version management.Improvements