all app files

ScilifelabDataCentre · Nov 11, 2024 · 5556b39 · 5556b39
1 parent 5467281
commit 5556b39
Show file tree

Hide file tree

Showing 9 changed files with 239 additions and 2 deletions.
diff --git a/.github/workflows/docker-image-ghcr.yml b/.github/workflows/docker-image-ghcr.yml
@@ -0,0 +1,44 @@
+name: Create and publish a Docker image to ghcr.io
+
+on:
+  push:
+    branches:
+      - main
+
+env:
+  REGISTRY: ghcr.io
+  IMAGE_NAME: ${{ github.repository }}
+
+jobs:
+  build-and-push-image:
+    runs-on: ubuntu-latest
+    permissions:
+      contents: read
+      packages: write
+
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v3
+
+      - name: Log in to the Container registry
+        uses: docker/login-action@f054a8b539a109f9f41c372932f1ae047eff08c9
+        with:
+          registry: ${{ env.REGISTRY }}
+          username: ${{ github.actor }}
+          password: ${{ secrets.GITHUB_TOKEN }}
+
+      - name: Extract metadata (tags, labels) for Docker
+        id: meta
+        uses: docker/metadata-action@v4
+        with:
+          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
+          tags: |
+            type=raw,value={{date 'YYYYMMDD-HHmmss' tz='Europe/Stockholm'}}
+
+      - name: Build and push Docker image
+        uses: docker/build-push-action@ad44023a93711e3deb337508980b4b5e9bcdc5dc
+        with:
+          context: .
+          push: true
+          tags: ${{ steps.meta.outputs.tags }}
+          labels: ${{ steps.meta.outputs.labels }}
diff --git a/Dockerfile b/Dockerfile
@@ -0,0 +1,36 @@
+FROM python:3.10.0-slim
+
+# Create user name and home directory variables. 
+# The variables are later used as $USER and $HOME. 
+ENV USER=username
+ENV HOME=/home/$USER
+
+# Add user to system
+RUN useradd -m -u 1000 $USER
+
+# Set working directory (this is where the code should go)
+WORKDIR $HOME/app
+
+# Update system and install dependencies.
+RUN apt-get update && apt-get install --no-install-recommends -y \
+    build-essential \
+    software-properties-common
+
+# Copy requirements.txt and install packages listed there with pip (this will place the files in home/username/)
+COPY app/requirements.txt $HOME/app/requirements.txt
+RUN pip install --no-cache-dir -r requirements.txt
+
+# Test making a prediction using the model
+COPY app/test_predictions $HOME/app/test_predictions/
+RUN python test_predictions/DECIMER_test_prediction.py
+
+# Copy all files that the app needs
+COPY app/Main.py $HOME/app/Main.py
+COPY app/pages $HOME/app/pages/
+
+USER $USER
+EXPOSE 8501
+
+HEALTHCHECK CMD curl --fail http://localhost:8501/_stcore/health
+
+ENTRYPOINT ["streamlit", "run", "Main.py", "--server.port=8501", "--server.address=0.0.0.0"]
diff --git a/README.md b/README.md
@@ -1,2 +1,25 @@
-# streamlit-image-to-smiles
-Web app that allows users to predict SMILES for chemical structures depicted in images files.
+# Predict SMILES encodings of chemical structure depictions in images
+
+This repository contains code for a web app that allows users to either upload an image file or take a picture using their webcam and get a prediction the chemical structure depicted in the image in SMILES notation. 
+
+This application was built using the [Streamlit](https://github.com/streamlit/streamlit) framework (Apache 2.0 license). It is using the [DECIMER Image Transformer](https://github.com/Kohulan/DECIMER-Image_Transformer) (MIT license) model to make predictions (as implemented in the [DECIMER Python package](https://pypi.org/project/decimer/)). In addition, the application allows to edit the predicted SMILES using the web-based 
+molecule sketcher [Ketcher](https://github.com/epam/ketcher) (Apache 2.0 license).
+
+
+The live app can be found here: [image-to-smiles.serve.scilifelab.se](https://image-to-smiles.serve.scilifelab.se/).
+
+## Model behind the app
+
+The DECIMER Image Transformer model was developed by the [Cheminformatics and Computational Metabolomics research group](https://cheminf.uni-jena.de/) at Friedrich Schiller University Jena, Germany. You can find out more about the model in these publications:
+
+- Rajan K, et al. "DECIMER.ai - An open platform for automated optical chemical structure identification, segmentation and recognition in scientific publications." *Nat. Commun.* 14, 5045 (2023).
+- Rajan, K., et al. "DECIMER 1.0: deep learning for chemical image recognition using transformers." *J Cheminform* 13, 61 (2021).
+- Rajan, K., et al. "Advancements in hand-drawn chemical structure recognition through an enhanced DECIMER architecture," *J Cheminform* 16, 78 (2024).
+
+## Contributing
+
+We welcome suggestions and contributions. If you found a mistake or would like to make a suggestion, please create an issue in this repository. Those who wish are also welcome to submit pull requests.
+
+## Contact
+
+This web app was built by [SciLifeLab Data Centre](https://github.com/ScilifelabDataCentre) team members.
diff --git a/app/Main.py b/app/Main.py
@@ -0,0 +1,40 @@
+import streamlit as st
+
+st.title("Predict SMILES encodings of chemical structure depictions in images")
+
+intro = '''This application allows users to either [upload an image file](/SMILES_from_an_image_file) or [take a picture using their webcam](/SMILES_from_a_webcam_photo)
+and get a prediction the chemical structure depicted in the image in SMILES notation. 
+The chemical structure depiction can be machine or hand drawn.
+
+This application is using the [DECIMER Image Transformer](https://github.com/Kohulan/DECIMER-Image_Transformer) (MIT license)
+model to make predictions (as implemented in the [DECIMER Python package](https://pypi.org/project/decimer/)). The DECIMER (Deep lEarning for Chemical ImagE Recognition) addresses the Optical Chemical Structure 
+Recognition (OCSR) with the latest computational intelligence methods to provide an automated open-source software solution.
+
+In addition, the application allows to edit the predicted SMILES using the web-based 
+molecule sketcher [Ketcher](https://github.com/epam/ketcher) (Apache 2.0 license).
+
+'''
+st.markdown(intro)
+
+st.subheader("Model behind the app", divider=None)
+st.markdown("The DECIMER Image Transformer model was developed by the [Cheminformatics and Computational Metabolomics research group](https://cheminf.uni-jena.de/) "
+         "at Friedrich Schiller University Jena, Germany. You can find out more about the model in these publications:")
+citations = '''
+1. Rajan K, et al. "DECIMER.ai - An open platform for automated optical chemical structure identification, segmentation and recognition in scientific publications." *Nat. Commun.* 14, 5045 (2023).
+2. Rajan, K., et al. "DECIMER 1.0: deep learning for chemical image recognition using transformers." *J Cheminform* 13, 61 (2021).
+3. Rajan, K., et al. "Advancements in hand-drawn chemical structure recognition through an enhanced DECIMER architecture," *J Cheminform* 16, 78 (2024).
+'''
+st.markdown(citations)
+
+st.subheader("App source code", divider=None)
+app_info = '''
+The source code of this web application 
+[can be found on GitHub](https://github.com/ScilifelabDataCentre/streamlit-image-to-smiles) with an 
+open source license so feel free to use it to build your own apps. The app was built using the
+ [Streamlit](https://github.com/streamlit/streamlit) framework (Apache 2.0 license).
+
+We welcome suggestions and contributions to this web application. If you found a mistake or would like to make 
+a suggestion, please create an issue in the app's GitHub repository. Those who wish are also welcome to submit 
+pull requests.
+'''
+st.markdown(app_info)
diff --git a/app/pages/01_SMILES_from_an_image_file.py b/app/pages/01_SMILES_from_an_image_file.py
@@ -0,0 +1,42 @@
+import streamlit as st
+from streamlit_ketcher import st_ketcher
+from DECIMER import predict_SMILES
+from PIL import Image
+
+st.header("Predict SMILES encodings of chemical structure depictions in an image file")
+
+st.markdown("Here you can upload your image of a chemical structure depiction (for example, picture of a hand drawing) "
+         "and get a prediction of what structure it is in SMILES notation from [DECIMER Image Transformer](https://github.com/Kohulan/DECIMER-Image_Transformer). "
+         "Allowed image formats: .jpg, .jpeg, .png. "
+         "Note that it may take a few minutes after you upload your file before you see the result. "
+         "You will then be able to see and edit the predicted structure in [Ketcher](https://github.com/epam/ketcher).")
+
+st.write("The image files are only stored in RAM memory and are removed as soon as you close or reload the page.")
+
+st.subheader("Step 1. Upload a file", divider="gray")
+
+# Input widget for users to upload image files from their computer
+uploaded_file = st.file_uploader("Select an image", type=["jpg", "png", "jpeg"])
+
+if uploaded_file is not None:
+    # Display the uploaded image
+    image = Image.open(uploaded_file)
+    container = st.container(border=True)
+    container.image(image, caption="Uploaded image")
+
+    # Run SMILES prediction
+    SMILES = predict_SMILES(uploaded_file)
+
+    # Display the prediction
+    st.subheader("Step 2. See the prediction", divider="gray")
+    st.html(f"<h5>Predicted SMILES:</h5><p><span style='font-size: 1.25em; color: red;'>{SMILES}</span></p>")
+    #st.markdown(f"Predicted SMILES: ``{SMILES}``")
+
+    # Tool to edit the prediction
+    st.subheader("Step 3. Edit/fine-tune the prediction", divider="gray")
+    edited_SMILES = st_ketcher(SMILES)
+    st.html(f"<h5>SMILES from the Ketcher drawing:</h5><p><span style='font-size: 1.25em; color: green;'>{edited_SMILES}</span></p>")
+    #st.markdown(f"SMILES from the Ketcher drawing: ``{edited_SMILES}``")
+
+else:
+    st.write("Please upload an image to predict SMILES.")
diff --git a/app/pages/02_SMILES_from_a_webcam_photo.py b/app/pages/02_SMILES_from_a_webcam_photo.py
@@ -0,0 +1,42 @@
+import streamlit as st
+from streamlit_ketcher import st_ketcher
+from DECIMER import predict_SMILES
+from PIL import Image
+
+st.header("Predict SMILES encodings of chemical structure depictions in a webcam photo")
+
+st.write("Here you can take a picture of a chemical structure depiction using your webcam "
+         "and get a prediction of what structure it is in SMILES notation from [DECIMER Image Transformer](https://github.com/Kohulan/DECIMER-Image_Transformer). "
+         "You have to allow webcam access to this page in your browser to be able to take a picture. "
+         "Note that it may take a few minutes after you upload your file before you see the result. "
+         "You will then be able to see and edit the predicted structure in [Ketcher](https://github.com/epam/ketcher).")
+
+st.write("The pictures you take are only stored in RAM memory and are removed as soon as you close or reload the page.")
+
+st.subheader("Step 1. Take a picture", divider="gray")
+
+# Input widget to take a photo with the user's webcam
+webcam_photo = st.camera_input("Take a picture")
+
+if webcam_photo is not None:
+    # Display the photo that was taken
+    image = Image.open(webcam_photo)
+    container = st.container(border=True)
+    container.image(image, caption="Uploaded image")
+
+    # Run SMILES prediction
+    SMILES = predict_SMILES(webcam_photo)
+
+    # Display the prediction
+    st.subheader("Step 2. See the prediction", divider="gray")
+    st.html(f"<h5>Predicted SMILES:</h5><p><span style='font-size: 1.25em; color: red;'>{SMILES}</span></p>")
+    #st.markdown(f"Predicted SMILES: ``{SMILES}``")
+
+    # Tool to edit the prediction
+    st.subheader("Step 3. Edit/fine-tune the prediction", divider="gray")
+    edited_SMILES = st_ketcher(SMILES)
+    st.html(f"<h5>SMILES from the Ketcher drawing:</h5><p><span style='font-size: 1.25em; color: green;'>{edited_SMILES}</span></p>")
+    #st.markdown(f"SMILES from the Ketcher drawing: ``{edited_SMILES}``")
+
+else:
+    st.write("Please take a photo to predict SMILES.")
diff --git a/app/requirements.txt b/app/requirements.txt
@@ -0,0 +1,5 @@
+streamlit==1.39.0
+streamlit_ketcher==0.0.1
+tensorflow==2.15.0
+decimer==2.7.1
+opencv-python-headless==4.10.0.84
diff --git a/app/test_predictions/DECIMER_test_prediction.py b/app/test_predictions/DECIMER_test_prediction.py
@@ -0,0 +1,5 @@
+from DECIMER import predict_SMILES
+
+image_path = "test_predictions/example_structure.png"
+SMILES = predict_SMILES(image_path)
+print(f"🎉 Decoded SMILES: {SMILES}")
diff --git a/app/test_predictions/example_structure.png b/app/test_predictions/example_structure.png