Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solution to the ML Engineer challenge by Mattia Delleani #1

Open
wants to merge 68 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
bf8c660
modified wget
mdell-temp Aug 7, 2024
c08dedc
add Docker
mdell-temp Aug 7, 2024
75df9f1
add Docker
mdell-temp Aug 7, 2024
086044a
add requirements for venv
mdell-temp Aug 7, 2024
bab8b6f
add src/README.md
mdell-temp Aug 7, 2024
2c0b64a
replace data folder
mdell-temp Aug 7, 2024
d2ff1a4
replace README
mdell-temp Aug 7, 2024
9a9c1fb
add Docker
mdell-temp Aug 7, 2024
e4a1e94
assigment notebook
mdell-temp Aug 7, 2024
0997094
add .gitignore
mdell-temp Aug 7, 2024
a06bb62
mod src/README.md
mdell-temp Aug 7, 2024
4f06f4d
EDA: till distribution analysis completed
mdell-temp Aug 8, 2024
6de17a8
EDA completed
mdell-temp Aug 8, 2024
7400e66
EDA: del garbage
mdell-temp Aug 8, 2024
00d1574
add src/data to git
mdell-temp Aug 8, 2024
2a89a02
add dataloader.py
mdell-temp Aug 8, 2024
60b2680
add src/data/*
mdell-temp Aug 8, 2024
0b1f4f9
add utils folder
mdell-temp Aug 8, 2024
4c462c8
remove name, update repo stucture
mdell-temp Aug 8, 2024
e31609b
add signal file processing
mdell-temp Aug 8, 2024
1d8493e
add pantompkins
mdell-temp Aug 8, 2024
1e0a912
cleaned useless libraries
mdell-temp Aug 8, 2024
9c827e9
useful plot for data
mdell-temp Aug 8, 2024
7d6d97b
add folder for models and dependecies
mdell-temp Aug 8, 2024
cae981f
add eval folder: visualize for plotting results
mdell-temp Aug 8, 2024
cb26ea6
cleaned and renamed libraries and functions
mdell-temp Aug 8, 2024
f965c70
ml classification: completed
mdell-temp Aug 8, 2024
6cdb4c9
plot suptitle fixed
mdell-temp Aug 8, 2024
d9003f2
cleaned pipeline.py
mdell-temp Aug 8, 2024
d66da53
mv setup logging to legger.py
mdell-temp Aug 9, 2024
28d0065
add experiments path
mdell-temp Aug 9, 2024
ddf1316
add log resourses functions
mdell-temp Aug 9, 2024
ef6bb3c
log_resourses functions
mdell-temp Aug 9, 2024
98b75f4
fixed: evaluate functions
mdell-temp Aug 9, 2024
61b1ba3
fixed eval, predict funct. rm cmd for training
mdell-temp Aug 9, 2024
c4a163c
rm unused libraries
mdell-temp Aug 9, 2024
5a53bce
add train.py file
mdell-temp Aug 9, 2024
fee2d83
renamed: train.py -> main.py + evaluation
mdell-temp Aug 9, 2024
d287f69
add logger verbosity
mdell-temp Aug 9, 2024
9705b74
rm: printing
mdell-temp Aug 9, 2024
b81bb80
upload example files
mdell-temp Aug 9, 2024
f5a03e3
rm: examples file
mdell-temp Aug 9, 2024
a605a7f
experiments upload for verbosity
mdell-temp Aug 9, 2024
8fe397c
upd: gitignore
mdell-temp Aug 9, 2024
8750f61
ML classification: DONE
mdell-temp Aug 9, 2024
b506018
upd: README, add structure and getting started
mdell-temp Aug 9, 2024
d2230a2
upd: README identation
mdell-temp Aug 9, 2024
cdb14b4
upd: README get start3
mdell-temp Aug 9, 2024
5985090
add MIT license, following references
mdell-temp Aug 9, 2024
2d6260f
add MIT license, following references
mdell-temp Aug 9, 2024
624ff2b
mv to src
mdell-temp Aug 9, 2024
5528c24
upd README
mdell-temp Aug 9, 2024
e6d5205
add: conclusion and thougths
mdell-temp Aug 9, 2024
63df6a1
mv License
mdell-temp Aug 9, 2024
1ab0d16
mv License outsid
mdell-temp Aug 9, 2024
a431a19
Create LICENSE
mdell-temp Aug 9, 2024
324c966
final changes
mdell-temp Aug 9, 2024
bec7ca4
Merge branch 'main' of github.com:mdell-temp/Idoven-DS-Machine-Learning
mdell-temp Aug 9, 2024
ba36f44
git ignore
mdell-temp Aug 9, 2024
ab7f9e6
Merge pull request #1 from mdell-temp/dev
mdell-temp Aug 9, 2024
50bf104
rm: doubled license
mdell-temp Aug 9, 2024
cfc5d42
link license
mdell-temp Aug 9, 2024
a310cb1
link license url
mdell-temp Aug 9, 2024
e50d77b
link license url
mdell-temp Aug 9, 2024
7881379
adjust comments
mdell-temp Aug 10, 2024
54b5d36
adjust README
mdell-temp Aug 10, 2024
eb35aa5
final comments p2
mdell-temp Aug 11, 2024
c0f08e5
Merge pull request #2 from mdell-temp/dev
mdell-temp Aug 11, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
upd: README, add structure and getting started
  • Loading branch information
mdell-temp committed Aug 9, 2024
commit b50601866df30aa2b0af1c0580b8b3b6aef23d2d
72 changes: 50 additions & 22 deletions src/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,32 +18,50 @@ Here's the structure of the repository:

```plaintext
.
├── data
│ ├── ptbxl
├── data # folder with ECG data
│ ├── download_data.sh
│ └── README.md
├── src
│ ├── assignment.ipynb
│ ├── data
│ │ ├── dataset.py
├── src # folder with proposed approach
│ │
│ ├── data # folder with dataloading and processing modules
│ │ ├── data_augmentation.py
│ │ ├── data_plots.p
│ │ ├── dataloader.py
│ │ └── data_augmentation.py
│ ├── utils
│ │ ├── utilities.py
│ │ └── data_augmentation.py
│ │ ├── dataset.py
│ │ └── signal_data_processing.py
│ │
│ ├── evaluation # folder with evaluation modules
│ │ └── visualize.py
│ │
│ ├── experiments # folder with experiments
│ │ ├── EDA/
│ │ ├── logs/
│ │ └── results/
│ │
│ ├── models # folder with ML models
│ │ ├── architectures.py
│ │ └── pipeline.py
│ │
│ ├── utils # folder with useful functions
│ │ ├── logger.py
│ │ └── utilities.py
│ │
│ ├── assignment.ipynb # Jupyter Notebook as asked
│ ├── main.py # file for cmd line training
│ ├── requirements.txt
│ └── README.md
├── Dockerfile
├── README.md
└── references
└── reference_documentation.pdf
├── Dockerfile # Consistent setup across different platforms.
└── README.md
```

<!-- ├── scripts
│ ├── preprocess_data.py
│ ├── train_model.py
│ └── evaluate_model.py -->
**Structure Rationale**

The structure is designed for simplicity and clarity, aligning with the assignment requirements. It provides a logical flow from data processing through model evaluation, making it easy to follow and reproduce the steps taken. By organizing code into distinct modules and separating experiments, the structure enhances readability and maintainability, ensuring that the focus remains on answering the assignment questions effectively.

Main component:
- `assignment.ipynb`: The Jupyter Notebook containing the answers to the assignment, structured into sections for EDA, ML classification, and a conclusion with references, providing a comprehensive response to the task.

```
## Getting Started

### Prerequisites
Expand All @@ -57,8 +75,8 @@ Ensure you have the following installed:
Clone the repository:

```bash
git clone https://github.com/yourusername/your-repo.git # clone the repo
cd your-repo # move inside the repo folder
git clone https://github.com/mdell-temp/Idoven-DS-Machine-Learning.git # clone the repo
cd Idoven-DS-Machine-Learning # move inside the repo folder
```

**Docker**
Expand All @@ -84,7 +102,17 @@ docker run -p 8888:8888 -v %cd%\src:/app/src ecg-classification # CPU
# or
docker run --gpus=all -p 8888:8888 -v %cd%\src:/app/src ecg-classification # GPU
```
This will start a Docker container and expose port 8888. You can access Jupyter Notebook by navigating to http://localhost:8888 in your web browser.
This will start a Docker container and expose port 8888. You can access Jupyter Lab session by navigating to http://localhost:8888 in your web browser.


3. Navigate through the project:

- Open and run the `assignment.ipynb` to see the proposed solution and perform EDA, train ML models and evaluate the trained models.

- Run the pipeline from command line (check the accepted arguments)
```python
python main.py [OPTIONS]
```