Split Detection Model into New Repo #216

pastorep · 2024-11-22T06:03:51Z

Rather than maintain livesystem and model train/validate/evaluate in the same repo, let's split these subjects.

I'm considering a repo structure like the following:

Name: OrcaHelloDetection

Example File Structure: https://cookiecutter-data-science.drivendata.org/

pastorep · 2024-11-22T06:04:27Z

@BrunoGrandePhD, what do you think of this template?

bnestor · 2024-11-22T14:50:36Z

I am not familiar with the structure, but I support that. We should probably be able to run the model in a generic framework like pytorch for inference. With that flexibility, we could develop the model with any package/language, and port it using ONNX. It would be much simpler for future hackathons if we did not have FastAI dependencies. Another option is to have a fastAPI server running on the infrastructure and just make internal requests to it. Such as 127.0.0.1:9000/query=<audio_packet_byte_encoded>&sample_rate=24000.

BrunoGrandePhD · 2024-11-22T16:25:19Z

I also think we should split off model training and evaluation into its own repo. I agree with @bnestor that we can adopt new practices in this new repo, which will facilitate iteration and model distribution (e.g. using ONNX and/or HuggingFace).

@pastorep: The structure you linked seems like a good place to start, and we can always adapt it as we go. I'm pretty sure I've come across it when I was working on my PhD and tried to structure my own analysis repo.

As a side note (since @bnestor brought it up), I'm personally interested in revisiting the model inference infrastructure at some point. It's something that I would like to explore since it aligns with my career goals. The current setup has limitations, notably that predictions below 0.5 are not stored. A FastAPI service is one approach that we can explore.

BrunoGrandePhD mentioned this issue Nov 22, 2024

Upgrade model training to fastai v2 #215

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split Detection Model into New Repo #216

Split Detection Model into New Repo #216

pastorep commented Nov 22, 2024

pastorep commented Nov 22, 2024

bnestor commented Nov 22, 2024

BrunoGrandePhD commented Nov 22, 2024

Split Detection Model into New Repo #216

Split Detection Model into New Repo #216

Comments

pastorep commented Nov 22, 2024

pastorep commented Nov 22, 2024

bnestor commented Nov 22, 2024

BrunoGrandePhD commented Nov 22, 2024