Skip to content

Latest commit

 

History

History
23 lines (18 loc) · 974 Bytes

index.md

File metadata and controls

23 lines (18 loc) · 974 Bytes

Joint scikit-learn, scikit-image, dask sprint

scikit-learn and scikit-image are two of the major scientific Python toolbox, enabling data-driven discoveries. The first one proposes simple yet efficient tools for data mining and data analysis, while the latter focuses on image processing algorithms. With the flow of data being processed and analysed, these two libraries face unprecedent scalability challenges.

One currently under-utilized avenue for solving such scalability challenge is to leverage the Python library Dask, which provides flexible parallelized NumPy and Pandas DataFrame, the core numerical objects used in Scientific Python. Our goal is thus to organize a sprint bringing together a small number of developers from scikit-learn, scikit-image, and Dask to experiment and improve the three libraries.

Dates: May, 28th to June 2nd Location: Berkeley Institute for Data Science

Who

  • Nelle Varoquaux
  • Matt Rocklin
  • Stéfan van der Walt