🏆 A ranked list of awesome Python open-source libraries & tools. Updated weekly.
This curated list contains 390 awesome open-source projects with a total of 1.8M stars grouped into 28 categories. All projects are ranked by a project-quality score, which is calculated based on various metrics automatically collected from GitHub and different package managers. If you like to add or update projects, feel free to open an issue, submit a pull request, or directly edit the projects.yaml. Contributions are very welcome!
🧙♂️ Discover other best-of lists or create your own.
📫 Subscribe to our newsletter for updates and trending projects.
- Data Serialization 16 projects
- Data Containers & Dataframes 30 projects
- Data Structures 15 projects
- Data Validation 15 projects
- Algorithms & Design Patterns 4 projects
- Date & Time Utilities 9 projects
- File & Path Utilities 10 projects
- Compatiblity 7 projects
- Cryptography 7 projects
- Infrastructure & DevOps 20 projects
- Process Utilities 4 projects
- Asynchronous Programming 7 projects
- Configuration 9 projects
- CLI Development 19 projects
- Development Tools 1 projects
- Data Caching 6 projects
- GUI Development 10 projects
- Computer & Machine Vision 2 projects
- Machine Learning & Data Engineering 1 projects
- Text Data 12 projects
- Web Development 1 projects
- Database Clients 64 projects
- Data Loading & Extraction 30 projects
- Data Pipelines & Streaming 43 projects
- File Formats 3 projects
- Code Inspection 4 projects
- General Utilities 15 projects
- Python Implementations 6 projects
- Others 21 projects
- 🥇🥈🥉 Combined project-quality score
- ⭐️ Star count from GitHub
- 🐣 New project (less than 6 months old)
- 💤 Inactive project (6 months no activity)
- 💀 Dead project (12 months no activity)
- 📈📉 Project is trending up or down
- ➕ Project was recently added
- ❗️ Warning (e.g. missing/risky license)
- 👨💻 Contributors count from GitHub
- 🔀 Fork count from GitHub
- 📋 Issue count from GitHub
- ⏱️ Last update timestamp on package manager
- 📥 Download count from package manager
- 📦 Number of dependent projects
- Pandas related project
protobuf (🥇52 · ⭐ 64K · 📉) - Protocol Buffers - Googles data interchange format. BSD-3
-
GitHub (👨💻 1.2K · 🔀 15K · 📥 44M · 📦 650K · 📋 6.2K - 6% open · ⏱️ 06.06.2024):
git clone https://github.com/protocolbuffers/protobuf
-
PyPi (📥 190M / month · 📦 6.8K · ⏱️ 23.05.2024):
pip install protobuf
-
Conda (📥 18M · ⏱️ 06.03.2024):
conda install -c conda-forge protobuf
-
npm (📥 7.6M / month · 📦 2.9K · ⏱️ 10.10.2022):
npm install google-protobuf
flatbuffers (🥇43 · ⭐ 22K) - FlatBuffers: Memory Efficient Serialization Library. Apache-2
-
GitHub (👨💻 680 · 🔀 3.2K · 📥 460K · 📦 110K · 📋 2.4K - 6% open · ⏱️ 03.06.2024):
git clone https://github.com/google/flatbuffers
-
PyPi (📥 19M / month · 📦 410 · ⏱️ 26.03.2024):
pip install flatbuffers
-
Conda (📥 1.1M · ⏱️ 26.03.2024):
conda install -c conda-forge flatbuffers
-
npm (📥 1.4M / month · 📦 230 · ⏱️ 26.03.2024):
npm install flatbuffers
marshmallow (🥈40 · ⭐ 6.9K) - A lightweight library for converting complex objects to and from.. MIT
orjson (🥈38 · ⭐ 5.7K) - Fast, correct Python JSON library supporting dataclasses, datetimes,.. Apache-2
jsonpickle (🥈36 · ⭐ 1.2K) - Python library for serializing any arbitrary object graph into.. BSD-3
msgpack (🥈35 · ⭐ 1.9K) - MessagePack serializer implementation for Python msgpack.org[Python]. Apache-2
ultrajson (🥉34 · ⭐ 4.3K) - Ultra fast JSON decoder and encoder written in C with Python bindings. BSD-3
simplejson (🥉34 · ⭐ 1.6K) - simplejson is a simple, fast, extensible JSON encoder/decoder for.. MIT
cloudpickle (🥉32 · ⭐ 1.6K) - Extended pickling support for Python objects. BSD-3
python-rapidjson (🥉29 · ⭐ 490) - Python wrapper around rapidjson. MIT
pysimdjson (🥉26 · ⭐ 630) - Python bindings for the simdjson project. MIT
General-purpose data containers as well as utilities & extensions for pandas.
polars (🥇44 · ⭐ 27K · 📈) - Dataframes powered by a multithreaded, vectorized query engine, written.. MIT
zarr (🥈36 · ⭐ 1.4K) - An implementation of chunked, compressed, N-dimensional arrays for Python. MIT
Bottleneck (🥈33 · ⭐ 1K) - Fast NumPy array functions written in C. BSD-2
TinyDB (🥈32 · ⭐ 6.6K · 💤) - TinyDB is a lightweight document oriented database optimized for your.. MIT
datasketch (🥉31 · ⭐ 2.4K) - MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog,.. MIT
Pandaral·lel (🥉27 · ⭐ 3.6K) - A simple and efficient tool to parallelize Pandas.. BSD-3
jupyter
StaticFrame (🥉27 · ⭐ 410) - Immutable and statically-typeable DataFrames with runtime type and.. MIT
Pandas Summary (🥉24 · ⭐ 490) - Engine for ML/Data tracking, visualization,.. Apache-2
Show 10 hidden projects...
- numpy (🥇51 · ⭐ 27K) - The fundamental package for scientific computing with Python.
❗Unlicensed
- Blaze (🥉31 · ⭐ 3.2K · 💀) - NumPy and Pandas interface to Big Data.
BSD-3
- Arctic (🥉29 · ⭐ 3K) - Arctic is a high performance datastore for numeric data.
❗️LGPL-2.1
- sklearn-pandas (🥉28 · ⭐ 2.8K · 💀) - Pandas integration with sklearn.
❗️Zlib
sklearn
- pandasql (🥉28 · ⭐ 1.3K · 💀) - sqldf for pandas.
MIT
- bcolz (🥉26 · ⭐ 960 · 💀) - A columnar data container that can be compressed.
BSD-3
- pickleDB (🥉22 · ⭐ 880 · 💀) - pickleDB is an open source key-value store using Pythons json module.
BSD-3
- fletcher (🥉19 · ⭐ 230 · 💀) - Pandas ExtensionDType/Array backed by Apache Arrow.
MIT
- Bounter (🥉18 · ⭐ 940 · 💀) - Efficient Counter that uses a limited (bounded) amount of memory..
MIT
- PandaPy (🥉13 · ⭐ 550 · 💀) - PandaPy has the speed of NumPy and the usability of Pandas 10x to..
MIT
pyrsistent (🥇35 · ⭐ 2K · 💤) - Persistent/Immutable/Functional data structures for Python. MIT
python-sortedcontainers (🥇32 · ⭐ 3.3K) - Python Sorted Container Types: Sorted List, Sorted.. Apache-2
python-benedict (🥈29 · ⭐ 1.4K) - dict subclass with keylist/keypath support, built-in I/O.. MIT
immutables (🥉27 · ⭐ 1.1K · 💤) - A high-performance immutable mapping type for Python. Apache-2
munch (🥉27 · ⭐ 760 · 💤) - A Munch is a Python dictionary that provides attribute-style access (a.. MIT
python-box (🥉25 · ⭐ 2.4K · 💤) - Python dictionaries with advanced dot notation access. MIT
Show 4 hidden projects...
- addict (🥈29 · ⭐ 2.4K · 💀) - The Python Dict thats better than heroin.
MIT
- sqlitedict (🥈29 · ⭐ 1.1K · 💀) - Persistent dict, backed by sqlite3 and pickle, multithread-..
Apache-2
- ordered-set (🥉28 · ⭐ 210 · 💀) - A mutable set that remembers the order of its entries. One of..
MIT
- cleverdict (🥉15 · ⭐ 99 · 💀) - A JSON-friendly data structure which allows both object attributes..
MIT
jsonschema (🥇41 · ⭐ 4.5K · 📈) - An implementation of the JSON Schema specification for Python. MIT
validators (🥈35 · ⭐ 920) - Python Data Validation for Humans. MIT
voluptuous (🥈32 · ⭐ 1.8K) - CONTRIBUTIONS ONLY: Voluptuous, despite the name, is a Python data.. BSD-3
python-email-validator (🥉30 · ⭐ 1K) - A robust email syntax and deliverability validation.. Unlicense
dirty-equals (🥉21 · ⭐ 780 · 💤) - Doing dirty (but extremely useful) things with equals. MIT
Show 5 hidden projects...
- schematics (🥉30 · ⭐ 2.6K · 💀) - Python Data Structures for Humans.
BSD-3
- strictyaml (🥉27 · ⭐ 1.4K · 💀) - Type-safe YAML parser and validator.
MIT
- valideer (🥉19 · ⭐ 260 · 💀) - Lightweight data validation and adaptation Python library.
MIT
- typical (🥉19 · ⭐ 180 · 💀) - Typical: Fast, simple, & correct data-validation using Python 3 typing.
MIT
- dataklasses (🥉7 · ⭐ 780 · 💀) - A different spin on dataclasses.
❗Unlicensed
🔗 python-patterns ( ⭐ 40K) - Collection of design patterns/idioms in Python.
transitions (🥇34 · ⭐ 5.4K) - A lightweight, object-oriented finite state machine implementation.. MIT
algorithms (🥉29 · ⭐ 24K) - Minimal examples of data structures and algorithms in Python. MIT
python-dateutil (🥈35 · ⭐ 2.3K) - Useful extensions to the standard Python datetime features. Apache-2
dateparser (🥈34 · ⭐ 2.5K) - python parser for human readable dates. BSD-3
Show 2 hidden projects...
- parsedatetime (🥉29 · ⭐ 690 · 💀) - Parse human-readable date/time strings.
Apache-2
- isodate (🥉29 · ⭐ 140 · 💀) - ISO 8601 date/time parser.
BSD-3
filesystem_spec (🥇40 · ⭐ 920) - A specification that python filesystems should adhere to. BSD-3
scandir (🥉28 · ⭐ 530 · 💤) - Better directory iterator and faster os.walk(), now in the Python.. BSD-3
Show 4 hidden projects...
- zipp (🥈36 · ⭐ 52 · 📈) - Backport of pathlib-compatible object wrapper for zip files.
MIT
- appdirs (🥉31 · ⭐ 1K · 💀) - A small Python module for determining appropriate platform-specific..
MIT
- pyfilesystem2 (🥉30 · ⭐ 2K · 💀) - Pythons Filesystem abstraction layer.
MIT
- Unipath (🥉22 · ⭐ 520 · 💀) - An object-oriented approach to Python file/directory operations.
MIT
typing (🥈34 · ⭐ 1.6K) - Python static typing home. Hosts the documentation and a user help.. Python-2.0
Show 4 hidden projects...
- contextlib2 (🥉28 · ⭐ 38) - contextlib2 is a backport of the standard librarys contextlib..
❗️psfrag
- dataclasses (🥉27 · ⭐ 580 · 💀) - A backport of the dataclasses module for Python 3.6.
Apache-2
- futures (🥉27 · ⭐ 230 · 💀) - Backport of the concurrent.futures package to Python 2.6 and 2.7.
Python-2.0
- pathlib2 (🥉27 · ⭐ 81 · 💤) - Backport of pathlib aiming to support the full stdlib Python API.
MIT
cryptography (🥇47 · ⭐ 6.4K) - cryptography is a package designed to expose cryptographic.. Apache-2
pycryptodomex (🥈39 · ⭐ 2.7K) - A self-contained cryptographic library for Python. BSD-3
asn1crypto (🥉33 · ⭐ 320 · 💤) - Python ASN.1 library with a focus on performance and a pythonic API. MIT
ansible (🥇48 · ⭐ 62K) - Ansible is a radically simple IT automation platform that makes your.. ❗️GPL-3.0
docker-compose (🥈40 · ⭐ 33K) - Define and run multi-container applications with Docker. Apache-2
paramiko (🥈40 · ⭐ 8.9K · 📉) - The leading native Python SSHv2 protocol library. ❗️LGPL-2.1
kubernetes (🥈39 · ⭐ 6.5K) - Official Python client library for kubernetes. Apache-2
Show 6 hidden projects...
- sshtunnel (🥉31 · ⭐ 1.2K · 💀) - SSH tunnels to remote server.
MIT
- parallel-ssh (🥉26 · ⭐ 1.2K · 💀) - Asynchronous parallel SSH client library.
❗️LGPL-2.1
- storm (🥉24 · ⭐ 3.9K · 💀) - Manage your SSH like a boss.
MIT
- fabtools (🥉24 · ⭐ 1.2K · 💀) - Tools for writing awesome Fabric files.
BSD-2
- wssh (🥉17 · ⭐ 1.4K · 💀) - SSH to WebSockets Bridge.
MIT
- Grai (🥉14 · ⭐ 280) - Platform to programmatically manage, test, and debug data..
❗️MulanPSL-2.0
pexpect (🥇38 · ⭐ 2.5K · 💤) - A Python module for controlling interactive programs in a pseudo-.. ISC
supervisor (🥈36 · ⭐ 8.3K) - Supervisor process control system for Unix.. ❗️Repoze Public License
ptyprocess (🥉24 · ⭐ 210 · 💤) - Run a subprocess in a pseudo terminal. ISC
anyio (🥇37 · ⭐ 1.6K) - High level asynchronous concurrency and networking framework that works on.. MIT
python-dotenv (🥇38 · ⭐ 7.2K) - Reads key-value pairs from a .env file and can set them as.. BSD-3
python-decouple (🥉32 · ⭐ 2.7K) - Strict separation of config from code. MIT
omegaconf (🥉31 · ⭐ 1.8K) - Flexible Python configuration system. The last one you will ever need. BSD-3
gin-config (🥉29 · ⭐ 2K) - Gin provides a lightweight configuration framework for Python. Apache-2
Show 1 hidden projects...
rich (🥇43 · ⭐ 48K) - Rich is a Python library for rich text and beautiful formatting in the terminal. MIT
python-fire (🥈39 · ⭐ 26K) - Python Fire is a library for automatically generating command.. Apache-2
python-prompt-toolkit (🥈39 · ⭐ 9K) - Library for building powerful interactive command line.. BSD-3
argcomplete (🥈35 · ⭐ 1.4K) - Python and tab completion, better together. Apache-2
wcwidth (🥉33 · ⭐ 380) - Python library that measures the width of unicode strings rendered to a.. MIT
questionary (🥉30 · ⭐ 1.4K) - Python library to build pretty command line user prompts Easy to use.. MIT
asciimatics (🥉29 · ⭐ 3.6K) - A cross platform package to do curses-like operations, plus.. Apache-2
ConfigArgParse (🥉28 · ⭐ 700 · 💤) - A drop-in replacement for argparse that allows options to.. MIT
docopt-ng (🥉23 · ⭐ 180) - Humane command line arguments parser. Now with maintenance, typehints,.. MIT
Show 5 hidden projects...
- docopt (🥈36 · ⭐ 7.9K · 💀) - Create beautiful command-line interfaces with Python.
MIT
- blessings (🥉28 · ⭐ 1.4K · 💀) - A thin, practical wrapper around terminal capabilities in Python.
MIT
- clint (🥉24 · ⭐ 95 · 💀) - Python Command-line Application Tools.
ISC
- bashplotlib (🥉22 · ⭐ 1.8K · 💀) - plotting in the terminal.
MIT
- Click Extra (🥉22 · ⭐ 54) - Extra colorization and configuration loading for Click.
❗️GPL-2.0
🔗 best-of-python-dev ( ⭐ 930) - A ranked list of awesome python developer tools and libraries. Updated..
cachetools (🥇34 · ⭐ 2.2K) - Extensible memoizing collections and decorators. MIT
pylibmc (🥉27 · ⭐ 480 · 💤) - A Python wrapper around the libmemcached interface from TangentOrg. BSD-3
Show 1 hidden projects...
- cached-property (🥈30 · ⭐ 680 · 💀) - A decorator for caching properties in classes.
BSD-3
🔗 best-of-web-python - Web UI ( ⭐ 2.2K) - Collection of libraries to implement web-based UIs.
kivy (🥇41 · ⭐ 17K) - Open source UI framework written in Python, running on Windows, Linux, macOS,.. MIT
DearPyGui (🥈32 · ⭐ 12K) - Dear PyGui: A fast and powerful Graphical User Interface Toolkit for.. MIT
Show 5 hidden projects...
- PySimpleGUI (🥈35 · ⭐ 13K) - PySimpleGUI is a Python package that enables Python..
❗Unlicensed
- Eel (🥉31 · ⭐ 6.2K · 💀) - A little Python library for making simple Electron-like HTML/JS GUI apps.
MIT
- Gooey (🥉30 · ⭐ 20K · 💀) - Turn (almost) any Python command line program into a full GUI..
MIT
- enaml (🥉25 · ⭐ 1.5K) - Declarative User Interfaces for Python.
❗Unlicensed
- Phoenix (🥉24 · ⭐ 2.2K) - wxPythons Project Phoenix. A new implementation of wxPython,..
❗️wxWindows
🔗 best-of-ml-python - Computer Vision ( ⭐ 16K) - Collection of computer vision and image processing..
🔗 best-of-ml-python ( ⭐ 16K) - A ranked list of awesome machine learning Python libraries. Updated..
🔗 best-of-ml-python - NLP ( ⭐ 16K) - Collection of text processing and NLP libraries.
phonenumbers (🥇34 · ⭐ 3.4K) - Python port of Googles libphonenumber. Apache-2
inflect (🥇34 · ⭐ 930) - Correctly generate plurals, ordinals, indefinite articles; convert numbers.. MIT
python-slugify (🥈33 · ⭐ 1.5K) - Returns unicode slugs. MIT
chardet (🥈31 · ⭐ 2.1K · 💤) - Python character encoding detector. ❗️LGPL-2.1
-
GitHub (👨💻 48 · 🔀 250 · 📦 6 · 📋 150 - 42% open · ⏱️ 01.08.2023):
git clone https://github.com/chardet/chardet
-
PyPi (📥 68M / month · 📦 5.4K · ⏱️ 01.08.2023):
pip install chardet
-
Conda (📥 23M · ⏱️ 23.09.2023):
conda install -c conda-forge chardet
-
npm (📥 58 / month · 📦 5 · ⏱️ 20.08.2017):
npm install @pypi/chardet
pyahocorasick (🥉29 · ⭐ 900) - Python module (C extension and plain python) implementing Aho-.. BSD-3
price-parser (🥉21 · ⭐ 300 · 💤) - Extract price amount and currency symbol from a raw text.. BSD-3
Show 4 hidden projects...
🔗 best-of-web-python ( ⭐ 2.2K) - A ranked list of awesome python libraries for web development. Updated..
Libraries for connecting to, operating, and querying databases.
SQLAlchemy (🥇46 · ⭐ 9K) - The Database Toolkit for Python. MIT
azure-storage-blob (🥇43 · ⭐ 4.3K) - This repository is for active development of the Azure SDK.. MIT
google-cloud-storage (🥇42 · ⭐ 4.7K) - Google Cloud Client Library for Python. Apache-2
elasticsearch (🥇42 · ⭐ 4.2K) - Official Python client for Elasticsearch. Apache-2
python-bigquery (🥈39 · ⭐ 720) - Google BigQuery API client library. Apache-2
MongoEngine (🥈38 · ⭐ 4.2K) - A Python Object-Document-Mapper for working with MongoDB. MIT
AWS Data Wrangler (🥈38 · ⭐ 3.8K) - pandas on AWS - Easy integration with Athena, Glue,.. Apache-2
sqlmodel (🥈37 · ⭐ 13K · 📈) - SQL databases in Python, designed for simplicity, compatibility,.. MIT
pydantic
kafka-python (🥈37 · ⭐ 5.5K) - Python client for Apache Kafka. Apache-2
Elasticsearch DSL (🥈37 · ⭐ 3.8K) - High level Python client for Elasticsearch. Apache-2
SQLAlchemy-Utils (🥈36 · ⭐ 1.2K) - Various utility functions and datatypes for SQLAlchemy. BSD-3
tortoise-orm (🥈35 · ⭐ 4.3K) - Familiar asyncio ORM for python, built with relations in mind. Apache-2
s3transfer (🥈35 · ⭐ 200) - Amazon S3 Transfer Manager for Python. Apache-2
Prometheus Client (🥈34 · ⭐ 3.8K) - Prometheus instrumentation library for Python.. Apache-2
mysqlclient (🥈34 · ⭐ 2.4K) - MySQL database connector for Python (with Python 3 support). ❗️GPL-2.0
Cassandra Driver (🥈34 · ⭐ 1.4K) - DataStax Python Driver for Apache Cassandra. Apache-2
PyPika (🥉33 · ⭐ 2.4K) - PyPika is a python SQL query builder that exposes the full richness.. Apache-2
neo4j-driver (🥉33 · ⭐ 870) - Neo4j Bolt driver for Python. Apache-2
pandas-gbq (🥉33 · ⭐ 420) - Google BigQuery connector for pandas. BSD-3
libcloud (🥉32 · ⭐ 2K) - Apache Libcloud is a Python library which hides differences between.. Apache-2
cx-Oracle (🥉31 · ⭐ 880) - Python interface to Oracle Database now superseded by python-oracledb. BSD-3
confluent-kafka-python (🥉29 · ⭐ 3.6K) - Confluents Kafka Python Client. Apache-2
ODMantic (🥉26 · ⭐ 1K) - Sync and Async ODM (Object Document Mapper) for MongoDB based on python.. ISC
aioprometheus (🥉21 · ⭐ 170) - A Prometheus Python client library for asyncio-based applications. MIT
psycopg3 (🥉19 · ⭐ 1.5K) - New generation PostgreSQL database adapter for the Python.. ❗️LGPL-3.0
-
GitHub (👨💻 56 · 🔀 150 · 📋 460 - 7% open · ⏱️ 04.06.2024):
git clone https://github.com/psycopg/psycopg
Show 17 hidden projects...
- psycopg2 (🥈38 · ⭐ 3.2K) - PostgreSQL database adapter for the Python..
❗️BSD-3-Clause-Attribution
- pyodbc (🥈35 · ⭐ 2.9K) - Python ODBC bridge.
❗️MIT-0
- google-cloud-bigtable (🥉31 · ⭐ 63) - Google Cloud Bigtable API client library.
Apache-2
- gino (🥉29 · ⭐ 2.7K · 💀) - GINO Is Not ORM - a Python asyncio ORM on SQLAlchemy core.
BSD-3
- redis-py-cluster (🥉29 · ⭐ 1.1K · 💀) - Python cluster client for the official redis cluster...
MIT
- umongo (🥉28 · ⭐ 440 · 💀) - sync/async MongoDB ODM, yes.
MIT
- cloudant (🥉28 · ⭐ 160 · 💀) - A Python library for Cloudant and CouchDB.
Apache-2
- mongo-connector (🥉27 · ⭐ 1.9K · 💀) - MongoDB data stream pipeline tools by YouGov (adopted..
Apache-2
- pyhdb (🥉24 · ⭐ 320 · 💀) - SAP HANA Connector in pure Python.
Apache-2
- PyMODM (🥉21 · ⭐ 350 · 💀) - A Pythonic, object-oriented interface for working with MongoDB.
Apache-2
- gsheets-db-api (🥉21 · ⭐ 210 · 💀) - A Python DB-API and SQLAlchemy dialect to Google Spreasheets.
MIT
- py2neo (🥉21 · ⭐ 14 · 💤) - EOL! Py2neo is a comprehensive Neo4j driver library and toolkit for..
Apache-2
- PugSQL (🥉20 · ⭐ 670 · 💀) - A HugSQL-inspired database library for Python.
Apache-2
- db.py (🥉19 · ⭐ 1.2K · 💀) - db.py is an easier way to interact with your databases.
BSD-2
- Queries (🥉19 · ⭐ 260 · 💀) - PostgreSQL database access simplified.
BSD-3
- SuperSQLite (🥉15 · ⭐ 720 · 💀) - A supercharged SQLite library for Python.
MIT
- lazydata (🥉15 · ⭐ 630 · 💀) - Lazydata: Scalable data dependencies for Python projects.
Apache-2
Libraries for loading, collecting, and extracting data from a variety of data sources and formats.
Datasets (🥇43 · ⭐ 19K) - The largest hub of ready-to-use datasets for ML models with fast,.. Apache-2
xmltodict (🥈35 · ⭐ 5.4K) - Python module that makes working with XML feel like you are working.. MIT
python-magic (🥈35 · ⭐ 2.6K) - A python wrapper for libmagic. MIT
smart-open (🥈34 · ⭐ 3.1K) - Utils for streaming large files (S3, HDFS, gzip, bz2...). MIT
csvkit (🥈33 · ⭐ 5.9K) - A suite of utilities for converting to and working with CSV, the king of.. MIT
pandas-datareader (🥈32 · ⭐ 2.8K · 💤) - Extract data from a wide range of Internet sources.. BSD-3
Intake (🥈32 · ⭐ 990) - Intake is a lightweight package for finding, investigating, loading and.. BSD-2
snorkel (🥉31 · ⭐ 5.7K) - A system for quickly generating training data with weak supervision. Apache-2
img2dataset (🥉27 · ⭐ 3.4K) - Easily turn large sets of image urls to an image dataset. Can.. MIT
rows (🥉23 · ⭐ 860) - A common, beautiful interface to tabular data, no matter the format. ❗️LGPL-3.0
Upgini (🥉21 · ⭐ 300) - Data search & enrichment library for Machine Learning Easily find and add.. BSD-3
Squirrel (🥉17 · ⭐ 280) - A Python library that enables ML teams to share, load, and transform.. Apache-2
Show 10 hidden projects...
- xlrd (🥈33 · ⭐ 2.1K · 💀) - Please use openpyxl where you can...
BSD-3
- SDV (🥉31 · ⭐ 2.2K) - Synthetic data generation for tabular data.
❗️SSPL-1.0
- PDFMiner (🥉27 · ⭐ 5.2K · 💀) - Python PDF Parser (Not actively maintained). Check out pdfminer.six.
MIT
- tabulator-py (🥉27 · ⭐ 240 · 💀) - Python library for reading and writing tabular data via streams.
MIT
- Singer (🥉26 · ⭐ 1.2K · 💀) - Standard for moving data between databases, web APIs, files,..
❗️AGPL-3.0
- messytables (🥉24 · ⭐ 390 · 💀) - Tools for parsing messy tabular data. This is now superseded by..
MIT
- pyexcel-xlsx (🥉23 · ⭐ 110 · 💀) - A wrapper library to read, manipulate and write data in xlsx..
BSD-3
- borb (🥉22 · ⭐ 3.3K) - borb is a library for reading, creating and manipulating PDF files..
❗Unlicensed
- datatest (🥉21 · ⭐ 290 · 💀) - Tools for test driven data-wrangling and data validation.
Apache-2
- csvs-to-sqlite (🥉15 · ⭐ 860 · 💀) - Convert CSV files into a SQLite database.
Apache-2
Libraries for data batch- and stream-processing, workflow automation, job scheduling, and other data pipeline tasks.
Airflow (🥇49 · ⭐ 35K) - Platform to programmatically author, schedule, and monitor workflows. Apache-2
-
GitHub (👨💻 3.3K · 🔀 14K · 📥 620K · 📦 11K · 📋 9.3K - 10% open · ⏱️ 06.06.2024):
git clone https://github.com/apache/airflow
-
PyPi (📥 24M / month · 📦 470 · ⏱️ 06.06.2024):
pip install apache-airflow
-
Conda (📥 1.1M · ⏱️ 07.05.2024):
conda install -c conda-forge airflow
-
Docker Hub (📥 1.3B · ⭐ 520 · ⏱️ 06.06.2024):
docker pull apache/airflow
Celery (🥇46 · ⭐ 24K) - Asynchronous task queue/job queue based on distributed message passing. BSD-3
Prefect (🥇43 · ⭐ 15K) - Prefect is a workflow orchestration tool empowering developers to.. Apache-2
Great Expectations (🥈40 · ⭐ 9.6K) - Always know what to expect from your data. Apache-2
luigi (🥈38 · ⭐ 17K · 📈) - Luigi is a Python module that helps you build complex pipelines of.. Apache-2
Kedro (🥈38 · ⭐ 9.4K) - Kedro is a toolbox for production-ready data science. It uses software.. Apache-2
dbt (🥈38 · ⭐ 9.1K) - dbt enables data analysts and engineers to transform their data using the.. Apache-2
Activeloop (🥈33 · ⭐ 7.8K) - Database for AI. Store Vectors, Images, Texts, Videos, etc. Use.. MPL-2.0
whylogs (🥈31 · ⭐ 2.6K) - Open standard for end-to-end data and ML monitoring for any scale in.. Apache-2
PyFunctional (🥉27 · ⭐ 2.4K) - Python library for creating data pipelines with chain functional.. MIT
streamparse (🥉25 · ⭐ 1.5K) - Run Python in Apache Storm topologies. Pythonic API, CLI.. Apache-2
dbnd (🥉25 · ⭐ 250) - DBND is an agile pipeline framework that helps data engineering teams.. Apache-2
Databolt Flow (🥉19 · ⭐ 950 · 💤) - Python library for building highly effective data science.. MIT
BatchFlow (🥉19 · ⭐ 200) - BatchFlow helps you conveniently work with random or sequential.. Apache-2
Mara Pipelines (🥉16 · ⭐ 2.1K) - A lightweight opinionated ETL framework, halfway between plain.. MIT
Show 16 hidden projects...
- mrjob (🥈31 · ⭐ 2.6K · 💀) - Run MapReduce jobs on Hadoop or Amazon Web Services.
Apache-2
- faust (🥉29 · ⭐ 6.7K · 💀) - Python Stream Processing.
BSD-3
- Optimus (🥉25 · ⭐ 1.4K · 💀) - Agile Data Preparation Workflows madeeasy with Pandas,..
Apache-2
spark
- bonobo (🥉24 · ⭐ 1.6K · 💀) - Extract Transform Load for Python 3.5+.
Apache-2
- Pypeline (🥉24 · ⭐ 1.5K · 💀) - Concurrent data pipelines in Python .
MIT
- pysparkling (🥉23 · ⭐ 260 · 💀) - A pure Python implementation of Apache Sparks RDD and DStream..
MIT
- dpark (🥉22 · ⭐ 2.7K · 💀) - Python clone of Spark, a MapReduce alike framework in Python.
BSD-3
spark
- pdpipe (🥉20 · ⭐ 720 · 💀) - Easy pipelines for pandas DataFrames.
MIT
- spark-deep-learning (🥉19 · ⭐ 2K · 💀) - Deep Learning Pipelines for Apache Spark.
Apache-2
spark
- mrq (🥉19 · ⭐ 880 · 💀) - Mr. Queue - A distributed worker task queue in Python using Redis & gevent.
MIT
- riko (🥉18 · ⭐ 1.6K · 💀) - A Python stream processing engine modeled after Yahoo! Pipes.
MIT
- bodywork-core (🥉17 · ⭐ 430 · 💀) - ML pipeline orchestration and model deployments on..
❗️AGPL-3.0
- kale (🥉16 · ⭐ 630 · 💀) - Kubeflows superfood for Data Scientists.
Apache-2
jupyter
- Botflow (🥉15 · ⭐ 1.2K · 💀) - Python Fast Dataflow programming framework for Data pipeline work(..
BSD-3
- RasgoQL (🥉13 · ⭐ 270 · 💀) - Write python locally, execute SQL in your data warehouse.
❗️AGPL-3.0
- datajob (🥉13 · ⭐ 110 · 💀) - Build and deploy a serverless data pipeline on AWS with no effort.
Apache-2
XlsxWriter (🥉36 · ⭐ 3.5K) - A Python module for creating Excel XLSX files. BSD-2
Show 3 hidden projects...
- importlib-resources (🥈31 · ⭐ 58) - Backport of the importlib.resources module.
Apache-2
- typing_inspect (🥉25 · ⭐ 330 · 💀) - Runtime inspection utilities for Python typing module.
MIT
- entrypoints (🥉23 · ⭐ 74 · 💀) - Discover and load entry points from installed packages.
MIT
more-itertools (🥇39 · ⭐ 3.5K) - More routines for operating on iterables, beyond itertools. MIT
ubelt (🥉24 · ⭐ 710) - A Python utility library with a stdlib like feel and extra batteries... Apache-2
Show 6 hidden projects...
- python-dependency-injector (🥈32 · ⭐ 3.6K · 💀) - Dependency injection framework for Python.
BSD-3
- retrying (🥉27 · ⭐ 1.9K · 💀) - Retrying is an Apache 2.0 licensed general-purpose retrying..
Apache-2
- ratelimit (🥉25 · ⭐ 720 · 💀) - API Rate Limit Decorator.
MIT
- pinject (🥉24 · ⭐ 1.3K · 💀) - A pythonic dependency injection library.
Apache-2
- CommonRegex (🥉23 · ⭐ 1.6K · 💀) - A collection of common regular expressions bundled with an easy..
MIT
- pampy (🥉22 · ⭐ 3.5K · 💀) - Pampy: The Pattern Matching for Python you always dreamed of.
MIT
micropython (🥈33 · ⭐ 19K) - MicroPython - a lean and efficient Python implementation for.. Python-2.0
Show 4 hidden projects...
- grumpy (🥈23 · ⭐ 11K · 💀) - Grumpy is a Python to Go source code transcompiler and runtime.
Apache-2
- pyston (🥉22 · ⭐ 2.5K · 💀) - A faster and highly-compatible implementation of the Python..
Apache-2
- stackless (🥉17 · ⭐ 1K · 💀) - The Stackless Python programming language.
❗Unlicensed
- cl-python (🥉11 · ⭐ 360 · 💤) - An implementation of Python in Common Lisp.
❗Unlicensed
cookiecutter (🥇41 · ⭐ 22K) - A cross-platform command-line utility that creates projects from.. BSD-3
py4j (🥈35 · ⭐ 1.2K) - Py4J enables Python programs to dynamically access arbitrary Java objects. BSD-3
pyscaffold (🥉29 · ⭐ 2K · 💤) - Python project template generator with batteries included. MIT
Send2Trash (🥉27 · ⭐ 260) - Python library to natively send files to Trash (or Recycle bin) on.. BSD-3
python-mss (🥉25 · ⭐ 970) - An ultra fast cross-platform multiple screenshots module in pure.. MIT
Show 6 hidden projects...
- keyboard (🥉32 · ⭐ 3.7K · 💀) - Hook and simulate global keyboard events on Windows and Linux.
MIT
- pyscreenshot (🥉26 · ⭐ 500 · 💀) - Python screenshot library, replacement for the Pillow..
BSD-2
- openpyxl (🥉26 · ⭐ 78) - A Python library to read/write Excel 2010 xlsx/xlsm files.
MIT
- powerline-shell (🥉25 · ⭐ 6.2K · 💀) - A beautiful and useful prompt for your shell.
MIT
- pluginbase (🥉24 · ⭐ 1.1K · 💀) - A simple but flexible plugin system for Python.
BSD-3
- macropy (🥉22 · ⭐ 3.3K · 💀) - Macros in Python: quasiquotes, case classes, LINQ and more!.
MIT
- Best-of lists: Discover other best-of lists with awesome open-source projects on all kinds of topics.
- best-of-ml-python: A ranked list of awesome machine learning Python libraries.
- best-of-web-python: A ranked list of awesome Python libraries for web development.
- best-of-python-dev: A ranked list of awesome Python developer tools and libraries.
- awesome-python: A curated list of awesome Python frameworks, libraries, software and resources.
Contributions are encouraged and always welcome! If you like to add or update projects, choose one of the following ways:
- Open an issue by selecting one of the provided categories from the issue page and fill in the requested information.
- Modify the projects.yaml with your additions or changes, and submit a pull request. This can also be done directly via the Github UI.
If you like to contribute to or share suggestions regarding the project metadata collection or markdown generation, please refer to the best-of-generator repository. If you like to create your own best-of list, we recommend to follow this guide.
For more information on how to add or update projects, please read the contribution guidelines. By participating in this project, you agree to abide by its Code of Conduct.