Skip to content

Commit

Permalink
[README]
Browse files Browse the repository at this point in the history
  • Loading branch information
kyegomez committed Dec 19, 2024
1 parent c88a85a commit e0d1bbe
Show file tree
Hide file tree
Showing 78 changed files with 729 additions and 5,664 deletions.
18 changes: 0 additions & 18 deletions .pre-commit-config.yaml

This file was deleted.

13 changes: 0 additions & 13 deletions .readthedocs.yml

This file was deleted.

25 changes: 0 additions & 25 deletions Dockerfile

This file was deleted.

22 changes: 0 additions & 22 deletions Makefile

This file was deleted.

170 changes: 132 additions & 38 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,67 +1,161 @@
[![Multi-Modality](agorabanner.png)](https://discord.com/servers/agora-999382051935506503)

# Python Package Template
# OmegaViT: A State-of-the-Art Vision Transformer with Multi-Query Attention, State Space Modeling, and Mixture of Experts

[![Join our Discord](https://img.shields.io/badge/Discord-Join%20our%20server-5865F2?style=for-the-badge&logo=discord&logoColor=white)](https://discord.gg/agora-999382051935506503) [![Subscribe on YouTube](https://img.shields.io/badge/YouTube-Subscribe-red?style=for-the-badge&logo=youtube&logoColor=white)](https://www.youtube.com/@kyegomez3242) [![Connect on LinkedIn](https://img.shields.io/badge/LinkedIn-Connect-blue?style=for-the-badge&logo=linkedin&logoColor=white)](https://www.linkedin.com/in/kye-g-38759a207/) [![Follow on X.com](https://img.shields.io/badge/X.com-Follow-1DA1F2?style=for-the-badge&logo=x&logoColor=white)](https://x.com/kyegomezb)

A easy, reliable, fluid template for python packages complete with docs, testing suites, readme's, github workflows, linting and much much more


## Installation

You can install the package using pip

```bash
pip install -e .
[![PyPI version](https://badge.fury.io/py/omegavit.svg)](https://badge.fury.io/py/omegavit)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Build Status](https://github.com/Agora-Lab-AI/OmegaViT/workflows/build/badge.svg)](https://github.com/Agora-Lab-AI/OmegaViT/actions)
[![Documentation Status](https://readthedocs.org/projects/omegavit/badge/?version=latest)](https://omegavit.readthedocs.io/en/latest/?badge=latest)

OmegaViT (ΩViT) is a cutting-edge vision transformer architecture that combines multi-query attention, rotary embeddings, state space modeling, and mixture of experts to achieve superior performance across various computer vision tasks. The model can process images of any resolution while maintaining computational efficiency.

## Key Features

- **Flexible Resolution Processing**: Handles arbitrary input image sizes through adaptive patch embedding
- **Multi-Query Attention (MQA)**: Reduces computational complexity while maintaining model expressiveness
- **Rotary Embeddings**: Enables better modeling of relative positions and spatial relationships
- **State Space Models (SSM)**: Integrates efficient sequence modeling every third layer
- **Mixture of Experts (MoE)**: Implements conditional computation for enhanced model capacity
- **Comprehensive Logging**: Built-in loguru integration for detailed execution tracking
- **Shape-Aware Design**: Continuous tensor shape tracking for reliable processing

## Architecture

```mermaid
flowchart TB
subgraph Input
img[Input Image]
end
subgraph PatchEmbed[Flexible Patch Embedding]
conv[Convolution]
norm1[LayerNorm]
conv --> norm1
end
subgraph TransformerBlocks[Transformer Blocks x12]
subgraph Block1[Block n]
direction TB
mqa[Multi-Query Attention]
ln1[LayerNorm]
moe1[Mixture of Experts]
ln2[LayerNorm]
ln1 --> mqa --> ln2 --> moe1
end
subgraph Block2[Block n+1]
direction TB
mqa2[Multi-Query Attention]
ln3[LayerNorm]
moe2[Mixture of Experts]
ln4[LayerNorm]
ln3 --> mqa2 --> ln4 --> moe2
end
subgraph Block3[Block n+2 SSM]
direction TB
ssm[State Space Model]
ln5[LayerNorm]
moe3[Mixture of Experts]
ln6[LayerNorm]
ln5 --> ssm --> ln6 --> moe3
end
end
subgraph Output
gap[Global Average Pooling]
classifier[Classification Head]
end
img --> PatchEmbed --> TransformerBlocks --> gap --> classifier
```

# Usage
```python
print("hello world")
## Multi-Query Attention Detail

```mermaid
flowchart LR
input[Input Features]
subgraph MQA[Multi-Query Attention]
direction TB
q[Q Linear]
k[K Linear]
v[V Linear]
rotary[Rotary Embeddings]
attn[Attention Weights]
input --> q & k & v
q & k --> rotary
rotary --> attn
attn --> v
end
MQA --> output[Output Features]
```

## Installation

```bash
pip install omegavit
```

### Code Quality 🧹
## Quick Start

- `make style` to format the code
- `make check_code_quality` to check code quality (PEP8 basically)
- `black .`
- `ruff . --fix`
```python
import torch
from omegavit import create_advanced_vit

### Tests 🧪
# Create model
model = create_advanced_vit(num_classes=1000)

[`pytests`](https://docs.pytest.org/en/7.1.x/) is used to run our tests.
# Example forward pass
batch_size = 8
x = torch.randn(batch_size, 3, 224, 224)
output = model(x)
print(f"Output shape: {output.shape}") # [8, 1000]
```

### Publish on PyPi 🚀
## Model Configurations

**Important**: Before publishing, edit `__version__` in [src/__init__](/src/__init__.py) to match the wanted new version.
| Parameter | Default | Description |
|-----------|---------|-------------|
| hidden_size | 768 | Dimension of transformer layers |
| num_attention_heads | 12 | Number of attention heads |
| num_experts | 8 | Number of expert networks in MoE |
| expert_capacity | 32 | Tokens per expert in MoE |
| num_layers | 12 | Number of transformer blocks |
| patch_size | 16 | Size of image patches |
| ssm_state_size | 16 | Hidden state size in SSM |

```
poetry build
poetry publish
```
## Performance

### CI/CD 🤖
*Note: Benchmarks coming soon*

We use [GitHub actions](https://github.com/features/actions) to automatically run tests and check code quality when a new PR is done on `main`.
## Citation

On any pull request, we will check the code quality and tests.
If you use OmegaViT in your research, please cite:

When a new release is created, we will try to push the new code to PyPi. We use [`twine`](https://twine.readthedocs.io/en/stable/) to make our life easier.
```bibtex
@article{omegavit2024,
title={OmegaViT: A State-of-the-Art Vision Transformer with Multi-Query Attention, State Space Modeling, and Mixture of Experts},
author={Agora Lab},
journal={arXiv preprint arXiv:XXXX.XXXXX},
year={2024}
}
```

The **correct steps** to create a new realease are the following:
- edit `__version__` in [src/__init__](/src/__init__.py) to match the wanted new version.
- create a new [`tag`](https://git-scm.com/docs/git-tag) with the release name, e.g. `git tag v0.0.1 && git push origin v0.0.1` or from the GitHub UI.
- create a new release from GitHub UI
## Contributing

The CI will run when you create the new release.
We welcome contributions! Please see our [contributing guidelines](CONTRIBUTING.md) for details.

# Docs
We use MK docs. This repo comes with the zeta docs. All the docs configurations are already here along with the readthedocs configs.
## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Acknowledgments

# License
MIT
Special thanks to the Agora Lab AI team and the open-source community for their valuable contributions and feedback.
Binary file removed agorabanner.png
Binary file not shown.
Binary file removed docs/.DS_Store
Binary file not shown.
42 changes: 0 additions & 42 deletions docs/applications/customer_support.md

This file was deleted.

Empty file removed docs/applications/enterprise.md
Empty file.
64 changes: 0 additions & 64 deletions docs/applications/marketing_agencies.md

This file was deleted.

Loading

0 comments on commit e0d1bbe

Please sign in to comment.