Skip to content
View mayfool's full-sized avatar

Block or report mayfool

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results
Python 10 1 Updated Mar 10, 2023

[InterSpeech 24] FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter

Python 86 7 Updated Jul 4, 2024

Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice

Python 234 30 Updated Jan 15, 2025

Automatically Update LLM Papers Daily using Github Actions. Ref: https://github.com/Vincentqyw/cv-arxiv-daily

Python 3 Updated Jan 19, 2025
Jupyter Notebook 426 45 Updated Jan 17, 2025

Grapheme-to-Phoneme for Mixed Chinese (Mandarin or Cantonese) and English.

Python 83 12 Updated Jan 16, 2025

Megatts2 use HierSpeechpp's vocoder

Python 17 1 Updated Dec 2, 2024

Aim 💫 — An easy-to-use & supercharged open-source experiment tracker.

Python 5,308 328 Updated Jan 18, 2025

⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)

Python 908 49 Updated Dec 6, 2024

Interface for OuteTTS models.

Python 870 72 Updated Jan 17, 2025

A curated list of reinforcement learning with human feedback resources (continually updated)

3,624 223 Updated Dec 5, 2024

first base model for full-duplex conversational audio

Python 1,686 113 Updated Jan 5, 2025

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Jupyter Notebook 2,378 166 Updated Jun 25, 2024

Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.

Python 17,840 1,860 Updated Oct 15, 2024

GLM-4-Voice | 端到端中英语音对话模型

Python 2,574 207 Updated Dec 5, 2024

Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".

Python 863 56 Updated Oct 28, 2024

Robust recipes to align language models with human and AI preferences

Python 4,907 427 Updated Nov 21, 2024

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,259 346 Updated Jan 14, 2025

Real-time Speech-Text Foundation Model Toolkit (wip)

Python 126 11 Updated Oct 14, 2024

[ACL 2024] This is the Pytorch code for our paper "StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing"

Python 72 3 Updated Nov 14, 2024

Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯

Python 789 30 Updated Dec 27, 2024

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 2,752 186 Updated Nov 14, 2024

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

Python 2,181 163 Updated Dec 6, 2024

A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).

Python 785 48 Updated Jan 4, 2025

大模型基础: 一文了解大模型基础知识

3,516 317 Updated Dec 25, 2024

An Open-Sourced LLM-empowered Foundation TTS System

Python 533 40 Updated Oct 17, 2024

A word list containing 25 000 of the most popular English words, divided into syllables.

43 12 Updated Jan 17, 2016

The open source code for SimpleSpeech series

Python 121 7 Updated Oct 8, 2024

Evaluation Protocol for Large-Scale Zero-Shot TTS Literature

Python 70 9 Updated Sep 26, 2024
Next