Orpheus TTS

TTS Towards Human-Sounding Speech

Overview of Orpheus TTS

Orpheus TTS is a Text-to-Speech (TTS) model designed to achieve human-sounding speech.

Goal: Generate speech with human-level quality.
Model: A state-of-the-art speech-LLM based on the Llama architecture.
Model Sizes: Offers models of different sizes, including Medium (3B parameters), Small (1B parameters), Tiny (400M parameters), and Nano (150M parameters).
Quality: Generates high-quality speech even with very small models.
Application: Can be used in production environments, providing pre-trained and fine-tuned models.
Features: Supports zero-shot voice cloning and custom fine-tuning.
Real-time Streaming: Provides a Python package for real-time streaming with fast inference.

Orpheus TTS aims to provide high-quality speech generation through various model sizes and supports voice cloning and real-time streaming.