voxcpm_rs¶

voxcpm_rs 是使用 burn 深度学习框架对 VoxCPM-0.5B 的 Rust 重写实现。

仓库：madushan1000/voxcpm_rs
上游：OpenBMB/VoxCPM

备注

这是面向 VoxCPM-0.5B 的实验性项目，支持基于参考音频的声音克隆（见下文用法）。

支持的 VoxCPM 版本¶
VoxCPM 1.0（0.5B）	✅ 支持（源码中硬编码 16 kHz）
VoxCPM 1.5	❌ 未测试
VoxCPM 2	❌ 不支持

主要特性¶

纯 Rust 实现 — 不依赖 Python 运行时
将 Hugging Face 格式权重转换为 burn 格式
用于文本转语音合成与声音克隆的 CLI

准备工作¶

Rust 工具链（stable）
burn 框架的特定 commit（见下文）
openbmb/VoxCPM-0.5B 权重

安装¶

本项目依赖固定 pin 的 burn commit：

# Clone and checkout the required burn version
git clone https://github.com/tracel-ai/burn.git
cd burn
git checkout e0847cbf618395775bf534cbece9f0c7f0d897be
cd ..

# Download VoxCPM-0.5B weights
git clone https://huggingface.co/openbmb/VoxCPM-0.5B

# Clone and build voxcpm_rs
git clone https://github.com/madushan1000/voxcpm_rs.git
cd voxcpm_rs
cargo build --release

基本用法¶

转换权重¶

cargo run --release --bin voxcpm convert \
    --input-path ../VoxCPM-0.5B/ \
    --output-path burn-models/

生成语音（零样本）¶

cargo run --release --bin voxcpm run \
    --model-path burn-models/ \
    --target-text "Hello, this is VoxCPM running in Rust."

# Play the output
mpv output.wav

声音克隆¶

cargo run --release --bin voxcpm run \
    --model-path burn-models/ \
    --target-text "Cloned voice output." \
    --prompt-text "Reference transcript" \
    --prompt-wav-path ref_voice.wav \
    --max-len 2048

输出保存为当前目录下的 output.wav。

限制说明¶

仅 VoxCPM-0.5B — VoxCPM 1.5 尚未测试
需要固定版本的 burn 框架
文档有限 — 高级用法请参阅源码