Quick Start
===========

This page gets you from zero to a working VoxCPM setup as fast as possible. Follow it top to bottom and you will have generated audio through three different paths: the Python API, the CLI, and the web demo.

Install
*******

.. code-block:: sh

    pip install voxcpm

That's it. For other installation methods (pip, source checkout, etc.), see :doc:`./installation`.

Step 1: Python API
******************

Start with the current recommended release, ``VoxCPM 2``:

.. code-block:: python

    from voxcpm import VoxCPM
    import soundfile as sf

    model = VoxCPM.from_pretrained(
        "openbmb/VoxCPM2",
        load_denoiser=False,
    )

    wav = model.generate(
        text="VoxCPM 2 is the current recommended release for realistic multilingual speech synthesis.",
        cfg_value=2.0,
        inference_timesteps=10,
    )
    sf.write("demo.wav", wav, model.tts_model.sample_rate)
    print("saved: demo.wav")

The first run downloads model weights automatically. If you have trouble accessing Hugging Face, see the mirror setup in :doc:`./installation`.

This example does not enable the optional denoiser — it is only needed when you want to enhance prompt or reference audio for voice cloning. See :doc:`./usage_guide` for details.

If this script runs and produces ``demo.wav``, your installation is working.

.. tip::

   Runtime device selection is automatic by default. ``VoxCPM.from_pretrained(..., device="auto")``
   prefers ``cuda -> mps -> cpu``. You can also force a device explicitly with
   ``device="cpu"``, ``device="mps"``, ``device="cuda"``, or ``device="cuda:0"``.
   If you hit platform-specific ``torch.compile`` issues, try ``optimize=False``.

.. note::

   For new projects, start with :doc:`./models/voxcpm2`, which is the current version. Earlier releases remain available from :doc:`./models/version_history` when you need an older checkpoint.

Step 2: CLI
***********

VoxCPM also provides a command-line interface. The CLI defaults to ``openbmb/VoxCPM2``, so you can use the recommended subcommands directly unless you want to override the checkpoint with ``--hf-model-id``:

.. code-block:: sh

    # Direct synthesis
    voxcpm design \
        --text "Hello from VoxCPM!" \
        --output out.wav

    # Reference-only cloning (VoxCPM 2)
    voxcpm clone \
        --text "This is a cloned voice sample." \
        --reference-audio path/to/voice.wav \
        --output out.wav \
        --denoise

    # Force CPU or MPS explicitly when needed
    voxcpm design --text "Hello from VoxCPM!" --device cpu --output out.wav
    voxcpm design --text "Hello from VoxCPM!" --device mps --no-optimize --output out.wav

    # Help
    voxcpm --help

Step 3: Web Demo
****************

The web demo requires a source checkout. If you installed via ``pip install voxcpm`` in the step above, you still need to clone the repository:

.. code-block:: sh

    git clone https://github.com/OpenBMB/VoxCPM.git
    cd VoxCPM
    pip install -e .
    python app.py

The web demo also downloads an additional ASR model (SenseVoice-Small) on first use for prompt audio transcription.

What's Next?
************

* Continue with :doc:`./usage_guide` for prompt strategy, voice cloning tips, and quality tuning.
* Check the pages under ``Models`` in the sidebar for version-specific features and migration notes.
* Fine-tune the model with :doc:`./finetuning/finetune` to adapt it to your use case.
* Deploy the model with :doc:`./deployment/nanovllm_voxcpm` for high-throughput serving.