Skip to main content

Convert Articles to Podcast Audio with Kokoro

ยท 4 min read

I wanted to convert my English articles into audio so I could listen while walking. After looking around, the only option that's free, sounds good, and runs locally is Kokoro.


What Is Kokoroโ€‹

Kokoro is an open-source TTS (text-to-speech) model with 82M parameters โ€” lightweight but surprisingly good. English quality approaches ElevenLabs commercial level, with 88 voices to choose from and an Apache 2.0 license, free for commercial use.

We'll use kokoro-web โ€” a project that wraps Kokoro into a local service, launched with a single Docker command, with a web UI included, ready to use out of the box.


Choosing a Voiceโ€‹

Kokoro voices are graded Aโ€“D by quality, A being highest. For English, just pick from these:

VoiceGradeStyle
HeartAWarm female voice, top pick
BellaA-High-quality female voice
MichaelC+Mature male voice
EmmaB-British female voice

The first time you use a voice, its model file is downloaded automatically and cached โ€” no need to re-download.


Installing Dockerโ€‹

If you don't have Docker yet, install it first.

International network: download from the official site: docker.com/products/docker-desktop

Chinese network: the official site can be slow, use Aliyun's mirror:

https://mirrors.aliyun.com/docker-toolbox/mac/docker-for-mac/

Download the .dmg, double-click to install, open Docker Desktop, and wait for the ๐Ÿณ icon to appear in the menu bar.

Then configure the command-line path so Terminal can find the docker command:

echo 'export PATH="$PATH:/Applications/Docker.app/Contents/Resources/bin"' >> ~/.zshrc
source ~/.zshrc

Verify the installation:

docker --version && docker compose version

Deploying kokoro-webโ€‹

1. Create a project directoryโ€‹

mkdir kokoro-web && cd kokoro-web

Make sure you cd into the folder before doing anything else. The docker compose commands must run inside this directory, otherwise you'll get no configuration file provided.

2. Create the config fileโ€‹

cat > compose.yaml << 'EOF'
services:
kokoro-web:
image: ghcr.io/eduardolat/kokoro-web:latest
ports:
- "3000:3000"
environment:
- KW_SECRET_API_KEY=my-secret-key
volumes:
- ./kokoro-cache:/kokoro/cache
restart: unless-stopped
EOF

KW_SECRET_API_KEY is the access password for the local service โ€” for local use, anything works, just remember what you set. I'd suggest leaving it as the default.

3. Start the serviceโ€‹

docker compose up -d

The first start pulls the image, which can be slow on some networks โ€” be patient.

Once you see this, it's up:

docker compose logs -f
# Success: Listening on http://0.0.0.0:3000

Using the Web UIโ€‹

Open your browser and go to http://localhost:3000.

Configure the API (important)โ€‹

Click API Settings in the top right corner:

  • Base URL: http://localhost:3000/api/v1 (default, no change needed)
  • API Key: my-secret-key

Gotcha: the API Key is just the value itself โ€” don't include the - KW_SECRET_API_KEY= prefix from the compose file.

Click OK.

Generation Settingsโ€‹

Back on the main screen, a few key options:

OptionRecommendedNotes
Execution placeAPI (Self-hosted)Use local Docker, not Browser
Model quantizationq8f16Best balance of speed and quality
Language accentEnglish (US)For English content
VoiceHeartTop recommendation

Always set Execution place to API (Self-hosted). Running in the browser runs the model in your browser โ€” much slower and re-downloads every time.

Paste your article into Text to process, then click Generate.

Demoโ€‹

Input text:

Google is renowned for its innovative and employee-centric work environment. The company's campuses, often called "Googleplexes," feature vibrant designs with open spaces, recreational facilities, free gourmet meals, and wellness centers. Employees enjoy flexible work hours, remote options, and a strong emphasis on collaboration through team projects and hackathons.

A culture of creativity thrives with "20% time," encouraging personal passion projects that have led to major products like Gmail. Diversity, inclusion, and continuous learning are prioritized through training programs and supportive leadership. This unique blend of fun, freedom, and purpose fosters high productivity and job satisfaction, making Google a top destination for tech talent worldwide.

Generated with the Heart voice (~150 words, ~30 seconds on local CPU):


Processing Markdown Filesโ€‹

Kokoro only accepts plain text. Pasting a .md file directly will cause it to read out ##, **bold**, and other Markdown syntax aloud.

Use pandoc to convert first:

# Install pandoc
brew install pandoc

# Convert md to plain text, then copy-paste the output
pandoc article.md -t plain

Useful Commandsโ€‹

docker compose logs -f # view live logs
docker compose stop # stop the service
docker compose start # start the service
docker compose pull # update to the latest version

Environment: MacBook Pro M1 Pro ยท macOS ยท Docker Desktop 29.4.3