mdaudiobook - Markdown to Audiobook Converter¶

mdaudiobook is a command-line tool that converts Markdown content into high-quality audiobooks and podcasts using advanced text-to-speech technology. It transforms written content into engaging audio experiences with natural-sounding narration.

Overview¶

mdaudiobook bridges the gap between written content and audio consumption. Whether you want to create audiobooks from your documentation, convert blog posts to podcasts, or generate narrated tutorials, mdaudiobook provides professional-grade text-to-speech conversion with customizable voices and audio processing.

Key Features¶

High-Quality TTS - Natural-sounding voice synthesis
Multiple Voice Options - Various voices and languages
Audio Processing - Noise reduction and enhancement
Chapter Support - Automatic chapter detection and navigation
Metadata Tagging - ID3 tags for podcast compatibility
Batch Processing - Convert multiple files at once
Customizable Output - Adjust speed, pitch, and audio quality
Cross-Platform - Works on Linux, macOS, and Windows

Installation¶

Via UCLI (Recommended)¶

# Install ucli tool manager first
curl -fsSL https://install.ucli.tools | bash

# Install mdaudiobook
ucli build mdaudiobook

Manual Installation¶

# Clone and install
git clone https://github.com/ucli-tools/mdaudiobook.git
cd mdaudiobook
make install

Prerequisites¶

Required Software¶

Python 3.8+ - Runtime environment
TTS Engine - eSpeak-ng, Festival, or cloud TTS services
Audio Processing - FFmpeg for audio manipulation
Python Packages - Automatically installed via pip

Installation by Platform¶

Ubuntu/Debian¶

# Install system dependencies
sudo apt update
sudo apt install -y python3 python3-pip ffmpeg espeak-ng

# Install TTS engines (choose one or more)
sudo apt install -y festival festvox-us1  # Festival TTS
# OR
pip install pyttsx3  # Python TTS wrapper

macOS¶

# Install system dependencies
brew install python3 ffmpeg

# Install TTS
pip install pyttsx3
# OR use macOS built-in 'say' command

Windows¶

# Install Python from python.org
# Install FFmpeg from ffmpeg.org

# Install TTS engines
pip install pyttsx3
# OR install eSpeak: https://espeak.sourceforge.net/

Usage¶

Basic Conversion¶

Convert Markdown to Audiobook¶

mdaudiobook generate document.md

Specify Output File¶

mdaudiobook generate document.md audiobook.mp3

Convert with Custom Settings¶

mdaudiobook generate document.md \
  --voice female \
  --speed 1.2 \
  --output audiobook.wav

Command-Line Options¶

Voice and Audio Settings¶

--voice VOICE          # Voice selection (male/female/child)
--language LANG        # Language code (en/us/fr/de)
--speed FACTOR         # Speech speed (0.5-2.0, default: 1.0)
--pitch SHIFT          # Voice pitch adjustment
--volume LEVEL         # Audio volume (0.0-1.0)

Output Configuration¶

--format FORMAT        # Output format (mp3/wav/flac)
--bitrate RATE         # Audio bitrate (64k/128k/320k)
--sample-rate RATE     # Sample rate (22050/44100)
--channels MODE        # Audio channels (mono/stereo)

Content Processing¶

--chapter-marker TEXT  # Chapter separation marker
--pause-duration SEC   # Pause between sections
--metadata TITLE      # Audiobook title metadata
--author NAME         # Author metadata

Interactive Mode¶

Run without arguments for guided setup:

mdaudiobook

The interactive mode will prompt for: - Input file selection - Voice preferences - Output format options - Audio quality settings

Content Processing¶

Markdown Parsing¶

mdaudiobook intelligently processes Markdown content:

Headers Become Chapters¶

# Chapter 1: Introduction
This is the first chapter content...

## Section 1.1: Background
More detailed content...

# Chapter 2: Main Content
Next chapter begins automatically...

Formatting Handling¶

Bold/Italic: Emphasized speech delivery
Code blocks: Announced as "code block" with contents
Lists: Numbered items announced clearly
Links: URLs read out phonetically
Tables: Converted to descriptive text

Metadata Extraction¶

YAML frontmatter is used for audiobook metadata:

---
title: "My Audiobook"
author: "Author Name"
description: "A great audiobook"
language: "en"
duration: "2h 30m"
---

Voice and Audio Configuration¶

Voice Selection¶

Built-in Voices¶

# List available voices
mdaudiobook voices

# Use specific voice
mdaudiobook generate doc.md --voice "Zira"  # Windows
mdaudiobook generate doc.md --voice "Alex"  # macOS

TTS Engine Configuration¶

eSpeak-ng (Cross-platform):

# Configure voice
mdaudiobook generate doc.md --voice espeak --language en-us

Festival (Linux):

# Use Festival with custom voice
mdaudiobook generate doc.md --tts-engine festival --voice us1

Audio Enhancement¶

Quality Settings¶

# High-quality output
mdaudiobook generate doc.md \
  --format wav \
  --sample-rate 44100 \
  --bitrate 320k \
  --channels stereo

Noise Reduction¶

# Apply audio processing
mdaudiobook generate doc.md --denoise --normalize

Advanced Features¶

Batch Processing¶

Convert Multiple Files¶

# Convert all markdown files in directory
mdaudiobook batch-convert *.md --output-dir audiobooks/

Playlist Generation¶

# Create M3U playlist from multiple files
mdaudiobook generate *.md --playlist audiobook.m3u

Podcast Integration¶

RSS Feed Generation¶

# Generate podcast RSS feed
mdaudiobook podcast-generate chapters/ \
  --title "My Podcast" \
  --description "Weekly episodes" \
  --feed-url "https://example.com/feed.xml"

Chapter Markers¶

# Add chapter markers for podcast apps
mdaudiobook generate book.md \
  --chapters \
  --chapter-file chapters.json

Integration with Other Tools¶

Documentation Workflows¶

# Convert README to audio documentation
mdaudiobook generate README.md docs.mp3 \
  --voice female \
  --speed 0.9 \
  --metadata "Project Documentation"

Educational Content¶

# Create narrated tutorials
mdaudiobook generate tutorial.md tutorial.mp3 \
  --voice teacher \
  --pause-duration 0.5 \
  --chapter-marker "##"

Customization¶

Configuration Files¶

Create ~/.config/mdaudiobook/config.yaml:

# Default settings
voice: "female"
language: "en-us"
speed: 1.0
format: "mp3"
bitrate: "128k"
sample_rate: 44100

# TTS engine settings
tts_engine: "espeak"  # espeak, festival, pyttsx3, system
espeak:
  voice: "en-us"
  pitch: 50
  speed: 175

# Audio processing
denoise: true
normalize: true
pause_duration: 0.3

Voice Profiles¶

Define custom voice profiles:

voices:
  narrator:
    engine: "espeak"
    voice: "en-gb"
    speed: 0.9
    pitch: 60

  character:
    engine: "festival"
    voice: "us3"
    speed: 1.1
    pitch: 70

Troubleshooting¶

Common Issues¶

TTS Engine Not Found¶

# Check available TTS engines
mdaudiobook engines

# Install missing engines
# Ubuntu: sudo apt install espeak-ng festival
# macOS: brew install espeak

Audio Quality Issues¶

# Test audio output
mdaudiobook test-audio

# Check FFmpeg installation
ffmpeg -version

File Encoding Problems¶

# Ensure UTF-8 encoding
file document.md

# Convert encoding if needed
iconv -f latin1 -t utf8 document.md > document_utf8.md

Memory Issues with Large Files¶

# Process in chunks
mdaudiobook generate large-doc.md --chunk-size 1000

# Use streaming for very large files
mdaudiobook generate large-doc.md --stream

Performance Optimization¶

Fast Processing¶

# Use faster TTS engine
mdaudiobook generate doc.md --tts-engine espeak --speed 2.0

# Reduce audio quality for speed
mdaudiobook generate doc.md --format mp3 --bitrate 64k

Parallel Processing¶

# Process multiple files simultaneously
mdaudiobook batch-convert *.md --parallel 4

Debug Mode¶

Enable verbose output:

mdaudiobook generate --debug doc.md

Examples¶

Basic Audiobook Creation¶

# Simple conversion
mdaudiobook generate story.md audiobook.mp3

# With custom voice settings
mdaudiobook generate story.md \
  --voice female \
  --speed 0.9 \
  --language en-gb \
  audiobook.mp3

Podcast Episode¶

---
title: "Episode 1: Getting Started"
author: "Podcast Host"
description: "Introduction to our topic"
duration: "25m 30s"
---

# Welcome to Our Podcast

Hello and welcome to the first episode...

## Today's Topic

We'll be discussing...

## Key Points

- Point one: Important concept
- Point two: Another key idea
- Point three: Final thoughts

# Generate podcast episode
mdaudiobook generate episode1.md episode1.mp3 \
  --voice male \
  --speed 1.0 \
  --metadata "Podcast Name - Episode 1" \
  --author "Host Name"

Documentation Narration¶

# Create narrated API documentation
mdaudiobook generate api-docs.md api-guide.mp3 \
  --voice neutral \
  --speed 0.8 \
  --chapters \
  --format mp3 \
  --bitrate 128k

Educational Content¶

# Generate narrated lecture
mdaudiobook generate lecture.md lecture.mp3 \
  --voice professor \
  --pause-duration 0.7 \
  --chapter-marker "#" \
  --metadata "CS101 - Lecture 1"

Integration Examples¶

Documentation Pipeline¶

#!/bin/bash
# Automated documentation audio generation

# Generate API docs audio
mdaudiobook generate api.md api-docs.mp3 \
  --voice female \
  --speed 0.9 \
  --chapters

# Generate changelog audio
mdaudiobook generate CHANGELOG.md changelog.mp3 \
  --voice male \
  --speed 1.0

# Create playlist
echo "# Documentation Audio Guide" > docs.m3u
echo "api-docs.mp3" >> docs.m3u
echo "changelog.mp3" >> docs.m3u

CI/CD Integration¶

GitHub Actions¶

- name: Generate Documentation Audio
  run: |
    pip install mdaudiobook
    mdaudiobook generate README.md docs.mp3 \
      --voice female \
      --speed 0.9 \
      --metadata "Project Documentation"
- name: Upload Audio
  uses: actions/upload-artifact@v3
  with:
    name: documentation-audio
    path: docs.mp3

Content Management System¶

# Python script for automated audio generation
import subprocess
import os

def generate_audio(content_file, output_file, voice="female"):
    cmd = [
        "mdaudiobook", "generate",
        content_file,
        "--voice", voice,
        "--speed", "0.9",
        "--output", output_file
    ]
    subprocess.run(cmd, check=True)

# Generate audio for all blog posts
for md_file in os.listdir("posts/"):
    if md_file.endswith(".md"):
        audio_file = md_file.replace(".md", ".mp3")
        generate_audio(f"posts/{md_file}", f"audio/{audio_file}")

Contributing¶

Development Setup¶

# Clone repository
git clone https://github.com/ucli-tools/mdaudiobook.git
cd mdaudiobook

# Install development dependencies
pip install -r requirements-dev.txt

# Run tests
python -m pytest

# Install locally for testing
pip install -e .

Adding TTS Engines¶

Create new TTS engine class in mdaudiobook/engines/
Implement the TTSEngine interface
Add configuration options
Update documentation

Voice Development¶

Contribute new voice profiles by submitting PRs with: - Voice configuration - Audio samples - Performance benchmarks

Changelog¶

Version 2.0.0¶

Multiple TTS engine support
Batch processing capabilities
Podcast RSS generation
Advanced audio processing

Version 1.5.0¶

Chapter detection and navigation
Metadata tagging support
Multiple output formats
Voice customization options

Version 1.0.0¶

Initial release
Basic Markdown to audio conversion
Single TTS engine support
Cross-platform compatibility

Support¶

Documentation: docs.ucli.tools/tools/mdaudiobook
Issues: GitHub Issues
Discussions: Community Forums

License¶

Licensed under the Apache License 2.0. See LICENSE for details.

Ready to turn text into speech? Try ucli build mdaudiobook to get started!

UCLI Tools Ecosystem

Professional CLI tools for developers

UCLI Tools

ucli.tools

Install

install.ucli.tools

Registry

registry.ucli.tools

You are here

Documentation

docs.ucli.tools