mdaudiobook - Markdown to Audiobook Converter¶
mdaudiobook is a command-line tool that converts Markdown content into high-quality audiobooks and podcasts using advanced text-to-speech technology. It transforms written content into engaging audio experiences with natural-sounding narration.
Overview¶
mdaudiobook bridges the gap between written content and audio consumption. Whether you want to create audiobooks from your documentation, convert blog posts to podcasts, or generate narrated tutorials, mdaudiobook provides professional-grade text-to-speech conversion with customizable voices and audio processing.
Key Features¶
- High-Quality TTS - Natural-sounding voice synthesis
- Multiple Voice Options - Various voices and languages
- Audio Processing - Noise reduction and enhancement
- Chapter Support - Automatic chapter detection and navigation
- Metadata Tagging - ID3 tags for podcast compatibility
- Batch Processing - Convert multiple files at once
- Customizable Output - Adjust speed, pitch, and audio quality
- Cross-Platform - Works on Linux, macOS, and Windows
Installation¶
Via UCLI (Recommended)¶
# Install ucli tool manager first
curl -fsSL https://install.ucli.tools | bash
# Install mdaudiobook
ucli build mdaudiobook
Manual Installation¶
# Clone and install
git clone https://github.com/ucli-tools/mdaudiobook.git
cd mdaudiobook
make install
Prerequisites¶
Required Software¶
- Python 3.8+ - Runtime environment
- TTS Engine - eSpeak-ng, Festival, or cloud TTS services
- Audio Processing - FFmpeg for audio manipulation
- Python Packages - Automatically installed via pip
Installation by Platform¶
Ubuntu/Debian¶
# Install system dependencies
sudo apt update
sudo apt install -y python3 python3-pip ffmpeg espeak-ng
# Install TTS engines (choose one or more)
sudo apt install -y festival festvox-us1 # Festival TTS
# OR
pip install pyttsx3 # Python TTS wrapper
macOS¶
# Install system dependencies
brew install python3 ffmpeg
# Install TTS
pip install pyttsx3
# OR use macOS built-in 'say' command
Windows¶
# Install Python from python.org
# Install FFmpeg from ffmpeg.org
# Install TTS engines
pip install pyttsx3
# OR install eSpeak: https://espeak.sourceforge.net/
Usage¶
Basic Conversion¶
Convert Markdown to Audiobook¶
Specify Output File¶
Convert with Custom Settings¶
Command-Line Options¶
Voice and Audio Settings¶
--voice VOICE # Voice selection (male/female/child)
--language LANG # Language code (en/us/fr/de)
--speed FACTOR # Speech speed (0.5-2.0, default: 1.0)
--pitch SHIFT # Voice pitch adjustment
--volume LEVEL # Audio volume (0.0-1.0)
Output Configuration¶
--format FORMAT # Output format (mp3/wav/flac)
--bitrate RATE # Audio bitrate (64k/128k/320k)
--sample-rate RATE # Sample rate (22050/44100)
--channels MODE # Audio channels (mono/stereo)
Content Processing¶
--chapter-marker TEXT # Chapter separation marker
--pause-duration SEC # Pause between sections
--metadata TITLE # Audiobook title metadata
--author NAME # Author metadata
Interactive Mode¶
Run without arguments for guided setup:
The interactive mode will prompt for: - Input file selection - Voice preferences - Output format options - Audio quality settings
Content Processing¶
Markdown Parsing¶
mdaudiobook intelligently processes Markdown content:
Headers Become Chapters¶
# Chapter 1: Introduction
This is the first chapter content...
## Section 1.1: Background
More detailed content...
# Chapter 2: Main Content
Next chapter begins automatically...
Formatting Handling¶
- Bold/Italic: Emphasized speech delivery
- Code blocks: Announced as "code block" with contents
- Lists: Numbered items announced clearly
- Links: URLs read out phonetically
- Tables: Converted to descriptive text
Metadata Extraction¶
YAML frontmatter is used for audiobook metadata:
---
title: "My Audiobook"
author: "Author Name"
description: "A great audiobook"
language: "en"
duration: "2h 30m"
---
Voice and Audio Configuration¶
Voice Selection¶
Built-in Voices¶
# List available voices
mdaudiobook voices
# Use specific voice
mdaudiobook generate doc.md --voice "Zira" # Windows
mdaudiobook generate doc.md --voice "Alex" # macOS
TTS Engine Configuration¶
eSpeak-ng (Cross-platform):
Festival (Linux):
Audio Enhancement¶
Quality Settings¶
# High-quality output
mdaudiobook generate doc.md \
--format wav \
--sample-rate 44100 \
--bitrate 320k \
--channels stereo
Noise Reduction¶
Advanced Features¶
Batch Processing¶
Convert Multiple Files¶
Playlist Generation¶
Podcast Integration¶
RSS Feed Generation¶
# Generate podcast RSS feed
mdaudiobook podcast-generate chapters/ \
--title "My Podcast" \
--description "Weekly episodes" \
--feed-url "https://example.com/feed.xml"
Chapter Markers¶
# Add chapter markers for podcast apps
mdaudiobook generate book.md \
--chapters \
--chapter-file chapters.json
Integration with Other Tools¶
Documentation Workflows¶
# Convert README to audio documentation
mdaudiobook generate README.md docs.mp3 \
--voice female \
--speed 0.9 \
--metadata "Project Documentation"
Educational Content¶
# Create narrated tutorials
mdaudiobook generate tutorial.md tutorial.mp3 \
--voice teacher \
--pause-duration 0.5 \
--chapter-marker "##"
Customization¶
Configuration Files¶
Create ~/.config/mdaudiobook/config.yaml:
# Default settings
voice: "female"
language: "en-us"
speed: 1.0
format: "mp3"
bitrate: "128k"
sample_rate: 44100
# TTS engine settings
tts_engine: "espeak" # espeak, festival, pyttsx3, system
espeak:
voice: "en-us"
pitch: 50
speed: 175
# Audio processing
denoise: true
normalize: true
pause_duration: 0.3
Voice Profiles¶
Define custom voice profiles:
voices:
narrator:
engine: "espeak"
voice: "en-gb"
speed: 0.9
pitch: 60
character:
engine: "festival"
voice: "us3"
speed: 1.1
pitch: 70
Troubleshooting¶
Common Issues¶
TTS Engine Not Found¶
# Check available TTS engines
mdaudiobook engines
# Install missing engines
# Ubuntu: sudo apt install espeak-ng festival
# macOS: brew install espeak
Audio Quality Issues¶
File Encoding Problems¶
# Ensure UTF-8 encoding
file document.md
# Convert encoding if needed
iconv -f latin1 -t utf8 document.md > document_utf8.md
Memory Issues with Large Files¶
# Process in chunks
mdaudiobook generate large-doc.md --chunk-size 1000
# Use streaming for very large files
mdaudiobook generate large-doc.md --stream
Performance Optimization¶
Fast Processing¶
# Use faster TTS engine
mdaudiobook generate doc.md --tts-engine espeak --speed 2.0
# Reduce audio quality for speed
mdaudiobook generate doc.md --format mp3 --bitrate 64k
Parallel Processing¶
Debug Mode¶
Enable verbose output:
Examples¶
Basic Audiobook Creation¶
# Simple conversion
mdaudiobook generate story.md audiobook.mp3
# With custom voice settings
mdaudiobook generate story.md \
--voice female \
--speed 0.9 \
--language en-gb \
audiobook.mp3
Podcast Episode¶
---
title: "Episode 1: Getting Started"
author: "Podcast Host"
description: "Introduction to our topic"
duration: "25m 30s"
---
# Welcome to Our Podcast
Hello and welcome to the first episode...
## Today's Topic
We'll be discussing...
## Key Points
- Point one: Important concept
- Point two: Another key idea
- Point three: Final thoughts
# Generate podcast episode
mdaudiobook generate episode1.md episode1.mp3 \
--voice male \
--speed 1.0 \
--metadata "Podcast Name - Episode 1" \
--author "Host Name"
Documentation Narration¶
# Create narrated API documentation
mdaudiobook generate api-docs.md api-guide.mp3 \
--voice neutral \
--speed 0.8 \
--chapters \
--format mp3 \
--bitrate 128k
Educational Content¶
# Generate narrated lecture
mdaudiobook generate lecture.md lecture.mp3 \
--voice professor \
--pause-duration 0.7 \
--chapter-marker "#" \
--metadata "CS101 - Lecture 1"
Integration Examples¶
Documentation Pipeline¶
#!/bin/bash
# Automated documentation audio generation
# Generate API docs audio
mdaudiobook generate api.md api-docs.mp3 \
--voice female \
--speed 0.9 \
--chapters
# Generate changelog audio
mdaudiobook generate CHANGELOG.md changelog.mp3 \
--voice male \
--speed 1.0
# Create playlist
echo "# Documentation Audio Guide" > docs.m3u
echo "api-docs.mp3" >> docs.m3u
echo "changelog.mp3" >> docs.m3u
CI/CD Integration¶
GitHub Actions¶
- name: Generate Documentation Audio
run: |
pip install mdaudiobook
mdaudiobook generate README.md docs.mp3 \
--voice female \
--speed 0.9 \
--metadata "Project Documentation"
- name: Upload Audio
uses: actions/upload-artifact@v3
with:
name: documentation-audio
path: docs.mp3
Content Management System¶
# Python script for automated audio generation
import subprocess
import os
def generate_audio(content_file, output_file, voice="female"):
cmd = [
"mdaudiobook", "generate",
content_file,
"--voice", voice,
"--speed", "0.9",
"--output", output_file
]
subprocess.run(cmd, check=True)
# Generate audio for all blog posts
for md_file in os.listdir("posts/"):
if md_file.endswith(".md"):
audio_file = md_file.replace(".md", ".mp3")
generate_audio(f"posts/{md_file}", f"audio/{audio_file}")
Contributing¶
Development Setup¶
# Clone repository
git clone https://github.com/ucli-tools/mdaudiobook.git
cd mdaudiobook
# Install development dependencies
pip install -r requirements-dev.txt
# Run tests
python -m pytest
# Install locally for testing
pip install -e .
Adding TTS Engines¶
- Create new TTS engine class in
mdaudiobook/engines/ - Implement the
TTSEngineinterface - Add configuration options
- Update documentation
Voice Development¶
Contribute new voice profiles by submitting PRs with: - Voice configuration - Audio samples - Performance benchmarks
Changelog¶
Version 2.0.0¶
- Multiple TTS engine support
- Batch processing capabilities
- Podcast RSS generation
- Advanced audio processing
Version 1.5.0¶
- Chapter detection and navigation
- Metadata tagging support
- Multiple output formats
- Voice customization options
Version 1.0.0¶
- Initial release
- Basic Markdown to audio conversion
- Single TTS engine support
- Cross-platform compatibility
Support¶
- Documentation: docs.ucli.tools/tools/mdaudiobook
- Issues: GitHub Issues
- Discussions: Community Forums
License¶
Licensed under the Apache License 2.0. See LICENSE for details.
Ready to turn text into speech? Try ucli build mdaudiobook to get started!
UCLI Tools Ecosystem
Professional CLI tools for developers