Professional Voice Generation
Create professional IVR prompts, announcements, and voicemail greetings without recording studios or voice actors. Update instantly at any time.
Overview
Text-to-Speech (TTS) converts written text into natural-sounding speech using advanced neural AI voices. Perfect for creating dynamic IVR prompts, announcements, and greetings that can be updated instantly without re-recording.
Key Benefits:
- Cost Savings: Eliminate recording studio and voice actor costs
- Instant Updates: Change prompts in seconds, not days
- Multi-Language: Support 40+ languages with native accents
- Consistency: Maintain consistent voice across all prompts
- Personalization: Dynamic content based on caller information
Use Cases:
- IVR menu prompts and instructions
- Business hours and holiday announcements
- Queue position and wait time messages
- Voicemail greetings and away messages
- Emergency notifications and alerts
- Personalized caller greetings
Features
Neural AI Voices
Natural-Sounding Speech:
- Advanced neural network technology (WaveNet, Azure Neural TTS)
- Human-like intonation and emotion
- Natural pauses and breathing patterns
- Consistent pronunciation and clarity
Voice Catalog:
200+ Professional Voices
40+ Languages
Multiple accents per language
Male and female options
Various speaking styles (friendly, professional, casual, authoritative)
Popular Voice Examples:
English (US):
👩 Jennifer - Warm, friendly customer service tone
👩 Joanna - Professional, clear business voice
👨 Matthew - Authoritative, confident narrator
👨 Joey - Casual, conversational style
English (UK):
👩 Emma - British, professional
👨 Brian - British, formal
Spanish (ES):
👩 Lucia - Neutral Spanish accent
👨 Enrique - Professional, clear
French (FR):
👩 Céline - Parisian accent
👨 Mathieu - Professional French
German (DE):
👩 Marlene - Standard German
👨 Hans - Professional, clear
Dynamic Content
Variable Substitution:
Create prompts with dynamic content that changes based on context:
Text Input:
"Hello {{caller_name}}, thank you for calling {{company_name}}.
Your account balance is {{account_balance}}."
TTS Output (for John Smith):
"Hello John Smith, thank you for calling Acme Corporation.
Your account balance is two hundred forty-five dollars and thirty cents."
Available Variables:
{{caller_name}} - Caller's name from contact
{{caller_number}} - Caller's phone number
{{company_name}} - Your company name
{{current_time}} - Current time
{{current_date}} - Current date
{{queue_position}} - Position in call queue
{{wait_time}} - Estimated wait time
{{account_balance}} - From CRM/database
{{agent_name}} - Assigned agent name
{{custom_field}} - Any custom data
Use Case Examples:
Personalized Greeting:
"Good {{time_of_day}}, {{caller_name}}. Welcome back to {{company_name}}."
Output: "Good afternoon, Sarah Johnson. Welcome back to Acme Corporation."
Queue Status:
"You are number {{queue_position}} in line. Estimated wait time is {{wait_time}} minutes."
Output: "You are number three in line. Estimated wait time is four minutes."
Account Info:
"Your order number {{order_number}} is {{order_status}} and will arrive on {{delivery_date}}."
Output: "Your order number A B C one two three four is shipped and will arrive on Friday, November twenty-fourth."
SSML Control
Speech Synthesis Markup Language (SSML) provides fine-grained control over pronunciation, pacing, and emphasis:
Basic SSML Example:
<speak>
Welcome to <emphasis level="strong">Acme Corporation</emphasis>.
<break time="500ms"/>
Please listen carefully as our menu options have changed.
</speak>
Common SSML Tags:
Pauses & Breaks:
<break time="500ms"/> <!-- Half-second pause -->
<break time="1s"/> <!-- One-second pause -->
<break strength="strong"/> <!-- Sentence-level pause -->
Emphasis & Volume:
<emphasis level="strong">Important message</emphasis>
<emphasis level="moderate">Notice</emphasis>
<prosody volume="loud">Attention!</prosody>
<prosody volume="soft">Quiet message</prosody>
Speed & Pitch:
<prosody rate="slow">Speak slowly for clarity</prosody>
<prosody rate="fast">Quick information</prosody>
<prosody pitch="high">Higher voice pitch</prosody>
<prosody pitch="low">Lower voice pitch</prosody>
Number & Date Formatting:
<say-as interpret-as="digits">123456</say-as>
<!-- Output: "one two three four five six" -->
<say-as interpret-as="cardinal">123</say-as>
<!-- Output: "one hundred twenty-three" -->
<say-as interpret-as="ordinal">3</say-as>
<!-- Output: "third" -->
<say-as interpret-as="telephone">+1-555-123-4567</say-as>
<!-- Output: "plus one, five five five, one two three, four five six seven" -->
<say-as interpret-as="date" format="mdy">11/22/2025</say-as>
<!-- Output: "November twenty-second, two thousand twenty-five" -->
Spelling Out Words:
Your confirmation code is <say-as interpret-as="spell-out">ABC123</say-as>
<!-- Output: "Your confirmation code is A B C one two three" -->
Setup & Usage
Step 1: Access TTS in Admin Portal
Navigate to TTS:
- Log into Cloud-PBX Admin Portal
- Go to Settings → Text-to-Speech
- Or access directly from IVR Builder → Add Prompt → Generate from Text
Step 2: Create Your First TTS Prompt
Simple Prompt Creation:
In IVR Builder:
- Click Add Prompt or edit existing prompt
- Select Text-to-Speech (instead of Upload Audio)
- Enter your text:
Thank you for calling Acme Corporation. For sales, press 1. For support, press 2. For billing, press 3. - Choose voice: Joanna (English, US, Female, Professional)
- Click Preview to hear sample
- Click Save to generate and use in IVR
Generated Prompt:
- Audio file automatically created
- Stored in your prompt library
- Ready to use immediately in IVR flows
Step 3: Advanced Options
Voice Settings:
Voice: Joanna (English US, Female)
Language: English (US) 🇺🇸
Speaking Rate: Normal (100%)
- Slow: 75%
- Normal: 100%
- Fast: 125%
Pitch: Normal (0)
- Low: -2
- Normal: 0
- High: +2
Volume: Normal (0 dB)
- Soft: -6 dB
- Normal: 0 dB
- Loud: +6 dBPreview Options:
- 🔊 Listen to preview
- 📥 Download audio file
- 🔄 Regenerate with different settings
- 💾 Save to prompt library
Step 4: Use SSML for Advanced Control
Enable SSML Mode:
- Toggle Advanced Mode → SSML Enabled
- Enter SSML markup instead of plain text
- Preview to verify pronunciation and timing
- Save when satisfied
Example SSML Prompt:
<speak>
<emphasis level="strong">Welcome</emphasis> to Acme Corporation.
<break time="500ms"/>
Please listen carefully, as our menu options have changed.
<break time="300ms"/>
For <prosody rate="slow">sales</prosody>, press <say-as interpret-as="digits">1</say-as>.
<break time="300ms"/>
For <prosody rate="slow">technical support</prosody>, press <say-as interpret-as="digits">2</say-as>.
<break time="300ms"/>
For <prosody rate="slow">billing</prosody>, press <say-as interpret-as="digits">3</say-as>.
<break time="500ms"/>
To repeat these options, press <say-as interpret-as="digits">9</say-as>.
</speak>Step 5: Dynamic Prompts with Variables
Enable Dynamic Content:
- In prompt editor, toggle Dynamic Content → Enabled
- Use
{{variable_name}}syntax for placeholders - Configure variable sources (CRM, database, caller info)
- Preview with sample data
Example Dynamic Prompt:
Good {{time_of_day}}, {{caller_name}}.
Thank you for calling {{company_name}}.
{{#if has_open_ticket}}
I see you have an open support ticket, number {{ticket_number}},
regarding {{ticket_subject}}.
Press 1 to speak with {{assigned_agent}}, or press 2 for the main menu.
{{else}}
For sales, press 1. For support, press 2.
{{/if}}Variable Configuration:
Variable: {{caller_name}}
Source: Caller ID Lookup → CRM Contact
Fallback: "valued customer"
Variable: {{time_of_day}}
Source: System Time
6am-12pm: "morning"
12pm-5pm: "afternoon"
5pm-9pm: "evening"
Variable: {{has_open_ticket}}
Source: CRM API Query
Query: "SELECT COUNT(*) FROM tickets WHERE phone = {{caller_number}} AND status = 'Open'"Use Cases & Examples
Professional IVR Menu
Scenario: Replace outdated recorded IVR with modern TTS
Traditional Approach:
- Hire voice actor: $300-500
- Studio recording: $200-400
- Editing and mastering: $100-200
- Total: $600-1,100
- Update time: 2-5 business days
TTS Approach:
- Text entry: 5 minutes
- Voice selection: 2 minutes
- Preview and adjust: 3 minutes
- Total cost: ~$0.15
- Update time: 10 minutes
TTS Script:
<speak>
<emphasis level="strong">Welcome</emphasis> to Acme Corporation.
<break time="500ms"/>
For <prosody rate="slow">sales and new orders</prosody>,
press <say-as interpret-as="digits">1</say-as>.
<break time="400ms"/>
For <prosody rate="slow">customer support and technical assistance</prosody>,
press <say-as interpret-as="digits">2</say-as>.
<break time="400ms"/>
For <prosody rate="slow">billing and account questions</prosody>,
press <say-as interpret-as="digits">3</say-as>.
<break time="400ms"/>
To hear these options again, press <say-as interpret-as="digits">9</say-as>.
<break time="500ms"/>
Or, stay on the line for the next available representative.
</speak>
Result: Professional, clear IVR that can be updated anytime for free.
Queue & Wait Time Messages
Dynamic Queue Status:
Thank you for your patience.
You are currently number {{queue_position}} in line.
Estimated wait time is {{wait_time}} minutes.
{{#if queue_position > 5}}
To receive a callback when an agent is available, press 1.
{{else}}
An agent will be with you shortly.
{{/if}}
To return to the main menu, press the star key.
Real-Time Variables:
{{queue_position}}: Updated in real-time as queue moves{{wait_time}}: Calculated based on average handle time- Queue position > 5: Offer callback option
- Regenerated automatically with current values
Business Hours Announcement
Smart Hours Message:
{{#if is_business_hours}}
Thank you for calling Acme Corporation.
Our business hours are Monday through Friday, 9 AM to 6 PM Eastern Time.
All of our representatives are currently assisting other customers.
Please hold, and the next available agent will be with you shortly.
{{else if is_weekend}}
Thank you for calling Acme Corporation.
You have reached us outside of our normal business hours.
Our office is open Monday through Friday, 9 AM to 6 PM Eastern Time.
Please leave a message after the tone, and we will return your call on the next business day.
For urgent matters, press 9 to reach our emergency support line.
{{else if is_holiday}}
Thank you for calling Acme Corporation.
Our office is closed today for {{holiday_name}}.
We will reopen on {{next_business_day}} at 9 AM Eastern Time.
For urgent matters, please press 9 to reach our emergency support line.
Otherwise, please leave a message, and we will return your call when we reopen.
{{/if}}
Automatic Schedule Updates:
- Business hours from calendar
- Holiday schedule from admin settings
- Next business day calculated automatically
- No manual prompt updates needed
Personalized Caller Greeting
VIP Customer Recognition:
Welcome back, {{caller_name}}.
Thank you for being a valued {{account_tier}} member since {{customer_since}}.
{{#if has_recent_order}}
Your recent order, number {{order_number}}, is {{order_status}}.
{{#if order_status == 'shipped'}}
Tracking shows delivery expected {{delivery_date}}.
{{/if}}
{{/if}}
{{#if has_assigned_account_manager}}
To speak with your dedicated account manager, {{manager_name}}, press 1.
Otherwise, press 2 for our main menu.
{{else}}
For sales, press 1. For support, press 2. For billing, press 3.
{{/if}}
Data Sources:
- CRM: Account tier, customer since date, account manager
- Order system: Recent orders, status, delivery tracking
- Smart routing based on customer relationship
Multi-Language Support
Language Selection with Regional Voices:
English (US):
Voice: Joanna (Professional US Female)
"Welcome to Acme Corporation. For English, press 1.
Para español, oprima el dos."
Spanish (ES):
Voice: Lucia (Professional Spanish Female)
"Bienvenido a Acme Corporation. Para continuar en español,
oprima el uno. For English, press two."
French (FR):
Voice: Céline (Professional French Female)
"Bienvenue chez Acme Corporation. Pour continuer en français,
appuyez sur le un. For English, press two."
Implementation:
- Language detection from caller ID or IVR selection
- Switch voices based on chosen language
- Consistent prompts across all languages
- Update all language versions simultaneously
Best Practices
Writing Effective TTS Scripts
Do's:
- ✅ Write conversationally (how people speak, not write)
- ✅ Use short sentences and phrases
- ✅ Spell out numbers for clarity ("one" not "1")
- ✅ Use punctuation for natural pauses
- ✅ Test pronunciation with preview before deploying
- ✅ Consider caller's perspective and information needs
Don'ts:
- ❌ Don't use overly formal or complex language
- ❌ Don't write run-on sentences (listeners can't rewind)
- ❌ Don't assume pronunciation (spell phonetically if needed)
- ❌ Don't overuse emphasis or special effects
- ❌ Don't forget pauses between menu options
Pronunciation Control
Common Challenges:
Acronyms & Abbreviations:
<!-- Wrong: TTS might say "P.B.X." as "Pibix" -->
PBX system
<!-- Right: Force spelling -->
<say-as interpret-as="spell-out">PBX</say-as> system
<!-- Output: "P B X system" -->
Company & Product Names:
<!-- If mispronounced, use phonetic spelling -->
<!-- Wrong pronunciation -->
Acme
<!-- Force phonetic spelling -->
<phoneme alphabet="ipa" ph="ˈæk.mi">Acme</phoneme>
<!-- Or spell it out -->
<say-as interpret-as="spell-out">ACME</say-as>
Numbers & Codes:
<!-- Account numbers: spell out -->
Your account number is <say-as interpret-as="digits">123456</say-as>
<!-- Output: "one two three four five six" -->
<!-- Prices: use currency -->
Your balance is <say-as interpret-as="currency" language="en-US">$245.30</say-as>
<!-- Output: "two hundred forty-five dollars and thirty cents" -->
<!-- Phone numbers -->
<say-as interpret-as="telephone">555-1234</say-as>
<!-- Output: "five five five, one two three four" -->
Voice Selection Tips
Match Voice to Use Case:
Customer Service IVR:
Best: Joanna (US), Emma (UK) - Professional, friendly
Avoid: Joey - Too casual for business
Emergency/Security Alerts:
Best: Matthew (US), Brian (UK) - Authoritative, clear
Avoid: Soft voices - May lack urgency
Marketing/Sales Prompts:
Best: Jennifer (US), Amy (UK) - Warm, engaging
Avoid: Overly formal voices
Technical Instructions:
Best: Joanna (US), Emma (UK) - Clear, moderate pace
Avoid: Fast-paced voices
Multi-Lingual Support:
Best: Native accent voices for each language
Avoid: English voice attempting other languages
Performance Optimization
Caching & Pre-Generation:
Static Prompts (rarely change):
Generate Once, Cache Forever:
- Main IVR menu
- Company greeting
- Standard instructions
Benefits:
✅ Instant playback (no generation delay)
✅ No usage fees after initial generation
✅ Consistent experience
Dynamic Prompts (change per call):
Generate On-Demand:
- Personalized greetings with {{caller_name}}
- Queue position messages with {{queue_position}}
- Account-specific information
Considerations:
⏱️ 100-300ms generation latency
💰 Per-generation usage fees
🔄 Short cache TTL (5-10 seconds)
Hybrid Approach:
Pre-generate templates with common variables:
"Welcome back, {{caller_name}}" → Cache multiple versions
"You are number [X] in line" → Pre-generate 1-20
Result: Fast playback + personalization
Pricing
Usage-Based Pricing
Standard Voices:
Cost: $4 per 1 million characters
Per-character: ~$0.000004
Examples:
"Thank you for calling." (24 chars) = $0.000096
Full IVR menu (500 chars) = $0.002
1,000 menu plays/month = $2.00/month
Neural Voices (Recommended):
Cost: $16 per 1 million characters
Per-character: ~$0.000016
Examples:
"Thank you for calling." (24 chars) = $0.000384
Full IVR menu (500 chars) = $0.008
1,000 menu plays/month = $8.00/month
SSML Characters:
SSML markup does NOT count toward character usage.
Example:
<speak>
<emphasis>Hello</emphasis> <break time="500ms"/> world
</speak>
Billable characters: 11 ("Hello world")
Non-billable: SSML tags (<speak>, <emphasis>, <break>)
Cost Optimization
Strategies to Reduce Costs:
1. Cache Static Prompts:
Generate once, use unlimited times:
Main menu: Generate once = $0.008
Played 10,000 times = Still $0.008 total
Savings: 99.9% vs. generating every time
2. Use Standard Voices for Non-Critical Prompts:
Neural voice: $0.008 per menu
Standard voice: $0.002 per menu
For internal/hold music announcements where quality less critical
Savings: 75%
3. Optimize Text Length:
❌ Wordy: "We would like to take this opportunity to thank you for taking the time to call our company today." (115 chars)
✅ Concise: "Thank you for calling." (24 chars)
Savings: 79% fewer characters = 79% lower cost
4. Pre-Generate Common Variables:
Instead of: "You are number {{queue_position}} in line"
Pre-generate: "You are number 1 in line" through "You are number 20 in line"
Generate 20 versions once vs. thousands of dynamic generations
Savings: 95%+ for high-traffic queues
Troubleshooting
Pronunciation Issues
Problem: TTS mispronounces company name, product, or acronym
Solutions:
1. Phonetic Spelling:
Wrong: "TheVoĉo" (pronounced "vo-ko")
Right: "Voco" (should be "vo-ca")
Fix: Phonetic helper
<phoneme alphabet="ipa" ph="ˈvoʊ.koʊ">TheVoĉo</phoneme>
2. Spell Out:
<say-as interpret-as="spell-out">TheVoĉo</say-as>
<!-- Output: "V O C O" -->
3. Alternative Spelling:
Wrong: "Acme" (pronounced "Ack-mee")
Try: "Ackmee" or "Ack-me"
4. Build Custom Lexicon:
<!-- Admin → TTS → Custom Pronunciations -->
Word: SQL
Pronunciation: "sequel" or "S Q L"
Word: Kubernetes
Pronunciation: <phoneme>koo-ber-net-eez</phoneme>
Voice Quality Issues
Problem: TTS voice sounds robotic or unnatural
Solutions:
1. Upgrade to Neural Voices:
Standard Voice → Neural Voice
Cost: 4x more, but significantly more natural
2. Add SSML Prosody:
<!-- Add natural variation -->
<prosody rate="95%" pitch="-1st">
Welcome to our company.
</prosody>
3. Use Punctuation:
❌ "Welcome to Acme Corporation for sales press 1 for support press 2"
✅ "Welcome to Acme Corporation. For sales, press 1. For support, press 2."
4. Add Breaks:
<speak>
Welcome to Acme Corporation.
<break time="500ms"/>
For sales, press 1.
<break time="300ms"/>
For support, press 2.
</speak>
Generation Failures
Problem: TTS generation fails or takes too long
Diagnostic:
Check TTS Dashboard:
- API Status: ✅ Operational
- Queue Depth: 0 requests
- Average Generation Time: 250ms
- Error Rate: 0.0%
Common Causes:
1. Invalid SSML:
<!-- Wrong: Unclosed tag -->
<speak>
<emphasis>Hello
</speak>
<!-- Right: Properly closed -->
<speak>
<emphasis>Hello</emphasis>
</speak>
2. Text Too Long:
Max characters: 3,000 per generation
Solution: Split into multiple prompts
3. Unsupported Characters:
Remove special characters: ©, ™, ®, emoji
Use: (c), (TM), (R), spelled-out emotions
4. Rate Limiting:
Limit: 100 requests per minute
Solution: Implement request queuing or upgrade plan
Advanced Features
Voice Cloning (Enterprise)
Custom Voice Creation:
Create a custom voice based on your company spokesperson or brand:
Process:
- Record 30-60 minutes of high-quality audio
- Submit audio for voice training (2-4 weeks)
- Custom voice becomes available in TTS engine
- Use custom voice across all IVR prompts
Benefits:
- Consistent brand voice across all channels
- Professional spokesperson without ongoing recording costs
- Update prompts anytime in spokesperson's voice
- Multi-language support with same voice characteristics
Cost: Contact sales for pricing (Enterprise plan)
A/B Testing
Test Voice Effectiveness:
Scenario: Which voice/script converts better?
Version A:
Voice: Matthew (Authoritative Male)
"Press 1 for sales."
Version B:
Voice: Jennifer (Friendly Female)
"If you'd like to speak with our sales team, press 1."
Metrics:
- Conversion rate (press 1 vs. hang up)
- Average time to decision
- Caller satisfaction
Implementation:
- Route 50% of calls to each version
- Track metrics for 1-2 weeks
- Deploy winning version to 100%
Getting Help
TTS Support
Need help with Text-to-Speech?
Common Questions:
- Pronunciation issues: Submit word with desired pronunciation
- Voice selection: Request voice samples for your use case
- SSML help: Provide desired effect, we'll suggest markup
- Cost optimization: Share your usage, we'll recommend strategies
Contact:
- Email: [email protected]
- Include: Text, desired output, current result (if applicable)
- Response time: < 4 hours (business hours)
Resources:
- 📹 Video Tutorial: Creating Professional IVR with TTS (8 minutes)
- 📄 SSML Quick Reference Guide (PDF)
- 🎧 Voice Samples: Listen to all available voices
- 📚 TTS Best Practices Handbook
Next Steps
Get Started with TTS:
- ✅ Log into Admin Portal → IVR Builder
- ✅ Create or edit existing IVR
- ✅ Add TTS prompt with simple text
- ✅ Preview and adjust voice settings
- ✅ Deploy and test with live call
- ✅ Iterate based on caller feedback
Explore Advanced Features:
- Dynamic prompts with variables
- SSML for advanced control
- Multi-language support
- A/B testing for optimization
Related Documentation: