Features

AI Transcription

Real-time speech-to-text transcription powered by AI. Convert meetings, voice chats, and audio content into searchable text automatically.

5 min readUpdated December 2025

Overview

Jyv Desktop's AI transcription feature automatically converts speech to text in real-time. Perfect for meetings, interviews, podcasts, lectures, and any scenario where you need accurate transcripts without manual note-taking.

Local Processing: All transcription happens on your computer using AI. Your audio never leaves your device unless you explicitly share transcripts.

Key Features

Real-Time Transcription: Live captions with <1 second delay
95%+ Accuracy: Industry-leading speech recognition
90+ Languages: Support for major world languages
Speaker Identification: Automatically detect and label different speakers
Punctuation & Formatting: Intelligent capitalization and punctuation
Searchable Archive: Find past conversations instantly
Export Options: TXT, DOCX, PDF, SRT, VTT formats

Features

Real-Time Transcription

Watch your words appear as you speak with minimal lag:

Live Display: Floating overlay shows real-time captions
Auto-Correction: AI refines transcripts as context becomes clear
Confidence Scoring: See which words may need manual review
Instant Search: Find specific topics during live conversations

Speaker Diarization

Automatically identify and label different speakers in conversations:

Speaker Detection Example

{
  "transcription": {
    "speakers": [
      {
        "id": "speaker_1",
        "label": "John (You)",
        "voiceprint": "...",
        "color": "#4dabf7"
      },
      {
        "id": "speaker_2",
        "label": "Sarah",
        "voiceprint": "...",
        "color": "#51cf66"
      }
    ],
    "autoDetect": true,
    "maxSpeakers": 10
  }
}

Jyv Desktop learns to recognize frequent conversation partners and automatically labels them in future transcripts.

Smart Punctuation

AI adds proper punctuation, capitalization, and formatting:

Sentence Detection: Automatic periods, commas, question marks
Capitalization: Proper names, sentence starts, acronyms
Paragraph Breaks: Detect topic changes and speaker switches
Numbers & Dates: Convert spoken numbers to digits
Filler Word Removal: Optionally remove "um", "uh", "like"

Live Captions

Display real-time subtitles on screen:

Caption Display Options:

Floating Overlay: Transparent window anywhere on screen
Bottom Bar: Classic subtitle-style display
Notification Style: Toast notifications for key points
OBS Integration: Send captions directly to streaming software

Transcript History

All transcripts are automatically saved and searchable:

Unlimited Storage: Keep all transcripts (configurable retention)
Full-Text Search: Find specific words or phrases instantly
Date/Time Filtering: Browse by meeting date and time
Tagging System: Organize transcripts with custom tags
Favorites: Mark important conversations

Setup & Configuration

Enable Transcription
Open Jyv Desktop → Settings → Features → Transcription
Toggle "Enable AI Transcription" to ON.
First-time setup downloads the AI model (~500MB). Subsequent transcriptions are instant.
Select Language
Choose your primary language from the dropdown:
- Auto-Detect: Automatically identify language being spoken
- Single Language: Better accuracy if you always speak the same language
- Multi-Language: Detect and switch between languages mid-conversation

Configure Speaker Detection

Enable "Identify Speakers" to distinguish between different voices.

Train the system by speaking for 10-15 seconds to create your voiceprint.

Speaker Configuration

{
  "speakerDetection": {
    "enabled": true,
    "minSpeakers": 1,
    "maxSpeakers": 10,
    "autoLabel": true,
    "trainOnFirst": true        // Create voiceprint from first words
  }
}

Set Quality vs. Performance
Choose transcription quality level:
- Fast Mode: 90% accuracy, low CPU, real-time
- Balanced: 95% accuracy, moderate CPU (recommended)
- Accurate: 98% accuracy, high CPU, slight delay
Enable Live Captions (Optional)
Toggle "Show Live Captions" to display text on screen.
Customize appearance, position, and size in caption settings.

Language Support

Jyv Desktop supports 90+ languages with varying levels of accuracy:

Tier 1: 98%+ Accuracy

English (US, UK, AU)
Spanish (ES, LATAM)
French
German
Italian
Portuguese (BR, PT)
Mandarin Chinese
Japanese

Tier 2: 95%+ Accuracy

Korean
Russian
Dutch
Polish
Turkish
Swedish
Arabic
Hindi

Tier 3: 90%+ Accuracy

Czech
Danish
Finnish
Greek
Hebrew
Norwegian
Thai
+ 70 more languages

Multi-Language Mode: Enable auto-detection to transcribe conversations where people speak multiple languages. AI will detect language switches automatically.

Use Cases

Meeting Transcription

Scenario: Capture every detail from Zoom/Teams meetings

Setup:

Enable transcription for Zoom/Teams in meeting profile
Turn on speaker detection to identify participants
Auto-export transcript at meeting end
Share via email or save to note-taking app

Meeting Transcription Config

{
  "meetingTranscription": {
    "autoStart": true,
    "speakerDetection": true,
    "removeFillerswords": true,
    "highlightActionItems": true,
    "autoSummarize": true,
    "exportFormat": ["docx", "pdf"]
  }
}

Content Creation

Scenario: Transcribe podcast interviews and YouTube videos

Setup:

High accuracy mode for clean transcripts
Speaker labels for host and guests
Timestamp markers every 30 seconds
Export as SRT/VTT for video subtitles

Accessibility

Scenario: Live captions for deaf/hard-of-hearing users

Setup:

Enable floating caption window
Large font size for readability
High contrast color scheme
Fast mode for minimal delay
Send captions to second monitor

Learning & Education

Scenario: Transcribe lectures and study sessions

Setup:

Auto-save all transcripts with tags
Enable keyword highlighting (key terms)
Create summary notes automatically
Export to Notion, OneNote, or Evernote

Export & Integration

Export Formats

Export transcripts in multiple formats:

Plain Text (.txt): Simple, compatible everywhere
Microsoft Word (.docx): Formatted with speakers and timestamps
PDF: Professional, shareable format
SRT/VTT: Video subtitle formats for editing software
JSON: Raw data with metadata for developers
Markdown: For documentation and note-taking apps

Integration with Note Apps

Automatically send transcripts to your favorite apps:

Integration Settings

{
  "integrations": {
    "notion": {
      "enabled": true,
      "autoSend": true,
      "database": "Meeting Notes"
    },
    "onenote": {
      "enabled": true,
      "notebook": "Work Notes",
      "section": "Meetings"
    },
    "obsidian": {
      "enabled": true,
      "vault": "Personal",
      "folder": "Transcripts"
    }
  }
}

API Access

Developers can access transcripts via REST API:

API Example

// Fetch recent transcripts
const response = await fetch('http://localhost:8080/api/transcripts', {
  headers: { 'Authorization': 'Bearer YOUR_API_KEY' }
});

const transcripts = await response.json();

// Get specific transcript
const transcript = await fetch(
  'http://localhost:8080/api/transcripts/abc123'
);

const data = await transcript.json();
console.log(data.text);       // Full transcript text
console.log(data.speakers);   // Speaker information
console.log(data.timestamps); // Word-level timestamps

Privacy & Security

100% Local Processing: All transcription happens on your computer. Your audio and transcripts never leave your device unless you explicitly export them.

Data Storage

Encrypted Database: Transcripts stored in encrypted SQLite database
Local Only: No cloud storage or external servers
Configurable Retention: Auto-delete transcripts after X days
Manual Deletion: Delete individual transcripts anytime

Security Features

Privacy Settings

{
  "privacy": {
    "encryptTranscripts": true,
    "encryptionKey": "user-password-derived",
    "retentionDays": 90,           // Auto-delete after 90 days
    "excludeApps": ["signal.exe"], // Never transcribe these apps
    "pauseDuringPrivacy": true,    // Stop when screen locked
    "redactSensitive": true        // Hide credit cards, SSNs, etc.
  }
}

Troubleshooting

Transcription Not Starting

Solution:

Check Feature Status
Verify transcription is enabled in Settings → Features → Transcription
Verify AI Model Download
Ensure AI model is fully downloaded (Settings shows "Ready" status)
Check Microphone Input
Confirm audio is being received in Jyv Desktop's audio meter
Restart Transcription Engine
Settings → Transcription → Restart Engine

Poor Transcription Accuracy

Cause: Background noise, poor microphone, or wrong language

Solution:

Enable Accurate Mode for better quality
Ensure Noise Suppression is active
Verify correct language is selected
Use better microphone or improve room acoustics
Speak clearly and at consistent volume

Speaker Detection Not Working

Solution:

Train your voiceprint in transcription settings
Increase minimum speaker detection confidence
Manually label speakers after transcription
Ensure speakers have distinct voices (not too similar)
Reduce background noise for better voice separation

High CPU Usage

Solution:

Switch from Accurate to Fast or Balanced mode
Enable GPU Acceleration if available
Disable speaker detection if not needed
Close other resource-intensive applications
Reduce transcription sample rate (24kHz instead of 48kHz)

For transcription issues, check our Transcription Troubleshooting Guide or contact support.

Was this helpful?

Need more help?

Can't find what you're looking for? Our support team is here to help.

Contact Support Join Community

AI Transcription

Overview

Key Features

Features

Real-Time Transcription

Speaker Diarization

Smart Punctuation

Live Captions

Caption Display Options:

Transcript History

Setup & Configuration

Enable Transcription

Select Language

Configure Speaker Detection

Set Quality vs. Performance

Enable Live Captions (Optional)

Language Support

Tier 1: 98%+ Accuracy

Tier 2: 95%+ Accuracy

Tier 3: 90%+ Accuracy

Use Cases

Meeting Transcription

Content Creation

Accessibility

Learning & Education

Export & Integration

Export Formats

Integration with Note Apps

API Access

Privacy & Security

Data Storage

Security Features

Troubleshooting

Transcription Not Starting

Check Feature Status

Verify AI Model Download

Check Microphone Input

Restart Transcription Engine

Poor Transcription Accuracy

Speaker Detection Not Working

High CPU Usage

Was this helpful?

Related Documentation

Meeting Integration

API Reference

Configuration

Export Options

Need more help?