Models for Conversational AI

Krisp Audio SDK can be integrated into the audio pipeline on the server side to clean background noises and voices from audio streams. For example, putting Krisp NC before Voice Activity Detection (VAD) in the audio pipeline results in much better “turn detection” (aka “unwanted interruptions”) for Conversational AI.

Use Cases

  • Conversational AI
  • Speech-to-Speech models
  • AI Voice Agents

Multiple Audio Stream Support

The SDK supports real-time processing of multiple audio streams using a single model loaded into memory, ensuring efficient memory utilization.

Supported Server Types

  • Linux x86_64, armv8-a
  • MacOS ARM, x86_64
  • Windows x86_64

Supported Languages

Supported Frameworks

  • KrispFilter is built into Pipecat
  • KrispNoiseFilter is included into LiveKit

Demo

Demo with Daily using Pipecat and Google Gemini Live

Recommended Models

Krisp offers different noise cancellation models optimized for different use cases, as shown in the following table.

  • NC (noise cancellation) models remove background noises and background chatter
  • BVC (background voice cancellation) models remove background noises and keep only primary speaker's voice
Audio SourceMic RequirementNCBVCKrisp Model
Telephony, Cellular, Landline (8Khz)Any+N/Ainbound NC
(c7.n.s.9f4389)
Mobile, Desktop, Browser
(WebRTC, 16kHz+)
Close-field talk or Headset++outbound BVC
(c6.f.s.de56df)
Mobile, Desktop, Browser
(WebRTC, 16kHz+)
Any+N/Aoutbound NC
(c8.f.s.026300)

For more details on models refer to Model Guide.