Prerequisites
- 
Install ffmpeg
- macOS: brew install ffmpeg
- Ubuntu: sudo apt-get install ffmpeg
- Windows: Download from https://ffmpeg.org/download.html
 
- macOS: 
- 
Install mlx-whisper library
- 
Prepare audio files
- Create a ‘storage/audio’ directory
- Place your audio files in this directory
- Supported formats: mp3, mp4, wav, etc.
 
- 
Download sample audio (optional)
- Visit the audio-samples (as an example) and save the audio file to the storage/audiodirectory.
 
- Visit the audio-samples (as an example) and save the audio file to the 
Example
The following agent will use MLX Transcribe to transcribe audio files.cookbook/tools/mlx_transcribe_tools.py
Toolkit Params
| Parameter | Type | Default | Description | 
|---|---|---|---|
| base_dir | Path | Path.cwd() | Base directory for audio files | 
| enable_read_files_in_base_dir | bool | True | Whether to register the read_files function | 
| path_or_hf_repo | str | "mlx-community/whisper-large-v3-turbo" | Path or HuggingFace repo for the model | 
| verbose | bool | None | Enable verbose output | 
| temperature | floatorTuple[float, ...] | None | Temperature for sampling | 
| compression_ratio_threshold | float | None | Compression ratio threshold | 
| logprob_threshold | float | None | Log probability threshold | 
| no_speech_threshold | float | None | No speech threshold | 
| condition_on_previous_text | bool | None | Whether to condition on previous text | 
| initial_prompt | str | None | Initial prompt for transcription | 
| word_timestamps | bool | None | Enable word-level timestamps | 
| prepend_punctuations | str | None | Punctuations to prepend | 
| append_punctuations | str | None | Punctuations to append | 
| clip_timestamps | strorList[float] | None | Clip timestamps | 
| hallucination_silence_threshold | float | None | Hallucination silence threshold | 
| decode_options | dict | None | Additional decoding options | 
Toolkit Functions
| Function | Description | 
|---|---|
| transcribe | Transcribes an audio file using MLX Whisper | 
| read_files | Lists all audio files in the base directory |