August 20th, 2024

Llamafile v0.8.13 (and Whisperfile)

Llamafile version 0.8.13 supports the Gemma 2B and Whisper models, allowing users to transcribe audio files. Compatibility requires 16kHz .wav format, with performance improved using GPU on M2 Max.

Read original article

The latest release of llamafile (version 0.8.13) introduces support for the Gemma 2B model and significant performance enhancements. It also adds compatibility with the Whisper speech-to-text model, utilizing Georgi Gerganov's C++ implementation of Whisper. Users can set up whisperfile by downloading the executable from GitHub and obtaining the whisper-tiny.en-q5_1.bin model from Hugging Face. The process involves running commands to transcribe audio files, with options to suppress debug output and save transcripts in JSON format. Users are advised to convert audio recordings to 16kHz .wav files using ffmpeg for compatibility. An update indicates that new whisperfiles have been uploaded to Hugging Face, which can automatically resample various audio formats. Performance tests show that transcribing a 10-minute audio file took 11 seconds with the tiny model and 1 minute 49 seconds with the larger Medium model. Utilizing GPU on an M2 Max MacBook Pro significantly reduced CPU usage and improved transcription speed.

- Llamafile v0.8.13 adds support for Gemma 2B and Whisper speech-to-text model.

- Users can transcribe audio files using whisperfile with various command options.

- Audio files must be converted to 16kHz .wav format for compatibility.

- New whisperfiles can automatically resample different audio formats.

- GPU usage can enhance transcription speed and reduce CPU load.

Run the strongest open-source LLM model: Llama3 70B with just a single 4GB GPU

The article discusses the release of open-source Llama3 70B model, highlighting its performance compared to GPT-4 and Claude3 Opus. It emphasizes training enhancements, data quality, and the competition between open and closed-source models.

Gemma 2 on AWS Lambda with Llamafile

Google released Gemma 2 9B, a compact language model rivaling GPT-3.5. Mozilla's llamafile simplifies deploying models like LLaVA 1.5 and Mistral 7B Instruct, enhancing accessibility to powerful AI models across various systems.

Llama 3.1 Official Launch

Llama introduces Llama 3.1, an open-source AI model available in 8B, 70B, and 405B versions. The 405B model is highlighted for its versatility in supporting various use cases, including multi-lingual agents and analyzing large documents. Users can leverage coding assistants, real-time or batch inference, and fine-tuning capabilities. Llama emphasizes open-source AI and offers subscribers updates via a newsletter.

Meta Llama 3.1 405B

The Meta AI team unveils Llama 3.1, a 405B model optimized for dialogue applications. It competes well with GPT-4o and Claude 3.5 Sonnet, offering versatility and strong performance in evaluations.

Show HN: LLM Aided Transcription Improvement

The LLM-Aided Transcription Improvement Project on GitHub enhances audio transcription quality using a multi-stage pipeline, supporting local and cloud-based models, requiring Python 3.12 for installation and execution.

0 comments

Run the strongest open-source LLM model: Llama3 70B with just a single 4GB GPU

Gemma 2 on AWS Lambda with Llamafile

Llama 3.1 Official Launch

Meta Llama 3.1 405B

The Meta AI team unveils Llama 3.1, a 405B model optimized for dialogue applications. It competes well with GPT-4o and Claude 3.5 Sonnet, offering versatility and strong performance in evaluations.

Llamafile v0.8.13 (and Whisperfile)

Related

Run the strongest open-source LLM model: Llama3 70B with just a single 4GB GPU

Gemma 2 on AWS Lambda with Llamafile

Llama 3.1 Official Launch

Meta Llama 3.1 405B

Show HN: LLM Aided Transcription Improvement

Related

Run the strongest open-source LLM model: Llama3 70B with just a single 4GB GPU

Gemma 2 on AWS Lambda with Llamafile

Llama 3.1 Official Launch

Meta Llama 3.1 405B

Show HN: LLM Aided Transcription Improvement