August 22nd, 2024

LM Studio 0.3.0

LM Studio version 0.3.0 enhances its desktop application with document chat, Retrieval Augmented Generation, a Structured Output API, multiple UI themes, improved regeneration, and simplified migration of previous chats.

Read original article

CuriosityEnthusiasmFrustration

LM Studio has released version 0.3.0, enhancing its desktop application designed for running local large language models (LLMs) offline. This update introduces several new features, including the ability to chat with documents, where users can input documents for the LLM to reference. For lengthy documents, the application employs Retrieval Augmented Generation (RAG) to extract relevant sections. The update also supports an OpenAI-like Structured Output API, allowing for reliable JSON outputs. Users can now choose from multiple UI themes, including Dark, Light, and Sepia, and the application automatically configures load parameters based on the user's hardware. A new "Serve on Network" feature enables access to the LM Studio server from other devices, and users can organize chats into folders. The regeneration feature has been improved to allow multiple generations for each chat. Additionally, the update includes a refreshed UI, enhanced model loading capabilities, support for embedding models, and initial translations for several languages. Users can migrate their previous chats to the new version easily. Overall, LM Studio 0.3.0 aims to provide a more user-friendly and versatile experience for managing local LLMs.

- LM Studio 0.3.0 introduces document chat functionality and Retrieval Augmented Generation (RAG).

- The update supports an OpenAI-like Structured Output API for reliable JSON outputs.

- Users can now choose from multiple UI themes and organize chats into folders.

- The application automatically configures load parameters based on hardware capabilities.

- Migration from previous versions is simplified, allowing users to retain their chat history.

LLMs on the Command Line

Simon Willison presented a Python command-line utility for accessing Large Language Models (LLMs) efficiently, supporting OpenAI models and plugins for various providers. The tool enables running prompts, managing conversations, accessing specific models like Claude 3, and logging interactions to a SQLite database. Willison highlighted using LLM for tasks like summarizing discussions and emphasized the importance of embeddings for semantic search, showcasing LLM's support for content similarity queries and extensibility through plugins and OpenAI API compatibility.

LLMs can solve hard problems

LLMs, like Claude 3.5 'Sonnet', excel in tasks such as generating podcast transcripts, identifying speakers, and creating episode synopses efficiently. Their successful application demonstrates practicality and versatility in problem-solving.

Llama 3.1 Official Launch

Llama introduces Llama 3.1, an open-source AI model available in 8B, 70B, and 405B versions. The 405B model is highlighted for its versatility in supporting various use cases, including multi-lingual agents and analyzing large documents. Users can leverage coding assistants, real-time or batch inference, and fine-tuning capabilities. Llama emphasizes open-source AI and offers subscribers updates via a newsletter.

Llama 3.1: Our most capable models to date

Meta has launched Llama 3.1 405B, an advanced open-source AI model supporting diverse languages and extended context length. It introduces new features like Llama Guard 3 and aims to enhance AI applications with improved models and partnerships.

An Open Course on LLMs, Led by Practitioners

A new free course, "Mastering LLMs," offers over 40 hours of content on large language models, featuring workshops by 25 experts, aimed at enhancing AI product development for technical individuals.

AI: What people are saying

The comments reflect a mix of experiences and opinions regarding LM Studio and its recent updates.

Users appreciate the enhancements in version 0.3.0, particularly the new features like document chat and Retrieval Augmented Generation.
Some users express frustration over the lack of open-source availability and the limitations of the licensing model.
Comparisons are made between LM Studio and other tools like Ollama, with some users preferring the latter for its integration capabilities.
Several comments highlight the ease of use and accessibility of LM Studio for local AI experimentation.
Users seek more information on system requirements, changelogs, and performance benchmarks across different setups.

22 comments

By @yags - 8 months

Hello Hacker News, Yagil here- founder and original creator of LM Studio (now built by a team of 6!). I had the initial idea to build LM Studio after seeing the OG LLaMa weights ‘leak’ (https://github.com/meta-llama/llama/pull/73/files) and then later trying to run some TheBloke quants during the heady early days of ggerganov/llama.cpp. In my notes LM Studio was first “Napster for LLMs” which evolved later to “GarageBand for LLMs”.

What LM Studio is today is a an IDE / explorer for local LLMs, with a focus on format universality (e.g. GGUF) and data portability (you can go to file explorer and edit everything). The main aim is to give you an accessible way to work with LLMs and make them useful for your purposes.

Folks point out that the product is not open source. However I think we facilitate distribution and usage of openly available AI and empower many people to partake in it, while protecting (in my mind) the business viability of the company. LM Studio is free for personal experimentation and we ask businesses to get in touch to buy a business license.

At the end of the day LM Studio is intended to be an easy yet powerful tool for doing things with AI without giving up personal sovereignty over your data. Our computers are super capable machines, and everything that can happen locally w/o the internet, should. The app has no telemetry whatsoever (you’re welcome to monitor network connections yourself) and it can operate offline after you download or sideload some models.

0.3.0 is a huge release for us. We added (naïve) RAG, internationalization, UI themes, and set up foundations for major releases to come. Everything underneath the UI layer is now built using our SDK which is open source (Apache 2.0): https://github.com/lmstudio-ai/lmstudio.js. Check out specifics under packages/.

Cheers!

-Yagil

By @pcf - 8 months

In some brief testing, I discovered that the same models (Llama 3 7B and one more I can't remember) are running MUCH slower in LM Studio than in Ollama on my MacBook Air M1 2020.

Has anyone found the same thing, or was that a fluke and I should try LM Studio again?

By @smcleod - 8 months

Nice, it’s a solid product! It’s just a shame it’s not open source and its license doesn’t permit work use.

By @mythz - 8 months

Originally started out with LM Studio which was pretty nice but ended up switching to Ollama since I only want to use 1 app to manage all the large model downloads and there are many more tools and plugins that integrate with Ollama, e.g. in IDEs and text editors

By @xeromal - 8 months

I never could get anything local working a few years ago and someone on reddit told me about LM Studio and I finally managed to "run an AI" on my machine. Really cool and now I'm tinkering with it using the built in HTTP server

By @pornlover - 8 months

LM Studio is great, although I wish recommended prompts were part of the data of each LLM. I probably just don't know enough but I feel like I get hunk of magic data and then I'm mostly on my own.

Similarly with images, LLMs and ML in general feel like DOS and config.sys and autoexec.bat and qemm days.

By @TeMPOraL - 8 months

Does anyone know if there's a changelog/release notes available for all historical versions of this? This is one of those programs with the annoying habit to surface only the list of changes in the most recent version, and their release cadence is such that there are some 3 to 5 updates between the times I run, and then I have no idea what changed.

By @swalsh - 8 months

I LOVE LM studio, it's super convenient for testing model capabilities, and the OpenAI server makes it really easy to spin up a server and test. My typical process is to load it up in LM studio, test it, and when I'm happy with the settings, move to vllm.

By @qwertox - 8 months

Yesterday I wanted to find a conversation snippet in ChatGPT of a conversation I had maybe 1 or 2 weeks ago. Searching for a single keyword would have been enough to find it.

How is it possible that there's still no way to search through your conversations?

By @mark_l_watson - 8 months

Question for everyone: I am using the MLX version of Flux to generate really good images from text on my M2 Mac, but I don’t have an easy setup for doing text + base image to a new image. I want to be able to use base images of my family and put them on Mount Everest, etc.

Does anyone have a recommendation?

For context: I have almost ten years experience with deep learning, but I want something easy to set up in my home M2 Mac, or Google Colab would be OK.

By @fallinditch - 8 months

Does anyone know what advantages LM Studio has over Ollama, and vise versa?

By @webprofusion - 8 months

Cool, it's a bit weird that the Windows download is 32-bit, it should be 64-bit by default and there's no need for a 32-bit windows version at all.

By @IronWolve - 8 months

Been using LM studio for months on windows, its so easy to use, simple install, just search for the LLM off huggingface and it downloads and just works. I dont need to setup a python environment in conda, its way easier for people to play and enjoy. Its what I tell people who want to start enjoying LLM's without the hassle.

By @dgreensp - 8 months

I filed a GitHub issue two weeks ago about a bug that was enough for me to put it down for a bit, and there’s been not even a response. Their development velocity seems incredible, though. I’m not sure what to make of it.

By @alok-g - 8 months