October 17th, 2024

The Prompt() Function: Use the Power of LLMs with SQL

MotherDuck's prompt() function integrates small language models into SQL, enabling efficient bulk text summarization and structured data extraction, significantly reducing processing times and allowing customizable output formats.

Read original articleLink Icon
The Prompt() Function: Use the Power of LLMs with SQL

The introduction of the prompt() function by MotherDuck allows users to integrate small language models (SLMs) like OpenAI's gpt-4o-mini directly into SQL queries, enhancing data processing capabilities. This function simplifies tasks such as text summarization and structured data extraction without requiring separate infrastructure. The prompt() function can be applied to all rows in a table, enabling bulk operations and significantly reducing processing time compared to traditional methods. For instance, summarizing comments from a dataset can be done in approximately 2.8 seconds for 100 rows, a considerable improvement over Python's sequential processing. The function also supports structured outputs, allowing users to define the format of the returned data, which can be easily integrated into analytical workflows. Users are encouraged to test the function on smaller datasets initially to evaluate its effectiveness and efficiency. The prompt() function is currently available in preview for users on the Free Trial or Standard Plan, with specific quotas on compute usage. MotherDuck invites feedback and experiences from users to further refine this functionality.

- The prompt() function integrates small language models into SQL for enhanced data processing.

- It allows for bulk text summarization and structured data extraction.

- Processing times are significantly reduced compared to traditional methods.

- Users can define output structures for easier integration into workflows.

- The function is available in preview with usage quotas for different plans.

Link Icon 4 comments
By @delichon - 4 months

  FROM hn.hacker_news
  LIMIT 100
"Oops I forgot the limit clause and now owe MotherDuck and OpenAI $93 billion."
By @domoritz - 4 months
I love the simplicity of this. Hurray for small models for small tasks.
By @mritchie712 - 4 months
By @korkybuchek - 4 months
Interesting -- is there any impact from LLM outputs not being deterministic?