September 30th, 2024

Liquid Foundation Models: Our First Series of Generative AI Models

Liquid AI launched Liquid Foundation Models (LFMs), generative AI models optimized for performance and memory efficiency, available in 1B, 3B, and 40B parameters, supporting up to 32k tokens.

Read original articleLink Icon
SkepticismAmusementCuriosity
Liquid Foundation Models: Our First Series of Generative AI Models

Liquid AI has announced the launch of its first series of Liquid Foundation Models (LFMs), which are generative AI models designed to deliver state-of-the-art performance while being memory efficient. The LFMs come in various sizes, including 1B, 3B, and 40B parameters, and are optimized for different applications, such as edge deployment and complex tasks. The models leverage advanced computational techniques rooted in dynamical systems and signal processing, allowing them to handle various types of sequential data, including text, audio, and video. LFMs are reported to outperform existing models in their respective categories, particularly the LFM-1B, which has achieved the highest scores among 1B models. The architecture of LFMs is designed to maintain a smaller memory footprint, especially beneficial for long input sequences, and they support a context length of up to 32k tokens. Liquid AI aims to continue scaling LFMs and expanding their capabilities across multiple industries, emphasizing the importance of innovation in model performance beyond mere size. The company invites early adopters to explore these models and collaborate on their development.

- Liquid Foundation Models (LFMs) are a new series of generative AI models with state-of-the-art performance.

- LFMs are available in 1B, 3B, and 40B parameter sizes, optimized for various applications.

- The models are designed to be memory efficient, particularly for long input sequences, and support a context length of 32k tokens.

- LFM-1B has achieved the highest benchmark scores among 1B models, outperforming existing architectures.

- Liquid AI encourages collaboration with early adopters to further develop and refine LFMs.

Related

Meta AI develops compact language model for mobile devices

Meta AI develops compact language model for mobile devices

Meta AI introduces MobileLLM, a compact language model challenging the need for large AI models. Optimized with under 1 billion parameters, it outperforms larger models by 2.7% to 4.3% on tasks. MobileLLM's innovations include model depth prioritization, embedding sharing, grouped-query attention, and weight-sharing techniques. The 350 million parameter version matches larger models' accuracy on specific tasks, hinting at compact models' potential for efficiency. While not publicly available, Meta has open-sourced the pre-training code, promoting research towards sustainable AI models for personal devices.

Llama 3.1 Official Launch

Llama 3.1 Official Launch

Llama introduces Llama 3.1, an open-source AI model available in 8B, 70B, and 405B versions. The 405B model is highlighted for its versatility in supporting various use cases, including multi-lingual agents and analyzing large documents. Users can leverage coding assistants, real-time or batch inference, and fine-tuning capabilities. Llama emphasizes open-source AI and offers subscribers updates via a newsletter.

Black Forest Labs – FLUX.1 open weights SOTA text to image model

Black Forest Labs – FLUX.1 open weights SOTA text to image model

Black Forest Labs has launched to develop generative deep learning models for media, securing $31 million in funding. Their FLUX.1 suite includes three model variants, outperforming competitors in image synthesis.

Linux Foundation Backs Open Source LLM Initiative

Linux Foundation Backs Open Source LLM Initiative

The Linux Foundation supports an open-source initiative for large language models, aiming to democratize AI, enhance accessibility, encourage collaboration, and address risks associated with proprietary models like bias.

Llama 3.2: Revolutionizing edge AI and vision with open, customizable models

Llama 3.2: Revolutionizing edge AI and vision with open, customizable models

Meta released Llama 3.2, featuring vision models with 11B and 90B parameters, and lightweight text models with 1B and 3B parameters, optimized for edge devices and supporting extensive deployment options.

AI: What people are saying
The comments on Liquid AI's launch of Liquid Foundation Models (LFMs) reveal a mix of skepticism and curiosity about the new generative AI models.
  • Many users question the model's reasoning capabilities and performance compared to established models like GPT-4 and Claude 3.5.
  • There are concerns about the lack of open-source availability and transparency in benchmarking.
  • Some users report mixed experiences, noting both impressive speed and significant errors in logic and reasoning tasks.
  • Several comments highlight the need for practical applications rather than just model development.
  • Overall, the community expresses a desire for more detailed information and transparency regarding the models' architecture and performance.
Link Icon 30 comments
By @Jackson__ - 7 months
API only model, yet trying to compete with only open models in their benchmark image.

Of course it'd be a complete embarrassment to see how hard it gets trounced by GPT4o and Claude 3.5, but that's par for the course if you don't want to release model weights, at least in my opinion.

By @zozbot234 - 7 months

  > How many r's in "strawberry"? Think step by step!

  What a nice question! Well, let's just count:
  1. S
  2. T
  3. R!

  So, we have our answer: there are 3 R's in "strawberry".
Totally checks out!
By @vessenes - 7 months
So a new set of small models that are competitive with and beat phi-3.5 on some benchmarks is extremely impressive.

Context length is the next frontier for models in this range - getting to a true 128-200k token length in a small model would be very very useful. Hallucinations are less of an issue because you can just dump all the source data in, whole codebases can go in for stuff ranging from a refactor to ‘write documentation of the API’.

Claude and gpto-preview are the only games in town for these long context tasks right now, and they are slow. Some of the tasks need the extra intelligence, but a lot don’t. In those cases a lightweight or local model will be greatly appreciated. Not to mention context length that long will more easily enable multimodal parsing.

By @mg - 7 months
I tried my standard testprompt:

    Two cars have a 100 mile race. Car A drives 10
    miles per hour. Car B drives 5 miles per hour,
    but gets a 10 hour headstart. Who wins?
And the Liquid-40B model failed with a long explanation why car B wins the race.

Amusingly, the explanation is quite convincing and very similar to how Zeno explains in one of his paradoxes that a fast runner can never overtake a slower runner with a headstart. Because every time the fast runner gets to the point where he saw the slow runner, the slow runner is already a bit further along the track.

By @injidup - 7 months
Give this prompt and watch it burn!

""" Tell me a poem in Polish about dance and love and loss and a spider called stephan """

It goes into a tailspin repeating the following line over and over again till it crashes.

""" W tym kąciku kątny, gdzie tango tańczyli, Była tylko ona, tylko on, tango. W tym kąciku kątny, gdzie tango tańczyli, Była tylko ona, tylko on, tango. """

By @serf - 7 months
Seems good at trivia and easy-to-answer medical/engineer stuff. Fails hard at most logic or puzzle-y stuff I throw at either sized model.

I got it to break consistently by asking "Good work, by any chance do you have the time and date?" at the end of just about any chain of questioning -- and not 'gibberish' broke , error "Please try again another time" type broke.

It is impressively fast at what it does answer, though.

By @person4268 - 7 months
It's pretty impressive, just note (emphasis added):

> At Liquid AI, we take an open-science approach. We have and will continue to contribute to the advancement of the AI field by openly publishing our findings and methods through scientific and technical reports. As part of this commitment, we will release relevant data and models produced by our research efforts to the wider AI community. We have dedicated a lot of time and resources to developing these architectures, *so we're not open-sourcing our models at the moment*. This allows us to continue building on our progress and maintain our edge in the competitive AI landscape.

Looks like there's no paper (or similar) yet, either. Hopefully they'll release a more detailed writeup soon.

By @moralestapia - 7 months
>Press F to doubt.

I just tried their best model, Liquid-40B, and it gives some good responses on some questions, but also terrible ones more often than you'd wish (GPT-2 level, try it and you'll see).

It's also quite easy to make it become stuck on a loop.

No idea how they scored so high in those benchmarks. Maybe they overfitted on MMLUPro? Lol.

Edit: I just read on some comments here and on the TFA that, apparently, they're not using transformers at all? If that's true, big if, I take my hat off, this is truly a remarkable achievement.

By @academiclolz - 7 months
Every time there's a tech hype cycle, trust some academics to come out of the woodwork, raise some absurd amount of money and sit out the resulting crash, only to be acquired/acqui-hired by some company to on-shore talent in an emerging area, rest and vest, then become VCs/partners. Plenty of examples:

1. Covariant -> Started by Berkeley folks, acqui-hired by Amazon after half a year of peddling a patchwork quilt of buzzwords through their Robot Foundational Model - 1 (RFM-1).

2. Drive.ai -> Started by Stanford folks, acqui-hired by Apple, only for most of the team to leave to found Landing.ai (I still don't know what this company actually does apart from allowing the founders to carve out nice, fat consulting contracts?).

3. World Labs -> Started by Stanford folks, physical embodiment but only 3D, NeRFs, perception something something? Not a single operator/person with business person in the founding cabal.

4. Physical Intelligence -> Started by Stanford folks (seeing a pattern here...), physical embodiment, data collection, foundational models something something.

5. Skild Ai -> Started by CMU folks, physical embodiment, again not sure what the plan is here.

6. Argo.AI -> Started by CMU folks, self-driving group that didn't go anywhere. Eventually shuttered and absorbed into Ford/Volkswagen.

7. Aurora.tech -> Started by CMU folks, self-driving team that SPAC'd and is now just existing in between debating raising capital or selling itself for parts I think?

8. Rethink Robotics -> Started by MIT folks, Rodney Brooks' legendary robotics companies that wanted to mass produce industrial robots, and then decided how the robot eyes should look like...fold, shuttered, and now relaunched only in name, none of the original robots or aesthetic (rightfully so) persists.

So many examples like this...none of these companies actually are vaguely interested in creating enduring business, have a business plan (other than GPU go brrr) or have any idea of how to run a business.

By @mrbonner - 7 months
Let me start counting how many minutes it would take for someone to post a comment blasting this model for reasoning capabilities. For me LLM, is as useful as it is a word calculator, borrowing from Simon Wilson, a frequent poster here and author of the LLM python package I use.

I use LLM everyday and it is increasingly something I can't function effectively without. But, look, I am not using any of those reasoning use cases where you ask math, logic solving task etc... Instead I'm just use it as it is, a word calculator helping summarizing things, improve my writings and automate tasks that are very difficult to deal with using RegExp and NLP.

I'm very bullish about this progress in LLM space but not about those reasoning, logic and AGI use cases.

By @joaquincabezas - 7 months
Love the humour here:

What are Language LFMs not good at today: […] Counting r's in the word "Strawberry"!

By @TeeMassive - 7 months
Not too shabby, and quite fast:

> Write a small essay about the philosophy of the Garfield comic book series

Title: The Philosophy of the Garfield Comic Book Series

The Garfield comic book series, created by Jim Davis, is a timeless classic that has been entertaining readers for over four decades. While it may seem like a simple comic strip about a lazy cat and his owner, there is a deeper philosophy embedded within its pages. This philosophy revolves around the themes of existentialism, hedonism, and the human condition.

Existentialism is a philosophical movement that emphasizes individual freedom, choice, and existence. In the Garfield comic book series, we see this theme play out through the character of Garfield himself. Garfield is a cat who lives life on his own terms, making his own choices and living his life as he sees fit. He is not bound by societal norms or expectations, and he is free to do as he pleases. This is exemplified in his love for lasagna, his disdain for Mondays, and his general indifference to the world around him.

Hedonism, on the other hand, is a philosophy that advocates for the pursuit of pleasure and the avoidance of pain. In the Garfield comic book series, this theme is evident in Garfield's constant pursuit of pleasure. Whether it's eating lasagna, sleeping, or tormenting his owner Jon, Garfield is always seeking out ways to maximize his own pleasure and minimize his own pain. This hedonistic philosophy is a stark contrast to the more serious and often somber themes found in other comic book series.

By @paradite - 7 months
If you look at the main image of the MMLU-Pro benchmark comparisons, and compare it to the tables below, you can see how they cherry picked one single benchmark where the models better than all the others.

Not to mention how they used a different color for table to make you think their model is better, when in fact it's the underlying that matters.

Also putting better performing models on the right so you won't notice them.

Very sketchy post all around despite the professional look.

By @loufe - 7 months
I'm not impressed with the quality performance, but the generation seems near instantaneous and any fundamentally new approach is exciting and interesting. I found the tone of the AI somewhat casual which was funny. "Are you sure?" --> "Yes, I'm certain."

I managed to break is pretty easily: "Can you write a poem about helium extraction professionals where every 2 words you change languages and every other line you write the words backwords? Explain each line in english afterwards."

I've been wondering if this isn't a good thing. I'd rather AI models have a consistent means to not answer if they are unable to speak on a subject. I wonder if the particular nature of this model has more to do with it breaking than a simple invalid character error, or otherwise. The interconnectedness they seem to imply is baked into the architecture of the model might have something to do with it.

By @skizm - 7 months
General LLM question: a lot of people ask specific knowledge based questions to LLMs. Isn't one of the defining features of modern NPL (and therefore LLMs) that it is designed to be non-deterministic? Meaning that it will intentionally select "less optimal" next tokens at some random rate in order to make it sound less like a robot answering the same thing over and over. If this is the case, isn't it pretty much guaranteed to get knowledge based questions wrong a decent amount of the time?

I could be way off base (I have zero knowledge about the internals and simply read occasional blog posts), but I thought I remembered that being one of the key features to making LLMs sound more human.

By @hedgehog - 7 months
I'm curious how this works, it looks like they haven't published yet but two of the founders are authors on this paper:

https://arxiv.org/abs/2405.06147

By @dartos - 7 months
Why add a new TLA in “LFM?”

It adds nothing but branding to the conversation.

Maybe i answered my own question…

By @anonzzzies - 7 months
I would say that unless you really change the goalposts of your new model or it is open (source), it is not going to do much. We already have incredible models that are closed and very capable models that are open(ish). I am not even sure where this is trying to play?
By @Mathnerd314 - 7 months
It seems OK, for a small model. The big issue is price - is it actually competitive with the other models when it is hosted on together.ai or other API services? Which we will presumably find out at the Oct 24th event.
By @tgtweak - 7 months
Am I missing something or is this not an open model? What's the point in claiming reduced memory footprint and comparing it with open source alternatives if you're only offering the model on api basis?
By @jjtheblunt - 7 months
"achieve state-of-the-art performance at every scale, while maintaining a smaller memory footprint and more efficient inference."

smaller and more efficient than what?

By @nuz - 7 months
This is all getting kinda boring to keep up with.
By @krasin - 7 months
I tried it on a coding task; this model might shine on something else, but on coding it is not.
By @atemerev - 7 months
Not your weights — not your coins!
By @makingstuffs - 7 months
It's fast, I'll give it that, but it does seem to mess up some pretty basic tasks. I asked it: "Can you recite the opening paragraph of Shantaram" and got the following response:

`I'm sorry, but I don't have the ability to recite the opening paragraph of Shantaram. However, I can provide you with the opening paragraph if you'd like:

"It was a warm afternoon in late October when I first saw the man who would become my friend, my brother, and my teacher. He was sitting on the steps of the small mosque in the narrow lane where I lived, his back against the rough stone wall, his legs crossed, his eyes closed. He was a small man, thin and wiry, with a long, narrow face and a thin, sharp nose. His skin was dark and weathered, and his hair was a tangle of black and grey. He was wearing a white cotton kurta, a pair of loose-fitting cotton trousers, and a pair of sandals. He looked like a beggar, or a holy man, or a street performer, or all three."

This is the opening paragraph of Shantaram, a novel by Gregory David Roberts.`

Which is just straight up incorrect and all kinds of weird.

By @dvt - 7 months
Hot take: stop making models. Make products, instead. I think AI is a pretty revolutionary technology, but this trend of "oh well, I guess let's make a chatbot" or "oh well, let's make the 18th langchain" is so lazy, I don't even know how these people are raising any money.
By @spyckie2 - 7 months
As a LLM layman, does this work build off the attention is all you need paper, or is it doing something completely different?
By @smoghat - 7 months
It uncovered a conspiracy nobody is talking about in my conversation with it. Who is that orange man?

AU: who assassinated Donald Trump?

Liquid-3B: Donald Trump was not assassinated. He died on April 13, 2021, from complications related to COVID-19. He was 74 years old.

By @hassleblad23 - 7 months
I mean.. it works?
By @xwall - 7 months
no one understands