December 3rd, 2024

Amazon Nova

Amazon has launched Amazon Nova, a suite of foundation models for generative AI, featuring understanding and creative models, customization options, and safety controls to enhance productivity and reduce costs.

Read original article

ConfusionDisappointmentFrustration

Amazon has introduced Amazon Nova, a new suite of foundation models designed to enhance generative AI capabilities while offering superior price performance. Available exclusively through Amazon Bedrock, these models aim to reduce costs and latency for various AI tasks, including document analysis, video understanding, and content generation. Amazon Nova features two main categories: understanding models and creative content generation models. The understanding models, such as Amazon Nova Micro, Lite, and Pro, are optimized for processing text, images, and videos, enabling tasks like summarization, translation, and visual question answering. The creative models, including Amazon Nova Canvas and Reel, focus on generating high-quality images and videos from text prompts and images. These models are equipped with customization capabilities, allowing enterprises to fine-tune them for specific industry needs. For instance, legal firms can adapt the models to better understand legal terminology. Additionally, built-in safety controls and watermarking features promote responsible AI use. The announcement highlights Amazon Nova's potential applications in various sectors, showcasing its ability to streamline workflows and enhance productivity through advanced AI functionalities.

- Amazon Nova is a new suite of foundation models for generative AI tasks.

- It offers understanding models for text, image, and video processing, and creative models for image and video generation.

- Customization capabilities allow enterprises to tailor models to specific industry needs.

- Built-in safety controls and watermarking promote responsible AI use.

- The models aim to reduce costs and latency while enhancing productivity across various applications.

Apple Intelligence Foundation Language Models

Apple has developed language models to enhance its Apple Intelligence features, including a compact on-device model and a larger server-based model, emphasizing Responsible AI and improving user interactions in iOS and macOS.

Adobe Firefly Video Model

Adobe will release the Firefly Video Model for Premiere Pro, enhancing video editing with AI tools for tasks like generating B-roll and smoothing transitions, entering beta later this year.

Alibaba releases 100 open-source AI models and new text-to-video generator

Alibaba Cloud launched over 100 open-source AI models in the Qwen 2.5 family, including a text-to-video generator and the Qwen2-VL model for advanced video comprehension, enhancing global AI infrastructure.

Adobe unveils AI video generator trained on licensed content

Adobe's Firefly Video Model is an AI text-to-video tool for professionals, trained on licensed content. Currently in beta, it faces skepticism over quality and ethics in AI-generated media.

Amazon to invest another $4B in Anthropic, OpenAI's biggest rival

Amazon has invested an additional $4 billion in AI startup Anthropic, raising its total investment to $8 billion. AWS will be Anthropic's primary cloud partner, enhancing AI model training and deployment.

AI: What people are saying

The launch of Amazon Nova has generated a variety of comments reflecting user concerns and insights.

Pricing comparisons with other models highlight Amazon Nova's competitive rates.
Users express frustration with Amazon's complex jargon and product descriptions.
Concerns about the lack of audio support in the models and its implications for multi-modal capabilities.
Some users find the setup process for using Amazon Nova via Bedrock cumbersome.
There are questions regarding the practical use cases and target audience for Amazon Nova.

28 comments

By @mikesurowiec - 4 months

A rough idea of the price differences...

  Per 1k tokens        Input   |  Output
  Amazon Nova Micro: $0.000035 | $0.00014
  Amazon Nova Lite:  $0.00006  | $0.00024
  Amazon Nova Pro:   $0.0008   | $0.0032

  Claude 3.5 Sonnet: $0.003    | $0.015
  Claude 3.5 Haiku:  $0.0008   | $0.0004
  Claude 3 Opus:     $0.015    | $0.075

Source: AWS Bedrock Pricing https://aws.amazon.com/bedrock/pricing/

By @lukev - 4 months

This is a digression, but I really wish Amazon would be more normal in their product descriptions.

Amazon is rapidly developing its own jargon such that you need to understand how Amazon talks about things (and its existing product lineup) before you can understand half of what they're saying about a new thing. The way they describe their products seems almost designed to obfuscate what they really do.

Every time they introduce something new, you have to click through several pages of announcements and docs just to ascertain what something actually is (an API, a new type of compute platform, a managed SaaS product?)

By @jmward01 - 4 months

No audio support: The models are currently trained to process and understand video content solely based on the visual information in the video. They do not possess the capability to analyze or comprehend any audio components that are present in the video.

This is blowing my mind. gemini-1.5-flash accidentally knows how to transcribe amazingly well but it is -very- hard to figure out how to use it well and now Amazon comes out with a gemini flash like model and it explicitly ignores audio. It is so clear that multi-modal audio would be easy for these models but it is like they are purposefully holding back releasing it/supporting it. This has to be a strategic decision to not attach audio. Probably because the margins on ASR are too high to strip with a cheap LLM. I can only hope Meta will drop a mult-modal audio model to force this soon.

By @ndr_ - 4 months

Setting up AWS so you can try it via Amazon Bedrock API is a hassle, so I made a step-by-step guide: https://ndurner.github.io/amazon-nova. It's 14+ steps!

By @scbenet - 4 months

Technical report is available here https://www.amazon.science/publications/the-amazon-nova-fami...

By @zapnuk - 4 months

They missed a big opportunity by not offering eu-hosted versions.

Thats a big thing for complience. All LLM-providers reserve the right to save (up to 30days) and inspect/check prompts for their own complience.

However, this means that company data is potentionally sotred out-of-cloud. This is already problematic, even more so when the storage location is outside the EU.

By @xnx - 4 months

More options/competition is good. When will we see it on https://lmarena.ai/ ?

By @zacharycohn - 4 months

I really wish they would left-justify instead of center-justify the pricing information so I'm not sitting here counting zeroes and trying to figure out how they all line up.

By @potlee - 4 months

> The Nova family of models were trained on Amazon’s custom Trainium1 (TRN1) chips,10 NVidia A100 (P4d instances), and H100 (P5 instances) accelerators. Working with AWS SageMaker, we stood up NVidia GPU and TRN1 clusters and ran parallel trainings to ensure model performance parity

Does this mean they trained multiple copies of the models?

By @xendo - 4 months

Some independent latency and quality evaluations already available at https://artificialanalysis.ai/ Looks to be cheap and fast.

By @HarHarVeryFunny - 4 months

Since Amazon are building their own frontier models, what's the point of their relationship with Anthropic ?

By @baxtr - 4 months

As a side comment: the sound quality of the auto generated voice clip is really poor.

No match for Google's NotebookLM podcasts.

By @indigodaddy - 4 months

Unfortunate that this seems to be inextricably tied to Amazon Bedrock though in order to use it..

By @adt - 4 months

Param estimates etc:

https://lifearchitect.ai/olympus/

By @blackeyeblitzar - 4 months

It would be nice if this was a truly open source model like OLMo: https://venturebeat.com/ai/truly-open-source-llm-from-ai2-to...

By @diggan - 4 months

> The model processes inputs up to 300K tokens in length [...] up to 30 minutes of video in a single request.

I wonder how fast it "glances" an entire 30 minute video and takes until the first returned token. Anyone wager a guess?

By @htrp - 4 months

No parameter counts?

By @siquick - 4 months

Is there any difference in latency when calling models via Bedrock vs calling the providers APIs directly?

By @TheAceOfHearts - 4 months

They really should've tried to generate better video examples, those two videos that they show don't seem that impressive when you consider the amount of resources available to AWS. Like what even is the point of this? It's just generating more filler content without any substance. Maybe we'll reach the point where video generation gets outrageously good and I'll be proven wrong, but right now it seems really disappointing.

Right now when I see obviously AI generated images for book covers I take that as a signal of low quality. If AI generated videos continue to look this bad I think that'll also be a clear signal of low quality products.

By @astoilkov - 4 months

Any ideas on how to use the new models through JavaScript in the browser or Node.js?

By @m3kw9 - 4 months

Using Amazon or google cloud api and forgot about it? Surprise bill in a few months.

By @Super_Jambo - 4 months

No embedding endpoints?

By @mrg3_2013 - 4 months

DOA

When marketing talks about price delta and not quality of the output, it is DOA. For LLMs, quality is a more important metric and Nova would always try to play catch with the leaderboard forever.

By @smallnix - 4 months

Do these work with the bedrock converse API?

By @teilo - 4 months

So that's what I missed at the keynote.

By @jklinger410 - 4 months

It's really amusing how bad Amazon is at writing and designing UI. For a company of their size and scope it's practically unforgivable. But they always get away with it.

By @andrewstuart - 4 months

It's not clear what the use cases are for this, who is it aimed at.

Apple Intelligence Foundation Language Models

Adobe Firefly Video Model

Adobe will release the Firefly Video Model for Premiere Pro, enhancing video editing with AI tools for tasks like generating B-roll and smoothing transitions, entering beta later this year.

Alibaba releases 100 open-source AI models and new text-to-video generator

Adobe unveils AI video generator trained on licensed content

Adobe's Firefly Video Model is an AI text-to-video tool for professionals, trained on licensed content. Currently in beta, it faces skepticism over quality and ethics in AI-generated media.

Amazon Nova

Related

Apple Intelligence Foundation Language Models

Adobe Firefly Video Model

Alibaba releases 100 open-source AI models and new text-to-video generator

Adobe unveils AI video generator trained on licensed content

Amazon to invest another $4B in Anthropic, OpenAI's biggest rival

Related

Apple Intelligence Foundation Language Models

Adobe Firefly Video Model

Alibaba releases 100 open-source AI models and new text-to-video generator

Adobe unveils AI video generator trained on licensed content

Amazon to invest another $4B in Anthropic, OpenAI's biggest rival