Amazon launched the Amazon Nova series of AI models during the AWS re:Invent conference in Las Vegas on December 3, 2024. The Nova family includes several models aimed at providing a wide range of capabilities from text generation to multimedia content creation:
- Amazon Nova Micro: A text-only model designed for low-latency responses at a very low cost. It’s suited for tasks like text summarization, translation, and basic coding, with a context length of 128K tokens.
- Amazon Nova Lite: A very low-cost, multimodal model that processes image, video, and text inputs to generate text outputs. It’s praised for its speed and cost-efficiency, making it ideal for interactive applications requiring quick responses.
- Amazon Nova Pro: This model combines high accuracy with speed and cost-effectiveness for a variety of tasks. It’s multimodal, supporting inputs of text, image, and video, and can handle complex tasks like video summarization, mathematical reasoning, and software development.
- Amazon Nova Premier: Scheduled for release in early 2025, this will be Amazon’s most capable multimodal model, focusing on complex reasoning tasks and serving as an excellent teacher model for custom model distillation.
Table of Contents
Additionally, Amazon introduced two creative models:
- Amazon Nova Canvas: Focused on generating studio-quality images from text or image prompts, offering features like inpainting, outpainting, and background removal.
- Amazon Nova Reel: A video generation model capable of producing short videos from text prompts or images, with controls over visual style and pacing. It’s noted for outperforming similar models in human evaluations, particularly in video quality and consistency.
These models are available through Amazon Bedrock, AWS’s platform for foundational AI models, which simplifies integration with existing AWS infrastructure. Amazon emphasizes the cost-effectiveness and speed of these models, claiming they are at least 75% cheaper and faster than competitors. The models also support fine-tuning and knowledge distillation for customization to specific business needs. Amazon Nova models are built with integrated safety measures, including watermarking for responsible AI use.
This launch positions Amazon more aggressively in the AI sector, directly competing with other tech giants in generative AI technology. Amazon’s strategy appears to focus on providing scalable, cost-effective solutions tailored to enterprise needs, leveraging its cloud infrastructure prowess to gain an edge.
Some details about Amazon Nova Pro
Amazon Nova Pro is a key model within Amazon’s newly announced “Nova” family of AI foundation models, introduced at the AWS re:Invent conference on December 3, 2024. Here are the detailed features and capabilities of Amazon Nova Pro:
- Multimodal Capabilities: Amazon Nova Pro is a highly capable multimodal model, meaning it can process text, images, and videos as inputs to generate text outputs. This makes it versatile for a wide array of applications that require understanding and interaction across different types of data.
- Performance: It’s noted for its balance of accuracy, speed, and cost. Amazon claims that Nova Pro achieves state-of-the-art performance across various benchmarks including visual question answering and video understanding. It’s also highlighted for its prowess in analyzing financial documents and processing code bases.
- Context Window: Nova Pro has a context length of up to 300K tokens, which allows it to handle complex tasks involving large amounts of text or multimedia, like processing over fifteen thousand lines of code or 30 minutes of video in a single request.
- Agentic Workflows: The model is particularly effective in agentic workflows, where it can call APIs and tools to execute complex, multi-step tasks. This is crucial for applications like AI assistants that need to interact with external systems or perform actions based on user queries.
- Fine-tuning and Distillation: It supports fine-tuning on text, image, and video inputs, allowing for customization based on proprietary data. Moreover, Nova Pro can serve as a “teacher” model for knowledge distillation, enabling the creation of smaller, more efficient models (like Nova Micro and Lite) without compromising on accuracy for specific use cases.
- Language Support: Amazon Nova Pro, like other Nova models, supports understanding and generation across over 200 languages, with particular strengths in major languages like English, German, and Spanish.
- Cost and Speed: Amazon states that Nova Pro is among the fastest models in its class within Amazon Bedrock and is at least 75% less expensive than competing models with similar capabilities.
- Integration with Amazon Bedrock: It’s available through Amazon Bedrock, Amazon’s platform for foundation models, where developers can easily integrate these models into their applications, manage data flows, and leverage AWS’s infrastructure for AI development.
- Responsible AI Use: Amazon emphasizes that its models, including Nova Pro, come with built-in controls like watermarking to ensure responsible use, aiming to combat issues like misinformation and harmful content generation.
Amazon Nova understanding models
Amazon Nova Micro, Amazon Nova Lite, and Amazon Nova Pro are understanding models that accept text, image, or video inputs and generate text output. They provide a broad selection of capability, accuracy, speed, and cost operation points.
- Fast and cost-effective inference across intelligence classes
- State-of-the-art text, image, and video understanding
- Fine-tuning on text, image, and video input
- Leading agentic and multimodal retrieval augmented generation (RAG) capabilities
- Easy integration to proprietary data and applications with Amazon Bedrock
Learn more: Benchmarks and examples
Amazon Nova creative content generation models
Amazon Nova Canvas and Amazon Nova Reel are creative content generation models that accept text and image inputs and produce image or video outputs. They are designed to deliver customizable high-quality images and videos for visual content generation.
- State-of-the-art image and video generation
- Control over your visual content generation
- Multiple approaches to customize and edit visual content
- Support for safe and responsible use of AI with watermarking and content moderation
Model versions
Amazon Nova Micro
Amazon Nova Micro is a text only model that delivers the lowest latency responses at very low cost. It is highly performant at language understanding, translation, reasoning, code completion, brainstorming, and mathematical problem-solving. With its generation speed of over 200 tokens per second, Amazon Nova Micro is ideal for applications that require fast responses.
Max tokens: 128k
Languages: 200+ languages
Fine-tuning supported: Yes, with text input.
Amazon Nova Lite
Amazon Nova Lite is a very low-cost multimodal model that is lightning fast for processing image, video, and text inputs. Amazon Nova Lite’s accuracy across a breadth of tasks, coupled with its lightning-fast speed, makes it suitable for a wide range of interactive and high-volume applications where cost is a key consideration.
Max tokens: 300k
Languages: 200+ languages
Fine-tuning supported: Yes, with text, image, and video input.
Amazon Nova Pro
Amazon Nova Pro is a highly capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Pro’s capabilities, coupled with its industry-leading speed and cost efficiency, makes it a compelling model for almost any task, including video summarization, Q&A, mathematical reasoning, software development, and AI agents that can execute multi-step workflows. In addition to state-of-the-art accuracy on text and visual intelligence benchmarks, Amazon Nova Pro excels at instruction following and agentic workflows as measured by Comprehensive RAG Benchmark (CRAG), the Berkeley Function Calling Leaderboard, and Mind2Web.
Max tokens: 300k
Languages: 200+ languages
Fine-tuning supported: Yes, with text, image, and video input.
Amazon Nova Premier
Coming soon
Amazon Nova Canvas
Amazon Nova Canvas is a state-of-the-art image generation model that creates professional grade images from text or images provided in prompts. Amazon Nova Canvas also provides features that make it easy to edit images using text inputs, controls for adjusting color scheme and layout, and built-in controls to support safe and responsible use of AI.
Max input characters: 1024
Languages: English
Fine-tuning supported: Coming soon
Amazon Nova Reel
Amazon Nova Reel is a state-of-the-art video generation model that allows customers to easily create high quality video from text and images. Amazon Nova Reel supports use of natural language prompts to control visual style and pacing, including camera motion control, and built-in controls to support safe and responsible use of AI.
Max input characters: 512
Languages: English
Fine-tuning supported: Coming soon