Amazon Nova Lite vs Amazon Nova Micro
Compare Amazon Nova Lite and Amazon Nova Micro across pricing, context window, capabilities, benchmarks, and API access to choose the better fit for tool-augmented workflows versus tool-augmented workflows.
Overview Comparison
Structured side-by-side differences for the highest-signal model metadata.
Provider
The entity that currently provides this model.
Model ID
The routed model identifier exposed by upstream providers.
Input Context Window
The number of tokens supported by the input context window.
Maximum Output Tokens
The number of tokens that can be generated by the model in a single request.
Open Source
Whether the model's code is available for public use.
Release Date
When the model was first released.
Knowledge Cut-off Date
When the model's knowledge was last updated.
API Providers
The providers that currently expose the model through an API.
Modalities
Types of data each model can process or return.
Pricing Comparison
Compare current token pricing before you choose the cheaper or more scalable API option.
Capabilities Comparison
See where each model overlaps, where they differ, and which one supports more of the features you care about.
Benchmark Comparison
Shared benchmark rows make it easier to compare performance where both models have published scores.
| Benchmark | Amazon Nova Lite | Amazon Nova Micro |
|---|---|---|
|
AIME 2024
American math olympiad problems
|
||
|
GPQA Diamond
PhD-level science questions (biology, physics, chemistry)
|
||
|
HLE
Questions that challenge frontier models across many domains
|
||
|
LiveCodeBench
Real-world coding tasks from recent competitions
|
||
|
MATH-500
Undergraduate and competition-level math problems
|
||
|
MMLU-Pro
Expert knowledge across 14 academic disciplines
|
||
|
SciCode
Scientific research coding and numerical methods
|
What Reddit discussions say about Amazon Nova Lite vs Amazon Nova Micro
Amazon Nova Lite and Amazon Nova Micro are both surfacing live Reddit discussions, giving this comparison a community layer beyond specs and benchmarks.
The most visible threads right now are clustered in r/aws, r/LLMDevs, r/BlackboxAI_. The feed below mixes discussion threads surfaced for each model so you can quickly spot where community sentiment overlaps or diverges.
Amazon just launched Nova 2 Lite models on Bedrock.
Now, you can use those models directly with Claude Code, and set automatic preferences on when to invoke the model for specific coding scenarios. Sample config below. This way you can mix/match different models based on coding use cases. Details in the demo folder here: [https://github.com/katanemo/archgw/tree/main/demos/use\_cases/claude\_code\_router](https://github.com/katanemo/archgw/tree/main/demos/use_cases/claude_code_router)
if you think this is useful, then don't forget to the star the project 🙏
# Anthropic Models
- model: anthropic/claude-sonnet-4-5
access_key: $ANTHROPIC_API_KEY
routing_preferences:
- name: code understanding
description: understand and explain existing code snippets, functions, or libraries
- model: amazon_bedrock/us.amazon.nova-2-lite-v1:0
default: true
access_key: $AWS_BEARER_TOKEN_BEDROCK
base_url: https://bedrock-runtime.us-west-2.amazonaws.com
routing_preferences:
- name: code generation
description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
- model: anthropic/claude-haiku-4-5
access_key: $ANTHROPIC_API_KEY
None of the MS models seem to be working for me. I get an error like:
`[API Error: 404 litellm.NotFoundError: NotFoundError: OpenrouterException - {"error":{"message":"No endpoints found that support tool use. To learn more about provider routing, visit:`
`https://openrouter.ai/docs/guides/routing/provider-selection","code":404}}. Received Model Group=blackboxai/microsoft/phi-4Available Model Group Fallbacks=None]`
Separately, the amazon/nova-lite-v1 model is s\*\*t... Offers vague recommendations and no specific fix for any code.
Amazon just launched Nova 2 Lite models on Bedrock.
Now, you can use those models directly with Claude Code, and set automatic preferences on when to invoke the model for specific coding scenarios. Sample config below. This way you can mix/match different models based on coding use cases. Details in the demo folder here: [https://github.com/katanemo/archgw/tree/main/demos/use\_cases/claude\_code\_router](https://github.com/katanemo/archgw/tree/main/demos/use_cases/claude_code_router)
# Anthropic Models
- model: anthropic/claude-sonnet-4-5
access_key: $ANTHROPIC_API_KEY
routing_preferences:
- name: code understanding
description: understand and explain existing code snippets, functions, or libraries
- model: amazon_bedrock/us.amazon.nova-2-lite-v1:0
default: true
access_key: $AWS_BEARER_TOKEN_BEDROCK
base_url: https://bedrock-runtime.us-west-2.amazonaws.com
routing_preferences:
- name: code generation
description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
- model: anthropic/claude-haiku-4-5
access_key: $ANTHROPIC_API_KEY
if you think this is useful, then don't forget to the star the project 🙏
Amazon just launched Nova 2 Lite models on Bedrock.
Now, you can use those models directly with Claude Code, and set automatic preferences on when to invoke the model for specific coding scenarios. Sample config below. This way you can mix/match different models based on coding use cases. Details in the demo folder here: [https://github.com/katanemo/archgw/tree/main/demos/use\_cases/claude\_code\_router](https://github.com/katanemo/archgw/tree/main/demos/use_cases/claude_code_router)
if you think this is useful, then don't forget to the star the project 🙏
# Anthropic Models
- model: anthropic/claude-sonnet-4-5
access_key: $ANTHROPIC_API_KEY
routing_preferences:
- name: code understanding
description: understand and explain existing code snippets, functions, or libraries
- model: amazon_bedrock/us.amazon.nova-2-lite-v1:0
default: true
access_key: $AWS_BEARER_TOKEN_BEDROCK
base_url: https://bedrock-runtime.us-west-2.amazonaws.com
routing_preferences:
- name: code generation
description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
- model: anthropic/claude-haiku-4-5
access_key: $ANTHROPIC_API_KEY
Today, the AI model that hallucinates the least is Google Gemini 2.0 Flash 001, with a factual consistency rate of 99.3%. This score is encouraging because it means that we're relatively close to solving the hallucination problem.
https://github.com/vectara/hallucination-leaderboard
What would happen if we built an AI agent that would first query Google Gemini 2.5 Pro about something, (because it is currently the most powerful model, completely dominating the Chatbot Arena Leaderboard by almost 40 points) and then ran the answer it generated by other models to catch any inaccuracies it may have generated?
https://lmarena.ai/?leaderboard
We presume that the different AI developers use different data sets to build their models, so while one may hallucinate about a certain query, it's possible that another would not. What would happen if we instructed our AI agent to run the content Gemini 2.5 generated through the next ten models by other developers, asking them each to analyze the answer for factual consistency?
Could this be a way to arrive at a factual consistency for answers that is perhaps 99.9% or higher? Could this be done relatively inexpensively and completely automatically?
Below are ten top models that our AI agent would run Gemini 2.5's answer through, ranked according to their factual consistency rate score. I asked 2.5 to comment on the usefulness of this approach, and point out some possible flaws and limitations, and end this post with its answer. While the technique may be relatively expensive, in some cases the usefulness of the accuracy may justify the extra expense.
OpenAI-o3-mini-high-reasoning - 99.2%
Zhipu AI GLM-4-9B-Chat - 98.7%
Amazon Nova-Micro-V1 - 98.4%
Qwen2-72B-Instruct - 95.3%
Mixtral-8x22B-Instruct-v0.1 - 95.3%
Anthropic Claude-3-5-haiku - 95.1%
01-AI Yi-1.5-9B-Chat - 95.1%
Cohere Command-R - 95.1%
IBM Granite-3.0-8B-Instruct - 93.5%
Microsoft phi-2 - 93.3%
Gemini 2.5 Pro:
"Okay, here's a concise summary of the analysis:
Your proposal involves using Gemini 2.5 Pro to generate an answer and then having 10 other models (ranked high for factual consistency) verify it. The goal is automated, inexpensive, near-perfect (99.9%+) factual accuracy.
Potential & Feasibility:
* Improved Accuracy: Yes, this ensemble approach could significantly increase factual accuracy compared to using a single model by catching diverse errors. Reaching 99.9%+ is ambitious but directionally plausible.
* Automation: The process is fully automatable using APIs.
Key Challenges & Limitations:
* Cost & Speed: Running 11 models per query will be substantially more expensive and slower than using one. "Relatively inexpensive" is unlikely.
* Consensus Logic: Deciding how to handle disagreements among the 10 verifiers is complex. A simple majority vote might miss valid corrections or reinforce errors.
* Verifier Imperfection: The verification models themselves aren't perfect and can make mistakes (hallucinate) or fail to spot errors.
* Shared Errors: Models might share common misconceptions from similar training data, leading to false consensus.
* Complexity: Designing effective verification prompts and consensus mechanisms is non-trivial.
In Short:
The idea leverages model diversity for better accuracy and is automatable. However, it faces significant hurdles in cost, speed, and the complexity of managing verification results. While promising, it's not a simple or cheap solution for achieving near-perfect factual consistency."
Hey everyone,
I just ran into an issue trying to call **Claude Sonnet 4.5** via the **AWS Bedrock Runtime API**, and I’m hoping someone here might have insights or has faced the same thing.
**Setup:**
* **Account type:** Channel program account (via AWS Partner / Distributor)
* **Region:** `us-east-1`
* **API key:** Valid — works fine for `amazon.nova-micro-v1:0`
* **Model I’m calling:** `anthropic.claude-sonnet-4-5-20250929-v1:0`
Here’s the cURL command I used:
curl -X POST "https://bedrock-runtime.us-east-1.amazonaws.com/model/anthropic.claude-sonnet-4-5-20250929-v1:0/converse" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <valid-token>" \
-d '{
"messages": [
{
"role": "user",
"content": [{"text": "Hello"}]
}
]
}'
And here’s the **error response** I got back:
{
"message": "Invocation of model ID anthropic.claude-sonnet-4-5-20250929-v1:0 with on-demand throughput isn't supported. Retry your request with the ID or ARN of an inference profile that contains this model."
}
After reaching out to AWS Support, I also got this message:
>
Has anyone here successfully accessed Claude Sonnet 4.5 under a channel program account, or know how to obtain the required inference profile ARN?
I seem i can't use any claude variant of models but I can use aws nova variant tho
Any clarification or workaround would be super appreciated 🙏
Here’s a slightly refined and Reddit-ready version of your post — same message, just cleaner formatting and tone so it reads smoothly and attracts good replies:
# [Help] Can't Access Claude Sonnet 4.5 on AWS Bedrock (Channel Program Account)
Hey everyone,
I just ran into an issue trying to call Claude Sonnet 4.5 via the AWS Bedrock Runtime API, and I’m hoping someone here might have insights or has faced the same thing.
Setup
* Account type: Channel program account (via AWS Partner / Distributor)
* Region: us-east-1
* API key: Valid — works fine for amazon.nova-micro-v1:0
* Model I’m calling: anthropic.claude-sonnet-4-5-20250929-v1:0
cURL command:
curl -X POST "https://bedrock-runtime.us-east-1.amazonaws.com/model/anthropic.claude-sonnet-4-5-20250929-v1:0/converse" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <valid-token>" \
-d '{
"messages": [
{
"role": "user",
"content": [{"text": "Hello"}]
}
]
}'
Error response:
{
"message": "Invocation of model ID anthropic.claude-sonnet-4-5-20250929-v1:0 with on-demand throughput isn't supported. Retry your request with the ID or ARN of an inference profile that contains this model."
}
After reaching out to AWS Support, I got this message back:
>
It seems like I can’t use any Claude variant (Sonnet, Haiku, etc.), but I can use AWS Nova models just fine.
Has anyone here successfully accessed Claude Sonnet 4.5 under a channel program account, or know how to obtain the required inference profile ARN?
Any clarification or workaround would be super appreciated 🙏
AI tools related to Amazon Nova Lite vs Amazon Nova Micro
These tools are closely connected to one or both models in this comparison and can help you evaluate real-world fit.
PartyRock
PartyRock is a playground powered by Amazon Bedrock that allows you to build AI-generated apps. It offers a fast, engaging way to explore generative AI, providing access to foundation models through an intuitive, code-free interface designed for learning prompt engineering and AI fundamentals.
StoryBee
StoryBee is an AI-powered story generator designed to spark creativity and imagination in children. The platform enables users to create personalized children's stories, bedtime tales, and educational narratives in seconds by providing a simple hint or theme. It is built for parents, teachers, and young readers.
GPT-trainer
GPT-trainer is an AI chatbot builder that enables users to create custom chatbots trained on their own data. It supports multiple data ingestion methods, including direct file uploads, cloud drive imports, URL scraping, and manual text entry. These chatbots can be embedded on websites or integrated into Slack to provide context-aware responses, with a focus on accuracy, data privacy, and seamless platform integration.
Unifyr
Unifyr is a data aggregation platform that provides executives with a 360-degree view of their business operations and automates reporting. By syncing your existing tech stack, the platform enables you to build dashboards and share insights, effectively removing the need for manual data collection. Leveraging AI, Unifyr converts complex data into actionable insights and improved productivity.
Which model should you choose?
Use the summary below to decide which model better fits your workflow, budget, and feature requirements.
Amazon Nova Lite
Amazon Nova Lite is a stronger fit for tool-augmented workflows, multimodal applications, cost-efficient scale.
Amazon Nova Micro
Amazon Nova Micro is a stronger fit for tool-augmented workflows, cost-efficient scale, benchmark-led evaluation.
Choose Amazon Nova Lite if you prioritize tool-augmented workflows, multimodal applications, cost-efficient scale. Choose Amazon Nova Micro if your workflow depends more on tool-augmented workflows, cost-efficient scale, benchmark-led evaluation.
Common questions about Amazon Nova Lite vs Amazon Nova Micro
What is the main difference between Amazon Nova Lite and Amazon Nova Micro?
Amazon Nova Lite leans toward tool-augmented workflows, multimodal applications, cost-efficient scale, while Amazon Nova Micro is better suited to tool-augmented workflows, cost-efficient scale, benchmark-led evaluation.
Which model is cheaper: Amazon Nova Lite or Amazon Nova Micro?
Amazon Nova Micro starts lower on input pricing at $0.0400 per 1M input tokens, compared with $0.0600 for Amazon Nova Lite.
Which model has the larger context window: Amazon Nova Lite or Amazon Nova Micro?
Amazon Nova Lite is listed with a context window of 300,000, while Amazon Nova Micro is listed with 128,000.
How should I evaluate Amazon Nova Lite vs Amazon Nova Micro for my use case?
This comparison currently includes 7 shared benchmark rows, helping you compare practical performance across overlapping evaluations.