aws-account-sharding

Here is 1 public repository matching this topic...

aws-samples / sample-resilient-llm-inference

This repo presents resilience patterns for scaling inference for Generative AI workloads on AWS: Bedrock cross-Region inference, AWS account sharding, and intelligent routing with LLM gateways.

fallback throttling load-balancing quotable-api litellm-ai-gateway bedrock-cross-region-inference genai-resilience aws-account-sharding

Updated Oct 2, 2025
Python

Improve this page

Add a description, image, and links to the aws-account-sharding topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the aws-account-sharding topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aws-account-sharding

Here is 1 public repository matching this topic...

aws-samples / sample-resilient-llm-inference

Improve this page

Add this topic to your repo