Build, Manage and Deploy AI/ML Systems
- 
            Updated
            Oct 24, 2025 
- Python
Build, Manage and Deploy AI/ML Systems
Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, 17+ clouds, or on-prem).
Determined is an open-source machine learning platform that simplifies distributed training, hyperparameter tuning, experiment tracking, and resource management. Works with PyTorch and TensorFlow.
Efficient Deep Learning Systems course materials (HSE, YSDA)
Aqueduct is no longer being maintained. Aqueduct allows you to run LLM and ML workloads on any cloud infrastructure.
🚀 Metadata tracking and UI service for Metaflow!
A Collection of GitHub Actions That Facilitate MLOps
Utilities for preprocessing text for deep learning with Keras
MONAI Deploy App SDK offers a framework and associated tools to design, develop and verify AI-driven applications in the healthcare imaging domain.
deploy ML Infrastructure and MLOps tooling anywhere quickly and with best practices with a single command
Run GPU inference and training jobs on serverless infrastructure that scales with you.
The official python package for NimbleBox. Exposes all APIs as CLIs and contains modules to make ML 🌸
Run GPU or batch jobs directly from your dev environment
A standalone inference server for trained Rubix ML estimators.
Kubeflow blog based on fastpages
Example ML projects that use the Determined library.
Render Jupyter Notebooks With Metaflow Cards
A tool for training models to Vertex on Google Cloud Platform.
GPU-aware inference mesh for large-scale AI serving
RFlow - A workflow framework for agile machine learning
Add a description, image, and links to the ml-infrastructure topic page so that developers can more easily learn about it.
To associate your repository with the ml-infrastructure topic, visit your repo's landing page and select "manage topics."