Changelog

Stay up to date with the latest features, improvements, and fixes.

v0.9.0April 1, 2026

Public Beta Launch

We're thrilled to announce the public beta of Inferactx! This release includes everything you need to deploy and scale AI models.

New: Public beta availability for all users
New: 500+ pre-optimized models in the model hub
New: Auto-scaling from 0 to millions of requests
New: Real-time monitoring dashboard
New: Python and JavaScript SDKs
Improved: Reduced cold start times by 80%

v0.8.5March 25, 2026

Performance Improvements

Major performance optimizations and bug fixes based on private beta feedback.

Improved: 2x faster inference for LLaMA-3 models
Improved: Improved batching efficiency
Fixed: Fixed memory leak in long-running deployments
Fixed: Resolved WebSocket connection drops

v0.8.0March 15, 2026

Multi-Model Orchestration

Run multiple models together with intelligent routing and load balancing.

New: Multi-model deployment support
New: Intelligent request routing
New: A/B testing for models
Improved: Better error messages and debugging

v0.7.0March 1, 2026

Custom Model Support

Upload and deploy your own custom models alongside pre-built options.

New: Custom model upload via CLI
New: Support for PyTorch, TensorFlow, and ONNX
New: Automatic model optimization on upload
Improved: Faster model warm-up times
Fixed: Fixed rate limiting calculation

v0.6.0February 15, 2026

Global Edge Network

Deploy models closer to your users for lower latency.

New: Edge deployment in 12 regions
New: Automatic geo-routing
New: Regional model replication
Improved: 50% latency reduction for global users

v0.5.0February 1, 2026

Private Beta Launch

Initial release to private beta testers.

New: Core inference API
New: Basic monitoring
New: Python SDK
New: Dashboard v1