Changelog
Stay up to date with the latest features, improvements, and fixes.
v0.9.0April 1, 2026
Public Beta Launch
We're thrilled to announce the public beta of Inferactx! This release includes everything you need to deploy and scale AI models.
- New: Public beta availability for all users
- New: 500+ pre-optimized models in the model hub
- New: Auto-scaling from 0 to millions of requests
- New: Real-time monitoring dashboard
- New: Python and JavaScript SDKs
- Improved: Reduced cold start times by 80%
v0.8.5March 25, 2026
Performance Improvements
Major performance optimizations and bug fixes based on private beta feedback.
- Improved: 2x faster inference for LLaMA-3 models
- Improved: Improved batching efficiency
- Fixed: Fixed memory leak in long-running deployments
- Fixed: Resolved WebSocket connection drops
v0.8.0March 15, 2026
Multi-Model Orchestration
Run multiple models together with intelligent routing and load balancing.
- New: Multi-model deployment support
- New: Intelligent request routing
- New: A/B testing for models
- Improved: Better error messages and debugging
v0.7.0March 1, 2026
Custom Model Support
Upload and deploy your own custom models alongside pre-built options.
- New: Custom model upload via CLI
- New: Support for PyTorch, TensorFlow, and ONNX
- New: Automatic model optimization on upload
- Improved: Faster model warm-up times
- Fixed: Fixed rate limiting calculation
v0.6.0February 15, 2026
Global Edge Network
Deploy models closer to your users for lower latency.
- New: Edge deployment in 12 regions
- New: Automatic geo-routing
- New: Regional model replication
- Improved: 50% latency reduction for global users
v0.5.0February 1, 2026
Private Beta Launch
Initial release to private beta testers.
- New: Core inference API
- New: Basic monitoring
- New: Python SDK
- New: Dashboard v1