Changelog

Stay up to date with the latest features, improvements, and fixes.

v0.9.0April 1, 2026

Public Beta Launch

We're thrilled to announce the public beta of Inferactx! This release includes everything you need to deploy and scale AI models.

  • New: Public beta availability for all users
  • New: 500+ pre-optimized models in the model hub
  • New: Auto-scaling from 0 to millions of requests
  • New: Real-time monitoring dashboard
  • New: Python and JavaScript SDKs
  • Improved: Reduced cold start times by 80%
v0.8.5March 25, 2026

Performance Improvements

Major performance optimizations and bug fixes based on private beta feedback.

  • Improved: 2x faster inference for LLaMA-3 models
  • Improved: Improved batching efficiency
  • Fixed: Fixed memory leak in long-running deployments
  • Fixed: Resolved WebSocket connection drops
v0.8.0March 15, 2026

Multi-Model Orchestration

Run multiple models together with intelligent routing and load balancing.

  • New: Multi-model deployment support
  • New: Intelligent request routing
  • New: A/B testing for models
  • Improved: Better error messages and debugging
v0.7.0March 1, 2026

Custom Model Support

Upload and deploy your own custom models alongside pre-built options.

  • New: Custom model upload via CLI
  • New: Support for PyTorch, TensorFlow, and ONNX
  • New: Automatic model optimization on upload
  • Improved: Faster model warm-up times
  • Fixed: Fixed rate limiting calculation
v0.6.0February 15, 2026

Global Edge Network

Deploy models closer to your users for lower latency.

  • New: Edge deployment in 12 regions
  • New: Automatic geo-routing
  • New: Regional model replication
  • Improved: 50% latency reduction for global users
v0.5.0February 1, 2026

Private Beta Launch

Initial release to private beta testers.

  • New: Core inference API
  • New: Basic monitoring
  • New: Python SDK
  • New: Dashboard v1