Integrate native auto-scaling capabilities for embedded Python workloads within InterSystems IRIS. This feature would dynamically allocate or deallocate resources (CPU, memory) to Python processes running inside IRIS containers based on real-time demand, without manual intervention. It would leverage Kubernetes-style orchestration logic but operate seamlessly within standalone IRIS deployments. 2️⃣ Target Audience
Developers building AI/ML microservices using IRIS Embedded Python.
System Administrators managing IRIS infrastructure for variable workloads.
Enterprises running resource-intensive Python workloads (e.g., real-time analytics, NLP, predictive modeling).
Problem: Python workloads in IRIS (e.g., PyTorch inference, pandas data processing) often experience unpredictable spikes in demand. Manually scaling resources is inefficient, leading to:
Over-provisioning: Wasted resources during low-usage periods.
Under-provisioning: Performance degradation or failures during peak loads.
Operational Overhead: DevOps teams must constantly monitor and adjust resources.
Efficiency: Optimizes resource usage (cost savings of 30–50% for cloud deployments).
Stability: Prevents crashes due to memory/CPU exhaustion during traffic surges.
Reliability: Ensures consistent performance for time-sensitive workloads (e.g., healthcare analytics).
Scalability: Enables hands-free handling of workload fluctuations.
Scenario: A hospital uses IRIS to process real-time patient data from IoT devices. Python scripts analyze ECG streams for anomalies using TensorFlow.
Without Auto-Scaling:
At 9:00 AM, 1,000 patients trigger simultaneous monitoring. Python processes overload, causing delays in critical alerts. Administrators scramble to manually allocate resources.
With Auto-Scaling:
Low Demand (3:00 AM):
5 active patient monitors → IRIS runs 2 Python pods (low CPU/memory).
Peak Demand (9:00 AM):
1,000 patients trigger monitoring → IRIS detects load spike.
Automatically spins up 20 Python pods in <10 seconds.
Post-Peak (10:00 AM):
Pods scale down to 5, freeing resources.
Outcome: ECG alerts are processed in real-time, resource costs drop by 40%, and clinical staff receive uninterrupted insights.
Real-World Alignment: Targets booming AI/ML use cases in IRIS.
Cloud-Native: Aligns with DevOps trends (Kubernetes, serverless).
Tangible ROI: Reduces costs while boosting reliability.
Thank you for submitting the idea. The status has been changed to "Future consideration".
Stay tuned!
should this be assigned to @Benjamin De Boe ?
Since embedded python is actually running inside an irisdb process, is really asking for auto-scaling IRIS?