Auto-Scaling for Embedded Python Workloads in IRIS

Integrate native auto-scaling capabilities for embedded Python workloads within InterSystems IRIS. This feature would dynamically allocate or deallocate resources (CPU, memory) to Python processes running inside IRIS containers based on real-time demand, without manual intervention. It would leverage Kubernetes-style orchestration logic but operate seamlessly within standalone IRIS deployments. 2️⃣ Target Audience

Developers building AI/ML microservices using IRIS Embedded Python.
System Administrators managing IRIS infrastructure for variable workloads.
Enterprises running resource-intensive Python workloads (e.g., real-time analytics, NLP, predictive modeling).

3️⃣ Problem Solved

Problem: Python workloads in IRIS (e.g., PyTorch inference, pandas data processing) often experience unpredictable spikes in demand. Manually scaling resources is inefficient, leading to:

Over-provisioning: Wasted resources during low-usage periods.
Under-provisioning: Performance degradation or failures during peak loads.
Operational Overhead: DevOps teams must constantly monitor and adjust resources.

4️⃣ Impact on Product Efficiency, Stability, and Reliability

Efficiency: Optimizes resource usage (cost savings of 30–50% for cloud deployments).
Stability: Prevents crashes due to memory/CPU exhaustion during traffic surges.
Reliability: Ensures consistent performance for time-sensitive workloads (e.g., healthcare analytics).
Scalability: Enables hands-free handling of workload fluctuations.

5️⃣ Specific Use Case Scenario

Scenario: A hospital uses IRIS to process real-time patient data from IoT devices. Python scripts analyze ECG streams for anomalies using TensorFlow.

Without Auto-Scaling:
At 9:00 AM, 1,000 patients trigger simultaneous monitoring. Python processes overload, causing delays in critical alerts. Administrators scramble to manually allocate resources.
With Auto-Scaling:
1. Low Demand (3:00 AM):
  - 5 active patient monitors → IRIS runs 2 Python pods (low CPU/memory).
2. Peak Demand (9:00 AM):
  - 1,000 patients trigger monitoring → IRIS detects load spike.
  - Automatically spins up 20 Python pods in <10 seconds.
3. Post-Peak (10:00 AM):
  - Pods scale down to 5, freeing resources.

Outcome: ECG alerts are processed in real-time, resource costs drop by 40%, and clinical staff receive uninterrupted insights.

Why This Idea Stands Out

Real-World Alignment: Targets booming AI/ML use cases in IRIS.
Cloud-Native: Aligns with DevOps trends (Kubernetes, serverless).
Tangible ROI: Reduces costs while boosting reliability.

ADMIN RESPONSE

Aug 20, 2025

Thank you for submitting the idea. The status has been changed to "Future consideration".
Stay tuned!

Post comment

Daniel Palevski

Aug 6, 2025

should this be assigned to @Benjamin De Boe ?

Reply
Hide replies
Like

Raj Singh

Jul 18, 2025

Since embedded python is actually running inside an irisdb process, is really asking for auto-scaling IRIS?

Reply
Hide replies
1 like

Please enter your email address

RELATED IDEAS

Auto-Scaling for Embedded Python Workloads in IRIS

3️⃣ Problem Solved

4️⃣ Impact on Product Efficiency, Stability, and Reliability

5️⃣ Specific Use Case Scenario

Why This Idea Stands Out