Skip to Main Content
InterSystems Ideas
We love hearing from our users. Tell us what you want to see next and upvote ideas from the community.
* Bugs and troubleshooting should as usual go through InterSystems support.
Status Future consideration
Categories InterSystems IRIS
Created by din ba
Created on Jul 3, 2025

Auto-Scaling for Embedded Python Workloads in IRIS

Integrate native auto-scaling capabilities for embedded Python workloads within InterSystems IRIS. This feature would dynamically allocate or deallocate resources (CPU, memory) to Python processes running inside IRIS containers based on real-time demand, without manual intervention. It would leverage Kubernetes-style orchestration logic but operate seamlessly within standalone IRIS deployments. 2️⃣ Target Audience

  • Developers building AI/ML microservices using IRIS Embedded Python.

  • System Administrators managing IRIS infrastructure for variable workloads.

  • Enterprises running resource-intensive Python workloads (e.g., real-time analytics, NLP, predictive modeling).


3️⃣ Problem Solved

Problem: Python workloads in IRIS (e.g., PyTorch inference, pandas data processing) often experience unpredictable spikes in demand. Manually scaling resources is inefficient, leading to:

  • Over-provisioning: Wasted resources during low-usage periods.

  • Under-provisioning: Performance degradation or failures during peak loads.

  • Operational Overhead: DevOps teams must constantly monitor and adjust resources.


4️⃣ Impact on Product Efficiency, Stability, and Reliability

  • Efficiency: Optimizes resource usage (cost savings of 30–50% for cloud deployments).

  • Stability: Prevents crashes due to memory/CPU exhaustion during traffic surges.

  • Reliability: Ensures consistent performance for time-sensitive workloads (e.g., healthcare analytics).

  • Scalability: Enables hands-free handling of workload fluctuations.


5️⃣ Specific Use Case Scenario

Scenario: A hospital uses IRIS to process real-time patient data from IoT devices. Python scripts analyze ECG streams for anomalies using TensorFlow.

  • Without Auto-Scaling:
    At 9:00 AM, 1,000 patients trigger simultaneous monitoring. Python processes overload, causing delays in critical alerts. Administrators scramble to manually allocate resources.

  • With Auto-Scaling:

    1. Low Demand (3:00 AM):

      • 5 active patient monitors → IRIS runs 2 Python pods (low CPU/memory).

    2. Peak Demand (9:00 AM):

      • 1,000 patients trigger monitoring → IRIS detects load spike.

      • Automatically spins up 20 Python pods in <10 seconds.

    3. Post-Peak (10:00 AM):

      • Pods scale down to 5, freeing resources.

Outcome: ECG alerts are processed in real-time, resource costs drop by 40%, and clinical staff receive uninterrupted insights.


Why This Idea Stands Out

  • Real-World Alignment: Targets booming AI/ML use cases in IRIS.

  • Cloud-Native: Aligns with DevOps trends (Kubernetes, serverless).

  • Tangible ROI: Reduces costs while boosting reliability.

  • ADMIN RESPONSE
    Aug 20, 2025

    Thank you for submitting the idea. The status has been changed to "Future consideration".

    Stay tuned!

  • Daniel Palevski
    Aug 6, 2025

    should this be assigned to @Benjamin De Boe ?

  • Raj Singh
    Jul 18, 2025

    Since embedded python is actually running inside an irisdb process, is really asking for auto-scaling IRIS?