Data Science AI Quick Actions v2.0.1

Services: Data Science
Release Date: February 25, 2026

Data Science AI Quick Actions v2.0.1 includes the following updates:

Expanded Service-Cached Models: Granite 4.0-h: Data Science continues to expand its service-cached catalog. Granite 4.0-h models are now available as service-cached entries, eliminating artifact downloads and reducing cold-start latency for inference workloads. This update enables faster, more reliable inference for these advanced models by using precached states, which help reduce cold-start latency and ensure consistent performance for enterprise-scale AI applications.

Upgraded Inference Container: vLLM 0.13: The service-managed inference container in Data Science is now upgraded to vLLM 0.13. This version offers improved compatibility with the latest machine learning frameworks and increases operational efficiency for AI workloads. The upgrade also introduces memory management optimizations to support high-concurrency processing, which aligns with our support for model deployments using custom containers and environments.

LMCache Integration for Enhanced Performance: The service-managed container now includes LMCache, which is optimized for conversational workloads. With LMCache enabled, you can achieve nearly 2× throughput improvement and over 50% reduction in time-to-first-token (TTFT) latency. These improvements speed up interactions and improve responsiveness in multi-turn dialogues.

AI Deployment Utilities: UI Policy Verification and Shape Calculator

To streamline the deployment lifecycle, two new utilities have been added in the AI Quick Actions suite:

UI Policy verification Tool: Check required policies for model registration, deployment, evaluation, and fine-tuning with one click to minimize configuration errors.
Shape Calculator: Get instant, data-driven shape recommendations for any model to ensure the infrastructure is correctly sized for your specific AI workloads.