Data Science AI Quick Actions v 2.0

Data Science AI Quick Actions v 2.0 includes:

Support for OpenAI Endpoint Model Deployment
Deploy models to many, configurable OpenAI endpoints. It includes support for streaming and advanced parameters.
Stacked Model Deployment
Enable several fine-tuned variants to share the same base model deployment. This unified setup improves GPU usage compared to managing separate instances for each variant.
Support for Quantization
Use quantization to reduce memory requirements, enabling large language model deployments on smaller, cost-effective compute shapes.
Support for Llama 4 Fine-Tuning
Fine-tune Llama 4 models for greater customization and control to address your unique AI needs.
Support of vLLM 0.11 and llama.cpp 0.3.16 are now available
Provide updated support for your AI model deployment needs.
For more information, see the AI Quick Actions model documentation.