Models
Learn about Model Explorer in AI Quick Actions.
Under Models, you can find Model Explorer that shows all the foundation models supported by AI Quick Actions and your fine-tuned models. The model cards include a tag to indicate the family of shapes that are supported for the model. Enter text in the Search and filter models text box to search for a model in the list. Or click the text box and select an option on which to filter the list of models. Under My models are service cached foundation models, ready-to-register models, and models you have registered. Service cached models are models whose configuration have been verified by the Data Science team and are ready to be used without the downloading of model artifacts. Ready-to-register models are models whose configurations have been verified by the Data Science team, and which you can bring into AI Quick Actions through the model registration process. Under Fine-tuned models are models you have fine-tuned.
Service Cached Models
Service cached models have been tested by Data Science and the model artifacts are downloaded to a bucket in the service's object storage. They are ready to be used.
- codellama-34b-instruct-hf
- codellama-13b-instruct-hf
- codellama-7b-instruct-hf
-
mistralai/Mixtral-8x7b-v0.1
-
mistralai/Mistral-7b-Instruct-v0.3
- mixtral-8x7b-instruct-v0.1
- mistral-7b-instruct-v0.2
- mistral-7b-v0.1
- mistral-7b-instruct-v0.1
- falcon-7b
- phi-2
- Phi-3-mini-4k-instruct-fp16.gguf
- Phi-3-mini-4k-instruct-q4.gguf
- jais-13b
- falcon-40b-instruct
-
microsoft/Phi-3-vision-128k-instruct
-
microsoft/Phi-3-mini-128k-instruct
-
microsoft/Phi-3-mini-4k-instruct
-
microsoft/Phi-3-mini-4k-instruct-gguf-fp16
-
microsoft/Phi-3-mini-4k-instruct-gguf-q4
Ready-to-Register Models
Ready-to-Register models have been tested by Data Science, and they can be used in AI Quick Actions through the Model Registration process.
- core42/jais-13b-chat
- core42/jais-13b
- llama-3-70b-instruct
- llama-3-8b-instruct
-
meta-llama/Meta-Llama-3.1-8B
-
meta-llama/Meta-Llama-3.1-8B-Instruct
-
meta-llama/Meta-Llama-3.1-70B
-
meta-llama/Meta-Llama-3.1-70B-Instruct
-
meta-llama/Meta-Llama-3.1-405B-Instruct-FP8
-
meta-llama/Meta-Llama-3.1-405B-FP8
-
meta-llama/Llama-3.2-1B
-
meta-llama/Llama-3.2-1B-Instruct
-
meta-llama/Llama-3.2-3B
-
meta-llama/Llama-3.2-3B-Instruct
-
meta-llama/Llama-3.2-11B-Vision
-
meta-llama/Llama-3.2-90B-Vision
-
meta-llama/Llama-3.2-11B-Vision-Instruct
-
meta-llama/Llama-3.2-90B-Vision-Instruct
- meta-llama-3-8b
- meta-llama-3-70b
- elyza/ELYZA-japanese-Llama-2-13b-instruct
- elyza/ELYZA-japanese-Llama-2-7b-instruct
- elyza/ELYZA-japanese-Llama-2-13b
- elyza/ELYZA-japanese-Llama-2-7b
- google/gemma-1.1-7b-it
- google/gemma-2b-it
- google/gemma-2b
- google/gemma-7b
- google/codegemma-2b
- google/codegemma-1.1-7b-it
- google/codegemma-1.1-2b
- google/codegemma-7b
- intfloat/e5-mistral-7b-instruct
The meta-llama/Meta-Llama-3.1 and meta-llama/Llama-3.2 models aren't available in EU regions.
Working with Multimodal Models
AI Quick Actions supports deployment of multimodal models. For an example of deploying and testing a multimodal model, see the AI Quick Actions samples in the Data Science section on GitHub.
To work with image payload and mulitmodal models, when creating a Model Deployment, under Advanced
options, select /v1/chat/completions
as the Inference
Mode.