Choosing a Fine-Tuning Method in Generative AI

When you create a custom model, OCI Generative AI fine-tunes the pretrained base models using a method that matches the base model.

Important

Some OCI Generative AI foundational pretrained models supported for the dedicated serving mode are now deprecated and will retire no sooner than 6 months after the release of the 1st replacement model. You can fine-tune and host a foundational pretrained model on a dedicated AI cluster (dedicated serving mode) until that model is retired. For dedicated serving mode retirement dates, see Retiring the Models.

The following table lists the method that Generative AI uses to train each type of base model:

Pretrained Base Model Training Method
cohere.command-r-16k
  • T-Few
meta.llama-3.1-70b-instruct
  • LoRA
cohere.command (deprecated)
  • T-Few
  • Vanilla
cohere.command-light (deprecated)
  • T-Few
  • Vanilla
meta.llama-3-70b-instruct (deprecated)
  • LoRA
Note

For information about the hyperparameters used for each training method, see Hyperparameters for Fine-Tuning a Model in Generative AI.

Choosing Between T-Few and Vanilla

For the cohere.command and cohere.command-light models, OCI Generative AI has two training methods: T-Few, and Vanilla. Use the following guidelines to help you choose the best training method for your use cases.

Feature Options and Recommendations
Training methods for cohere.command and cohere.command-light
  • T-Few
  • Vanilla
Dataset Size
  • Use T-Few for small datasets (A few thousand samples or less)
  • Use Vanilla for large datasets (From a hundred thousand samples to millions of samples)

Using small datasets for the Vanilla method might cause overfitting. Overfitting happens when the trained model gives great results for the training data, but can't generalize outputs for unseen data.

Complexity
  • Use T-Few for format following or instruction following.
  • Use Vanilla for complicated semantic understanding improvement, such as improving a model's understanding of medical cases.
Hosting
  • Use T-Few if you're planning to host several fine-tuned models on the same hosting dedicated AI cluster. If all the models are trained on the same base model, you can host them on the same cluster. This stacked-serving feature saves cost and offers good performance if user traffic to each T-Few fine-tuned model is relatively low. See Adding Endpoints to Hosting Clusters.
  • Each model that's fine-tuned with the Vanilla method requires its own hosting dedicated AI cluster.