Exploring mergekit for Model Merge and AutoEval for Model Evaluation | by Wenqi Glantz | Jan, 2024

My observations from experimenting with model merge, evaluation, and fine-tuning

Wenqi Glantz
Towards Data Science
Image generated by DALL-E 3 by the author

Let’s continue our learning journey of Maxime Labonne’s llm-course, which is pure gold for the community. This time, we will focus on model merge and evaluation.

Maxime has a great article titled Merge Large Language Models with mergekit. I highly recommend you check it out first. We will not repeat the steps he has already laid out in his article, but we will explore some details I came across that might be helpful to you.

We are going to experiment with model merge and model evaluation in the following steps:

  • Using LazyMergekit, we merge two models from the Hugging Face hub, mistralai/Mistral-7B-Instruct-v0.2 and jan-hq/trinity-v1.
  • Run AutoEval on the base model mistralai/Mistral-7B-Instruct-v0.2.
  • Run AutoEval on the merged model MistralTrinity-7b-slerp.
  • Fine-tune the merged model with a customized instruction dataset.
  • Run AutoEval on the fine-tuned model.
Diagram by author

Let’s dive in.

First, how do we select which models to merge?

Determining whether two or multiple models can be merged involves evaluating several key attributes and considerations:

  1. Model Architecture: Model architecture is a crucial consideration when merging models. Ensure the models share a compatible architecture (e.g., both transformer-based). Merging dissimilar architectures is often challenging. The Hugging Face model card usually details a model’s architecture. If you cannot find the model architecture info, you can try and error with Maxime’s LazyMergekit, which we will explore later. If you encounter an error, it’s usually because of the incompatibility of the model architectures.
  2. Dependencies and Libraries: Ensure that…

Source link

This post originally appeared on TechToday.