QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137

Version 1.0
LLM Fine Tuning with QLoRA -
Dataset Generation
Using GPT 3 and 4 to create a set of instructions and results
that can be used for fine-tuning an LLM.
Obioma Anomnachi
Engineer @ Anant

Fine Tuning Overview
● Fine tuning is a method of continuing the training of
an already trained large language model
○ Less computationally intensive than the pre-training
required to produce the base llm
○ Originally viewed as a method of “teaching” facts to an
llm. Turned out to be less useful/practical than methods
like RAG.
○ Fine tuning is more useful in shaping the output format
of an llm. Mostly used these days by the original owners
of llms to produce instruction following or code
generating versions of their existing models.
○ Also used by people working with diffusion image
models to control art style, resolution, or other
variables.

LLM Architecture
● Llm models are transformers with attention,
which are a type of neural network
● Like other deep learning models (image
processing, GANs, etc) they are composed of
blocks with different structures for different
purposes
● LoRA takes advantage of this structure by
freezing the existing blocks in place and
attaching new, smaller sections in between
that the training process is allowed to change
● QLoRA adds quantization to this process,
treating groupings of nearby neurons as if
they have the same value, effectively
decreasing the number of parameters that
are being optimized

Fine Tuning Types
● Instruction fine-tuning
○ Changes the way that the model responds to prompts so that it follows commands rather than just
completing prompts with whatever would follow.
● Transfer learning
○ Fine tuning on a minimum of task specific examples to create a marginal improvement over a pre-trained
model.
● Task specific fine tuning
○ Fine tuning on a specific task or domain with many more examples to work from. Runs the risk of
catastrophic forgetting.
● Multi task learning
○ Including all the desired tasks in the training set to avoid those specific tasks being affected by catastrophic
forgetting
● Sequential fine tuning
○ Task specific fine tuning in sequence to go from more general to more specific domains or tasks

Instruction Format
● For instruction fine tuning, we want specific instructions and results, even when the model is
accomplishing a task other than instruction following.
○ So an instruction that tells the model to do sentiment analysis might look like this:
■ “Analyze the sentiment of the following text and identify if it is positive.”
○ This could be categorized as just instruction fine tuning, or we could conceptualize this as a sort of multi-task
learning where the model is learning to respond to instructions, but also protecting it’s skills at sentiment
analysis or other specific tasks from catastrophic forgetting
● Our instruction data set is based around performing tasks on snippets from technical articles
about Cassandra
○ Therefore it is composed of instruction, input, and output sections. This is similar to other instruction tuning
datasets like alpaca-cleaned, except that the input field is always full

Cassandra Article Dataset
● Starting from the collection of
Cassandra articles powering
Cassandra.Link, we used OpenAI
models to build an instruction
dataset
○ These articles contain a number of
articles on a variety of topics
related to Cassandra, so not every
type of instruction is applicable to
every article
■ For example if an article
includes definitions, we
could order the model to
explain a specific concept

Instruction Generation Process
● First the entire article is attached to the main outer prompt and put through GPT-4
○ This prompt is used to extract which of the specific instruction types this article is suited for
○ We created 14 different instruction types, all of which require some context from the main article
○ This prompt is tuned to return minimal text in order to minimize the cost of using gpt 4
● We use traditional coding techniques to extract the instruction type numbers from the gpt-4
result
● Then each valid instruction type is attached to a system prompt and the article text, and passed to
gpt-3.5 with function-calling ensuring that the output will be valid JSON data
○ The result is a JSON object containing the instruction, input, and output columns that are later combined to
become a full instruction for fine tuning

Resources
● https://github.com/Anant/llm-fine-tuning-qlora/tree/main
● https://www.analyticsvidhya.com/blog/2023/08/fine-tuning-large-language-
models/
● https://huggingface.co/transformers/v4.10.1/custom_datasets.html
● https://wandb.ai/capecape/alpaca_ft/reports/How-to-Fine-Tune-an-LLM-Part-1-
Preparing-a-Dataset-for-Instruction-Tuning--Vmlldzo1NTcxNzE2
● https://www.turing.com/resources/finetuning-large-language-models
● https://www.lakera.ai/blog/llm-fine-tuning-guide

Strategy: Scalable Fast Data
Architecture: Cassandra, Spark, Kafka
Engineering: Node, Python, JVM,CLR
Operations: Cloud, Container
Rescue: Downtime!! I need help.
www.anant.us | solutions@anant.us | (855) 262-6826
3 Washington Circle, NW | Suite 301 | Washington, DC 20037

QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137

More Related Content

Similar to QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137 (20)

More from Anant Corporation (20)

Recently uploaded (20)

QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137