Domain Adaptation of LLMs Using RAFT, the Hybrid of RAG and SFT
Updated on

Domain Adaptation of LLMs Using RAFT, the Hybrid of RAG and SFT

We are witnessing the arrival of new and improved Large Language Models (LLMs) on a frequent basis, with every release, they have been improving their performance on previous benchmarks. Their impressive advancements in general-purpose tasks are being harnessed for practical applications across diverse industries. However, their knowledge is restricted to the corpus on which they are trained. If they are exposed to a new set of questions for which there is no underlying training data, they can potentially hallucinate.

For domain-specific adaptation, practitioners either use methods such as in-context learning through Retrieval-Augmented Generation (RAG) or Supervised Fine-Tuning (SFT). While these methods are promising, they come with their own pros and cons.

Retrieval-Augmented Generation (RAG)

Pros Cons

Improved Accuracy with Relevant Information
RAG allows LLMs to dynamically access and leverage relevant information during task completion. This can lead to more accurate and informative responses, especially for factual tasks like question-answering.

Reliance on Retrieval Quality
The effectiveness of RAG heavily depends on the quality of the retrieval process. Inaccurate or irrelevant retrieved documents can lead to misleading or incorrect responses.

Reduced Hallucination
LLMs can sometimes generate responses that are seemingly plausible but factually incorrect (“hallucinations”). RAG helps mitigate this by grounding the response in retrieved documents, promoting factual consistency. They provide citations so that the results are trustworthy.

Information Loss
There’s potential for information loss during various stages of the RAG process, such as document chunking, embedding creation, retrieval based on similarity, and response generation. This can affect the accuracy and completeness of the final answer.

Reduced Forgetting
RAG avoids the issue of catastrophic forgetting that can occur in fine-tuning, where the LLM loses general knowledge while specializing in a new domain.

Increased Complexity
The RAG system adds an additional layer of complexity to the LLM architecture. This requires more computational resources for retrieval, processing retrieved information, and generating the final response.

 

Supervised Fine-Tuning (SFT)

Pros Cons

Adaptability
Fine-tuning enables LLMs to be adapted to various domains and applications. This flexibility makes them valuable tools for a wide range of tasks and industries.

Catastrophic Forgetting
Fine-tuning can lead to a phenomenon called catastrophic forgetting, where the LLM loses its ability to perform tasks it was previously trained on (general knowledge tasks) as it focuses on the new domain.

Improved Accuracy on Specific Tasks
Fine-tuning allows LLMs to specialize in a particular domain or task. By focusing on a specific dataset and objective, the model can learn the nuances of that domain and achieve higher accuracy on relevant tasks.

Overfitting
If the fine-tuning dataset is too small or not fully representative of the domain, the LLM can overfit to the training data. This can lead to inferior performance on unseen data and a lack of generalizability.

Reduced Data Requirements and Faster Training
Compared to training a large LLM from scratch, fine-tuning requires less data. The LLM already possesses a vast amount of general knowledge, and fine-tuning leverages this baseline while focusing on domain-specific information. It reduces the training time and allows quicker development and deployment of LLMs for specific tasks.

Limited Interpretability
Understanding how a fine-tuned LLM arrives at its answers can be challenging due to the complexity of the process. This limits the ability to debug errors or improve performance.

 

While both RAG and SFT have promising outcomes, they still do not provide the best outcomes in enterprise settings. RAFT, which stands for Retrieval-Augmented Fine-Tuning, addresses the limitations of both RAG and SFT by combining them strategically.

How RAFT Works

  • Fine-Tuning

    RAFT leverages Supervised Fine-Tuning to train the LLM on a dataset specific to the target domain. This helps the model acquire domain-specific knowledge and improve its understanding of the domain’s language and concepts.

  • Retrieval-Augmented Generation

    RAFT incorporates RAG by allowing the LLM to access and reference relevant documents retrieved from the domain-specific data during task completion (like question answering). This ensures the response is grounded in factual information.

  • Focus on Robustness

    RAFT goes beyond simply incorporating retrieved documents. It trains the LLM to be robust against retrieval errors. This means the model can manage situations where some retrieved documents might be irrelevant or misleading. It learns to discern reliable information and utilize it effectively to generate accurate answers.

What Are the Benefits of RAFT Compared with RAG and Supervised Fine-Tuning?

  • Overcomes Limitations of RAG

    RAG relies heavily on accurate document retrieval. RAFT addresses this by training the LLM to be less susceptible to retrieval errors, leading to more reliable responses even with imperfect retrieval.

  • Improves Upon Supervised Fine-Tuning

    While fine-tuning improves domain knowledge, it might neglect the open-ended nature of real-world scenarios where additional information might be available. RAFT allows the LLM to leverage retrieved documents during task completion, mimicking a real-world situation where one might need to consult additional resources.

  • Enhanced Accuracy

    By combining domain-specific knowledge with access to relevant information, RAFT has the potential to generate more accurate and informative responses compared with either RAG or Supervised Fine-Tuning alone.

  • Reduced Forgetting

    Unlike fine-tuning, RAFT avoids the issue of catastrophic forgetting, where the LLM loses general knowledge while specializing in a new domain.

Results

As per the recent paper RAFT: Adapting Language Model to Domain Specific RAG, RAFT consistently outperforms supervised fine-tuning with and without RAG on benchmark datasets like PubMed, HotpotQA, and Gorilla. This demonstrates RAFT’s effectiveness as a simple yet powerful technique for enhancing in-domain RAG performance in pre-trained LLMs.

Conclusion

Overall, RAFT offers a promising approach for adapting LLMs to specialized domains. It combines the benefits of both Supervised Fine-Tuning and RAG, leading to potentially more accurate, robust, and informative responses.


Jayachandran Ramachandran

LinkedIn

Jayachandran Ramachandran

Jayachandran has over 25 years of industry experience and is an AI thought leader, consultant, design thinker, inventor, and speaker at industry forums with extensive...

Read More    Read More