Domain Adaptation of LLMs Using RAFT, the Hybrid of RAG and SFT
We are witnessing the arrival of new and improved Large Language Models (LLMs) on a frequent basis, with every release, they have been improving their performance on previous benchmarks. Their impressive advancements in general-purpose tasks are being harnessed for practical applications across diverse industries. However, their knowledge is restricted to the corpus on which they are trained. If they are exposed to a new set of questions for which there is no underlying training data, they can potentially hallucinate.
For domain-specific adaptation, practitioners either use methods such as in-context learning through Retrieval-Augmented Generation (RAG) or Supervised Fine-Tuning (SFT). While these methods are promising, they come with their own pros and cons.
Retrieval-Augmented Generation (RAG)
Pros | Cons |
---|---|
Improved Accuracy with Relevant Information |
Reliance on Retrieval Quality |
Reduced Hallucination |
Information Loss |
Reduced Forgetting |
Increased Complexity |
Supervised Fine-Tuning (SFT)
Pros | Cons |
---|---|
Adaptability |
Catastrophic Forgetting |
Improved Accuracy on Specific Tasks |
Overfitting |
Reduced Data Requirements and Faster Training |
Limited Interpretability |
While both RAG and SFT have promising outcomes, they still do not provide the best outcomes in enterprise settings. RAFT, which stands for Retrieval-Augmented Fine-Tuning, addresses the limitations of both RAG and SFT by combining them strategically.
How RAFT Works
-
Fine-Tuning
RAFT leverages Supervised Fine-Tuning to train the LLM on a dataset specific to the target domain. This helps the model acquire domain-specific knowledge and improve its understanding of the domain’s language and concepts.
-
Retrieval-Augmented Generation
RAFT incorporates RAG by allowing the LLM to access and reference relevant documents retrieved from the domain-specific data during task completion (like question answering). This ensures the response is grounded in factual information.
-
Focus on Robustness
RAFT goes beyond simply incorporating retrieved documents. It trains the LLM to be robust against retrieval errors. This means the model can manage situations where some retrieved documents might be irrelevant or misleading. It learns to discern reliable information and utilize it effectively to generate accurate answers.
What Are the Benefits of RAFT Compared with RAG and Supervised Fine-Tuning?
-
Overcomes Limitations of RAG
RAG relies heavily on accurate document retrieval. RAFT addresses this by training the LLM to be less susceptible to retrieval errors, leading to more reliable responses even with imperfect retrieval.
-
Improves Upon Supervised Fine-Tuning
While fine-tuning improves domain knowledge, it might neglect the open-ended nature of real-world scenarios where additional information might be available. RAFT allows the LLM to leverage retrieved documents during task completion, mimicking a real-world situation where one might need to consult additional resources.
-
Enhanced Accuracy
By combining domain-specific knowledge with access to relevant information, RAFT has the potential to generate more accurate and informative responses compared with either RAG or Supervised Fine-Tuning alone.
-
Reduced Forgetting
Unlike fine-tuning, RAFT avoids the issue of catastrophic forgetting, where the LLM loses general knowledge while specializing in a new domain.
Results
As per the recent paper RAFT: Adapting Language Model to Domain Specific RAG, RAFT consistently outperforms supervised fine-tuning with and without RAG on benchmark datasets like PubMed, HotpotQA, and Gorilla. This demonstrates RAFT’s effectiveness as a simple yet powerful technique for enhancing in-domain RAG performance in pre-trained LLMs.
Conclusion
Overall, RAFT offers a promising approach for adapting LLMs to specialized domains. It combines the benefits of both Supervised Fine-Tuning and RAG, leading to potentially more accurate, robust, and informative responses.
Don’t miss our next article!
Sign up to get the latest perspectives on analytics, insights, and AI.
Subscribe to Our Newsletter