Resource Augmented Generation
Categories: ai
Resource Augmented Generation (RAG)
Reviewing what others are saying about RAG and how to use it. Mainly looking to upskill how to externalize my knowledge to others.
What is Retrieval-Augmented Generation (RAG)?
LLM is instructed to retrieve content from a dedicated corpus of document models with primary source data. Increases confidence based on retrieved data.
RAG vs Fine-Tuning vs Prompt Engineering: Optimizing AI Models
Retrieval Augmented Generation
Query -> Model -> Corpus
Converts the queries and documents into vector embeddings.
- Searches for documents semantically similar documents
- Allows for updated information and domain specific knowledge
- Adds additional latency to the response of the model.
- Vector embeddings can be expensive
Fine-Tuning
Takes an existing model and give it a focused data set to develop expertise on. Changes the model weights, generally using backpropagation using supervised learning with domain specific request and response models. Very fast since it is baked into the model weight.
Down sides:
- Thousands of training points
- Computational cost of training a new model
- Catastrophic forgetting – forgets skills unrelated to the specialized training.
Prompt training
Benefits;
- No infrastructure reuqired for it
- Immediate results
- Prompt is an art
Drawbacks:
- Limited to existing knowledge
- No additional knowledge
Summary
All three are generally used together.
LangChain RAG: Optimizing AI Models for Accurate Responses
Great chart to pull from about ~2 minutes
Walks through how RAG works. Specifically, the two flows: on-line queries and off-line loading. They note the search space is target model specific.
Off-line loading
They use retrievers to load multiple web pages into a langchain Chroma database as a set of vectors. Allows for queries to find related queries based on semantically relevant responses.
On-line queries
A prompt is a combination of instructions + search results + question sent to the LLM. Then inferences are made
based off this.