Photo by Mariia Shalabaieva on Unsplash
Enhancing Our Chatbot with Document Retrieval: Exploring LLMs — 6
Hey there, fellow tech enthusiasts! Remember our last adventure where we taught a chatbot to answer questions using a PostgreSQL database? Well, hold onto your keyboards, because we're about to kick things up a notch. Ever wondered if a chatbot could flip through a document before tapping into a database? Spoiler alert: it can!
In this installment of our Exploring LLMs series, we're supercharging our chatbot with the power of Document Retrieval. If you're new here or need a refresher, you can catch up on the previous posts in this series.
Curious? Let's unravel how we pulled this off, step by step, without the fluff. And yes, the code for this chapter is right here if you're itching to get your hands dirty.
Overview of the New Features
Document Retrieval: Our chatbot can now peruse a text document to find answers before bothering the database.
LLM-Based Classification: We're using a language model to decide if the document's answer hits the nail on the head.
Fallback Mechanism: If the document leaves us hanging, we fall back to querying the database, just like old times.
Let's dive into the code modifications and see how this magic happens.
Before We Dive In
For our mission, we'll create a new document file named text_doc.md
. Think of it as the chatbot's little black book, containing hobbies and ages of employees from our database, but told in a narrative style. Here's a sneak peek:
John Doe, a 34-year-old from New York, spends most of his weekends hiking and exploring nature, with a personal goal to visit every national park in the U.S. before turning 50. Jane Smith, 29, is passionate about painting and often spends her evenings working on abstract art pieces; she dreams of one day opening her own gallery. Emily Johnson, at 41, is an avid cyclist who participates in local races and has a life goal to complete a triathlon. Michael Brown, 37, loves photography and hopes to publish a book of his work, capturing unique moments from his travels around the world. Lastly, Sarah Davis, a 25-year-old yoga enthusiast, is focused on personal growth and mindfulness, with the goal of becoming a certified instructor and opening her own wellness retreat.
Our goal? To have the chatbot tap into this juicy info when answering questions.
Step 1: Importing the Magic Ingredients
First things first, we need to bring in some extra tools to handle document loading, embeddings, and vector storage. Here's what we're adding to the mix:
from langchain.document_loaders import TextLoader
from langchain.embeddings import HuggingFaceEmbeddings # Make sure this fits your setup
from langchain.vectorstores import FAISS
from langchain.chains import RetrievalQA
Why these?
TextLoader: So our chatbot can read the document—because literacy is important!
HuggingFaceEmbeddings: To turn text into numerical vectors that capture meaning.
FAISS: Think of it as a super-fast librarian who can find similar texts in a jiffy.
RetrievalQA: The chain that combines retrieval and question-answering.
Embeddings and Vector Databases? Sounds Fancy!
Imagine trying to find a book in a library without any cataloging system—nightmare, right? Embeddings are like assigning a unique ID to every book based on its content, making similar books have similar IDs. A vector database stores all these IDs (vectors) so we can quickly find what we're looking for.
Embeddings: Convert text into high-dimensional vectors. Similar texts = similar vectors.
FAISS: An efficient way to store and search through these vectors.
By using embeddings and FAISS, our chatbot can swiftly find relevant pieces of text from our document.
Step 2: Modifying the __init__
Method
Time to teach our chatbot some new tricks!
Loading the Text Document
We need to load text_
doc.md
so our chatbot can access it:
if document_path:
# Load the document
loader = TextLoader(document_path)
documents = loader.load()
TextLoader reads the content of the file.
documents is a list containing the loaded text.
Creating Embeddings and the Vector Store
Next, we turn the text into embeddings and store them:
# Create embeddings and vectorstore
embeddings = HuggingFaceEmbeddings()
vectorstore = FAISS.from_documents(documents, embeddings)
How does this work?
Embedding Generation: Each piece of text is converted into a numerical vector.
Vector Storage: We store these vectors in FAISS for quick retrieval.
Setting Up the Retriever and the Retrieval QA Chain
Now, let's set up the system that fetches relevant text based on a query:
# Create a retriever
retriever = vectorstore.as_retriever()
# Create a RetrievalQA chain
self.retrieval_qa_chain = RetrievalQA.from_chain_type(
llm=self.llm,
chain_type="stuff",
retriever=retriever,
return_source_documents=True
)
Retriever: Finds relevant text snippets by comparing vectors.
RetrievalQA Chain: Uses the retriever and the LLM to generate answers.
Handling the Absence of a Document
In case we don't have a document, we ensure our chatbot doesn't throw a tantrum:
else:
self.retrieval_qa_chain = None
Step 3: Adding a Classification Step
But wait, how do we know if the document's answer is good enough? Let's add a step where the LLM acts as a quality checker.
Creating a Classification Prompt
We craft a prompt to ask the LLM if the answer contains the needed information:
self.classification_prompt = PromptTemplate.from_template(
"""
Given the user's question and the assistant's answer, determine whether the assistant's answer addresses the user's question, it is okay even if the answer is only partially correct, as long as it is not completely empty of any information, in such cases start your answer with yes, otherwise no.
Question: {question}
Answer: {answer}
"""
)
Purpose: Have the LLM judge the answer.
Constraint: Keep it simple.
Creating the Classification Chain
Now, we tie it all together:
self.classification_chain = LLMChain(
llm=self.llm,
prompt=self.classification_prompt
)
Step 4: Modifying the get_response
Method
Time to update how our chatbot responds to questions.
Attempting to Answer Using the Document
First, we try to answer using the document:
if self.retrieval_qa_chain:
try:
# Get answer from RetrievalQA chain
response = self.retrieval_qa_chain({"query": question})
answer = response["result"]
retrieval_qa_chain attempts to find an answer in the document.
answer is what the LLM comes up with.
Classifying the Answer
Now, we check if the answer is satisfactory:
# Use the LLM to classify whether the answer contains the information
classification_input = {
"question": question,
"answer": answer
}
classification_result = self.classification_chain.run(classification_input).strip().lower()
- classification_result tells us "yes" or "no".
Deciding Whether to Use the Retrieved Answer
Based on the classification, we decide our next move:
if classification_result == "yes":
# Update memory
self.memory.save_context({"question": question}, {"answer": answer})
return answer
else:
# Proceed to SQL chain
pass
If "Yes": We return the answer.
If "No": Time to query the database.
Exception Handling
We make sure to handle any hiccups gracefully:
except Exception as e:
# If there is any error, proceed to SQL chain
print(f"Error in RetrievalQA chain: {e}")
Proceeding to Query the Database
If the document didn't help, we fall back to our trusty SQL chain:
# Prepare the inputs
inputs = {
"question": question,
}
# Call the chain
response = self.chain.invoke(inputs)
# Update memory
self.memory.save_context({"question": question}, {"answer": response})
return response
Step 5: How It All Comes Together
User Interaction Flow
User asks a question: "What are Jane Smith's hobbies?"
Document Retrieval Attempt:
The chatbot searches
text_doc.md
.Finds information about Jane's passion for painting.
Answer Classification:
The LLM confirms the answer is relevant.
The chatbot returns: "Jane Smith is passionate about painting and dreams of opening her own gallery."
Fallback to Database Query:
If the user asks: "What is Jane Smith's salary?"
Document lacks this info.
Classification returns "No."
Chatbot queries the database and provides the salary.
Acknowledging Limitations and Future Improvements
Now, let's address the elephant in the room. This isn't a production-ready solution—yet.
Gaps and Areas for Improvement:
Scalability: Handling larger documents or multiple files efficiently.
Enhanced Retrieval: Smarter ways to chunk and retrieve documents.
Error Handling: More robust mechanisms to catch and log errors.
Security: Sanitizing inputs to prevent nasty surprises like SQL injections.
Our Intent: This series is all about giving you a taste of how LLMs can be harnessed. Moving forward, we'll dive deeper into each of these components—embeddings, vector databases, retrieval mechanisms, and more—to truly understand the nuts and bolts.
Conclusion
By adding a document retrieval mechanism and an LLM-based classification step, we've made our chatbot smarter and more resourceful. It's like giving it a library card before sending it to the database. This approach, known as Retrieval-Augmented Generation (RAG), makes the chatbot more efficient and user-friendly.
We've introduced key concepts like embeddings and vector databases, essential tools in the AI toolkit. While there's room for improvement, we've laid a solid foundation.
Key Takeaways
Embeddings: Turning text into numbers to capture meaning.
Vector Databases: Storing and searching these numerical representations efficiently.
Retrieval-Augmented Generation (RAG): Combining retrieval with generation for better answers.
LLM-Based Classification: Letting the language model judge answer relevance.
Continuous Improvement: Recognizing limitations and planning for enhancements is crucial in developing effective AI systems.
Stay tuned as we dive deeper into these components in upcoming posts. Until then, happy coding!
P.S. Got stuck or have questions? Drop a comment below or reach out—I'm all ears!