Unveiling the Power of RAG and LangChains in Building an Advanced Chatbot for Customer Queries

Perumal Babu
4 min readSep 24, 2023

In the realm of customer service, technology has emerged as a powerful catalyst for transformative change. The continuous evolution of Natural Language Processing (NLP) has paved the way for the creation of increasingly intricate, precise, and rapid response systems. This blog post delves into the intricacies of constructing a cutting-edge question-answering system tailored to address customer inquiries. It will explore the application of advanced techniques such as Retrieval Augmented Generation (RAG), Large Language Models (LLMs), and the integration of the LangChains library

The Role of RAG in NLP based Solutions

RAG, or Retrieval-augmented generation, is a technique that enhances the capabilities of traditional language models. It works by first retrieving a set of relevant documents from an external knowledge base like Wikipedia or a scientific paper database. These retrieved documents are then combined with the original prompt and fed into an LLM, which subsequently generates a response.

Here is a quick demo of how it works.

Here is a logic view of the architecture of a LLM enabled chatbot that I had built as a Proof of concept.

Data Preparation and Transformation

Before feeding any data into the system, it’s essential to prepare it appropriately. In our case, we used raw order data from customers. This data was transformed into natural language payloads using a template-based transformer. Alternatively, an LLM could be employed for more nuanced results.

When dealing with large files, the data was split into smaller chunks to make it more manageable.

Embeddings and Storage

The transformed data was then embedded using the model [Instructor-Large from Hugging Face](https://huggingface.co/hkunlp/instructor-large). Once converted into vectors, it was stored in a vector database. For our implementation, we used Chroma DB.

Setting Up the User Interface

A simple interface was developed to accept questions from users, enhancing the system’s usability.

Building the Question-Answering Chain with LangChains

We leveraged the LangChains library to create a question-answering chain. The specific chain used was RetrievalQA, a base class designed for QA systems that use a retriever and an LLM to generate answers.

Configuration Details

  • LLM: LLama2 was used for transforming vector results to natural language.
  • Chain Type:Stuff,” a straightforward document chain, was employed. It retrieves a list of relevant documents, concatenates them into a single string, and then forwards the string to the LLM for generating an answer.
  • Retriever: Chroma, our chosen vector database, was used for retrieving relevant documents.
  • Return Source Documents: An optional setting where the relevant document could also be returned along with the answer.
  • chain_type_kwargs: A dictionary with two keys — ‘prompt’ and ‘memory’ — was passed. The ‘prompt’ specifies the prompt used for generating the answer, and ‘memory’ provides an object for storing and retrieving information across multiple conversational turns.

Putting it All Together

The data from OMS was loaded as knowledge artifacts after a template based transformation of the data into plain english. This knowledge is ingested into vector store like Chroma db. Now based on the query the data is fetched from the chroma db we also use the prompt data and historical chat conversation as context to the LLama 2 (LLM) model.For this purpose we use Langchain’s RetreivalQA module. The response is then presented to the user.

Demo and Result

Our demo showcases a small chatbot capable of answering customer queries about products. The blend of Langchains, and other advanced NLP techniques has proved to be highly effective in constructing a powerful question-answering system. This approach is versatile and can be extended to various other tasks like summarization and creative writing. By diving deeper into each component, you can unlock a wide range of possibilities in NLP.

I would not say that they are production ready.It needs considerable optimization both for accuracy and performance.I have seen instances where the POC that I created could not fetch information from multiple customers accurately. However it was able to get information for a single customer more accurately.

So go ahead, explore these tools and techniques, and bring your chatbot to the next level. Happy coding! The POC was inspired by Private GPT design by Iván Martínez. I have added a link to the repo in my references.

Reference :

https://github.com/imartinez/privateGPT

--

--