Saurabh Sharma

Building a Private GPT Chatbot :

A Step-by-Step Guide

A brief intro

In this blog, we'll break down a project that involves creating a private GPT chatbot using Python and various modern technologies. We'll go through the configuration, the main components, and the logic behind them. Let's dive in!

Github Repo : Local AI Chatbot with HuggingFace

1. Configuration: config.yaml

First, let's look at the config.yaml file, which holds configuration details for the chatbot.

model_path:

small: "./model/mistral-7b-instruct-v0.1.Q4_K_M.gguf"

large: "./model/capybarahermes-2.5-mistral-7b.Q5_K_M.gguf"

model_type: "mistral"

embeddings_path: "BAAI/bge-large-en-v1.5"

model_config: {'max_new_tokens': 512, 'temperature': 0.7, 'context_length': 4096, 'gpu_layers': 0}

chat_history_path: "./chat_sessions"

Model Paths: Specifies the paths to different models used for generating responses.
Model Type: Defines the type of model, in this case, "mistral".
Embeddings Path: Path to pre-trained embeddings for better understanding and generating text.
Model Config: Contains settings like the maximum number of tokens, temperature (controls randomness), context length, and GPU layers.
Chat History Path: Location to save chat sessions.

Why These Choices:

YAML for Configuration: YAML is human-readable and easy to edit, making it a great choice for configuration files.

Alternatives:

JSON or INI Files: JSON is also readable and widely used. INI files are simpler but less flexible for nested configurations.

2. Main Application: gpt.py

The main file, gpt.py, uses Streamlit to create a user interface for the chatbot.

import streamlit as st from llm_chain import load_normal_chain from langchain.memory import StreamlitChatMessageHistory

def load_chain(chat_history):

return load_normal_chain(chat_history)

def clear_input_field():

st.session_state.user_question = st.session_state.user_input

st.session_state.user_input = ""

def set_send_input():

st.session_state.send_input = True

clear_input_field()

def main():

st.title("Private GPT")

chat_container = st.container()

chat_history = StreamlitChatMessageHistory(key="history")

llm_chain = load_chain(chat_history)

user_input = st.text_input("Type your message here", key="user_input", on_change=set_send_input)

send_button = st.button("Send", key="send_button")

if send_button or st.session_state.send_input:

if st.session_state.user_question != "":

with chat_container:

st.chat_message("user").write(st.session_state.user_question)

llm_response = llm_chain.run(st.session_state.user_question)

st.session_state.user_question = ""

if chat_history.messages != []:

with chat_container:

st.write("Chat History:")

for message in chat_history.messages:

st.chat_message(message.type).write(message.content)

if __name__ == "__main__":

main()

Streamlit: Creates a web app interface.
Chat History: Manages and displays past messages.
User Input Handling: Captures user input and triggers responses from the chatbot.

Why These Choices:

Streamlit: Streamlit is easy to set up and allows for rapid prototyping of web applications with Python. It's especially useful for data science projects.

Alternatives:

Flask or Django: These are more robust web frameworks that offer greater flexibility and control but require more setup and knowledge of web development.
Gradio: Another tool similar to Streamlit, designed for creating interactive demos and web apps for machine learning models.

3. Logic and Chains: llm_chain.py

This file handles the creation and management of the language model chain.

from prompt_templates import memory_prompt_template

from langchain.chains import StuffDocumentsChain, LLMChain, ConversationalRetrievalChain

from langchain.embeddings import HuggingFaceInstructEmbeddings

from langchain.memory import ConversationBufferWindowMemory

from langchain.prompts import PromptTemplate

from langchain.llms import CTransformers

from langchain.vectorstores import Chroma

import yaml

with open("config.yaml","r") as f:

config = yaml.safe_load(f)

def create_llm(model_path=config["model_path"]["large"], model_type=config["model_type"], model_config=config["model_config"]):

llm = CTransformers(model=model_path, model_type=model_type, config=model_config)

return llm

def create_embeddings(embeddings_path=config["embeddings_path"]):

return HuggingFaceInstructEmbeddings(embeddings_path)

def create_chat_memory(chat_history):

return ConversationBufferWindowMemory(memory_key="history", chat_memory=chat_history, k=9)

def create_prompt_from_template(template):

return PromptTemplate.from_template(template)

def create_llm_chain(llm, chat_prompt, memory):

return LLMChain(llm=llm, prompt=chat_prompt, memory=memory)

def load_normal_chain(chat_history):

return chatChain(chat_history)

class chatChain:

def __init__(self, chat_history):

self.memory = create_chat_memory(chat_history)

llm = create_llm()

chat_prompt = create_prompt_from_template(memory_prompt_template)

self.llm_chain = create_llm_chain(llm, chat_prompt, self.memory)

def run(self, user_input):

return self.llm_chain.run(human_input=user_input, history=self.memory.chat_memory.messages, stop=["Human:"])

LLM Creation: Loads the language model with specified configurations.
Embeddings: Uses pre-trained embeddings for better understanding.
Chat Memory: Manages chat history with a buffer.
Prompt Template: Structures responses.
LLM Chain: Combines the model, prompt, and memory to generate responses.

4. Prompt Templates: prompt_templates.py

This file contains templates for the chatbot's responses.

memory_prompt_template = """<s>[INST] You are an AI chatbot having a conversation with a human. Answer his questions.[/INST]

Previous conversation: {history}

Human: {human_input}

AI:"""

Templates: Provide a structure for how the chatbot should respond based on the conversation history and user input.

Conclusion

This project demonstrates how to build a private GPT chatbot using modern technologies. The configuration file sets up the model paths and parameters, while the main Python scripts handle the user interface, logic, and language model chaining. With these components, you can create an interactive and responsive chatbot tailored to your needs.