What is LangChain and Why it is damn popular? A Step-by-Step Guide using OpenAI, LangChain, and Streamlit

Table of Contents

LangChain is an open-source framework that makes it easy to build applications using large language models (LLMs). It was created by Harrison Chase and released in October 2022. LangChain has over 41,900 stars on GitHub and over 800 contributors.

What LangChain is used for?

You can use LangChain to build chatbots or personal assistants, to summarize, analyze, or generate Q&A over documents or structured data, to write or understand code, to interact with APIs, and to create other applications that take advantage of generative AI.

Nevertheless, LangChain can be used for a variety of tasks, including:

Chatbots
Question answering
Text summarization
Code generation
Natural language inference
Text generation
Data analysis

ChatBots

Chatbots are one of the central use cases for LLMs. They are able to have long-running conversations with users because they can access and process large amounts of information. They can also remember past interactions, which allows them to provide more personalized and relevant responses.

Source ~ https://python.langchain.com/docs/use_cases/chatbots

The core components of a chatbot are:

Natural language processing (NLP): This is the ability to understand and process human language. NLP is used to break down user input into its constituent parts, such as words, phrases, and entities.
Large language model (LLM): This is a statistical model that is trained on a massive dataset of text and code. LLMs are used to generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way.
Memory: This is the ability to store and retrieve information. Memory is used to store past interactions with the user, as well as information about the domain that the chatbot is operating in.
Retrieval: This is the ability to find information that is stored in memory. Retrieval is used to find information that is relevant to the user’s query.

Here’s a brief description of the various components of a chatbot that uses an LLM. The components are:

Prompt: This is the message that the user sends to the chatbot. The prompt is used to trigger the LLM and to provide the LLM with some context for its response.
LLM: This is the large language model that is used to generate the chatbot’s response. The LLM is trained on a massive dataset of text and code, and it can generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way.
Memory: This is the ability to store and retrieve information. Memory is used to store past interactions with the user, as well as information about the domain that the chatbot is operating in.
Retrieval: This is the ability to find information that is stored in memory. Retrieval is used to find information that is relevant to the user’s query.
Answer: This is the chatbot’s response to the user’s prompt. The answer is generated by the LLM and is based on the information that is stored in memory.

The chatbot works by first receiving a prompt from the user. The prompt is then passed to the LLM, which generates a response. The response is then passed to the memory, which stores it for future reference. If the user asks a question, the chatbot can then retrieve the answer from memory.

The retrieval component is optional, and it is only used if the chatbot is able to find the answer to the user’s question in memory. If the chatbot cannot find the answer in memory, it will use the LLM to generate a response.

The chatbot can be improved by increasing the size of the LLM, by adding more information to the memory, and by improving the retrieval algorithm.

At a high level, LangChain connects LLMs to external sources like Google, Wikipedia, Notion, and Wolfram. It also provides abstractions and tools to help developers interface between text input and output. LLM models and components are linked together in a pipeline, which makes it easy to rapidly prototype robust applications.

In other words, LangChain is a toolkit that helps developers build applications that are powered by LLMs. It provides a variety of features that make it easy to connect LLMs to external sources, interface with text input and output, and rapidly prototype applications.

Key Features of LangChain

Here are some of the key features of LangChain:

Connects LLMs to external sources: LangChain makes it easy to connect LLMs to external sources like Google, Wikipedia, Notion, and Wolfram. This allows developers to access a wider range of data and information when building their applications.
Provides abstractions and tools: LangChain provides a variety of abstractions and tools to help developers interface between text input and output. This makes it easier for developers to build applications that can understand and respond to natural language.
Links LLM models and components into a pipeline: LangChain links LLM models and components together in a pipeline. This makes it easy for developers to rapidly prototype robust applications.

Benefits of using Langchain

It makes it easier to develop LLM-powered applications.
It provides a standard interface for interacting with LLMs.
It has a library of pre-built components for common tasks.
It supports multiple LLMs.
It has tools for debugging and monitoring applications.

Key Modules of LangChain

Here’s a six key modules of LangChain:

Model I/O is the module responsible for connecting the LLM model to the user input and output. It takes the user input, which is typically a prompt, and passes it to the LLM model. The LLM model then generates an output, which is typically text, and the Model I/O module returns the output to the user.
Data connection is the module responsible for loading, transforming, storing, and querying user data. It can be used to load data from a variety of sources, such as files, databases, and APIs. It can also be used to transform data into a format that the LLM model can understand.
Memory is the module responsible for storing short-term and long-term memories. This allows LangChain to remember previous interactions with the user, which can be used to improve the quality of the output.
Chains are a way to combine several components or other chains in a single pipeline. This allows LangChain to build complex applications that can perform a variety of tasks.
Agents are responsible for deciding on a course of action to take based on the input. They can use the data connection, memory, and chains modules to get the information they need to make a decision.
Callbacks are functions that are triggered to perform at specific points during the execution of an LLM run. This can be used to monitor the progress of the run or to take corrective action if necessary.

In short, the six key modules of LangChain provide a powerful and flexible framework for building applications that use LLMs. They can be used to connect the LLM model to the user, load and transform data, store memories, build complex pipelines, make decisions, and monitor the execution of an LLM run.

If you are interested in building applications using LLMs, LangChain is a great option. It is a powerful and versatile framework that can help you build applications that are both efficient and effective. In this blog post, you will learn how to build a simple LLM-powered app using OpenAI, LangChain, and Streamlit.

Getting started

To get started, you will need to:

Get an OpenAI API key

You can do this by following the instructions on the OpenAI website.
Set up a coding environment with Python and the following libraries:

streamlit
openai
langchain

You can install these libraries using the following commands:

pip install streamlit
pip install openai
pip install langchain

Building the app

The app takes in the user’s input text, uses the OpenAI API to generate AI-generated content, and displays the output in a blue box.

The code for the app is as follows:

import streamlit as stlit
from langchain.llms import OpenAI

stlit.title('Quickstart App')

openai_api_key = stlit.sidebar.text_input('OpenAI API Key')

def generate_response(input_text):
  llm = OpenAI(temperature=0.7, openai_api_key=openai_api_key)
  stlit.info(llm(input_text))

with stlit.form('my_form'):
  text = stlit.text_area('Enter text:', 'What are the three key pieces of advice for learning how to code?')
  submitted = stlit.form_submit_button('Submit')
  if not openai_api_key.startswith('sk-'):
    stlit.warning('Please enter your OpenAI API key!', icon='⚠')
  if submitted and openai_api_key.startswith('sk-'):
    generate_response(text)

To run the app, you can save the code in a file called streamlit_app.py and then run the following command in the terminal:

streamlit run streamlit_app.py

Deploying the app

You can deploy the app to the cloud by creating a GitHub repository and deploying it to Streamlit Community Cloud.

To do this, follow these steps:

Create a GitHub repository for the app.
In Streamlit Community Cloud, click the New app button, then specify the repository, branch, and main file path.
Click the Deploy! button.
Once the app is deployed, you can access it by clicking the URL that is displayed in the Streamlit Community Cloud dashboard.

Conclusion

In this blog post, you learned how to build a simple LLM-powered app using OpenAI, LangChain, and Streamlit.

Here are some resources for further learning:

OpenAI website: https://openai.com/
LangChain documentation: https://langchain.readthedocs.io/en/latest/
Streamlit documentation: https://docs.streamlit.io/en/stable/