原文首發(fā)于博客文章langchain源碼閱讀
本節(jié)是langchian源碼閱讀系列第三篇,下面進(jìn)入Chain模塊??:
LLM 應(yīng)用構(gòu)建實(shí)踐筆記
Chain鏈定義
鏈定義為對(duì)組件的一系列調(diào)用,也可以包括其他鏈,這種在鏈中將組件組合在一起的想法很簡(jiǎn)單但功能強(qiáng)大,極大地簡(jiǎn)化了復(fù)雜應(yīng)用程序的實(shí)現(xiàn)并使其更加模塊化,這反過來又使調(diào)試、維護(hù)和改進(jìn)應(yīng)用程序變得更加容易。
Chain基類是所有chain對(duì)象的基本入口,與用戶程序交互,處理用戶的輸入,準(zhǔn)備其他模塊的輸入,提供內(nèi)存能力,chain的回調(diào)能力,其他所有的 Chain 類都繼承自這個(gè)基類,并根據(jù)需要實(shí)現(xiàn)特定的功能。
class Chain(BaseModel, ABC):
memory: BaseMemory
callbacks: Callbacks
def __call__(
self,
inputs: Any,
return_only_outputs: bool = False,
callbacks: Callbacks = None,
) -> Dict[str, Any]:
...
實(shí)現(xiàn)自定義鏈
from __future__ import annotations
from typing import Any, Dict, List, Optional
from pydantic import Extra
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.manager import (
AsyncCallbackManagerForChainRun,
CallbackManagerForChainRun,
)
from langchain.chains.base import Chain
from langchain.prompts.base import BasePromptTemplate
class MyCustomChain(Chain):
prompt: BasePromptTemplate
llm: BaseLanguageModel
output_key: str = "text"
class Config:
extra = Extra.forbid
arbitrary_types_allowed = True
@property
def input_keys(self) -> List[str]:
"""pompt中的動(dòng)態(tài)變量
"""
return self.prompt.input_variables
@property
def output_keys(self) -> List[str]:
"""允許直接輸出的動(dòng)態(tài)變量.
"""
return [self.output_key]
# 同步調(diào)用
def _call(
self,
inputs: Dict[str, Any],
run_manager: Optional[CallbackManagerForChainRun] = None,
) -> Dict[str, str]:
# 下面是一個(gè)自定義邏輯實(shí)現(xiàn)
prompt_value = self.prompt.format_prompt(**inputs)
# 調(diào)用一個(gè)語(yǔ)言模型或另一個(gè)鏈時(shí),傳遞一個(gè)回調(diào)處理。這樣內(nèi)部運(yùn)行可以通過這個(gè)回調(diào)(進(jìn)行邏輯處理)。
response = self.llm.generate_prompt(
[prompt_value], callbacks=run_manager.get_child() if run_manager else None
)
# 回調(diào)出發(fā)時(shí)的日志輸出
if run_manager:
run_manager.on_text("Log something about this run")
return {self.output_key: response.generations[0][0].text}
# 異步調(diào)用
async def _acall(
self,
inputs: Dict[str, Any],
run_manager: Optional[AsyncCallbackManagerForChainRun] = None,
) -> Dict[str, str]:
prompt_value = self.prompt.format_prompt(**inputs)
response = await self.llm.agenerate_prompt(
[prompt_value], callbacks=run_manager.get_child() if run_manager else None
)
if run_manager:
await run_manager.on_text("Log something about this run")
return {self.output_key: response.generations[0][0].text}
@property
def _chain_type(self) -> str:
return "my_custom_chain"
繼承Chain的子類主要有兩種類型:
通用工具 chain: 控制chain的調(diào)用順序, 是否調(diào)用,他們可以用來合并構(gòu)造其他的chain。
專門用途 chain: 和通用chain比較來說,他們承擔(dān)了具體的某項(xiàng)任務(wù),可以和通用的chain組合起來使用,也可以直接使用。有些 Chain 類可能用于處理文本數(shù)據(jù),有些可能用于處理圖像數(shù)據(jù),有些可能用于處理音頻數(shù)據(jù)等。
從 LangChainHub 加載鏈
LangChainHub 托管了一些高質(zhì)量Prompt、Agent和Chain,可以直接在langchain中使用。
def test_mathchain():
from langchain.chains import load_chain
chain = load_chain("lc://chains/llm-math/chain.json")
"""
> Entering new chain...
2+2等于幾Answer: 4
> Finished chain.
Answer: 4
"""
print(chain.run("2+2等于幾"))
運(yùn)行 LLM 鏈的五種方式
from langchain import PromptTemplate, OpenAI, LLMChain
prompt_template = "給做 {product} 的公司起一個(gè)名字?"
llm = OpenAI(temperature=0)
llm_chain = LLMChain(
llm=llm,
prompt=PromptTemplate.from_template(prompt_template)
)
print(llm_chain("兒童玩具"))
print(llm_chain.run("兒童玩具"))
llm_chain.apply([{"product":"兒童玩具"}])
llm_chain.generate([{"product":"兒童玩具"}])
llm_chain.predict(product="兒童玩具")
通用工具chain
- MultiPromptChain:可以動(dòng)態(tài)選擇與給定問題最相關(guān)的提示,然后使用該提示回答問題。
- EmbeddingRouterChain:使用嵌入和相似性動(dòng)態(tài)選擇下一個(gè)鏈。
- LLMRouterChain:使用 LLM 來確定動(dòng)態(tài)選擇下一個(gè)鏈。
- SimpleSequentialChain/SequentialChain:將多個(gè)鏈按照順序組成處理流水線,SimpleMemory支持在多個(gè)鏈之間傳遞上下文
- TransformChain:一個(gè)自定義方法做動(dòng)態(tài)轉(zhuǎn)換的鏈
def transform_func(inputs: dict) -> dict: text = inputs["text"] shortened_text = "\n\n".join(text.split("\n\n")[:3]) return {"output_text": shortened_text} transform_chain = TransformChain( input_variables=["text"], output_variables=["output_text"], transform=transform_func ) template = """Summarize this text: {output_text} Summary:""" prompt = PromptTemplate(input_variables=["output_text"], template=template) llm_chain = LLMChain(llm=OpenAI(), prompt=prompt) sequential_chain = SimpleSequentialChain(chains=[transform_chain, llm_chain])
合并文檔的鏈(專門用途chain)
BaseCombineDocumentsChain 有四種不同的模式
def load_qa_chain(
llm: BaseLanguageModel,
chain_type: str = "stuff",
verbose: Optional[bool] = None,
callback_manager: Optional[BaseCallbackManager] = None,
**kwargs: Any,
) -> BaseCombineDocumentsChain:
"""Load question answering chain.
Args:
llm: Language Model to use in the chain.
chain_type: Type of document combining chain to use. Should be one of "stuff",
"map_reduce", "map_rerank", and "refine".
verbose: Whether chains should be run in verbose mode or not. Note that this
applies to all chains that make up the final chain.
callback_manager: Callback manager to use for the chain.
Returns:
A chain to use for question answering.
"""
loader_mapping: Mapping[str, LoadingCallable] = {
"stuff": _load_stuff_chain,
"map_reduce": _load_map_reduce_chain,
"refine": _load_refine_chain,
"map_rerank": _load_map_rerank_chain,
}
StuffDocumentsChain
獲取一個(gè)文檔列表,帶入提示上下文,傳遞給LLM(適合小文檔)
def _load_stuff_chain(
llm: BaseLanguageModel,
prompt: Optional[BasePromptTemplate] = None,
document_variable_name: str = "context",
verbose: Optional[bool] = None,
callback_manager: Optional[BaseCallbackManager] = None,
callbacks: Callbacks = None,
**kwargs: Any,
) -> StuffDocumentsChain:
RefineDocumentsChain
在Studff方式上進(jìn)一步優(yōu)化,循環(huán)輸入文檔并迭代更新其答案,以獲得最好的最終結(jié)果。具體做法是將所有非文檔輸入、當(dāng)前文檔和最新的中間答案組合傳遞給LLM。(適合LLM上下文大小不能容納的小文檔)
def _load_refine_chain(
llm: BaseLanguageModel,
question_prompt: Optional[BasePromptTemplate] = None,
refine_prompt: Optional[BasePromptTemplate] = None,
document_variable_name: str = "context_str",
initial_response_name: str = "existing_answer",
refine_llm: Optional[BaseLanguageModel] = None,
verbose: Optional[bool] = None,
callback_manager: Optional[BaseCallbackManager] = None,
callbacks: Callbacks = None,
**kwargs: Any,
) -> RefineDocumentsChain:
MapReduceDocumentsChain
將LLM鏈應(yīng)用于每個(gè)單獨(dú)的文檔(Map步驟),將鏈的輸出視為新文檔。然后,將所有新文檔傳遞給單獨(dú)的合并文檔鏈以獲得單一輸出(Reduce步驟)。在執(zhí)行Map步驟前也可以對(duì)每個(gè)單獨(dú)文檔進(jìn)行壓縮或合并映射,以確保它們適合合并文檔鏈;可以將這個(gè)步驟遞歸執(zhí)行直到滿足要求。(適合大規(guī)模文檔的情況)
def _load_map_reduce_chain(
llm: BaseLanguageModel,
question_prompt: Optional[BasePromptTemplate] = None,
combine_prompt: Optional[BasePromptTemplate] = None,
combine_document_variable_name: str = "summaries",
map_reduce_document_variable_name: str = "context",
collapse_prompt: Optional[BasePromptTemplate] = None,
reduce_llm: Optional[BaseLanguageModel] = None,
collapse_llm: Optional[BaseLanguageModel] = None,
verbose: Optional[bool] = None,
callback_manager: Optional[BaseCallbackManager] = None,
callbacks: Callbacks = None,
**kwargs: Any,
) -> MapReduceDocumentsChain:
MapRerankDocumentsChain
每個(gè)文檔上運(yùn)行一個(gè)初始提示,再給對(duì)應(yīng)輸出給一個(gè)分?jǐn)?shù),返回得分最高的回答。
def _load_map_rerank_chain(
llm: BaseLanguageModel,
prompt: BasePromptTemplate = map_rerank_prompt.PROMPT,
verbose: bool = False,
document_variable_name: str = "context",
rank_key: str = "score",
answer_key: str = "answer",
callback_manager: Optional[BaseCallbackManager] = None,
callbacks: Callbacks = None,
**kwargs: Any,
) -> MapRerankDocumentsChain:
獲取領(lǐng)域知識(shí)的鏈(專門用途chain)
APIChain使得可以使用LLMs與API進(jìn)行交互,以檢索相關(guān)信息。通過提供與所提供的API文檔相關(guān)的問題來構(gòu)建鏈。
下面是與播客查詢相關(guān)的
import os
from langchain.llms import OpenAI
from langchain.chains.api import podcast_docs
from langchain.chains import APIChain
listen_api_key = 'xxx'
llm = OpenAI(temperature=0)
headers = {"X-ListenAPI-Key": listen_api_key}
chain = APIChain.from_llm_and_api_docs(llm, podcast_docs.PODCAST_DOCS, headers=headers, verbose=True)
chain.run("搜索關(guān)于ChatGPT的節(jié)目, 要求超過30分鐘,只返回一條")
合并文檔的鏈的高頻使用場(chǎng)景舉例
對(duì)話場(chǎng)景(最廣泛)
ConversationalRetrievalChain 對(duì)話式檢索鏈的工作原理:將聊天歷史記錄(顯式傳入或從提供的內(nèi)存中檢索)和問題合并到一個(gè)獨(dú)立的問題中,然后從檢索器查找相關(guān)文檔,最后將這些文檔和問題傳遞給問答鏈以返回響應(yīng)。文章來源:http://www.zghlxwxcb.cn/news/detail-549521.html
def test_converstion():
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
loader = TextLoader("./test.txt")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
documents = text_splitter.split_documents(documents)
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(documents, embeddings)
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0), vectorstore.as_retriever(), memory=memory)
query = "這本書包含哪些內(nèi)容?"
result = qa({"question": query})
print(result)
chat_history = [(query, result["answer"])]
query = "還有要補(bǔ)充的嗎"
result = qa({"question": query, "chat_history": chat_history})
print(result["answer"])
基于數(shù)據(jù)庫(kù)問答場(chǎng)景
def test_db_chain():
from langchain import OpenAI, SQLDatabase, SQLDatabaseChain
db = SQLDatabase.from_uri("sqlite:///../user.db")
llm = OpenAI(temperature=0, verbose=True)
db_chain = SQLDatabaseChain.from_llm(llm, db, verbose=True, use_query_checker=True)
db_chain.run("有多少用戶?")
總結(jié)場(chǎng)景
def test_summary():
from langchain.chains.summarize import load_summarize_chain
text_splitter = CharacterTextSplitter()
with open("./測(cè)試.txt") as f:
state_of_the_union = f.read()
texts = text_splitter.split_text(state_of_the_union)
docs = [Document(page_content=t) for t in texts[:3]]
chain = load_summarize_chain(OpenAI(temperature=0), chain_type="map_reduce")
chain.run(docs)
問答場(chǎng)景
def test_qa():
from langchain.chains.question_answering import load_qa_chain
loader = TextLoader("./測(cè)試.txt")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_documents(documents)
embeddings = OpenAIEmbeddings()
docsearch = Chroma.from_documents(texts, embeddings)
qa_chain = load_qa_chain(OpenAI(temperature=0), chain_type="map_reduce")
qa = RetrievalQA(combine_documents_chain=qa_chain, retriever=docsearch.as_retriever())
qa.run()
更多內(nèi)容在公號(hào)
文章來源地址http://www.zghlxwxcb.cn/news/detail-549521.html
到了這里,關(guān)于langchain源碼閱讀系列(三)之Chain模塊的文章就介紹完了。如果您還想了解更多內(nèi)容,請(qǐng)?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!