■ 채팅 히스토리를 갖고 CHROMA 벡터 저장소를 검색하는 방법을 보여준다.
※ OPENAI_API_KEY 환경 변수 값은 .env 파일에 정의한다.
▶ main.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 |
import bs4 from dotenv import load_dotenv from langchain_openai import ChatOpenAI from langchain_community.document_loaders import WebBaseLoader from langchain_text_splitters import RecursiveCharacterTextSplitter from langchain_chroma import Chroma from langchain_openai import OpenAIEmbeddings from langchain_core.prompts import ChatPromptTemplate from langchain_core.prompts import MessagesPlaceholder from langchain.chains import create_history_aware_retriever from langchain.chains.combine_documents import create_stuff_documents_chain from langchain.chains import create_retrieval_chain from langchain_core.chat_history import BaseChatMessageHistory from langchain_community.chat_message_histories import ChatMessageHistory from langchain_core.runnables.history import RunnableWithMessageHistory load_dotenv() chatOpenAI = ChatOpenAI(model = "gpt-4o") webBaseLoader = WebBaseLoader( web_paths = ("https://lilianweng.github.io/posts/2023-06-23-agent/",), bs_kwargs = dict(parse_only = bs4.SoupStrainer(class_ = ("post-content", "post-title", "post-header"))) ) documentList = webBaseLoader.load() recursiveCharacterTextSplitter = RecursiveCharacterTextSplitter(chunk_size = 1000, chunk_overlap = 200) splitDocumentList = recursiveCharacterTextSplitter.split_documents(documentList) openAIEmbeddings = OpenAIEmbeddings() chroma = Chroma.from_documents(documents = splitDocumentList, embedding = openAIEmbeddings) vectorStoreRetriever = chroma.as_retriever() systemMessage1 = "Given a chat history and the latest user question which might reference context in the chat history, formulate a standalone question which can be understood without the chat history. Do NOT answer the question, just reformulate it if needed and otherwise return it as is." chatPromptTemplate1 = ChatPromptTemplate.from_messages( [ ("system", systemMessage1), MessagesPlaceholder("chat_history"), ("human", "{input}") ] ) runnableBinding1 = create_history_aware_retriever(chatOpenAI, vectorStoreRetriever, chatPromptTemplate1) systemMessage2 = "You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, say that you don't know. Use three sentences maximum and keep the answer concise.\n\n{context}" chatPromptTemplate2 = ChatPromptTemplate.from_messages( [ ("system", systemMessage2), MessagesPlaceholder("chat_history"), ("human", "{input}"), ] ) runnableBinding2 = create_stuff_documents_chain(chatOpenAI, chatPromptTemplate2) runnableBinding3 = create_retrieval_chain(runnableBinding1, runnableBinding2) chatMessageHistoryDictionary = {} def GetChatMessageHistoryDictionary(session_id : str) -> BaseChatMessageHistory: if session_id not in chatMessageHistoryDictionary: chatMessageHistoryDictionary[session_id] = ChatMessageHistory() return chatMessageHistoryDictionary[session_id] runnableWithMessageHistory = RunnableWithMessageHistory( runnableBinding3, GetChatMessageHistoryDictionary, input_messages_key = "input", history_messages_key = "chat_history", output_messages_key = "answer", ) responseDictionary1 = runnableWithMessageHistory.invoke( {"input" : "What is Task Decomposition?"}, config = {"configurable" : {"session_id" : "abc123"}} ) answer1 = responseDictionary1["answer"] print(answer1) print("-" * 50) """ Task Decomposition is a technique used to break down complex tasks into smaller, manageable steps. It is often implemented using methods like Chain of Thought (CoT) or Tree of Thoughts, which help in systematically exploring and reasoning through various possibilities. This approach enhances model performance by allowing a step-by-step analysis and execution of tasks. """ responseDictionary2 = runnableWithMessageHistory.invoke( {"input" : "What are common ways of doing it?"}, config = {"configurable" : {"session_id" : "abc123"}} ) answer2 = responseDictionary2["answer"] print(answer2) print("-" * 50) from langchain_core.messages import AIMessage for message in chatMessageHistoryDictionary["abc123"].messages: if isinstance(message, AIMessage): prefix = "AI" else: prefix = "User" print(f"{prefix} : {message.content}") print("-" * 50) """ Task decomposition is the process of breaking down a complex task into smaller, more manageable steps or subgoals. This approach, often used in conjunction with techniques like Chain of Thought (CoT), helps enhance model performance by enabling step-by-step reasoning. It can be achieved through prompting, task-specific instructions, or human inputs. -------------------------------------------------- Common ways of performing task decomposition include using straightforward prompts like "Steps for XYZ.\n1." or "What are the subgoals for achieving XYZ?", employing task-specific instructions such as "Write a story outline" for writing a novel, and incorporating human inputs. -------------------------------------------------- User : What is Task Decomposition? AI : Task decomposition is the process of breaking down a complex task into smaller, more manageable steps or subgoals. This approach, often used in conjunction with techniques like Chain of Thought (CoT), helps enhance model performance by enabling step-by-step reasoning. It can be achieved through prompting, task-specific instructions, or human inputs. User : What are common ways of doing it? AI : Common ways of performing task decomposition include using straightforward prompts like "Steps for XYZ.\n1." or "What are the subgoals for achieving XYZ?", employing task-specific instructions such as "Write a story outline" for writing a novel, and incorporating human inputs. -------------------------------------------------- """ |
▶ requirements.txt
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 |
aiohappyeyeballs==2.4.3 aiohttp==3.11.7 aiosignal==1.3.1 annotated-types==0.7.0 anyio==4.6.2.post1 asgiref==3.8.1 attrs==24.2.0 backoff==2.2.1 bcrypt==4.2.1 beautifulsoup4==4.12.3 bs4==0.0.2 build==1.2.2.post1 cachetools==5.5.0 certifi==2024.8.30 charset-normalizer==3.4.0 chroma-hnswlib==0.7.6 chromadb==0.5.20 click==8.1.7 colorama==0.4.6 coloredlogs==15.0.1 dataclasses-json==0.6.7 Deprecated==1.2.15 distro==1.9.0 durationpy==0.9 fastapi==0.115.5 filelock==3.16.1 flatbuffers==24.3.25 frozenlist==1.5.0 fsspec==2024.10.0 google-auth==2.36.0 googleapis-common-protos==1.66.0 greenlet==3.1.1 grpcio==1.68.0 h11==0.14.0 httpcore==1.0.7 httptools==0.6.4 httpx==0.27.2 httpx-sse==0.4.0 huggingface-hub==0.26.2 humanfriendly==10.0 idna==3.10 importlib_metadata==8.5.0 importlib_resources==6.4.5 jiter==0.8.0 jsonpatch==1.33 jsonpointer==3.0.0 kubernetes==31.0.0 langchain==0.3.8 langchain-chroma==0.1.4 langchain-community==0.3.8 langchain-core==0.3.21 langchain-openai==0.2.10 langchain-text-splitters==0.3.2 langsmith==0.1.146 markdown-it-py==3.0.0 marshmallow==3.23.1 mdurl==0.1.2 mmh3==5.0.1 monotonic==1.6 mpmath==1.3.0 multidict==6.1.0 mypy-extensions==1.0.0 numpy==1.26.4 oauthlib==3.2.2 onnxruntime==1.20.1 openai==1.55.1 opentelemetry-api==1.28.2 opentelemetry-exporter-otlp-proto-common==1.28.2 opentelemetry-exporter-otlp-proto-grpc==1.28.2 opentelemetry-instrumentation==0.49b2 opentelemetry-instrumentation-asgi==0.49b2 opentelemetry-instrumentation-fastapi==0.49b2 opentelemetry-proto==1.28.2 opentelemetry-sdk==1.28.2 opentelemetry-semantic-conventions==0.49b2 opentelemetry-util-http==0.49b2 orjson==3.10.12 overrides==7.7.0 packaging==24.2 posthog==3.7.3 propcache==0.2.0 protobuf==5.28.3 pyasn1==0.6.1 pyasn1_modules==0.4.1 pydantic==2.10.2 pydantic-settings==2.6.1 pydantic_core==2.27.1 Pygments==2.18.0 PyPika==0.48.9 pyproject_hooks==1.2.0 pyreadline3==3.5.4 python-dateutil==2.9.0.post0 python-dotenv==1.0.1 PyYAML==6.0.2 regex==2024.11.6 requests==2.32.3 requests-oauthlib==2.0.0 requests-toolbelt==1.0.0 rich==13.9.4 rsa==4.9 shellingham==1.5.4 six==1.16.0 sniffio==1.3.1 soupsieve==2.6 SQLAlchemy==2.0.35 starlette==0.41.3 sympy==1.13.3 tenacity==9.0.0 tiktoken==0.8.0 tokenizers==0.20.4 tqdm==4.67.1 typer==0.13.1 typing-inspect==0.9.0 typing_extensions==4.12.2 urllib3==2.2.3 uvicorn==0.32.1 watchfiles==1.0.0 websocket-client==1.8.0 websockets==14.1 wrapt==1.17.0 yarl==1.18.0 zipp==3.21.0 |
※ pip install python-dotenv langchain langchain-community langchain-openai bs4 명령을 실행했다.