[PYTHON/LANGCHAIN] ChatOpenAI 클래스 : with_structured_output 메소드를 사용해 RAG 애플리케이션에서 모델 응답 구조의 소스 구하기

■ ChatOpenAI 클래스의 with_structured_output 메소드를 사용해 RAG 애플리케이션에서 모델 응답 구조의 소스를 구하는 방법을 보여준다.

※ OPENAI_API_KEY 환경 변수 값은 .env 파일에 정의한다.

▶ main.py


import bs4

from dotenv                               import load_dotenv
from langchain_community.document_loaders import WebBaseLoader
from langchain_text_splitters             import RecursiveCharacterTextSplitter
from langchain_chroma                     import Chroma
from langchain_openai                     import OpenAIEmbeddings
from langchain_openai                     import ChatOpenAI
from langchain_core.prompts               import ChatPromptTemplate
from typing_extensions                    import TypedDict
from typing_extensions                    import Annotated
from typing                               import List
from langchain_core.runnables             import RunnablePassthrough

load_dotenv()

webBaseLoader = WebBaseLoader(
    web_paths = ("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs = dict(parse_only = bs4.SoupStrainer(class_ = ("post-content", "post-title", "post-header")))
)

documentList = webBaseLoader.load()

recursiveCharacterTextSplitter = RecursiveCharacterTextSplitter(chunk_size = 1000, chunk_overlap = 200)

splitDocumentList = recursiveCharacterTextSplitter.split_documents(documentList)

chroma = Chroma.from_documents(documents = splitDocumentList, embedding = OpenAIEmbeddings())

vectorStoreRetriever = chroma.as_retriever()

runnableSequence1 = (lambda x: x["input"]) | vectorStoreRetriever

def getStringFromDocumentList(documentList):
    return "\n\n".join(document.page_content for document in documentList)

chatPromptTemplate = ChatPromptTemplate.from_messages(
    [
        ("system", "You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, say that you don't know. Use three sentences maximum and keep the answer concise.\n\n{context}"),
        ("human", "{input}")
    ]
)

chatOpenAI = ChatOpenAI(model = "gpt-4o-mini")

class ResponseModel(TypedDict):
    """An answer to the question, with sources."""

    answer  : str
    sources : Annotated[List[str], ..., "List of sources (author + year) used to answer the question"]

runnableSequence2 = (
    {
        "input"   : lambda x : x["input"],
        "context" : lambda x : getStringFromDocumentList(x["context"]),
    }
    | chatPromptTemplate
    | chatOpenAI.with_structured_output(ResponseModel)
)

runnableSequence3 = RunnablePassthrough.assign(context = runnableSequence1).assign(answer = runnableSequence2)

responseDictionary = runnableSequence3.invoke({"input": "What is Chain of Thought?"})

print(responseDictionary)

"""
{
    'input'  : 'What is Chain of Thought?',
    'context' : [
        Document(
            metadata     = {'source' : 'https://lilianweng.github.io/posts/2023-06-23-agent/'},
            page_content = 'Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.\nTask decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.'
        ),
        Document(
            metadata     = {'source' : 'https://lilianweng.github.io/posts/2023-06-23-agent/'},
            page_content = 'Fig. 1. Overview of a LLM-powered autonomous agent system.\nComponent One: Planning#\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\nTask Decomposition#\nChain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.'
        ),
        Document(
            metadata     = {'source' : 'https://lilianweng.github.io/posts/2023-06-23-agent/'},
            page_content = 'Chain of Hindsight (CoH; Liu et al. 2023) encourages the model to improve on its own outputs by explicitly presenting it with a sequence of past outputs, each annotated with feedback. Human feedback data is a collection of $D_h = \\{(x, y_i , r_i , z_i)\\}_{i=1}^n$, where $x$ is the prompt, each $y_i$ is a model completion, $r_i$ is the human rating of $y_i$, and $z_i$ is the corresponding human-provided hindsight feedback. Assume the feedback tuples are ranked by reward, $r_n \\geq r_{n-1} \\geq \\dots \\geq r_1$ The process is supervised fine-tuning where the data is a sequence in the form of $\\tau_h = (x, z_i, y_i, z_j, y_j, \\dots, z_n, y_n)$, where $\\leq i \\leq j \\leq n$. The model is finetuned to only predict $y_n$ where conditioned on the sequence prefix, such that the model can self-reflect to produce better output based on the feedback sequence. The model can optionally receive multiple rounds of instructions with human annotators at test time.'
        ),
        Document(
            metadata     = {'source' : 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, page_content='Fig. 2.  Examples of reasoning trajectories for knowledge-intensive tasks (e.g. HotpotQA, FEVER) and decision-making tasks (e.g. AlfWorld Env, WebShop). (Image source: Yao et al. 2023).\nIn both experiments on knowledge-intensive tasks and decision-making tasks, ReAct works better than the Act-only baseline where Thought: … step is removed.\nReflexion (Shinn & Labash 2023) is a framework to equips agents with dynamic memory and self-reflection capabilities to improve reasoning skills. Reflexion has a standard RL setup, in which the reward model provides a simple binary reward and the action space follows the setup in ReAct where the task-specific action space is augmented with language to enable complex reasoning steps. After each action $a_t$, the agent computes a heuristic $h_t$ and optionally may decide to reset the environment to start a new trial depending on the self-reflection results.'
        )
    ],
    'answer' : {
        'answer'  : 'Chain of Thought (CoT), introduced by Wei et al. in 2022, is a prompting technique that encourages language models to decompose complex tasks into smaller, manageable steps by instructing them to "think step by step." This method enhances model performance by utilizing more test-time computation and provides insights into the model\'s reasoning process. CoT transforms larger tasks into a series of simpler sub-tasks, making problem-solving more structured and interpretable.',
        'sources' : ['Wei et al. 2022']
    }
}
"""

import bs4

from dotenv import load_dotenv

from langchain_community.document_loaders import WebBaseLoader

from langchain_text_splitters import RecursiveCharacterTextSplitter

from langchain_chroma import Chroma

from langchain_openai import OpenAIEmbeddings

from langchain_openai import ChatOpenAI

from langchain_core.prompts import ChatPromptTemplate

from typing_extensions import TypedDict

from typing_extensions import Annotated

from typing import List

from langchain_core.runnables import RunnablePassthrough

load_dotenv()

webBaseLoader = WebBaseLoader(

web_paths = ("https://lilianweng.github.io/posts/2023-06-23-agent/",),

bs_kwargs = dict(parse_only = bs4.SoupStrainer(class_ = ("post-content", "post-title", "post-header")))

)

documentList = webBaseLoader.load()

recursiveCharacterTextSplitter = RecursiveCharacterTextSplitter(chunk_size = 1000, chunk_overlap = 200)

splitDocumentList = recursiveCharacterTextSplitter.split_documents(documentList)

chroma = Chroma.from_documents(documents = splitDocumentList, embedding = OpenAIEmbeddings())

vectorStoreRetriever = chroma.as_retriever()

runnableSequence1 = (lambda x: x["input"]) | vectorStoreRetriever

def getStringFromDocumentList(documentList):

return "\n\n".join(document.page_content for document in documentList)

chatPromptTemplate = ChatPromptTemplate.from_messages(

[

("system", "You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, say that you don't know. Use three sentences maximum and keep the answer concise.\n\n{context}"),

("human", "{input}")

]

)

chatOpenAI = ChatOpenAI(model = "gpt-4o-mini")

class ResponseModel(TypedDict):

"""An answer to the question, with sources."""

answer : str

sources : Annotated[List[str], ..., "List of sources (author + year) used to answer the question"]

runnableSequence2 = (

{

"input" : lambda x : x["input"],

"context" : lambda x : getStringFromDocumentList(x["context"]),

}

| chatPromptTemplate

| chatOpenAI.with_structured_output(ResponseModel)

)

runnableSequence3 = RunnablePassthrough.assign(context = runnableSequence1).assign(answer = runnableSequence2)

responseDictionary = runnableSequence3.invoke({"input": "What is Chain of Thought?"})

print(responseDictionary)

"""

{

'input' : 'What is Chain of Thought?',

'context' : [

Document(

metadata = {'source' : 'https://lilianweng.github.io/posts/2023-06-23-agent/'},

page_content = 'Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.\nTask decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.'

Document(

metadata = {'source' : 'https://lilianweng.github.io/posts/2023-06-23-agent/'},

page_content = 'Fig. 1. Overview of a LLM-powered autonomous agent system.\nComponent One: Planning#\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\nTask Decomposition#\nChain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.'

Document(

metadata = {'source' : 'https://lilianweng.github.io/posts/2023-06-23-agent/'},

page_content = 'Chain of Hindsight (CoH; Liu et al. 2023) encourages the model to improve on its own outputs by explicitly presenting it with a sequence of past outputs, each annotated with feedback. Human feedback data is a collection of $D_h = \\{(x, y_i , r_i , z_i)\\}_{i=1}^n$, where $x$ is the prompt, each $y_i$ is a model completion, $r_i$ is the human rating of $y_i$, and $z_i$ is the corresponding human-provided hindsight feedback. Assume the feedback tuples are ranked by reward, $r_n \\geq r_{n-1} \\geq \\dots \\geq r_1$ The process is supervised fine-tuning where the data is a sequence in the form of $\\tau_h = (x, z_i, y_i, z_j, y_j, \\dots, z_n, y_n)$, where $\\leq i \\leq j \\leq n$. The model is finetuned to only predict $y_n$ where conditioned on the sequence prefix, such that the model can self-reflect to produce better output based on the feedback sequence. The model can optionally receive multiple rounds of instructions with human annotators at test time.'

Document(

metadata = {'source' : 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, page_content='Fig. 2. Examples of reasoning trajectories for knowledge-intensive tasks (e.g. HotpotQA, FEVER) and decision-making tasks (e.g. AlfWorld Env, WebShop). (Image source: Yao et al. 2023).\nIn both experiments on knowledge-intensive tasks and decision-making tasks, ReAct works better than the Act-only baseline where Thought: … step is removed.\nReflexion (Shinn & Labash 2023) is a framework to equips agents with dynamic memory and self-reflection capabilities to improve reasoning skills. Reflexion has a standard RL setup, in which the reward model provides a simple binary reward and the action space follows the setup in ReAct where the task-specific action space is augmented with language to enable complex reasoning steps. After each action $a_t$, the agent computes a heuristic $h_t$ and optionally may decide to reset the environment to start a new trial depending on the self-reflection results.'

)

'answer' : {

'answer' : 'Chain of Thought (CoT), introduced by Wei et al. in 2022, is a prompting technique that encourages language models to decompose complex tasks into smaller, manageable steps by instructing them to "think step by step." This method enhances model performance by utilizing more test-time computation and provides insights into the model\'s reasoning process. CoT transforms larger tasks into a series of simpler sub-tasks, making problem-solving more structured and interpretable.',

'sources' : ['Wei et al. 2022']

}

"""

▶ requirements.txt


aiohappyeyeballs==2.4.4
aiohttp==3.11.9
aiosignal==1.3.1
annotated-types==0.7.0
anyio==4.6.2.post1
asgiref==3.8.1
attrs==24.2.0
backoff==2.2.1
bcrypt==4.2.1
beautifulsoup4==4.12.3
bs4==0.0.2
build==1.2.2.post1
cachetools==5.5.0
certifi==2024.8.30
charset-normalizer==3.4.0
chroma-hnswlib==0.7.6
chromadb==0.5.20
click==8.1.7
colorama==0.4.6
coloredlogs==15.0.1
dataclasses-json==0.6.7
Deprecated==1.2.15
distro==1.9.0
durationpy==0.9
fastapi==0.115.5
filelock==3.16.1
flatbuffers==24.3.25
frozenlist==1.5.0
fsspec==2024.10.0
google-auth==2.36.0
googleapis-common-protos==1.66.0
greenlet==3.1.1
grpcio==1.68.1
h11==0.14.0
httpcore==1.0.7
httptools==0.6.4
httpx==0.28.0
httpx-sse==0.4.0
huggingface-hub==0.26.3
humanfriendly==10.0
idna==3.10
importlib_metadata==8.5.0
importlib_resources==6.4.5
jiter==0.8.0
jsonpatch==1.33
jsonpointer==3.0.0
kubernetes==31.0.0
langchain==0.3.9
langchain-chroma==0.1.4
langchain-community==0.3.9
langchain-core==0.3.21
langchain-openai==0.2.10
langchain-text-splitters==0.3.2
langsmith==0.1.147
markdown-it-py==3.0.0
marshmallow==3.23.1
mdurl==0.1.2
mmh3==5.0.1
monotonic==1.6
mpmath==1.3.0
multidict==6.1.0
mypy-extensions==1.0.0
numpy==1.26.4
oauthlib==3.2.2
onnxruntime==1.20.1
openai==1.56.0
opentelemetry-api==1.28.2
opentelemetry-exporter-otlp-proto-common==1.28.2
opentelemetry-exporter-otlp-proto-grpc==1.28.2
opentelemetry-instrumentation==0.49b2
opentelemetry-instrumentation-asgi==0.49b2
opentelemetry-instrumentation-fastapi==0.49b2
opentelemetry-proto==1.28.2
opentelemetry-sdk==1.28.2
opentelemetry-semantic-conventions==0.49b2
opentelemetry-util-http==0.49b2
orjson==3.10.12
overrides==7.7.0
packaging==24.2
posthog==3.7.4
propcache==0.2.1
protobuf==5.29.0
pyasn1==0.6.1
pyasn1_modules==0.4.1
pydantic==2.10.2
pydantic-settings==2.6.1
pydantic_core==2.27.1
Pygments==2.18.0
PyPika==0.48.9
pyproject_hooks==1.2.0
pyreadline3==3.5.4
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
PyYAML==6.0.2
regex==2024.11.6
requests==2.32.3
requests-oauthlib==2.0.0
requests-toolbelt==1.0.0
rich==13.9.4
rsa==4.9
shellingham==1.5.4
six==1.16.0
sniffio==1.3.1
soupsieve==2.6
SQLAlchemy==2.0.36
starlette==0.41.3
sympy==1.13.3
tenacity==9.0.0
tiktoken==0.8.0
tokenizers==0.21.0
tqdm==4.67.1
typer==0.14.0
typing-inspect==0.9.0
typing_extensions==4.12.2
urllib3==2.2.3
uvicorn==0.32.1
watchfiles==1.0.0
websocket-client==1.8.0
websockets==14.1
wrapt==1.17.0
yarl==1.18.3
zipp==3.21.0

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

aiohappyeyeballs==2.4.4

aiohttp==3.11.9

aiosignal==1.3.1

annotated-types==0.7.0

anyio==4.6.2.post1

asgiref==3.8.1

attrs==24.2.0

backoff==2.2.1

bcrypt==4.2.1

beautifulsoup4==4.12.3

bs4==0.0.2

build==1.2.2.post1

cachetools==5.5.0

certifi==2024.8.30

charset-normalizer==3.4.0

chroma-hnswlib==0.7.6

chromadb==0.5.20

click==8.1.7

colorama==0.4.6

coloredlogs==15.0.1

dataclasses-json==0.6.7

Deprecated==1.2.15

distro==1.9.0

durationpy==0.9

fastapi==0.115.5

filelock==3.16.1

flatbuffers==24.3.25

frozenlist==1.5.0

fsspec==2024.10.0

google-auth==2.36.0

googleapis-common-protos==1.66.0

greenlet==3.1.1

grpcio==1.68.1

h11==0.14.0

httpcore==1.0.7

httptools==0.6.4

httpx==0.28.0

httpx-sse==0.4.0

huggingface-hub==0.26.3

humanfriendly==10.0

idna==3.10

importlib_metadata==8.5.0

importlib_resources==6.4.5

jiter==0.8.0

jsonpatch==1.33

jsonpointer==3.0.0

kubernetes==31.0.0

langchain==0.3.9

langchain-chroma==0.1.4

langchain-community==0.3.9

langchain-core==0.3.21

langchain-openai==0.2.10

langchain-text-splitters==0.3.2

langsmith==0.1.147

markdown-it-py==3.0.0

marshmallow==3.23.1

mdurl==0.1.2

mmh3==5.0.1

monotonic==1.6

mpmath==1.3.0

multidict==6.1.0

mypy-extensions==1.0.0

numpy==1.26.4

oauthlib==3.2.2

onnxruntime==1.20.1

openai==1.56.0

opentelemetry-api==1.28.2

opentelemetry-exporter-otlp-proto-common==1.28.2

opentelemetry-exporter-otlp-proto-grpc==1.28.2

opentelemetry-instrumentation==0.49b2

opentelemetry-instrumentation-asgi==0.49b2

opentelemetry-instrumentation-fastapi==0.49b2

opentelemetry-proto==1.28.2

opentelemetry-sdk==1.28.2

opentelemetry-semantic-conventions==0.49b2

opentelemetry-util-http==0.49b2

orjson==3.10.12

overrides==7.7.0

packaging==24.2

posthog==3.7.4

propcache==0.2.1

protobuf==5.29.0

pyasn1==0.6.1

pyasn1_modules==0.4.1

pydantic==2.10.2

pydantic-settings==2.6.1

pydantic_core==2.27.1

Pygments==2.18.0

PyPika==0.48.9

pyproject_hooks==1.2.0

pyreadline3==3.5.4

python-dateutil==2.9.0.post0

python-dotenv==1.0.1

PyYAML==6.0.2

regex==2024.11.6

requests==2.32.3

requests-oauthlib==2.0.0

requests-toolbelt==1.0.0

rich==13.9.4

rsa==4.9

shellingham==1.5.4

six==1.16.0

sniffio==1.3.1

soupsieve==2.6

SQLAlchemy==2.0.36

starlette==0.41.3

sympy==1.13.3

tenacity==9.0.0

tiktoken==0.8.0

tokenizers==0.21.0

tqdm==4.67.1

typer==0.14.0

typing-inspect==0.9.0

typing_extensions==4.12.2

urllib3==2.2.3

uvicorn==0.32.1

watchfiles==1.0.0

websocket-client==1.8.0

websockets==14.1

wrapt==1.17.0

yarl==1.18.3

zipp==3.21.0

※ pip install python-dotenv langchain langchain-community langchain-openai langchain-chroma bs4 명령을 실행했다.