■ ChatPromptTemplate 클래스에서 프롬프트 문자열을 직접 사용해 모델 응답 구조에서 문서 ID와 인묭문을 구하는 방법을 보여준다.
※ OPENAI_API_KEY 환경 변수 값은 .env 파일에 정의한다.
▶ main.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 |
from dotenv import load_dotenv from langchain_community.retrievers import WikipediaRetriever from typing import List from langchain_core.documents import Document from langchain_core.prompts import ChatPromptTemplate from langchain_openai import ChatOpenAI from langchain_core.runnables import RunnablePassthrough from langchain_core.output_parsers import XMLOutputParser load_dotenv() wikipediaRetriever = WikipediaRetriever(top_k_results = 6, doc_content_chars_max = 2000) runnableSequence1 = (lambda x: x["input"]) | wikipediaRetriever def getStringFromDocumentList(documentList : List[Document]) -> str: targetList = [] for index, document in enumerate(documentList): documentXML = f"""\ <source id=\"{index}\"> <title>{document.metadata['title']}</title> <article_snippet>{document.page_content}</article_snippet> </source>""" targetList.append(documentXML) return "\n\n<sources>" + "\n".join(targetList) + "</sources>" systemString = """You're a helpful AI assistant. Given a user question and some Wikipedia article snippets, answer the user question and provide citations. If none of the articles answer the question, just say you don't know. Remember, you must return both an answer and citations. A citation consists of a VERBATIM quote that justifies the answer and the ID of the quote article. Return a citation for every quote across all articles that justify the answer. Use the following format for your final output : <cited_answer> <answer></answer> <citations> <citation><source_id></source_id><quote></quote></citation> <citation><source_id></source_id><quote></quote></citation> ... </citations> </cited_answer> Here are the Wikipedia articles : {context}""" chatPromptTemplate = ChatPromptTemplate.from_messages( [ ("system", systemString), ("human" , "{input}" ) ] ) chatOpenAI = ChatOpenAI(model = "gpt-4o-mini") runnableSequence2 = ( RunnablePassthrough.assign(context = (lambda x : getStringFromDocumentList(x["context"]))) | chatPromptTemplate | chatOpenAI | XMLOutputParser() ) runnableSequence3 = RunnablePassthrough.assign(context = runnableSequence1).assign(answer = runnableSequence2) responseDictionary = runnableSequence3.invoke({"input" : "How fast are cheetahs?"}) print(responseDictionary["answer"]) """ { 'cited_answer' : [ {'answer' : 'Cheetahs are capable of running at speeds of 93 to 104 km/h (58 to 65 mph).'}, { 'citations' : [ { 'citation' : [ {'source_id' : '0'}, {'quote' : 'The cheetah is capable of running at 93 to 104 km/h (58 to 65 mph); it has evolved specialized adaptations for speed, including a light build, long thin legs and a long tail.'} ] }, { 'citation' : [ {'source_id' : '3'}, {'quote' : 'The fastest land animal is the cheetah.'} ] } ] } ] } """ |
▶ requirements.txt
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 |
aiohappyeyeballs==2.4.4 aiohttp==3.11.9 aiosignal==1.3.1 annotated-types==0.7.0 anyio==4.6.2.post1 attrs==24.2.0 beautifulsoup4==4.12.3 certifi==2024.8.30 charset-normalizer==3.4.0 colorama==0.4.6 dataclasses-json==0.6.7 defusedxml==0.7.1 distro==1.9.0 frozenlist==1.5.0 greenlet==3.1.1 h11==0.14.0 httpcore==1.0.7 httpx==0.28.0 httpx-sse==0.4.0 idna==3.10 jiter==0.8.0 jsonpatch==1.33 jsonpointer==3.0.0 langchain==0.3.9 langchain-community==0.3.9 langchain-core==0.3.21 langchain-openai==0.2.10 langchain-text-splitters==0.3.2 langsmith==0.1.147 marshmallow==3.23.1 multidict==6.1.0 mypy-extensions==1.0.0 numpy==2.1.3 openai==1.56.0 orjson==3.10.12 packaging==24.2 propcache==0.2.1 pydantic==2.10.2 pydantic-settings==2.6.1 pydantic_core==2.27.1 python-dotenv==1.0.1 PyYAML==6.0.2 regex==2024.11.6 requests==2.32.3 requests-toolbelt==1.0.0 sniffio==1.3.1 soupsieve==2.6 SQLAlchemy==2.0.36 tenacity==9.0.0 tiktoken==0.8.0 tqdm==4.67.1 typing-inspect==0.9.0 typing_extensions==4.12.2 urllib3==2.2.3 wikipedia==1.4.0 yarl==1.18.3 |
※ pip install python-dotenv langchain langchain-community langchain-openai wikipedia defusedxml 명령을 실행했다.