■ FAISS 클래스의 as_retriever 메소드에서 search_type 인자를 사용해 벡터 스토어 검색기를 구하는 방법을 보여준다.
※ 기본적으로 벡터 스토어 검색기는 유사성 검색을 사용한다.
※ 기본 벡터 스토어가 최대 한계 관련성 검색을 지원하는 경우 검색 유형으로 지정할 수 있다.
※ OPENAI_API_KEY 환경 변수 값은 .env 파일에 정의한다.
▶ main.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
from dotenv import load_dotenv from langchain_community.document_loaders import TextLoader from langchain_text_splitters import CharacterTextSplitter from langchain_openai import OpenAIEmbeddings from langchain_community.vectorstores import FAISS load_dotenv() textLoader = TextLoader("state_of_the_union.txt") documentList = textLoader.load() characterTextSplitter = CharacterTextSplitter(chunk_size = 1000, chunk_overlap = 0) splitDocumentList = characterTextSplitter.split_documents(documentList) openAIEmbeddings = OpenAIEmbeddings() faiss = FAISS.from_documents(splitDocumentList, openAIEmbeddings) vectorStoreRetriever = faiss.as_retriever(search_type = "mmr") responseDocumentList = vectorStoreRetriever.invoke("what did the president say about ketanji brown jackson?") print(len(responseDocumentList)) print(responseDocumentList[0].page_content) """ 4 Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence. """ |
▶ requirements.txt
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
aiohappyeyeballs==2.3.5 aiohttp==3.10.3 aiosignal==1.3.1 annotated-types==0.7.0 anyio==4.4.0 async-timeout==4.0.3 attrs==24.2.0 certifi==2024.7.4 charset-normalizer==3.3.2 dataclasses-json==0.6.7 distro==1.9.0 exceptiongroup==1.2.2 faiss-gpu==1.7.2 frozenlist==1.4.1 greenlet==3.0.3 h11==0.14.0 httpcore==1.0.5 httpx==0.27.0 idna==3.7 jiter==0.5.0 jsonpatch==1.33 jsonpointer==3.0.0 langchain==0.2.12 langchain-community==0.2.11 langchain-core==0.2.29 langchain-openai==0.1.21 langchain-text-splitters==0.2.2 langsmith==0.1.98 marshmallow==3.21.3 multidict==6.0.5 mypy-extensions==1.0.0 numpy==1.26.4 openai==1.40.3 orjson==3.10.7 packaging==24.1 pydantic==2.8.2 pydantic_core==2.20.1 python-dotenv==1.0.1 PyYAML==6.0.2 regex==2024.7.24 requests==2.32.3 sniffio==1.3.1 SQLAlchemy==2.0.32 tenacity==8.5.0 tiktoken==0.7.0 tqdm==4.66.5 typing-inspect==0.9.0 typing_extensions==4.12.2 urllib3==2.2.2 yarl==1.9.4 |