■ Llama 2 모델에 검색된 문서와 프롬프트를 전달해 요약 체인을 만드는 방법을 보여준다.
▶ main.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 |
from langchain_community.document_loaders import WebBaseLoader from langchain_text_splitters import RecursiveCharacterTextSplitter from langchain_huggingface import HuggingFaceEmbeddings from langchain_chroma import Chroma from langchain_community.llms import LlamaCpp from langchain_core.prompts import PromptTemplate from langchain_core.output_parsers import StrOutputParser # 웹 문서를 로드한다. webBaseLoader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/") documentList = webBaseLoader.load() # 문서를 분할한다. recursiveCharacterTextSplitter = RecursiveCharacterTextSplitter(chunk_size = 500, chunk_overlap = 0) splitDocumentList = recursiveCharacterTextSplitter.split_documents(documentList) # 임베딩 생성 및 유사도 검색을 위한 벡터 저장소를 설정한다. chroma = Chroma.from_documents(documents = splitDocumentList, embedding = HuggingFaceEmbeddings()) # 벡터 저장소에 대해 유사도 검색을 한다. resultDocumentList = chroma.similarity_search("What are the approaches to Task Decomposition?") # LLM 모델을 설정한다. gpuLayerCount = 1 batchSize = 512 # 1에서 n_ctx 사이이어야 한다. llamaCpp = LlamaCpp( model_path = "./llama-2-13b-chat.Q4_0.gguf", # llama-2-13b-chat.Q4_0.gguf 파일 경로를 설정한다. n_gpu_layers = gpuLayerCount, n_batch = batchSize, n_ctx = 2048, f16_kv = True, # 반드시 True로 설정해야 한다. 그렇지 않으면 몇 번의 호출 후에 문제가 발생하게 된다. verbose = False ) # 프롬프트 템플리트를 설정한다. promptTemplate = PromptTemplate.from_template("Summarize the main themes in these retrieved docs : {documentPageContent}") # 문서 페이지 컨텐츠를 병합하는 함수를 정의한다. def mergeDocumentPageContent(documentList): return "\n\n".join(doc.page_content for doc in documentList) # 실행 체인을 설정한다. runnableSequence = {"documentPageContent" : mergeDocumentPageContent} | promptTemplate | llamaCpp | StrOutputParser() # 질의 응답을 실행한다. resultString = runnableSequence .invoke(resultDocumentList) print(resultString) """ (1) Task decomposition with LLMs: Simple prompting like "Steps for XYZ" or "Subgoals for achieving YZ" can be used to elicit task decomposition from LLMs. This approach is useful when we have a high-level understanding of the desired output but struggle to break it down into smaller parts. (2) Task-specific instructions: Providing task-specific instructions, e.g., "Write a story outline" for writing a novel, can effectively guide the LLM towards decomposing the task into smaller components. This approach leverages domain knowledge and tailors the prompting to the specific task at hand. (3) Human inputs: For tasks that are difficult to decompose or require a high degree of creativity, human inputs can be used to provide additional guidance. This could involve providing examples, outlining the desired outcome, or offering feedback on the LLM's performance. """ |
▶ requirements.txt
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 |
aiohttp==3.9.5 aiosignal==1.3.1 annotated-types==0.7.0 anyio==4.4.0 asgiref==3.8.1 async-timeout==4.0.3 attrs==23.2.0 backoff==2.2.1 bcrypt==4.1.3 beautifulsoup4==4.12.3 bs4==0.0.2 build==1.2.1 cachetools==5.3.3 certifi==2024.6.2 charset-normalizer==3.3.2 chroma-hnswlib==0.7.3 chromadb==0.5.0 click==8.1.7 coloredlogs==15.0.1 dataclasses-json==0.6.7 Deprecated==1.2.14 diskcache==5.6.3 dnspython==2.6.1 email_validator==2.1.1 exceptiongroup==1.2.1 fastapi==0.111.0 fastapi-cli==0.0.4 filelock==3.15.1 flatbuffers==24.3.25 frozenlist==1.4.1 fsspec==2024.6.0 google-auth==2.30.0 googleapis-common-protos==1.63.1 greenlet==3.0.3 grpcio==1.64.1 h11==0.14.0 httpcore==1.0.5 httptools==0.6.1 httpx==0.27.0 huggingface-hub==0.23.3 humanfriendly==10.0 idna==3.7 importlib_metadata==7.1.0 importlib_resources==6.4.0 Jinja2==3.1.4 joblib==1.4.2 jsonpatch==1.33 jsonpointer==3.0.0 kubernetes==30.1.0 langchain==0.2.4 langchain-chroma==0.1.1 langchain-community==0.2.4 langchain-core==0.2.6 langchain-huggingface==0.0.3 langchain-text-splitters==0.2.1 langsmith==0.1.77 llama_cpp_python==0.2.78 markdown-it-py==3.0.0 MarkupSafe==2.1.5 marshmallow==3.21.3 mdurl==0.1.2 mmh3==4.1.0 monotonic==1.6 mpmath==1.3.0 multidict==6.0.5 mypy-extensions==1.0.0 networkx==3.3 numpy==1.26.4 nvidia-cublas-cu12==12.1.3.1 nvidia-cuda-cupti-cu12==12.1.105 nvidia-cuda-nvrtc-cu12==12.1.105 nvidia-cuda-runtime-cu12==12.1.105 nvidia-cudnn-cu12==8.9.2.26 nvidia-cufft-cu12==11.0.2.54 nvidia-curand-cu12==10.3.2.106 nvidia-cusolver-cu12==11.4.5.107 nvidia-cusparse-cu12==12.1.0.106 nvidia-nccl-cu12==2.20.5 nvidia-nvjitlink-cu12==12.5.40 nvidia-nvtx-cu12==12.1.105 oauthlib==3.2.2 onnxruntime==1.18.0 opentelemetry-api==1.25.0 opentelemetry-exporter-otlp-proto-common==1.25.0 opentelemetry-exporter-otlp-proto-grpc==1.25.0 opentelemetry-instrumentation==0.46b0 opentelemetry-instrumentation-asgi==0.46b0 opentelemetry-instrumentation-fastapi==0.46b0 opentelemetry-proto==1.25.0 opentelemetry-sdk==1.25.0 opentelemetry-semantic-conventions==0.46b0 opentelemetry-util-http==0.46b0 orjson==3.10.5 overrides==7.7.0 packaging==24.1 pillow==10.3.0 posthog==3.5.0 protobuf==4.25.3 pyasn1==0.6.0 pyasn1_modules==0.4.0 pydantic==2.7.4 pydantic_core==2.18.4 Pygments==2.18.0 PyPika==0.48.9 pyproject_hooks==1.1.0 python-dateutil==2.9.0.post0 python-dotenv==1.0.1 python-multipart==0.0.9 PyYAML==6.0.1 regex==2024.5.15 requests==2.32.3 requests-oauthlib==2.0.0 rich==13.7.1 rsa==4.9 safetensors==0.4.3 scikit-learn==1.5.0 scipy==1.13.1 sentence-transformers==3.0.1 shellingham==1.5.4 six==1.16.0 sniffio==1.3.1 soupsieve==2.5 SQLAlchemy==2.0.30 starlette==0.37.2 sympy==1.12.1 tenacity==8.3.0 threadpoolctl==3.5.0 tokenizers==0.19.1 tomli==2.0.1 torch==2.3.1 tqdm==4.66.4 transformers==4.41.2 triton==2.3.1 typer==0.12.3 typing-inspect==0.9.0 typing_extensions==4.12.2 ujson==5.10.0 urllib3==2.2.1 uvicorn==0.30.1 uvloop==0.19.0 watchfiles==0.22.0 websocket-client==1.8.0 websockets==12.0 wrapt==1.16.0 yarl==1.9.4 zipp==3.19.2 |
※ pip install langchain langchain_chroma langchain-community langchain-huggingface bs4 sentence-transformers llama-cpp-python 명령을 실행했다.