■ YoutubeTranscriptReader 클래스의 load_data 메소드를 사용해 유튜브 동영상에서 자막을 가져오는 데이터 커넥터를 설정하는 방법을 보여준다.
▶ main.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
import os from llama_index.core import download_loader, GPTVectorStoreIndex os.environ["OPENAI_API_KEY"] = "<OPENAI_API_KEY>" YoutubeTranscriptReader = download_loader("YoutubeTranscriptReader") youtubeTranscriptReader = YoutubeTranscriptReader() documentList = youtubeTranscriptReader.load_data(ytlinks = ["https://www.youtube.com/watch?v=oc6RV5c1yd0"]) vectorStoreIndex = GPTVectorStoreIndex.from_documents(documentList) retrieverQueryEngine = vectorStoreIndex.as_query_engine() responsea = retrieverQueryEngine.query("이 동영상에서 전하고 싶은 말은 무엇인가요? 한국어로 대답해 주세요.") print(responsea) |
▶ 실행 결과
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
Requirement already satisfied: llama-index-readers-youtube-transcript in ./env/lib/python3.10/site-packages (0.1.4) Requirement already satisfied: youtube-transcript-api>=0.5.0 in ./env/lib/python3.10/site-packages (from llama-index-readers-youtube-transcript) (0.6.2) Requirement already satisfied: llama-index-core<0.11.0,>=0.10.1 in ./env/lib/python3.10/site-packages (from llama-index-readers-youtube-transcript) (0.10.43) Requirement already satisfied: numpy in ./env/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (1.26.4) Requirement already satisfied: nltk<4.0.0,>=3.8.1 in ./env/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (3.8.1) Requirement already satisfied: tenacity<9.0.0,>=8.2.0 in ./env/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (8.3.0) Requirement already satisfied: typing-inspect>=0.8.0 in ./env/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (0.9.0) Requirement already satisfied: requests>=2.31.0 in ./env/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (2.32.3) Requirement already satisfied: pillow>=9.0.0 in ./env/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (10.3.0) Requirement already satisfied: pandas in ./env/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (2.2.2) Requirement already satisfied: fsspec>=2023.5.0 in ./env/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (2024.6.0) Requirement already satisfied: tqdm<5.0.0,>=4.66.1 in ./env/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (4.66.4) Requirement already satisfied: PyYAML>=6.0.1 in ./env/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (6.0.1) Requirement already satisfied: aiohttp<4.0.0,>=3.8.6 in ./env/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (3.9.5) Requirement already satisfied: wrapt in ./env/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (1.16.0) Requirement already satisfied: llamaindex-py-client<0.2.0,>=0.1.18 in ./env/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (0.1.19) Requirement already satisfied: typing-extensions>=4.5.0 in ./env/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (4.12.2) Requirement already satisfied: deprecated>=1.2.9.3 in ./env/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (1.2.14) Requirement already satisfied: httpx in ./env/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (0.27.0) Requirement already satisfied: networkx>=3.0 in ./env/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (3.3) Requirement already satisfied: tiktoken>=0.3.3 in ./env/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (0.7.0) Requirement already satisfied: SQLAlchemy[asyncio]>=1.4.49 in ./env/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (2.0.30) Requirement already satisfied: dataclasses-json in ./env/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (0.6.6) Requirement already satisfied: nest-asyncio<2.0.0,>=1.5.8 in ./env/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (1.6.0) Requirement already satisfied: dirtyjson<2.0.0,>=1.0.8 in ./env/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (1.0.8) Requirement already satisfied: openai>=1.1.0 in ./env/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (1.33.0) Requirement already satisfied: multidict<7.0,>=4.5 in ./env/lib/python3.10/site-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (6.0.5) Requirement already satisfied: aiosignal>=1.1.2 in ./env/lib/python3.10/site-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (1.3.1) Requirement already satisfied: frozenlist>=1.1.1 in ./env/lib/python3.10/site-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (1.4.1) Requirement already satisfied: yarl<2.0,>=1.0 in ./env/lib/python3.10/site-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (1.9.4) Requirement already satisfied: async-timeout<5.0,>=4.0 in ./env/lib/python3.10/site-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (4.0.3) Requirement already satisfied: attrs>=17.3.0 in ./env/lib/python3.10/site-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (23.2.0) Requirement already satisfied: pydantic>=1.10 in ./env/lib/python3.10/site-packages (from llamaindex-py-client<0.2.0,>=0.1.18->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (2.7.3) Requirement already satisfied: sniffio in ./env/lib/python3.10/site-packages (from httpx->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (1.3.1) Requirement already satisfied: idna in ./env/lib/python3.10/site-packages (from httpx->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (3.7) Requirement already satisfied: certifi in ./env/lib/python3.10/site-packages (from httpx->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (2024.6.2) Requirement already satisfied: anyio in ./env/lib/python3.10/site-packages (from httpx->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (4.4.0) Requirement already satisfied: httpcore==1.* in ./env/lib/python3.10/site-packages (from httpx->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (1.0.5) Requirement already satisfied: h11<0.15,>=0.13 in ./env/lib/python3.10/site-packages (from httpcore==1.*->httpx->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (0.14.0) Requirement already satisfied: regex>=2021.8.3 in ./env/lib/python3.10/site-packages (from nltk<4.0.0,>=3.8.1->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (2024.5.15) Requirement already satisfied: click in ./env/lib/python3.10/site-packages (from nltk<4.0.0,>=3.8.1->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (8.1.7) Requirement already satisfied: joblib in ./env/lib/python3.10/site-packages (from nltk<4.0.0,>=3.8.1->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (1.4.2) Requirement already satisfied: distro<2,>=1.7.0 in ./env/lib/python3.10/site-packages (from openai>=1.1.0->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (1.9.0) Requirement already satisfied: urllib3<3,>=1.21.1 in ./env/lib/python3.10/site-packages (from requests>=2.31.0->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (2.2.1) Requirement already satisfied: charset-normalizer<4,>=2 in ./env/lib/python3.10/site-packages (from requests>=2.31.0->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (3.3.2) Requirement already satisfied: greenlet!=0.4.17 in ./env/lib/python3.10/site-packages (from SQLAlchemy[asyncio]>=1.4.49->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (3.0.3) Requirement already satisfied: mypy-extensions>=0.3.0 in ./env/lib/python3.10/site-packages (from typing-inspect>=0.8.0->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (1.0.0) Requirement already satisfied: marshmallow<4.0.0,>=3.18.0 in ./env/lib/python3.10/site-packages (from dataclasses-json->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (3.21.3) Requirement already satisfied: python-dateutil>=2.8.2 in ./env/lib/python3.10/site-packages (from pandas->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (2.9.0.post0) Requirement already satisfied: pytz>=2020.1 in ./env/lib/python3.10/site-packages (from pandas->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (2024.1) Requirement already satisfied: tzdata>=2022.7 in ./env/lib/python3.10/site-packages (from pandas->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (2024.1) Requirement already satisfied: exceptiongroup>=1.0.2 in ./env/lib/python3.10/site-packages (from anyio->httpx->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (1.2.1) Requirement already satisfied: packaging>=17.0 in ./env/lib/python3.10/site-packages (from marshmallow<4.0.0,>=3.18.0->dataclasses-json->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (23.2) Requirement already satisfied: annotated-types>=0.4.0 in ./env/lib/python3.10/site-packages (from pydantic>=1.10->llamaindex-py-client<0.2.0,>=0.1.18->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (0.7.0) Requirement already satisfied: pydantic-core==2.18.4 in ./env/lib/python3.10/site-packages (from pydantic>=1.10->llamaindex-py-client<0.2.0,>=0.1.18->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (2.18.4) Requirement already satisfied: six>=1.5 in ./env/lib/python3.10/site-packages (from python-dateutil>=2.8.2->pandas->llama-index-core<0.11.0,>=0.10.1->llama-index-readers-youtube-transcript) (1.16.0) 이 동영상에서 전하고 싶은 메시지는 GPT-4가 OpenAI에서 개발된 최신 AI 시스템이며, 문제 해결 능력에서의 엄청난 발전을 이루었다는 것입니다. 또한 GPT-4는 다양한 작업을 수행할 수 있으며, 사용자의 요구에 더 안전하고 맞춰진 방식으로 개발되었습니다. |
▶ requirements.txt
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 |
aiohttp==3.9.5 aiosignal==1.3.1 annotated-types==0.7.0 anyio==4.4.0 async-timeout==4.0.3 attrs==23.2.0 beautifulsoup4==4.12.3 certifi==2024.6.2 charset-normalizer==3.3.2 click==8.1.7 dataclasses-json==0.6.6 Deprecated==1.2.14 dirtyjson==1.0.8 distro==1.9.0 exceptiongroup==1.2.1 frozenlist==1.4.1 fsspec==2024.6.0 greenlet==3.0.3 h11==0.14.0 httpcore==1.0.5 httpx==0.27.0 idna==3.7 joblib==1.4.2 jsonpatch==1.33 jsonpointer==2.4 langchain==0.2.3 langchain-core==0.2.5 langchain-text-splitters==0.2.1 langsmith==0.1.75 llama-index==0.10.43 llama-index-agent-openai==0.2.7 llama-index-cli==0.1.12 llama-index-core==0.10.43 llama-index-embeddings-openai==0.1.10 llama-index-indices-managed-llama-cloud==0.1.6 llama-index-legacy==0.9.48 llama-index-llms-openai==0.1.22 llama-index-multi-modal-llms-openai==0.1.6 llama-index-program-openai==0.1.6 llama-index-question-gen-openai==0.1.3 llama-index-readers-file==0.1.23 llama-index-readers-llama-parse==0.1.4 llama-index-readers-youtube-metadata==0.1.0 llama-index-readers-youtube-transcript==0.1.4 llama-parse==0.4.4 llamaindex-py-client==0.1.19 marshmallow==3.21.3 multidict==6.0.5 mypy-extensions==1.0.0 nest-asyncio==1.6.0 networkx==3.3 nltk==3.8.1 numpy==1.26.4 openai==1.33.0 orjson==3.10.3 packaging==23.2 pandas==2.2.2 pillow==10.3.0 pydantic==2.7.3 pydantic_core==2.18.4 pypdf==4.2.0 python-dateutil==2.9.0.post0 pytz==2024.1 PyYAML==6.0.1 regex==2024.5.15 requests==2.32.3 six==1.16.0 sniffio==1.3.1 soupsieve==2.5 SQLAlchemy==2.0.30 striprtf==0.0.26 tenacity==8.3.0 tiktoken==0.7.0 tqdm==4.66.4 typing-inspect==0.9.0 typing_extensions==4.12.2 tzdata==2024.1 urllib3==2.2.1 wrapt==1.16.0 yarl==1.9.4 youtube-transcript-api==0.6.2 |
※ pip install openai langchain llama-index llama_index.readers.youtube_metadata 명령을 실행했다.