■ AutoTokenizer 클래스의 from_pretrained 정적 메소드를 사용해 LlamaTokenizerFast 객체를 만드는 방법을 보여준다. (hf-internal-testing/llama-tokenizer)
▶ main.py
1 2 3 4 5 |
from transformers import AutoTokenizer llamaTokenizerFast = AutoTokenizer.from_pretrained("hf-internal-testing/llama-tokenizer", legacy = False) |
▶ requirements.txt
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
certifi==2024.12.14 charset-normalizer==3.4.1 filelock==3.16.1 fsspec==2024.12.0 huggingface-hub==0.27.1 idna==3.10 Jinja2==3.1.5 MarkupSafe==3.0.2 mpmath==1.3.0 networkx==3.4.2 numpy==2.2.1 nvidia-cublas-cu12==12.4.5.8 nvidia-cuda-cupti-cu12==12.4.127 nvidia-cuda-nvrtc-cu12==12.4.127 nvidia-cuda-runtime-cu12==12.4.127 nvidia-cudnn-cu12==9.1.0.70 nvidia-cufft-cu12==11.2.1.3 nvidia-curand-cu12==10.3.5.147 nvidia-cusolver-cu12==11.6.1.9 nvidia-cusparse-cu12==12.3.1.170 nvidia-nccl-cu12==2.21.5 nvidia-nvjitlink-cu12==12.4.127 nvidia-nvtx-cu12==12.4.127 packaging==24.2 PyYAML==6.0.2 regex==2024.11.6 requests==2.32.3 safetensors==0.5.2 sympy==1.13.1 tokenizers==0.21.0 torch==2.5.1 tqdm==4.67.1 transformers==4.48.0 triton==3.1.0 typing_extensions==4.12.2 urllib3==2.3.0 |
※ pip install transformers torch 명령을 실행했다.