■ RecursiveJsonSplitter 클래스의 split_json 메소드에서 convert_lists 인자를 사용해 리스트도 분할 대상으로 설정하는 방법을 보여준다.
▶ main.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
import requests from langchain_text_splitters import RecursiveJsonSplitter response = requests.get("https://api.smith.langchain.com/openapi.json") jsonDictionary = response.json() recursiveJsonSplitter = RecursiveJsonSplitter(max_chunk_size = 300) jsonDictionaryList = recursiveJsonSplitter.split_json(json_data = jsonDictionary, convert_lists = True) for splitDictionary in jsonDictionaryList[:3]: print(splitDictionary) print() """ {'openapi': '3.1.0', 'info': {'title': 'LangSmith', 'version': '0.1.0'}, 'paths': {'/api/v1/sessions/{session_id}': {'get': {'tags': {'0': 'tracer-sessions'}, 'summary': 'Read Tracer Session', 'description': 'Get a specific session.'}}}} {'paths': {'/api/v1/sessions/{session_id}': {'get': {'operationId': 'read_tracer_session_api_v1_sessions__session_id__get', 'security': {'0': {'API Key': {}}, '1': {'Tenant ID': {}}, '2': {'Bearer Auth': {}}}}}}} {'paths': {'/api/v1/sessions/{session_id}': {'get': {'parameters': {'0': {'name': 'session_id', 'in': 'path', 'required': True, 'schema': {'type': 'string', 'format': 'uuid', 'title': 'Session Id'}}}}}}} """ |
▶ requirements.txt
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
annotated-types==0.7.0 certifi==2024.6.2 charset-normalizer==3.3.2 idna==3.7 jsonpatch==1.33 jsonpointer==3.0.0 langchain-core==0.2.10 langchain-text-splitters==0.2.2 langsmith==0.1.82 orjson==3.10.5 packaging==24.1 pydantic==2.7.4 pydantic_core==2.18.4 PyYAML==6.0.1 requests==2.32.3 tenacity==8.4.2 typing_extensions==4.12.2 urllib3==2.2.2 |
※ pip install langchain-text-splitters 명령을 실행했다.