[PYTHON/LANGCHAIN] ChatOpenAI 클래스 : 멀티 모달 모델 사용하기 (이미지 인식)

icodebroker LANGCHAIN 2025-01-10

■ ChatOpenAI 클래스에서 멀티 모달 모델을 사용하는 방법을 보여준다. (이미지 인식)

※ OPENAI_API_KEY 환경 변수 값은 .env 파일에 정의한다.

▶ main.py


import base64

from dotenv                    import load_dotenv
from langchain_openai          import ChatOpenAI
from langchain.schema.messages import HumanMessage

load_dotenv()

def getBASE64StringFromFile(filePath):
    with open(filePath, "rb") as bufferedReader:
        imageBytes   = bufferedReader.read()
        base64Bytes  = base64.b64encode(imageBytes)
        base64String =  base64Bytes.decode("utf-8")
        return base64String

chatOpenAI = ChatOpenAI(
    model_name  = "gpt-4o",
    max_tokens  = 2048,
    temperature = 0.1
)

imageBASE64String = getBASE64StringFromFile("GrandTetons.jpg")

humanMessage = HumanMessage(
    content = [
        {
            "type" : "text",
            "text" : "이미지를 설명해주세요."
        },
        {
            "type"      : "image_url",
            "image_url" : {"url" : f"data:image/jpeg;base64,{imageBASE64String}"}
        }
    ]
)

responseAIMessage = chatOpenAI.invoke([humanMessage])

print(responseAIMessage.content)

"""
이미지에는 웅장한 산맥이 보입니다. 하늘에는 구름이 떠 있고, 산 아래에는 울창한 숲이 펼쳐져 있습니다. 산의 봉우리는 날카롭고, 일부에는 눈이 덮여 있는 것처럼 보입니다. 자연의 아름다움과 장엄함이 잘 드러나는 풍경입니다.
"""

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

import base64

from dotenv import load_dotenv

from langchain_openai import ChatOpenAI

from langchain.schema.messages import HumanMessage

load_dotenv()

def getBASE64StringFromFile(filePath):

with open(filePath, "rb") as bufferedReader:

imageBytes = bufferedReader.read()

base64Bytes = base64.b64encode(imageBytes)

base64String = base64Bytes.decode("utf-8")

return base64String

chatOpenAI = ChatOpenAI(

model_name = "gpt-4o",

max_tokens = 2048,

temperature = 0.1

)

imageBASE64String = getBASE64StringFromFile("GrandTetons.jpg")

humanMessage = HumanMessage(

content = [

{

"type" : "text",

"text" : "이미지를 설명해주세요."

},

{

"type" : "image_url",

"image_url" : {"url" : f"data:image/jpeg;base64,{imageBASE64String}"}

}

]

)

responseAIMessage = chatOpenAI.invoke([humanMessage])

print(responseAIMessage.content)

"""

이미지에는 웅장한 산맥이 보입니다. 하늘에는 구름이 떠 있고, 산 아래에는 울창한 숲이 펼쳐져 있습니다. 산의 봉우리는 날카롭고, 일부에는 눈이 덮여 있는 것처럼 보입니다. 자연의 아름다움과 장엄함이 잘 드러나는 풍경입니다.

"""

▶ requirements.txt


aiohappyeyeballs==2.4.4
aiohttp==3.11.11
aiosignal==1.3.2
annotated-types==0.7.0
anyio==4.8.0
async-timeout==4.0.3
attrs==24.3.0
certifi==2024.12.14
charset-normalizer==3.4.1
distro==1.9.0
exceptiongroup==1.2.2
frozenlist==1.5.0
greenlet==3.1.1
h11==0.14.0
httpcore==1.0.7
httpx==0.28.1
idna==3.10
jiter==0.8.2
jsonpatch==1.33
jsonpointer==3.0.0
langchain==0.3.14
langchain-core==0.3.29
langchain-openai==0.2.14
langchain-text-splitters==0.3.5
langsmith==0.2.10
multidict==6.1.0
numpy==1.26.4
openai==1.59.4
orjson==3.10.13
packaging==24.2
propcache==0.2.1
pydantic==2.10.4
pydantic_core==2.27.2
python-dotenv==1.0.1
PyYAML==6.0.2
regex==2024.11.6
requests==2.32.3
requests-toolbelt==1.0.0
sniffio==1.3.1
SQLAlchemy==2.0.36
tenacity==9.0.0
tiktoken==0.8.0
tqdm==4.67.1
typing_extensions==4.12.2
urllib3==2.3.0
yarl==1.18.3

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

aiohappyeyeballs==2.4.4

aiohttp==3.11.11

aiosignal==1.3.2

annotated-types==0.7.0

anyio==4.8.0

async-timeout==4.0.3

attrs==24.3.0

certifi==2024.12.14

charset-normalizer==3.4.1

distro==1.9.0

exceptiongroup==1.2.2

frozenlist==1.5.0

greenlet==3.1.1

h11==0.14.0

httpcore==1.0.7

httpx==0.28.1

idna==3.10

jiter==0.8.2

jsonpatch==1.33

jsonpointer==3.0.0

langchain==0.3.14

langchain-core==0.3.29

langchain-openai==0.2.14

langchain-text-splitters==0.3.5

langsmith==0.2.10

multidict==6.1.0

numpy==1.26.4

openai==1.59.4

orjson==3.10.13

packaging==24.2

propcache==0.2.1

pydantic==2.10.4

pydantic_core==2.27.2

python-dotenv==1.0.1

PyYAML==6.0.2

regex==2024.11.6

requests==2.32.3

requests-toolbelt==1.0.0

sniffio==1.3.1

SQLAlchemy==2.0.36

tenacity==9.0.0

tiktoken==0.8.0

tqdm==4.67.1

typing_extensions==4.12.2

urllib3==2.3.0

yarl==1.18.3

※ pip install python-dotenv langchain langchain-openai 명령을 실행했다.

GrandTetons.jpg

2D AI GRAPHICS IMAGE LANGCHAIN LLM OPENAI PYTHON

icodebroker

[PYTHON/LANGCHAIN] ChatOpenAI 클래스 : 멀티 모달 모델 사용하기 (이미지 인식)

분류

가장 많이 읽힌 글

보관함