現在只要有額度,大家都可以調用OpenAI的多模態大模型了,例如GPT-4o和GPT-4 Turbo,我一年多前總結過一些OpenAI API的用法,發現現在稍微更新了一下。主要參考了這里:https://platform.openai.com/docs/guides/vision
其實也是比較簡單的,就是本地圖片需要用base 64先編碼,然后再上傳。我舉個例子,大家應該一看就清楚(圖片放在Processed文件夾里面):
from openai import OpenAI
import os
import base64client = OpenAI(api_key="Your_API_Key"
)# Function to encode the image
def encode_image(image_path):with open(image_path, "rb") as image_file:return base64.b64encode(image_file.read()).decode('utf-8')fig_path='Processed'for filename in os.listdir(fig_path):if filename.endswith('.png'):image_path=os.path.join(fig_path, filename)print(image_path)base64_image = encode_image(image_path)messages=[{"role": "user", "content": [{"type":"text", "text":"What's in this image?"},{"type":"image_url","image_url":{"url":f"data:image/png;base64,{base64_image}"}}]}]completion = client.chat.completions.create(model="gpt-4o",messages=messages)chat_response = completionanswer = chat_response.choices[0].message.contentprint(f'ChatGPT: {answer}')
當然,大家用的時候還是要注意花費,現在感覺還是有點貴的。