Image to text using GPT4Vision

If you need to extract text from image or answer questions about an image at scale, this blog is for you. Image to text is extremely easy with BulkGPT. We leverage GPT4Vision – the flagship AI tool from OpenAI as our engine for image to text.

Step 1 – Open up BulkGPT workflow

Visit BulkGPT and open up your workspace. Under workspace, go to Workflow section.

Step 2 – Enter a list of image URL or upload a csv with a list image URLs

Now simply enter a list of image url that you want to analyze. You can either input them manually or upload a list of image urls in a CSV. For example, you can find lots of free images to try out here.

Step 3 – Add a GPT4Vision task

Then click on the “+” icon, and you’ll see a list of tasks that you can add. Simply click on GPT4 Vision

Step 4 – Reference your image to text input image and add your prompt

Reference your input image and add your prompts

Step 5 – Click run and wait for your image to text results

The result are in a table format and you can now export them!

Step 6 – Iterate and improve your results

The best way to use BulkGPT is to just try out a few tasks. Iterate on your prompt. When you are happy with the result then run all your tasks at once.

Prompt engineering will usually help you significantly improve your result. Check out this guide on prompt engineering.

If you have a complex workflow, such as first scraping content from a website then let ChatGPT generate text based on scraped content, then you need to checkout BulkGPT Workflow. This feature is designed for you to build your own customized workflow then run them in batch.

Summary

That’s it, now you should be able to mass automate your image to text batch request! Try out your first automation at https://bulkgpt.ai! For more complex workflows, please checkout our future tutorials on BulkGPT Workflow feature!