Home Arrow Icon Knowledge base Arrow Icon Global Arrow Icon Python Example code for Image Captioning


Python Example code for Image Captioning


Here is a concise example of Python code for image captioning using a pre-trained model:

python
from PIL import Image
from transformers import VisionEncoderDecoderModel, ViTFeatureExtractor, AutoTokenizer

# Load the pre-trained model and tokenizer
model = VisionEncoderDecoderModel.from_pretrained("nlpconnect/vit-gpt2-image-captioning")
feature_extractor = ViTFeatureExtractor.from_pretrained("nlpconnect/vit-gpt2-image-captioning")
tokenizer = AutoTokenizer.from_pretrained("nlpconnect/vit-gpt2-image-captioning")

# Load and preprocess the image
image = Image.open("image.jpg")
pixel_values = feature_extractor(images=image, return_tensors="pt").pixel_values

# Generate the caption
output_ids = model.generate(pixel_values, max_length=50, num_beams=4, early_stopping=True)[0]
caption = tokenizer.decode(output_ids, skip_special_tokens=True)

print(f"The caption for the image is: {caption}")

This code uses the pre-trained "nlpconnect/vit-gpt2-image-captioning" model from the Hugging Face Transformers library to generate a caption for an image. The key steps are:

1. Load the pre-trained model, feature extractor, and tokenizer.
2. Load and preprocess the image using the feature extractor.
3. Generate the caption using the model's `generate()` method, which takes the image features as input and outputs the predicted caption.
4. Decode the output token IDs into a readable caption string.

The generated caption is then printed to the console. This is a simple example, but the Hugging Face Transformers library provides many more advanced options for customizing and fine-tuning the image captioning model. [1][4]

Citations:
[1] https://www.geeksforgeeks.org/image-captioning-using-python/
[2] https://github.com/topics/image-caption-generator
[3] https://github.com/cobanov/image-captioning
[4] https://www.youtube.com/watch?v=okSrioyYnHw
[5] https://data-flair.training/blogs/python-based-project-image-caption-generator-cnn/