Home Arrow Icon Knowledge base Arrow Icon Global Arrow Icon Python Example code for Image Captioning
    Python image captioning Hugging Face Transformers VisionEncoderDecoderModel ViTFeatureExtractor AutoTokenizer image processing NLP


Python Example code for Image Captioning


Here is a concise example of Python code for image captioning using a pre-trained model:

python
from PIL import Image
from transformers import VisionEncoderDecoderModel, ViTFeatureExtractor, AutoTokenizer

# Load the pre-trained model and tokenizer
model = VisionEncoderDecoderModel.from_pretrained("nlpconnect/vit-gpt2-image-captioning")
feature_extractor = ViTFeatureExtractor.from_pretrained("nlpconnect/vit-gpt2-image-captioning")
tokenizer = AutoTokenizer.from_pretrained("nlpconnect/vit-gpt2-image-captioning")

# Load and preprocess the image
image = Image.open("image.jpg")
pixel_values = feature_extractor(images=image, return_tensors="pt").pixel_values

# Generate the caption
output_ids = model.generate(pixel_values, max_length=50, num_beams=4, early_stopping=True)[0]
caption = tokenizer.decode(output_ids, skip_special_tokens=True)

print(f"The caption for the image is: {caption}")

This code uses the pre-trained "nlpconnect/vit-gpt2-image-captioning" model from the Hugging Face Transformers library to generate a caption for an image. The key steps are:

1. Load the pre-trained model, feature extractor, and tokenizer.
2. Load and preprocess the image using the feature extractor.
3. Generate the caption using the model's `generate()` method, which takes the image features as input and outputs the predicted caption.
4. Decode the output token IDs into a readable caption string.

The generated caption is then printed to the console. This is a simple example, but the Hugging Face Transformers library provides many more advanced options for customizing and fine-tuning the image captioning model. [1][4]

Citations:
[1] https://www.geeksforgeeks.org/image-captioning-using-python/
[2] https://github.com/topics/image-caption-generator
[3] https://github.com/cobanov/image-captioning
[4] https://www.youtube.com/watch?v=okSrioyYnHw
[5] https://data-flair.training/blogs/python-based-project-image-caption-generator-cnn/