Skip to content

Conversation

tberends
Copy link

@tberends tberends commented Aug 2, 2025

Description

On request of @SkalskiP at PR: https://github.com/roboflow/notebooks/pull/384

This PR improves the documentation regarding the ordering of content in requests that combine images with text prompts. Following Google's Gemini API best practices, text prompts are now placed after image parts in the contents array when using a single image with text.

Type of change

  • This change requires a documentation update

How has this change been tested, please provide a testcase or example of how you tested the change?

According to the Gemini API documentation on image prompts, when using a single image with text, the recommended approach is to place the text prompt after the image part in the contents array. This ordering has been shown to produce significantly better results in practice.

In our testing with Process & Instrument Diagrams (P&IDs) using object detection, this reordering led to drastically improved accuracy in bounding box positioning. While the object labels were already accurate, the spatial precision of detected elements improved considerably with the optimized prompt ordering

Docs

  • Docs updated? What were the changes: updated the tips for prompt engineering

@tberends tberends requested a review from SkalskiP as a code owner August 2, 2025 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant