fix(vertexai): add missing role attributes when handling images #3347

nirga · 2025-08-27T14:50:39Z

Summary

Fixed missing role attributes in Vertex AI instrumentation when processing input arguments with images
Added role="user" attribute in both async and sync versions of set_input_attributes functions
Ensures consistency with OpenTelemetry semantic conventions and other LLM instrumentations

Background

The Vertex AI instrumentation was missing role attributes when handling images or other complex input arguments. This caused inconsistency with other instrumentations like OpenAI and Gemini, which properly set role attributes alongside content.

Changes

set_input_attributes(): Added role="user" attribute before setting content
set_input_attributes_sync(): Added role="user" attribute before setting content
Follows the same pattern used in OpenAI instrumentation

Test plan

Existing tests should continue to pass
New spans with image content should include both content and role attributes
Role attributes should be set to "user" for input prompts

🤖 Generated with Claude Code

Important

Adds missing role="user" attributes to Vertex AI instrumentation for consistency with OpenTelemetry conventions, with comprehensive tests.

Behavior:
- Add role="user" attribute in set_input_attributes() and set_input_attributes_sync() before setting content.
- Ensures consistency with OpenTelemetry semantic conventions.
Tests:
- Add test_role_attributes.py to validate role and content attributes for sync/async paths, multiple/mixed inputs, image content, and disabled/non-recording scenarios.

^{This description was created by}^{for 5974e0c. You can customize this summary. It will automatically update as commits are pushed.}

Summary by CodeRabbit

New Features
- Adds role metadata (“user”) to each recorded prompt in Vertex AI telemetry, alongside existing prompt content. Applies to both asynchronous and synchronous flows and supports mixed text/image inputs.
Tests
- Introduces comprehensive tests covering async and sync paths, multiple arguments, mixed content (including images), and scenarios where prompt capture is disabled or spans are not recording, ensuring role and content attributes behave as expected.

When processing input arguments with images in Vertex AI instrumentation, the role attribute was not being set for prompt spans. This caused inconsistency with other LLM instrumentations like OpenAI and Gemini which properly set role attributes. Changes: - Add role="user" attribute in set_input_attributes() async function - Add role="user" attribute in set_input_attributes_sync() function - Ensures consistency with OpenTelemetry semantic conventions - Matches behavior of OpenAI instrumentation pattern 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

coderabbitai · 2025-08-27T14:50:47Z

Walkthrough

Adds "role" span attributes for each prompt argument in both async and sync input attribute setters, marking them as "user" alongside existing content attributes. Introduces tests validating role and content behavior across enabled/disabled prompts, recording state, multiple/mixed arguments, and image-content cases.

Changes

Cohort / File(s)	Summary of Changes
VertexAI instrumentation: input attributes `packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py`	In both async and sync paths, sets `SpanAttributes.LLM_PROMPTS.{index}.role = "user"` before setting corresponding `.content`; no API/signature changes.
Tests: role attribute handling `packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py`	New test module covering async/sync behavior for role and content attributes, gating by `should_send_prompts` and `span.is_recording`, multiple args, mixed text/image content, and negative cases.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Caller
  participant Utils as set_input_attributes(_sync)
  participant Gate as should_send_prompts
  participant Span as Span

  Caller->>Utils: set_input_attributes(args, span)
  Utils->>Gate: check prompts enabled
  Gate-->>Utils: True/False
  alt prompts enabled and span.is_recording
    loop for each arg i with processed_content
      Utils->>Span: set_attribute(LLM_PROMPTS.i.role, "user")
      Utils->>Span: set_attribute(LLM_PROMPTS.i.content, processed_content)
    end
  else not enabled or not recording
    note over Utils: No attributes set
  end

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10–15 minutes

Poem

I thump my paws—new roles in tow,
“user” tags where prompts now go.
Text and pics, we mark with care,
Content follows, neatly there.
Carrots crunched, tests all green—
Trace trails tidy, crisp, and clean. 🥕✨

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/vertexai-missing-role-attributes

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbit in a new review comment at the desired location with your query.
PR comments: Tag @coderabbit in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbit gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbit read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbit help to get the list of available commands.

Other keywords and placeholders

Add @coderabbit ignore or @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbit summary or @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbit or @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

ellipsis-dev

Important

Looks good to me! 👍

Reviewed everything up to 6dc8314 in 36 seconds. Click for details.

Reviewed 28 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 2 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py:217

Draft comment:
Good addition of the role attribute in the async set_input_attributes. Consider extracting the constant 'user' into a shared constant if used elsewhere to reduce duplication.
Reason this comment was not posted:
Confidence changes required: 0% <= threshold 50% None

2. packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py:242

Draft comment:
Role attribute addition in the sync version looks correct. You might consider refactoring duplicate role-setting logic into a helper function to improve maintainability.
Reason this comment was not posted:
Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_BwSOjydSwpaiocMC

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py (2)
259-260: Inconsistent key: .0.user vs .0.role + .0.content.

set_model_input_attributes still writes gen_ai.prompt.0.user, which diverges from the new .role/.content scheme used elsewhere. Recommend aligning for consistency and downstream consumers.

Apply this diff in place:
-    _set_span_attribute(
-        span, f"{SpanAttributes.LLM_PROMPTS}.0.user", kwargs.get("prompt")
-    )
+    _set_span_attribute(
+        span, f"{SpanAttributes.LLM_PROMPTS}.0.role", "user"
+    )
+    _set_span_attribute(
+        span,
+        f"{SpanAttributes.LLM_PROMPTS}.0.content",
+        json.dumps([{"type": "text", "text": kwargs.get("prompt")}])
+    )
98-106: Potential silent failure: lost exceptions/empty URL from background upload (sync image path).

If upload_base64_image raises inside the thread, the exception won’t surface here and url may remain None, producing an invalid image_url. Add a guard to skip setting image content when upload fails.

Apply after run_async(upload_task()):
         run_async(upload_task())
 
+        if not url:
+            logger.warning("upload_base64_image returned no URL; skipping image content.")
+            return None
+
         return {
             "type": "image_url",
             "image_url": {"url": url}
         }

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between bdcd2fa and 6dc8314.

📒 Files selected for processing (1)

packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py (2 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules

Files:

packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py

🧬 Code graph analysis (1)

packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py (2)

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (1)

_set_span_attribute (20-24)

packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py (1)

SpanAttributes (64-261)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: Lint
GitHub Check: Test Packages (3.12)
GitHub Check: Test Packages (3.11)
GitHub Check: Build Packages (3.11)
GitHub Check: Test Packages (3.10)

🔇 Additional comments (3)

packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py (3)

217-221: Good addition: set role before content for prompts (async).

Setting gen_ai.prompt.{i}.role = "user" prior to .content aligns with completions and other instrumentations.

241-245: Good addition: set role before content for prompts (sync).

Mirrors the async path; consistent behavior across both code paths.

206-227: No duplicate prompt attributes in VertexAI instrumentation
I verified the call sites in __init__.py where both functions run:

Async handler (_handle_request):
• set_model_input_attributes(span, kwargs, llm_model) at line 183
• await set_input_attributes(span, args) at line 187

Sync wrapper:
• set_model_input_attributes(span, kwargs, llm_model) at line 293
• set_input_attributes_sync(span, args) at line 297

set_model_input_attributes only sets the SpanAttributes.LLM_REQUEST_MODEL, while both set_input_attributes and set_input_attributes_sync exclusively set SpanAttributes.LLM_PROMPTS.* fields (role and content). There is no overlap, so no risk of double-writing or conflicting prompt attributes.

ellipsis-dev

Important

Looks good to me! 👍

Reviewed 5ffd36d in 46 seconds. Click for details.

Reviewed 514 lines of code in 2 files
Skipped 0 files when reviewing.
Skipped posting 2 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. packages/opentelemetry-instrumentation-google-generativeai/tests/test_role_attributes.py:37

Draft comment:
Tests comprehensively cover role attribute handling for various input types (string, contents, prompt, custom role, and image content). The use of patch and side_effect for _is_image_part is robust and aligns with expected behavior.
Reason this comment was not posted:
Confidence changes required: 0% <= threshold 50% None

2. packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py:35

Draft comment:
VertexAI tests correctly validate role attribute handling in both async and sync functions with diverse inputs. Note the different function signature (no extra kwargs) compared to Google GenerativeAI and the inclusion of image content tests. Consider adding a trailing newline for stylistic consistency.
Reason this comment was not posted:
Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_OynnNstCxmZW9VQu

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (8)

packages/opentelemetry-instrumentation-google-generativeai/tests/test_role_attributes.py (4)

63-66: Use JSON structure equality instead of string equality.

-        expected_content = json.dumps([{"type": "text", "text": "Hello from contents!"}])
-        assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.content"] == expected_content
+        expected_content = [{"type": "text", "text": "Hello from contents!"}]
+        assert json.loads(self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.content"]) == expected_content

79-82: Add a content assertion for prompt kwargs for parity.

         # Verify role attribute is set
         assert f"{SpanAttributes.LLM_PROMPTS}.0.role" in self.span_attributes
         assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.role"] == "user"
+        # And content
+        assert f"{SpanAttributes.LLM_PROMPTS}.0.content" in self.span_attributes
+        expected_content = [{"type": "text", "text": "Hello from prompt!"}]
+        assert json.loads(self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.content"]) == expected_content

137-138: Prefer JSON structural compare (avoids ordering brittleness).

-        expected_content = json.dumps([{"type": "text", "text": "Hello, world!"}])
-        assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.content"] == expected_content
+        expected_content = [{"type": "text", "text": "Hello, world!"}]
+        assert json.loads(self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.content"]) == expected_content

21-27: Trim trailing whitespace to satisfy Flake8 (W291/W293).

packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py (4)

91-95: Use json.loads for content equality (avoid brittle string compare).

-        expected_content = json.dumps([{"type": "text", "text": "Hello, world!"}])
-        assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.content"] == expected_content
+        expected_content = [{"type": "text", "text": "Hello, world!"}]
+        assert json.loads(self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.content"]) == expected_content

173-190: Make image detection deterministic by patching the helper (mirrors GGAI tests).

-    @patch("opentelemetry.instrumentation.vertexai.span_utils.should_send_prompts")
+    @patch("opentelemetry.instrumentation.vertexai.span_utils.should_send_prompts")
+    @patch("opentelemetry.instrumentation.vertexai.span_utils._is_image_part")
     @pytest.mark.asyncio
-    async def test_async_role_attribute_for_image_content(self, mock_should_send_prompts):
+    async def test_async_role_attribute_for_image_content(self, mock_is_image_part, mock_should_send_prompts):
@@
-        # Create a mock image-like object (simplified for testing)
+        # Mock image detection
+        def side_effect(item):
+            return hasattr(item, "mime_type") and str(item.mime_type).startswith("image/")
+        mock_is_image_part.side_effect = side_effect
+
+        # Create a mock image-like object (simplified for testing)
         mock_image_part = Mock()
         mock_image_part.mime_type = "image/jpeg"

194-213: Do the same deterministic patch for sync image test.

-    @patch("opentelemetry.instrumentation.vertexai.span_utils.should_send_prompts")
-    def test_sync_role_attribute_for_image_content(self, mock_should_send_prompts):
+    @patch("opentelemetry.instrumentation.vertexai.span_utils.should_send_prompts")
+    @patch("opentelemetry.instrumentation.vertexai.span_utils._is_image_part")
+    def test_sync_role_attribute_for_image_content(self, mock_is_image_part, mock_should_send_prompts):
@@
-        # Create a mock image-like object (simplified for testing)
+        # Mock image detection
+        def side_effect(item):
+            return hasattr(item, "mime_type") and str(item.mime_type).startswith("image/")
+        mock_is_image_part.side_effect = side_effect
+
+        # Create a mock image-like object (simplified for testing)
         mock_image_part = Mock()
         mock_image_part.mime_type = "image/jpeg"

1-213: Optional: reduce duplication via parametrization/fixture helpers.

Many tests duplicate setup and assertions across async/sync; consider a shared fixture for span capture and parametrize message cases.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 6dc8314 and 5ffd36d.

📒 Files selected for processing (2)

packages/opentelemetry-instrumentation-google-generativeai/tests/test_role_attributes.py (1 hunks)
packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules

Files:

packages/opentelemetry-instrumentation-google-generativeai/tests/test_role_attributes.py
packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py

🧬 Code graph analysis (2)

packages/opentelemetry-instrumentation-google-generativeai/tests/test_role_attributes.py (2)

packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py (1)

SpanAttributes (64-261)

packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py (9)

TestRoleAttributes (11-212)

setup_method (14-26)

capture_attribute (23-24)

test_async_role_attribute_for_string_args (30-45)

test_async_role_attribute_for_multiple_args (49-62)

test_sync_role_attribute_for_string_args (79-94)

test_async_no_role_when_prompts_disabled (127-135)

test_async_no_role_when_span_not_recording (150-159)

test_async_role_attribute_for_image_content (175-192)

packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py (2)

packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py (1)

SpanAttributes (64-261)

packages/opentelemetry-instrumentation-google-generativeai/tests/test_role_attributes.py (4)

setup_method (14-26)

capture_attribute (23-24)

test_async_role_attribute_for_string_args (30-46)

test_async_role_attribute_for_image_content (245-265)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: Test Packages (3.10)
GitHub Check: Test Packages (3.12)
GitHub Check: Test Packages (3.11)
GitHub Check: Lint
GitHub Check: Build Packages (3.11)

🔇 Additional comments (3)

packages/opentelemetry-instrumentation-google-generativeai/tests/test_role_attributes.py (1)

242-289: Good coverage for image handling with explicit _is_image_part patching.

packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py (2)

28-172: Core role-attribute coverage looks good and aligns with the PR goal.

4-7: Signature check passed across instrumentations. VertexAI’s async set_input_attributes(span, args) and sync set_input_attributes_sync(span, args) match the test calls, and Google-GenerativeAI’s async and sync versions accept (span, args, kwargs, llm_model). _is_image_part is only defined in the GGAI instrumentation.

packages/opentelemetry-instrumentation-google-generativeai/tests/test_role_attributes.py

packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py

… GenerativeAI Add test coverage for role attribute handling in both instrumentation packages: - VertexAI: Tests for the recent fix that added missing role="user" attributes when processing arguments containing images - Google GenerativeAI: Tests confirming role attributes work correctly for all content types including images (no fixes needed) Tests cover: - Text-only prompts with role attributes - Image-containing arguments with role attributes - Multiple arguments handling - Custom role attributes (assistant, system) - Edge cases (disabled prompts, non-recording spans) 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

ellipsis-dev

Important

Looks good to me! 👍

Reviewed 5974e0c in 57 seconds. Click for details.

Reviewed 218 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 5 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py:11

Draft comment:
Overall, the tests comprehensively cover both sync and async flows for role attributes. Consider parameterizing repeated patterns to reduce redundancy.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

2. packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py:43

Draft comment:
The test compares JSON strings for the content attribute. Consider deserializing and comparing objects to avoid potential formatting issues.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

3. packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py:70

Draft comment:
In test_async_role_attribute_with_mixed_content, the input is a nested list of text only. The test name suggests mixed content; consider including a non-text (e.g. image-like) element or renaming the test.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

4. packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py:180

Draft comment:
The mock image object only sets mime_type. Consider enhancing it with additional attributes if realistic image behavior is needed in the instrumentation.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

5. packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py:195

Draft comment:
In the sync image content test, the assertion only checks for the existence of the content attribute. Consider verifying the actual formatted content to ensure consistency.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

Workflow ID: wflow_CK8f2Sdga1D0qFht

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (2)

packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py (2)

14-27: Record attribute call order to assert role-before-content.

Without tracking call order, you can’t guarantee role is set before content.

Apply:

     def setup_method(self):
         """Setup test fixtures"""
         self.mock_span = Mock()
         self.mock_span.is_recording.return_value = True
         self.mock_span.context.trace_id = "test_trace_id"
         self.mock_span.context.span_id = "test_span_id"
         self.span_attributes = {}
 
-        # Mock set_attribute to capture the attributes
-        def capture_attribute(key, value):
-            self.span_attributes[key] = value
+        # Mock set_attribute to capture the attributes and their call order
+        self.attr_order = []
+        def capture_attribute(key, value):
+            self.span_attributes[key] = value
+            self.attr_order.append(key)
 
         self.mock_span.set_attribute = capture_attribute

38-46: Assert role precedes content and compare JSON structurally.

String-equality on json dumps is brittle; also verify ordering.

Apply:

-        # Verify content is also set
-        assert f"{SpanAttributes.LLM_PROMPTS}.0.content" in self.span_attributes
-        expected_content = json.dumps([{"type": "text", "text": "Hello, world!"}])
-        assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.content"] == expected_content
+        # Verify content is set and came after role
+        role_key = f"{SpanAttributes.LLM_PROMPTS}.0.role"
+        content_key = f"{SpanAttributes.LLM_PROMPTS}.0.content"
+        assert content_key in self.span_attributes
+        assert self.attr_order.index(role_key) < self.attr_order.index(content_key)
+        expected_content = [{"type": "text", "text": "Hello, world!"}]
+        assert json.loads(self.span_attributes[content_key]) == expected_content

🧹 Nitpick comments (9)

packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py (9)

70-77: Mixed-content: verify content presence, JSON shape, and order.

Avoid overfitting to representation; just assert list JSON and role-before-content.

Apply:

         # Verify role attribute is set
         assert f"{SpanAttributes.LLM_PROMPTS}.0.role" in self.span_attributes
         assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.role"] == "user"
+        # Verify content set and order
+        role_key = f"{SpanAttributes.LLM_PROMPTS}.0.role"
+        content_key = f"{SpanAttributes.LLM_PROMPTS}.0.content"
+        assert content_key in self.span_attributes
+        assert self.attr_order.index(role_key) < self.attr_order.index(content_key)
+        parts = json.loads(self.span_attributes[content_key])
+        assert isinstance(parts, list) and len(parts) >= 1

117-124: Sync mixed-content: assert content exists, JSON shape, and ordering.

Keeps the test robust across content encodings.

Apply:

         # Verify role attribute is set
         assert f"{SpanAttributes.LLM_PROMPTS}.0.role" in self.span_attributes
         assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.role"] == "user"
+        # Verify content set and order
+        role_key = f"{SpanAttributes.LLM_PROMPTS}.0.role"
+        content_key = f"{SpanAttributes.LLM_PROMPTS}.0.content"
+        assert content_key in self.span_attributes
+        assert self.attr_order.index(role_key) < self.attr_order.index(content_key)
+        parts = json.loads(self.span_attributes[content_key])
+        assert isinstance(parts, list) and len(parts) >= 1

134-136: Scope the “no attributes” assertions to LLM_PROMPTS keys.

Other attributes may be set; check only prompt attributes.

Apply:

-        # Verify no attributes are set when prompts are disabled
-        assert len(self.span_attributes) == 0
+        # Verify no LLM_PROMPTS attributes are set when prompts are disabled
+        assert not any(
+            key.startswith(f"{SpanAttributes.LLM_PROMPTS}") for key in self.span_attributes
+        )

145-147: Sync disabled: restrict assertion to prompt keys.

Same reasoning as async case.

Apply:

-        # Verify no attributes are set when prompts are disabled
-        assert len(self.span_attributes) == 0
+        # Verify no LLM_PROMPTS attributes are set when prompts are disabled
+        assert not any(
+            key.startswith(f"{SpanAttributes.LLM_PROMPTS}") for key in self.span_attributes
+        )

158-160: Span not recording (async): restrict assertion to prompt keys.

Avoid false failures due to unrelated attributes.

Apply:

-        # Verify no attributes are set when span is not recording
-        assert len(self.span_attributes) == 0
+        # Verify no LLM_PROMPTS attributes are set when span is not recording
+        assert not any(
+            key.startswith(f"{SpanAttributes.LLM_PROMPTS}") for key in self.span_attributes
+        )

170-172: Span not recording (sync): restrict assertion to prompt keys.

Mirror the async negative case.

Apply:

-        # Verify no attributes are set when span is not recording
-        assert len(self.span_attributes) == 0
+        # Verify no LLM_PROMPTS attributes are set when span is not recording
+        assert not any(
+            key.startswith(f"{SpanAttributes.LLM_PROMPTS}") for key in self.span_attributes
+        )

187-194: Image-content (async): assert order and JSON shape without overfitting.

Validate role-before-content and that content is valid JSON list.

Apply:

         # Verify role attribute is set even when content includes images
         assert f"{SpanAttributes.LLM_PROMPTS}.0.role" in self.span_attributes
         assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.role"] == "user"
 
         # Verify content is also set
         assert f"{SpanAttributes.LLM_PROMPTS}.0.content" in self.span_attributes
+        role_key = f"{SpanAttributes.LLM_PROMPTS}.0.role"
+        content_key = f"{SpanAttributes.LLM_PROMPTS}.0.content"
+        assert self.attr_order.index(role_key) < self.attr_order.index(content_key)
+        parts = json.loads(self.span_attributes[content_key])
+        assert isinstance(parts, list)

207-213: Image-content (sync): assert order and JSON shape.

Keep parity with async image test.

Apply:

         # Verify role attribute is set even when content includes images
         assert f"{SpanAttributes.LLM_PROMPTS}.0.role" in self.span_attributes
         assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.role"] == "user"
 
         # Verify content is also set
         assert f"{SpanAttributes.LLM_PROMPTS}.0.content" in self.span_attributes
+        role_key = f"{SpanAttributes.LLM_PROMPTS}.0.role"
+        content_key = f"{SpanAttributes.LLM_PROMPTS}.0.content"
+        assert self.attr_order.index(role_key) < self.attr_order.index(content_key)
+        parts = json.loads(self.span_attributes[content_key])
+        assert isinstance(parts, list)

28-213: Reduce duplication with helper/assertions or parametrization.

A small helper (assert_role_and_content(span_attrs, idx, text)) or pytest parametrization can DRY async/sync and multi-arg variants.

Happy to propose a minimal helper and fold tests with pytest.mark.parametrize if you want.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 5ffd36d and 5974e0c.

📒 Files selected for processing (1)

packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules

Files:

packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py

🧬 Code graph analysis (1)

packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py (1)

packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py (1)

SpanAttributes (64-261)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: Build Packages (3.11)
GitHub Check: Test Packages (3.12)
GitHub Check: Lint
GitHub Check: Test Packages (3.10)
GitHub Check: Test Packages (3.11)

🔇 Additional comments (1)

packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py (1)

1-9: Imports and patch targets look correct.

Modules used, async markers, and the patched path to should_send_prompts are appropriate.

coderabbitai · 2025-08-29T14:00:52Z

packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py

+        # Verify role attributes are set for both arguments
+        assert f"{SpanAttributes.LLM_PROMPTS}.0.role" in self.span_attributes
+        assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.role"] == "user"
+
+        assert f"{SpanAttributes.LLM_PROMPTS}.1.role" in self.span_attributes
+        assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.1.role"] == "user"
+


🛠️ Refactor suggestion

Also assert contents and per-arg ordering for multi-arg case.

Currently only role is asserted; add content and order checks for each arg.

Apply:

# Verify role attributes are set for both arguments assert f"{SpanAttributes.LLM_PROMPTS}.0.role" in self.span_attributes assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.role"] == "user" assert f"{SpanAttributes.LLM_PROMPTS}.1.role" in self.span_attributes assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.1.role"] == "user" + + # Verify contents set and ordering for both arguments + for i, text in enumerate(["First message", "Second message"]): + role_key = f"{SpanAttributes.LLM_PROMPTS}.{i}.role" + content_key = f"{SpanAttributes.LLM_PROMPTS}.{i}.content" + assert content_key in self.span_attributes + assert self.attr_order.index(role_key) < self.attr_order.index(content_key) + assert json.loads(self.span_attributes[content_key]) == [{"type": "text", "text": text}]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Verify role attributes are set for both arguments

assert f"{SpanAttributes.LLM_PROMPTS}.0.role" in self.span_attributes

assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.role"] == "user"

assert f"{SpanAttributes.LLM_PROMPTS}.1.role" in self.span_attributes

assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.1.role"] == "user"

# Verify role attributes are set for both arguments

assert f"{SpanAttributes.LLM_PROMPTS}.0.role" in self.span_attributes

assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.role"] == "user"

assert f"{SpanAttributes.LLM_PROMPTS}.1.role" in self.span_attributes

assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.1.role"] == "user"

# Verify contents set and ordering for both arguments

for i, text in enumerate(["First message", "Second message"]):

role_key = f"{SpanAttributes.LLM_PROMPTS}.{i}.role"

content_key = f"{SpanAttributes.LLM_PROMPTS}.{i}.content"

assert content_key in self.span_attributes

assert self.attr_order.index(role_key) < self.attr_order.index(content_key)

assert json.loads(self.span_attributes[content_key]) == [{"type": "text", "text": text}]

🤖 Prompt for AI Agents

In packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py around lines 57-63, the test currently only asserts the role keys for each prompt index; add assertions to verify each prompt's content exists and matches the expected per-argument text and that ordering is preserved. Concretely, assert f"{SpanAttributes.LLM_PROMPTS}.0.content" and f"{SpanAttributes.LLM_PROMPTS}.1.content" are present in self.span_attributes and equal the first and second prompt strings used in the test (e.g., first_prompt_text and second_prompt_text), and optionally build a list [self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.content"], self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.1.content"]] and assert it equals [first_prompt_text, second_prompt_text] to ensure ordering.

coderabbitai · 2025-08-29T14:00:53Z

packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py

+        # Verify role attribute is set
+        assert f"{SpanAttributes.LLM_PROMPTS}.0.role" in self.span_attributes
+        assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.role"] == "user"
+
+        # Verify content is also set
+        assert f"{SpanAttributes.LLM_PROMPTS}.0.content" in self.span_attributes
+        expected_content = json.dumps([{"type": "text", "text": "Hello, world!"}])
+        assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.content"] == expected_content
+


🛠️ Refactor suggestion

Sync path: assert order and parse JSON structurally.

Mirror the async assertions for consistency.

Apply:

- # Verify content is also set - assert f"{SpanAttributes.LLM_PROMPTS}.0.content" in self.span_attributes - expected_content = json.dumps([{"type": "text", "text": "Hello, world!"}]) - assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.content"] == expected_content + # Verify content is set and came after role + role_key = f"{SpanAttributes.LLM_PROMPTS}.0.role" + content_key = f"{SpanAttributes.LLM_PROMPTS}.0.content" + assert content_key in self.span_attributes + assert self.attr_order.index(role_key) < self.attr_order.index(content_key) + expected_content = [{"type": "text", "text": "Hello, world!"}] + assert json.loads(self.span_attributes[content_key]) == expected_content

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Verify role attribute is set

assert f"{SpanAttributes.LLM_PROMPTS}.0.role" in self.span_attributes

assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.role"] == "user"

# Verify content is also set

assert f"{SpanAttributes.LLM_PROMPTS}.0.content" in self.span_attributes

expected_content = json.dumps([{"type": "text", "text": "Hello, world!"}])

assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.content"] == expected_content

# Verify role attribute is set

assert f"{SpanAttributes.LLM_PROMPTS}.0.role" in self.span_attributes

assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.role"] == "user"

# Verify content is set and came after role

role_key = f"{SpanAttributes.LLM_PROMPTS}.0.role"

content_key = f"{SpanAttributes.LLM_PROMPTS}.0.content"

assert content_key in self.span_attributes

assert self.attr_order.index(role_key) < self.attr_order.index(content_key)

expected_content = [{"type": "text", "text": "Hello, world!"}]

assert json.loads(self.span_attributes[content_key]) == expected_content

🤖 Prompt for AI Agents

In packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py around lines 87 to 95, the sync test currently checks the prompt content by string equality; change it to mirror the async assertions by first asserting the role attribute exists and equals "user", then asserting the content attribute exists, parsing the stored JSON with json.loads and comparing the resulting Python object to the expected list/dict structure (e.g., [{"type":"text","text":"Hello, world!"}]) so the check is structural and order-safe rather than raw string equality.

coderabbitai · 2025-08-29T14:00:53Z

packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py

+        # Verify role attributes are set for both arguments
+        assert f"{SpanAttributes.LLM_PROMPTS}.0.role" in self.span_attributes
+        assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.role"] == "user"
+
+        assert f"{SpanAttributes.LLM_PROMPTS}.1.role" in self.span_attributes
+        assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.1.role"] == "user"
+


🛠️ Refactor suggestion

Sync multi-arg: add content and ordering checks.

Bring parity with async test and increase signal.

Apply:

# Verify role attributes are set for both arguments assert f"{SpanAttributes.LLM_PROMPTS}.0.role" in self.span_attributes assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.role"] == "user" assert f"{SpanAttributes.LLM_PROMPTS}.1.role" in self.span_attributes assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.1.role"] == "user" + + # Verify contents set and ordering for both arguments + for i, text in enumerate(["First message", "Second message"]): + role_key = f"{SpanAttributes.LLM_PROMPTS}.{i}.role" + content_key = f"{SpanAttributes.LLM_PROMPTS}.{i}.content" + assert content_key in self.span_attributes + assert self.attr_order.index(role_key) < self.attr_order.index(content_key) + assert json.loads(self.span_attributes[content_key]) == [{"type": "text", "text": text}]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Verify role attributes are set for both arguments

assert f"{SpanAttributes.LLM_PROMPTS}.0.role" in self.span_attributes

assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.role"] == "user"

assert f"{SpanAttributes.LLM_PROMPTS}.1.role" in self.span_attributes

assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.1.role"] == "user"

# Verify role attributes are set for both arguments

assert f"{SpanAttributes.LLM_PROMPTS}.0.role" in self.span_attributes

assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.0.role"] == "user"

assert f"{SpanAttributes.LLM_PROMPTS}.1.role" in self.span_attributes

assert self.span_attributes[f"{SpanAttributes.LLM_PROMPTS}.1.role"] == "user"

# Verify contents set and ordering for both arguments

for i, text in enumerate(["First message", "Second message"]):

role_key = f"{SpanAttributes.LLM_PROMPTS}.{i}.role"

content_key = f"{SpanAttributes.LLM_PROMPTS}.{i}.content"

assert content_key in self.span_attributes

assert self.attr_order.index(role_key) < self.attr_order.index(content_key)

assert json.loads(self.span_attributes[content_key]) == [{"type": "text", "text": text}]

🤖 Prompt for AI Agents

In packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py around lines 105 to 111, the sync multi-arg test currently only checks that role attributes exist and equal "user"; update it to match the async test by also asserting the prompt content and explicit ordering values for both arguments: add assertions that f"{SpanAttributes.LLM_PROMPTS}.0.content" and f"{SpanAttributes.LLM_PROMPTS}.1.content" are present and equal to the expected prompt strings used in the test, and assert that f"{SpanAttributes.LLM_PROMPTS}.0.order" == 0 and f"{SpanAttributes.LLM_PROMPTS}.1.order" == 1 (or the same ordering representation used in the async test) so content and ordering parity are verified.

ellipsis-dev bot reviewed Aug 27, 2025

View reviewed changes

coderabbitai bot reviewed Aug 27, 2025

View reviewed changes

ellipsis-dev bot reviewed Aug 28, 2025

View reviewed changes

doronkopit5 approved these changes Aug 28, 2025

View reviewed changes

coderabbitai bot reviewed Aug 28, 2025

View reviewed changes

nirga force-pushed the fix/vertexai-missing-role-attributes branch from 5ffd36d to 5974e0c Compare August 29, 2025 13:51

ellipsis-dev bot reviewed Aug 29, 2025

View reviewed changes

coderabbitai bot reviewed Aug 29, 2025

View reviewed changes

Merge branch 'main' into fix/vertexai-missing-role-attributes

c1de664

nirga merged commit 8fd4289 into main Aug 29, 2025
9 checks passed

nirga deleted the fix/vertexai-missing-role-attributes branch August 29, 2025 18:02

fix(vertexai): add missing role attributes when handling images #3347

fix(vertexai): add missing role attributes when handling images #3347

Uh oh!

Conversation

nirga commented Aug 27, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Background

Changes

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

nirga commented Aug 27, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 27, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)