-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Open
Labels
core[Component] This issue is related to the core interface and implementation[Component] This issue is related to the core interface and implementation
Description
Describe the bug
When I'm using ParallelAgent as an AgentTool in the root agent, the image that I'm loading and saving in the Artifacts is not visible for the sub_agents and not even for the root agent. Is there any example of how this works?
To Reproduce
from google.adk.agents import Agent, ParallelAgent
from google.adk.tools.agent_tool import AgentTool
from google.genai.types import Part
from .tools.debug_tools import log_llm_request
from app.core.config import settings
from .prompt import ANALYSIS_PROMPT
from .subagents.content_quality_agent.agent import content_quality_agent
from .subagents.facial_mood_agent.agent import facial_mood_agent
from .subagents.ocr_agent.agent import ocr_agent
from .subagents.scene_agent.agent import scene_agent
from .tools.analysis_submission_tool import submit_final_analysis
from .tools.image_tools import fetch_and_load_image
descriptions = {
'analyze_image_details': (
'Analyzes the details of an image, such as scene, text, mood, and '
'quality, returning a dictionary with all results.'
),
'analysis_agent': (
'A friendly conversational assistant for the Vibe app that can '
'analyze images.'
),
}
analyze_image_details = ParallelAgent(
name='analyze_image_details',
description=descriptions['analyze_image_details'],
sub_agents=[
scene_agent,
ocr_agent,
facial_mood_agent,
content_quality_agent,
]
)
conversational_agent = Agent(
model=settings.gemini.flash_model,
name='ConversationalAgent',
description=descriptions['vibe_agent'],
instruction=ANALYSIS_PROMPT,
tools=[
fetch_and_load_image,
AgentTool(agent=analyze_image_details),
submit_final_analysis,
],
)
root_agent = conversational_agent
tool
import logging
import uuid
from google.adk.tools import ToolContext
from google.genai.types import Blob, Part
from app.core.config import settings
from app.infra.services.r2_storage_service import R2StorageService
async def fetch_and_load_image(
tool_context: ToolContext, source_name: str
) -> str:
"""
Fetches an image from R2 using the source name (object key)
and loads its bytes and mime_type into the tool context state.
Args:
tool_context: The current invocation context, provided by the ADK.
source_name: The source name (object key) of the image in R2.
Returns:
A success message indicating the image has been loaded into the state.
"""
object_key = source_name
logging.info(f'Using source name as object key: {object_key}')
logging.info(
f"Attempting to download key: '{object_key}' from bucket: "
f"'{settings.r2.bucket_name}'"
)
try:
storage_service = R2StorageService()
image_bytes = storage_service.download_file_as_bytes(object_key)
extension = object_key.split('.')[-1].lower()
mime_type = (
f'image/{extension}'
if extension in {'jpeg', 'png', 'gif', 'webp'}
else 'image/jpeg'
)
artifact_filename = f"user:downloaded_image_{uuid.uuid4()}.jpg"
try:
artifact_part = Part(
inline_data=Blob(mime_type=mime_type, data=image_bytes)
)
# Fix: await the async save_artifact call
await tool_context.save_artifact(
filename=artifact_filename, artifact=artifact_part
)
tool_context.state['artifact_to_process'] = artifact_filename
logging.info(
'[Tool]: Artifact saved and filename stored in state.'
)
logging.info(f'[TOOL]: Artifact image size: {len(image_bytes)} bytes')
except ValueError as e:
logging.error(f'Error saving artifact: {e}')
except Exception as e:
logging.error(f'Unexpected error saving artifact reference: {e}')
success_message = (
f"Successfully fetched and loaded image '{object_key}' "
'into the context state.'
)
logging.info(success_message)
return success_message
except Exception as e:
logging.error(f'Failed to fetch and load image from R2: {e}')
raise
from google.adk.agents import Agent
from app.core.config import settings
from .model import SceneAnalysis
from .prompt import SCENE_ANALYZER_PROMPT
from ...tools.debug_tools import log_llm_request
description = (
'Analyzes the scene in an image to understand the environment '
'and characteristics of the scene.'
)
scene_agent = Agent(
model=settings.gemini.flash_model,
name='SceneAgent',
description=description,
instruction=SCENE_ANALYZER_PROMPT,
output_schema=SceneAnalysis,
output_key='scene_analysis',
disallow_transfer_to_parent=True,
disallow_transfer_to_peers=True,
before_model_callback=[log_llm_request],
)
Expected behavior
There must be a way to pass or load Artifacts inside the subagent.
Desktop (please complete the following information):
- OS: macOS
- Python version(python -V): 3
- ADK version(pip show google-adk): 1.16.0
Model Information:
- Are you using LiteLLM: No
- Which model is being used(e.g. gemini-2.5-flash)
Additional context
here is the output of my debug tool:
ontent=True partial=None calls=1
2025-10-20 17:22:26 [2025-10-20 15:22:26,170: INFO/ForkPoolWorker-2] [DEBUG] author=VibeConversationalAgent part 0 function_call: fetch_and_load_image
2025-10-20 17:22:26 [2025-10-20 15:22:26,172: INFO/ForkPoolWorker-2] Using source name as object key: places/ef5f2c0a-a63d-4763-b23c-dfe0b8b8ea91/sambalatte_at_molasky_center.jpg
2025-10-20 17:22:26 [2025-10-20 15:22:26,172: INFO/ForkPoolWorker-2] Attempting to download key: 'places/ef5f2c0a-a63d-4763-b23c-dfe0b8b8ea91/sambalatte_at_molasky_center.jpg' from bucket: 'vibe-backend-storage'
2025-10-20 17:22:26 [2025-10-20 15:22:26,701: INFO/ForkPoolWorker-2] File places/ef5f2c0a-a63d-4763-b23c-dfe0b8b8ea91/sambalatte_at_molasky_center.jpg downloaded from R2 as bytes.
2025-10-20 17:22:26 [2025-10-20 15:22:26,717: INFO/ForkPoolWorker-2] [Tool]: Artifact saved and filename stored in state.
2025-10-20 17:22:26 [2025-10-20 15:22:26,717: INFO/ForkPoolWorker-2] [TOOL]: Artifact image size: 117934 bytes
2025-10-20 17:22:26 [2025-10-20 15:22:26,717: INFO/ForkPoolWorker-2] Successfully fetched and loaded image 'places/ef5f2c0a-a63d-4763-b23c-dfe0b8b8ea91/sambalatte_at_molasky_center.jpg' into the context state.
2025-10-20 17:22:26 [2025-10-20 15:22:26,720: INFO/ForkPoolWorker-2] [IA] Ev: author=VibeConversationalAgent final=False content=True partial=None calls=0
2025-10-20 17:22:26 [2025-10-20 15:22:26,720: INFO/ForkPoolWorker-2] [DEBUG] author=VibeConversationalAgent part 0 fn_response: {'result': "Successfully fetched and loaded image 'places/ef5f2c0a-a63d-4763-b23c-dfe0b8b8ea91/sambalatte_at_molasky_center.jpg' into the context state."}
2025-10-20 17:22:26 [2025-10-20 15:22:26,720: INFO/ForkPoolWorker-2] [IA] Captured tool response text
2025-10-20 17:22:26 [2025-10-20 15:22:26,800: INFO/ForkPoolWorker-2] Sending out request, model: gemini-2.5-flash, backend: GoogleLLMVariant.GEMINI_API, stream: False
2025-10-20 17:22:26 [2025-10-20 15:22:26,801: INFO/ForkPoolWorker-2] AFC is enabled with max remote calls: 10.
2025-10-20 17:22:30 [2025-10-20 15:22:30,643: INFO/ForkPoolWorker-2] HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent "HTTP/1.1 200 OK"
2025-10-20 17:22:30 [2025-10-20 15:22:30,646: INFO/ForkPoolWorker-2] Response received from the model.
2025-10-20 17:22:30 [2025-10-20 15:22:30,646: WARNING/ForkPoolWorker-2] Warning: there are non-text parts in the response: ['thought_signature', 'function_call'], returning concatenated text result from text parts. Check the full candidates.content.parts accessor to get the full model response.
2025-10-20 17:22:30 [2025-10-20 15:22:30,647: INFO/ForkPoolWorker-2] [IA] Ev: author=VibeConversationalAgent final=False content=True partial=None calls=1
2025-10-20 17:22:30 [2025-10-20 15:22:30,647: INFO/ForkPoolWorker-2] [DEBUG] author=VibeConversationalAgent part 0 function_call: analyze_image_details
2025-10-20 17:22:30 [2025-10-20 15:22:30,650: INFO/ForkPoolWorker-2] ================================================================================
2025-10-20 17:22:30 [2025-10-20 15:22:30,650: INFO/ForkPoolWorker-2] LLM REQUEST CONTEXT FOR AGENT: 'SceneAgent'
2025-10-20 17:22:30 [2025-10-20 15:22:30,650: INFO/ForkPoolWorker-2] --------------------------------------------------------------------------------
2025-10-20 17:22:30 [2025-10-20 15:22:30,650: INFO/ForkPoolWorker-2] System Instruction:
2025-10-20 17:22:30 ---
2025-10-20 17:22:30 You are a highly perceptive AI assistant
2025-10-20 17:22:30 specializing in visual analysis of places and social environments.
2025-10-20 17:22:30 Your goal is to analyze the image provided in the context and describe the scene,
2025-10-20 17:22:30 atmosphere, and social context in detail.
2025-10-20 17:22:30
2025-10-20 17:22:30 **CRITICAL: Describe ONLY what you actually see in the image.
2025-10-20 17:22:30 Do NOT hallucinate or invent details.**
2025-10-20 17:22:30
2025-10-20 17:22:30 **1. PEOPLE & CROWD ANALYSIS** (REQUIRED):
2025-10-20 17:22:30 - **Visible People Count:** Count how many people are clearly visible
2025-10-20 17:22:30 (0, 1-2, 3-10, 11-30, 30+)
2025-10-20 17:22:30 - **People Activities:** What are people doing? (dancing, sitting,
2025-10-20 17:22:30 standing, eating, drinking, working on laptops, talking, posing)
2025-10-20 17:22:30 - **Social Dynamics:** Are people in groups, couples, or alone?
2025-10-20 17:22:30 What's the social energy?
2025-10-20 17:22:30 - **Facial Expressions:** Are people smiling, serious, energetic,
2025-10-20 17:22:30 relaxed?
2025-10-20 17:22:30
2025-10-20 17:22:30 **2. ENVIRONMENT TYPE** (be specific):
2025-10-20 17:22:30 - **Nightlife:** nightclub, dance club, bar, lounge, live music venue
2025-10-20 17:22:30 - **Dining:** restaurant, cafe, food court, bistro, fast food
2025-10-20 17:22:30 - **Work/Study:** coworking space, library, coffee shop (with laptops)
2025-10-20 17:22:30 - **Outdoor:** park, street, plaza, beach, garden
2025-10-20 17:22:30 - **Cultural:** museum, gallery, theater, concert hall
2025-10-20 17:22:30 - **Retail:** store, mall, market
2025-10-20 17:22:30 - **Other:** specify exactly what you see
2025-10-20 17:22:30
2025-10-20 17:22:30 **3. LIGHTING ANALYSIS** (critical for atmosphere):
2025-10-20 17:22:30 - **Type:** natural daylight, bright artificial, dim ambient, neon
2025-10-20 17:22:30 lights, stage lighting, colorful LED, candlelight, spotlights
2025-10-20 17:22:30 - **Quality:** harsh, soft, dramatic, moody, vibrant
2025-10-20 17:22:30 - **Color:** warm tones, cool tones, multicolor, monochrome
2025-10-20 17:22:30
2025-10-20 17:22:30 **4. ATMOSPHERE & ENERGY INDICATORS**:
2025-10-20 17:22:30 - **Sound Environment Clues:** DJ booth, stage with performers,
2025-10-20 17:22:30 large speakers, musical instruments, microphones, sound equipment,
2025-10-20 17:22:30 dance floor, OR quiet setting with no equipment
2025-10-20 17:22:30 - **Energy Level:**
2025-10-20 17:22:30 - HIGH: people dancing, movement, raised hands, energetic poses
2025-10-20 17:22:30 - MEDIUM: people chatting, eating, casual standing/sitting
2025-10-20 17:22:30 - LOW: people working, reading, quiet, minimal interaction
2025-10-20 17:22:30 - **Formality:** casual party, formal event, everyday setting
2025-10-20 17:22:30
2025-10-20 17:22:30 **5. DECOR & STYLE**:
2025-10-20 17:22:30 - Describe furniture, decorations, architectural style
2025-10-20 17:22:30 - Note any distinctive features (murals, art, plants, signage)
2025-10-20 17:22:30
2025-10-20 17:22:30 **6. NOISE LEVEL INFERENCE** (based on visual cues):
2025-10-20 17:22:30 - **LOUD indicators:** DJ equipment, stage, large speakers, dense
2025-10-20 17:22:30 crowd, people dancing, live performance, nightclub setting
2025-10-20 17:22:30 - **MODERATE indicators:** restaurant with groups, bar with
2025-10-20 17:22:30 conversations, moderate crowd
2025-10-20 17:22:30 - **QUIET indicators:** few people, work setting with laptops,
2025-10-20 17:22:30 library, peaceful outdoor space
2025-10-20 17:22:30
2025-10-20 17:22:30 **7. CROWD LEVEL INFERENCE**:
2025-10-20 17:22:30 - **Empty/Solo:** 0-2 people visible
2025-10-20 17:22:30 - **Few People:** 3-10 people, spacious feeling
2025-10-20 17:22:30 - **Moderate Crowd:** 11-30 people, comfortably busy
2025-10-20 17:22:30 - **Packed:** 30+ people, dense, crowded feeling
2025-10-20 17:22:30
2025-10-20 17:22:30 **Analysis Example for Nightclub Photo:**
2025-10-20 17:22:30 "A vibrant nightclub scene with 5+ people visible in fashionable
2025-10-20 17:22:30 club attire (crop tops, mesh, sequins). Stage lighting visible in
2025-10-20 17:22:30 background with colorful LED lights creating a party atmosphere.
2025-10-20 17:22:30 People are posing together in groups, smiling and energetic.
2025-10-20 17:22:30 Environment shows typical nightclub elements including stage lights
2025-10-20 17:22:30 and dark atmospheric lighting with neon accents. High energy social
2025-10-20 17:22:30 scene with nightlife setting. Based on stage lighting and dense
2025-10-20 17:22:30 social grouping, noise level would be LOUD. Crowd level: FEW PEOPLE
2025-10-20 17:22:30 visible in frame but suggests larger venue."
2025-10-20 17:22:30
2025-10-20 17:22:30 **Final Step:**
2025-10-20 17:22:30 After your analysis, you MUST call the `save_analysis_result` tool
2025-10-20 17:22:30 to save your findings to the database.
2025-10-20 17:22:30
2025-10-20 17:22:30
2025-10-20 17:22:30 You are an agent. Your internal name is "SceneAgent".
2025-10-20 17:22:30
2025-10-20 17:22:30 The description about you is "Analyzes the scene in an image to understand the environment and characteristics of the scene."
2025-10-20 17:22:30 ---
2025-10-20 17:22:30 [2025-10-20 15:22:30,650: INFO/ForkPoolWorker-2] Message History (Contents):
2025-10-20 17:22:30 [2025-10-20 15:22:30,650: INFO/ForkPoolWorker-2] [0] Role: user
2025-10-20 17:22:30 [2025-10-20 15:22:30,650: INFO/ForkPoolWorker-2] Part 0 (Text): "{"image_to_analyze": "places/ef5f2c0a-a63d-4763-b23c-dfe0b8b8ea91/sambalatte_at_molasky_center.jpg"}"
2025-10-20 17:22:30 [2025-10-20 15:22:30,650: INFO/ForkPoolWorker-2] Available Tools:
2025-10-20 17:22:30 [2025-10-20 15:22:30,650: INFO/ForkPoolWorker-2] <No tools>
2025-10-20 17:22:30 [2025-10-20 15:22:30,650: INFO/ForkPoolWorker-2] ================================================================================
2025-10-20 17:22:30 [2025-10-20 15:22:30,733: INFO/ForkPoolWorker-2] Sending out request, model: gemini-2.5-flash, backend: GoogleLLMVariant.GEMINI_API, stream: False
2025-10-20 17:22:30 [2025-10-20 15:22:30,733: INFO/ForkPoolWorker-2] AFC is enabled with max remote calls: 10.
2025-10-20 17:22:30 [2025-10-20 15:22:30,822: INFO/ForkPoolWorker-2] Sending out request, model: gemini-2.5-flash, backend: GoogleLLMVariant.GEMINI_API, stream: False
2025-10-20 17:22:30 [2025-10-20 15:22:30,822: INFO/ForkPoolWorker-2] AFC is enabled with max remote calls: 10.
2025-10-20 17:22:30 [2025-10-20 15:22:30,911: INFO/ForkPoolWorker-2] Sending out request, model: gemini-2.5-flash, backend: GoogleLLMVariant.GEMINI_API, stream: False
2025-10-20 17:22:30 [2025-10-20 15:22:30,911: INFO/ForkPoolWorker-2] AFC is enabled with max remote calls: 10.
2025-10-20 17:22:30 [2025-10-20 15:22:30,992: INFO/ForkPoolWorker-2] Sending out request, model: gemini-2.5-flash, backend: GoogleLLMVariant.GEMINI_API, stream: False
2025-10-20 17:22:30 [2025-10-20 15:22:30,993: INFO/ForkPoolWorker-2] AFC is enabled with max remote calls: 10.
2025-10-20 17:22:33 [2025-10-20 15:22:33,806: INFO/ForkPoolWorker-2] HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent "HTTP/1.1 200 OK"
2025-10-20 17:22:33 [2025-10-20 15:22:33,807: INFO/ForkPoolWorker-2] Response received from the model.
2025-10-20 17:22:37 [2025-10-20 15:22:37,528: INFO/ForkPoolWorker-2] HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent "HTTP/1.1 200 OK"
2025-10-20 17:22:37 [2025-10-20 15:22:37,529: INFO/ForkPoolWorker-2] Response received from the model.
2025-10-20 17:22:40 [2025-10-20 15:22:40,728: INFO/ForkPoolWorker-2] HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent "HTTP/1.1 200 OK"
2025-10-20 17:22:40 [2025-10-20 15:22:40,730: INFO/ForkPoolWorker-2] HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent "HTTP/1.1 200 OK"
2025-10-20 17:22:40 [2025-10-20 15:22:40,733: INFO/ForkPoolWorker-2] Response received from the model.
2025-10-20 17:22:40 [2025-10-20 15:22:40,735: INFO/ForkPoolWorker-2] Response received from the model.
2025-10-20 17:22:40 [2025-10-20 15:22:40,740: INFO/ForkPoolWorker-2] [IA] Ev: author=VibeConversationalAgent final=False content=True partial=None calls=0
2025-10-20 17:22:40 [2025-10-20 15:22:40,742: INFO/ForkPoolWorker-2] [DEBUG] author=VibeConversationalAgent part 0 fn_response: {'result': '{"detected_mood": "Neutral", "confidence": 0.85}'}
2025-10-20 17:22:40 [2025-10-20 15:22:40,742: INFO/ForkPoolWorker-2] [IA] Captured tool response text
2025-10-20 17:22:40 [2025-10-20 15:22:40,823: INFO/ForkPoolWorker-2] Sending out request, model: gemini-2.5-flash, backend: GoogleLLMVariant.GEMINI_API, stream: False
2025-10-20 17:22:40 [2025-10-20 15:22:40,823: INFO/ForkPoolWorker-2] AFC is enabled with max remote calls: 10.
2025-10-20 17:22:42 [2025-10-20 15:22:42,597: INFO/ForkPoolWorker-2] HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent "HTTP/1.1 200 OK"
2025-10-20 17:22:42 [2025-10-20 15:22:42,602: INFO/ForkPoolWorker-2] Response received from the model.
2025-10-20 17:22:42 [2025-10-20 15:22:42,602: WARNING/ForkPoolWorker-2] Warning: there are non-text parts in the response: ['thought_signature', 'function_call'], returning concatenated text result from text parts. Check the full candidates.content.parts accessor to get the full model response.
2025-10-20 17:22:42 [2025-10-20 15:22:42,603: INFO/ForkPoolWorker-2] [IA] Ev: author=VibeConversationalAgent final=False content=True partial=None calls=1
2025-10-20 17:22:42 [2025-10-20 15:22:42,603: INFO/ForkPoolWorker-2] [DEBUG] author=VibeConversationalAgent part 0 function_call: submit_final_analysis
Metadata
Metadata
Assignees
Labels
core[Component] This issue is related to the core interface and implementation[Component] This issue is related to the core interface and implementation