Skip to content

feat: Implement LLM-Powered Adaptive Replay and Auto-Documentation #953

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

shephinphilip
Copy link

@shephinphilip shephinphilip commented Jun 5, 2025

This commit introduces an LLM-powered adaptive replay strategy and an auto-documentation feature.

Key changes include:

  1. LLMAdaptiveStrategy (openadapt/strategies/llm_adaptive_strategy.py):

    • A new replay strategy that inherits from BaseReplayStrategy.
    • Intent Abstraction: I use an LLM (via generate_action_event.j2 prompt) to determine the next action based on recorded actions, current UI state, and your task description.
    • Semantic Matching & Adaptation: I implemented a UI consistency check (_is_ui_consistent_for_next_original_action) to decide whether to replay a recorded action directly or use the LLM for adaptation if the UI has changed. This involves comparing window titles, dimensions, and screenshot similarity.
    • Basic Error Recovery: I overrode the run method to include a post-action check using prompt_is_action_complete. If an action doesn't complete as expected, this is logged, and I implicitly handle the new state in the next cycle.
    • Action history is consistently managed in self.action_events.
  2. Auto-Documentation Script (openadapt/scripts/generate_documentation.py):

    • A new script that takes a recording timestamp.
    • It loads the recording, prepares context (action details, window states, screenshots).
    • It uses the describe_recording.j2 prompt to ask an LLM to generate a human-readable summary of the recording.
    • It prints the generated documentation to the console.
  3. Integration & Prompts:

    • The new strategy is dynamically discovered by the system.
    • It leverages existing prompt infrastructure and LLM adapter configurations.
    • Relevant prompts (generate_action_event.j2, describe_recording.j2, is_action_complete.j2, system.j2) are utilized.

Steps I Took:

  • Initial planning and codebase exploration.
  • Created LLMAdaptiveStrategy class structure.
  • Implemented LLM-based intent abstraction in get_next_action_event.
  • Added semantic replay matching logic to intelligently choose between replaying original actions or using LLM for adaptation.
  • Implemented basic error detection in the strategy's run method.
  • Ensured the new strategy integrates with the existing system.
  • Developed the generate_documentation.py script for auto-documentation.

This work fulfills the core requirements of the issue to create an intelligent replay system using LLMs to generalize, abstract, and execute workflows across varying UI states, and to auto-document recordings. I planned unit tests as the next step.

@shephinphilip shephinphilip changed the title Worked on create an intelligent replay system that uses LLMs to generalize, abstract, and execute workflows across slightly varying UI/app states. feat: Implement LLM-Powered Adaptive Replay and Auto-Documentation Jun 5, 2025
@shephinphilip shephinphilip deleted the jules_wip_17590803739021845458 branch June 5, 2025 07:19
@abrichr
Copy link
Member

abrichr commented Jul 6, 2025

@shephinphilip thank you for your interest, and for your patience!

I would love to know how far you got with this, and why you chose to close it.

@shephinphilip
Copy link
Author

@shephinphilip thank you for your interest, and for your patience!

I would love to know how far you got with this, and why you chose to close it.

I created another one

#954

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants