Description
Browsers like Chrome (I think firefox might have support now too) can be started with a command line option that provides a websocket port that can be used to control the browser fully.
You can get a list of tabs open, select one of them (for example chatgpt.com or the gemini website), and fill forms you find on those pages.
This is very much like pupeteer, but instead of working in a background/abstract browser, it works in your real day to day browser, in which you're logged into chatgpt/anthropic/google AI etc.
We could use this to add a feature to prompt-tower
where there's a "push" button at the bottom of the page, you generate a prompt, and instead of having to copy/paste it, you just click the "push" button, and it's automagically added to the textarea input in your AI page of choice.
It would need to be updated when the HTML page changes too much, but I think that would probably be manageable.
What do you think?
( in theory, we could also use the same technique to grab the answer from the LLM, and use that for tool use and back-and-forth debugging, but I looked at the TOS for OpenAI and Anthropic and that's unfortunately against their rules... but just pushing to the page, that doesn't seem to be a problem)