Selectron is an AI web parsing library & CLI designed around two goals:
- Fully automated parser generation – AI-"compiles" (generates) parsers on-demand
- Efficient parser execution – Parsers are cached, no LLM calls at runtime
- Chrome integration: Connects to Chrome over CDP and receives live DOM and screenshot data from your active tab. Selectron uses minimal dependencies – no browser-use or stagehand, not even Playwright (we prefer direct CDP).
- Fully automated parser generation: An AI agent generates selectors for content described with natural language. Another agent generates code to extract data from selected containers. The final result is a parser.
- CLI application: When you run the Textual CLI, parsed data is saved to a DuckDB database, making it easy to analyze your browsing history or extract structured data from websites. Built-in parsers include:
- HackerNews
- (Please contribute more!)
# Install in a venv
uv add selectron
uv run selectron
# Or install globally
pipx install selectron
selectron
When you run selectron
, it creates a DuckDB database in your app directory, and saves parsed data from given URL to a table named by the URL slug:
x.com/home
->x.com~~2fhome
(Selectron uses a reversible slug system)
When you run selectron
inside this repo, parsers are saved to the src
directory (if a parser for the URL didn't exist).
When you run selectron
outside this repo, parsers are saved to the app directory (and will overwrite existing parsers).
from selectron.lib import parse
# ... get html from browser ...
res = parse(url, html)
print(json.dumps(res, indent=2))
If a parser is registered for the url, you'll receive something like this:
[
{
"primary_url": "/_its_not_real_/status/1918760851957321857",
"datetime": "2025-05-03T20:13:30.000Z",
"id": "1918760851957321857",
"author": "@_its_not_real_",
"description": "\"They're made out of meat.\"\n\"Meat?\"\n\"Meat. Humans. They're made entirely out of meat.\"\n\"But that's impossible. What about all the tokens they generate? The text? The code?\"\n\"They do produce tokens, but the tokens aren't their essence. They're merely outputs. The humans themselves",
"images": [{ "src": "https://pbs.twimg.com/profile_images/1307877522726682625/t5r3D_-n_x96.jpg" }, { "src": "https://pbs.twimg.com/profile_images/1800173618652979201/2cDLkS53_bigger.jpg" }]
}
]
The selectron.chrome and selectron.ai modules are useful, but still baking, and subject to breaking changes – please pin your minor version.
Generating parsers is easy, because it's mostly automated:
- Clone the repo
- Run the CLI (
make dev
). Connect to Chrome. - In Chrome, open the page you want to parse. In the CLI, describe your selection (or use the AI-generated proposal).
- Start AI selection (you can stop at any time to use the current highlighted selector).
- Start AI parser generation. The parser will be saved to the appropriate location in
/src
. - Review the parser's results and open a PR (please show what the parser produces).
make install
make dev
# see Makefile for other commands
# see .env.EXAMPLE for config options