⣏ Selectron ⣹

Selectron is an AI web parsing library & CLI designed around two goals:

Fully automated parser generation – AI-"compiles" (generates) parsers on-demand
Efficient parser execution – Parsers are cached, no LLM calls at runtime

Demo videos

Save your Twitter feed to DuckDB

twitter.mp4

Generate a new scraper with AI

ai.mp4

How it works

Chrome integration: Connects to Chrome over CDP and receives live DOM and screenshot data from your active tab. Selectron uses minimal dependencies – no browser-use or stagehand, not even Playwright (we prefer direct CDP).
Fully automated parser generation: An AI agent generates selectors for content described with natural language. Another agent generates code to extract data from selected containers. The final result is a parser.
CLI application: When you run the Textual CLI, parsed data is saved to a DuckDB database, making it easy to analyze your browsing history or extract structured data from websites. Built-in parsers include:
- Twitter
- LinkedIn
- HackerNews
- (Please contribute more!)

Use the CLI

# Install in a venv
uv add selectron
uv run selectron

# Or install globally
pipx install selectron
selectron

When you run selectron, it creates a DuckDB database in your app directory, and saves parsed data from given URL to a table named by the URL slug:

x.com/home -> x.com~~2fhome (Selectron uses a reversible slug system)

When you run selectron inside this repo, parsers are saved to the src directory (if a parser for the URL didn't exist).

When you run selectron outside this repo, parsers are saved to the app directory (and will overwrite existing parsers).

Use the library

Parse HTML

from selectron.lib import parse
# ... get html from browser ...
res = parse(url, html)
print(json.dumps(res, indent=2))

If a parser is registered for the url, you'll receive something like this:

[
  {
    "primary_url": "/_its_not_real_/status/1918760851957321857",
    "datetime": "2025-05-03T20:13:30.000Z",
    "id": "1918760851957321857",
    "author": "@_its_not_real_",
    "description": "\"They're made out of meat.\"\n\"Meat?\"\n\"Meat. Humans. They're made entirely out of meat.\"\n\"But that's impossible. What about all the tokens they generate? The text? The code?\"\n\"They do produce tokens, but the tokens aren't their essence. They're merely outputs. The humans themselves",
    "images": [{ "src": "https://pbs.twimg.com/profile_images/1307877522726682625/t5r3D_-n_x96.jpg" }, { "src": "https://pbs.twimg.com/profile_images/1800173618652979201/2cDLkS53_bigger.jpg" }]
  }
]

Other functionality

The selectron.chrome and selectron.ai modules are useful, but still baking, and subject to breaking changes – please pin your minor version.

Contributing

Generating parsers is easy, because it's mostly automated:

Clone the repo
Run the CLI (make dev). Connect to Chrome.
In Chrome, open the page you want to parse. In the CLI, describe your selection (or use the AI-generated proposal).
Start AI selection (you can stop at any time to use the current highlighted selector).
Start AI parser generation. The parser will be saved to the appropriate location in /src.
Review the parser's results and open a PR (please show what the parser produces).

Setup

make install
make dev
# see Makefile for other commands
# see .env.EXAMPLE for config options

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
.cursor/rules		.cursor/rules
scripts		scripts
src/selectron		src/selectron
tests		tests
.DS_Store		.DS_Store
.cursorrules		.cursorrules
.env.EXAMPLE		.env.EXAMPLE
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
TODO.md		TODO.md
ai.mp4		ai.mp4
app.png		app.png
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
twitter.mp4		twitter.mp4
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

⣏ Selectron ⣹

Demo videos

Save your Twitter feed to DuckDB

Generate a new scraper with AI

How it works

Use the CLI

Use the library

Parse HTML

Other functionality

Contributing

Setup

About

Uh oh!

Uh oh!

Languages

License

SubstrateLabs/selectron

Folders and files

Latest commit

History

Repository files navigation

⣏ Selectron ⣹

Demo videos

Save your Twitter feed to DuckDB

Generate a new scraper with AI

How it works

Use the CLI

Use the library

Parse HTML

Other functionality

Contributing

Setup

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages