Advanced Twitter Automation AI is a modular Python-based framework for automating X (Twitter) at scale. It supports multiple accounts, robust Selenium automation (with optional undetected Chrome + stealth), per‑account proxies and rotation, structured LLM generation/analysis (OpenAI, Azure OpenAI, Gemini), community posting, and per‑account metrics/logs.
- Advanced Twitter Automation AI
- Multi-Account Management: Seamlessly manage and automate actions for multiple Twitter accounts.
- Content Scraping:
- Scrape tweets based on keywords, user profiles, and news/research sites.
- Extract tweet content, user information, and engagement metrics.
- Content Publishing:
- Post new tweets, including text and media.
- Reply to tweets based on various triggers.
- Repost (retweet) content from competitor profiles or based on engagement metrics.
- LLM Integration:
- Utilize OpenAI (GPT models) and Google Gemini for:
- Generating tweet content and replies.
- Analyzing tweet threads and sentiment.
- Summarizing articles for posting.
- Flexible LLM preference settings at global and per-account levels.
- Utilize OpenAI (GPT models) and Google Gemini for:
- Engagement Automation:
- Engage with tweets through likes, replies, and reposts.
- Analyze competitor activity and engage strategically.
- Configurable Automation:
- Fine-grained control over automation parameters via JSON configuration files.
- Per-account overrides for keywords, target profiles, LLM settings, and action behaviors.
- Browser Automation: Uses Selenium for interacting with Twitter, handling dynamic content and complex UI elements.
- Modular Design: Easily extendable with new features and functionalities.
- Logging: Comprehensive logging for monitoring and debugging.
- Community Posting: Switch audience and post into configured communities (by ID or name).
- Stealth Mode (Chrome): Optional undetected-chromedriver + selenium-stealth to reduce fingerprinting.
- Proxies: Per-account proxies, named proxy pools, and rotation strategies (hash/round-robin) with env interpolation.
- LLM Structured Prompts: Strict JSON prompting with few-shots, system prompts, and robust extraction.
- Metrics: Per-account JSON summaries and JSONL event logs for observability.
- Programming Language: Python 3.9+
- Browser Automation: Selenium, WebDriver Manager
- HTTP Requests: Requests
- Data Validation: Pydantic
- LLM Integration: Langchain (for Google GenAI), OpenAI SDK
- Stealth: undetected-chromedriver, selenium-stealth (optional)
- Configuration: JSON, python-dotenv
- Web Interaction: Fake Headers (for mimicking browser headers)
The project is organized as follows:
twitter-automation-ai/
├── config/
│ ├── accounts.json # Configuration for multiple Twitter accounts
│ └── settings.json # Global settings (API keys, automation parameters)
├── src/
│ ├── core/ # Core modules (browser, LLM, config)
│ │ ├── browser_manager.py
│ │ ├── config_loader.py
│ │ └── llm_service.py
│ ├── features/ # Modules for Twitter features (scraper, publisher, etc.)
│ │ ├── scraper.py
│ │ ├── publisher.py
│ │ └── engagement.py
│ ├── utils/ # Utility modules (logger, file handler, etc.)
│ │ ├── logger.py
│ │ ├── file_handler.py
│ │ ├── progress.py
│ │ └── scroller.py
│ ├── data_models.py # Pydantic models for data structures
│ ├── main.py # Main orchestrator script
│ └── __init__.py
├── .env # Environment variables (optional, for API keys)
├── requirements.txt # Python dependencies
├── .gitignore # Specifies intentionally untracked files
├── LICENSE # Project license
├── CODE_OF_CONDUCT.md # Contributor Code of Conduct
├── CONTRIBUTING.md # Guidelines for contributing
└── README.md # This file
- Python 3.9 or higher.
- A modern web browser (e.g., Chrome, Firefox) compatible with Selenium.
Follow these steps to set up and run the project:
Quick-start templates are available in presets/
.
- Settings presets:
presets/settings/*.json
(defaults, Chrome undetected, proxies hash/round-robin) - Accounts presets:
presets/accounts/*.json
(growth, brand_safe, replies_first, engagement_light, community_posting) - How to apply:
- cp
presets/settings/beginner-chrome-undetected.json
config/settings.json
- cp
presets/accounts/growth.json
config/accounts.json
- Edit placeholders for API keys, cookie paths, and community IDs.
- See
presets/README.md
anddata/README.md
(dummy cookies/proxies).
- cp
git clone https://github.com/ihuzaifashoukat/twitter-automation-ai
cd twitter-automation-ai
It's highly recommended to use a virtual environment:
python -m venv venv
# On Windows
venv\Scripts\activate
# On macOS/Linux
source venv/bin/activate
Install the required Python packages:
pip install -r requirements.txt
This file manages individual Twitter account configurations. It should be an array of account objects.
-
Key Fields per Account:
account_id
: A unique identifier for the account.is_active
: Boolean, set totrue
to enable automation for this account.cookie_file_path
: Path to a JSON file containing cookies for the account (e.g.,config/my_account_cookies.json
).cookies
: Alternatively, an array of cookie objects can be provided directly.proxy
(optional): Per-account proxy URL. Examples:http://user:pass@host:port
,socks5://host:port
.post_to_community
(optional): Whentrue
, switch the audience to a community before posting.community_id
(optional): Community ID used to select the audience (preferred).community_name
(optional): Fallback visible name in the audience picker if ID selection fails.- Overrides: You can specify per-account overrides for various settings like
target_keywords_override
,competitor_profiles_override
,llm_settings_override
, andaction_config_override
. If an override is not present, the global defaults fromconfig/settings.json
will be used.
-
Example
config/accounts.json
entry:// Minimal example: ONLY add the community fields you need; keep your existing structure as-is. // Below shows adding community + (optional) proxy to one of your existing account objects: { "account_id": "your_existing_account_id", "is_active": true, "cookie_file_path": "config/your_existing_cookie_file.json", // ... all your existing fields remain unchanged ... "proxy": "http://127.0.0.1:8888", // optional "post_to_community": true, // optional "community_id": "1737236915810627584", // preferred when known "community_name": "One Piece" // fallback by visible name }
(Refer to the example in the original README section for a more detailed structure if needed, or adapt based on current
data_models.py
.) -
Obtaining Cookies: Use browser developer tools (e.g., "EditThisCookie" extension) to export cookies for
x.com
after logging in. Save them as a JSON array of cookie objects if usingcookie_file_path
. -
Per-Account Proxy: If set, the proxy overrides the global
browser_settings.proxy
for that account. Chrome uses--proxy-server
; Firefox is configured via profile preferences (proxy auth prompts are not handled automatically). -
Community Posting: When
post_to_community
is true, the publisher clicks the "Choose audience" button in the composer and selects your community usingcommunity_id
(preferred) orcommunity_name
as a fallback, then posts. -
Important: We do not require changing your existing
accounts.json
structure. Simply add the optional fields (post_to_community
,community_id
,community_name
, and/orproxy
) to the appropriate account objects. -
For rewrite-based posting to communities or personal profiles, ensure each account has competitor sources configured via
competitor_profiles
(orcompetitor_profiles_override
in your current structure). The scraper uses these as input for rewriting and posting.
This file contains global configurations for the application.
- Key Sections:
api_keys
: Store API keys for LLM services (e.g.,openai_api_key
,gemini_api_key
).twitter_automation
:action_config
: Default behaviors for automation actions (e.g.,max_posts_per_run
,min_likes_for_repost
).response_interval_seconds
: Default delay between actions.media_directory
: Path to store downloaded media.analysis_config
: Enable/disable relevance filters per pipeline and thresholds.
enable_relevance_filter.competitor_reposts
(bool),thresholds.competitor_reposts_min
(0–1)enable_relevance_filter.likes
(bool),thresholds.likes_min
(0–1)
engagement_decision
: Ifenabled: true
, automatically chooses between repost/retweet/quote/like based on relevance and sentiment.
use_sentiment
: include sentiment in decisionthresholds.quote_min|retweet_min|repost_min
: relevance cutoffs (0–1)
logging
: Configuration for the logger.
browser_settings
: Settings for Selenium WebDriver (e.g.,headless
mode).type
:chrome
orfirefox
. For best anti-detection, use Chrome withuse_undetected_chromedriver
.use_undetected_chromedriver
(Chrome only): Whentrue
, usesundetected-chromedriver
for stealthier automation.enable_stealth
(Chrome only): Whentrue
andselenium-stealth
is installed, applies additional anti-detection tweaks.user_agent_generation
:random
orcustom
withcustom_user_agent
string.proxy
: Global proxy (can be overridden per account).driver_options
: Extra Chrome/Firefox CLI options.page_load_timeout_seconds
,script_timeout_seconds
,window_size
.
LLM prompt engineering
- LLM prompts now use stronger, schema-first instructions for structured JSON with optional few-shot examples and hard character limits.
LLMService.generate_text
accepts an optionalsystem_prompt
andmessages
for OpenAI/Azure; Gemini concatenates system+user.LLMService.generate_structured
addsfew_shots
,system_prompt
, andhard_character_limit
to guide safer, parsable outputs.proxy_pools
: Named pools for per-account proxies. Use"pool:<name>"
in accountproxy
to select from a pool.proxy_pool_strategy
:hash
(stable per-account) orround_robin
(rotates across runs/accounts).proxy_pool_state_file
: Persist file for round-robin counters (defaultdata/proxy_pools_state.json
).
- Important Note: Content source lists like
target_keywords
,competitor_profiles
, etc., are primarily managed per-account inconfig/accounts.json
. The globalaction_config
insettings.json
defines default how actions run, which can be overridden per account.
For sensitive data like API keys, you can use a .env
file in the project root. python-dotenv
is included in requirements.txt
to load these variables.
- Create a
.env
file:The application is designed to prioritize environment variables for API keys if available.OPENAI_API_KEY="your_openai_api_key" GEMINI_API_KEY="your_gemini_api_key" # Add other sensitive variables as needed
Execute the main orchestrator script from the project root:
python src/main.py
The orchestrator will iterate through active accounts in config/accounts.json
and perform actions based on their respective configurations and global settings.
To post into a community instead of your public timeline, set the following on the account object in config/accounts.json
:
post_to_community: true
- Provide at least one of:
community_id
: preferred (appears in URLs like/i/communities/<id>
)community_name
: fallback by the visible label in the audience picker
How selection works:
- The app opens the “Choose audience” control, locates the audience menu container (dialog or
data-testid="HoverCard"
), and attempts to click your community. - It scrolls the virtualized list to reveal off-screen items and uses JS-click fallbacks to avoid overlay interception.
- After selection, it posts using the chosen audience.
If it fails to select your community:
- Verify the account has joined the community and it appears under “My Communities”.
- Provide a DOM snapshot from the audience menu in a GitHub issue so selectors can be tuned.
Controls live under twitter_automation.action_config
(globally) and per-account action_config_override
:
- Replies:
enable_keyword_replies
,max_replies_per_keyword_run
, optional recencyreply_only_to_recent_tweets_hours
. - Likes:
enable_liking_tweets
,max_likes_per_run
,like_tweets_from_keywords
(defaults to accounttarget_keywords
when omitted). - Retweets (new):
enable_keyword_retweets
,max_retweets_per_keyword_run
.
Relevance filters (optional):
enable_relevance_filter_keyword_replies
,relevance_threshold_keyword_replies
enable_relevance_filter_likes
,relevance_threshold_likes
-
ElementClickInterceptedException or “not clickable”:
- The app automatically scrolls into view, waits for the composer mask to disappear, and uses JS-click or Ctrl+Enter fallbacks.
- If it persists, consider adding small delays or switching to Chrome (
browser_settings.type: "chrome"
).
-
Community not selected:
- Ensure
post_to_community: true
and setcommunity_id
(best) orcommunity_name
. - The audience list is virtualized; the app scrolls within the menu to reveal items. If your UI differs, open an issue with a DOM snippet.
- Ensure
-
WebDriver downloads blocked or offline:
- The app prefers local drivers (
chromedriver
,geckodriver
) if found in PATH or viachrome_driver_path
/gecko_driver_path
. Install via your OS package manager or ensure they’re in PATH.
- The app prefers local drivers (
-
Gemini errors (429/500):
- Use
gemini-1.5-flash-latest
, reduce calls, or configure OpenAI/Azure.
- Use
- Structured analysis and generation use strict schema-first prompts with robust JSON extraction. OpenAI JSON mode is attempted when available.
- Internally supports system prompts and few-shot examples for more controllable outputs.
- Content generation for posts composes final text (with optional hashtags) within 280 chars.
- Chrome: set
browser_settings.type
to"chrome"
and enableuse_undetected_chromedriver
. Optionalenable_stealth
applies extra anti-detection tweaks. - Firefox: standard Selenium with proxy prefs; auth prompts are not auto-handled.
- Headless: Chrome uses
--headless=new
for better parity. User-Agent randomized by default.
- Per-account
proxy
overrides global. Use"pool:<name>"
to select frombrowser_settings.proxy_pools
. - Rotation strategies:
hash
(stable per-account) orround_robin
(usesdata/proxy_pools_state.json
). - Env interpolation:
${ENV_VAR}
inside proxy strings is expanded at runtime.
- Point
cookie_file_path
to a JSON array of cookies forx.com
. The app navigates tobrowser_settings.cookie_domain_url
before injecting cookies. - Example file:
data/cookies/dummy_cookies_example.json
.
- Summary per account:
data/metrics/<account_id>.json
(counters for posts, replies, retweets, quote_tweets, likes, errors; last run timestamps) - Structured events per account:
logs/accounts/<account_id>.jsonl
(JSON lines of each action attempt with metadata)
See docs/CONFIG_REFERENCE.md
for a concise schema of config/settings.json
and config/accounts.json
, including per-account action_config_override
fields and decision thresholds.
- Logging: Detailed logs are output to the console. Configuration is in
config/settings.json
and managed bysrc/utils/logger.py
. - Selenium Selectors: Twitter's (X.com) UI is subject to change. XPath and CSS selectors in
src/features/scraper.py
andsrc/features/publisher.py
may require updates if the site structure changes. - Error Handling: The project includes basic error handling. Enhancements with more specific exception management and retry mechanisms are potential areas for improvement.
- Extensibility: To add new features:
- Define necessary data structures in
src/data_models.py
. - Create new feature modules within the
src/features/
directory. - Integrate the new module into the
TwitterOrchestrator
insrc/main.py
.
- Define necessary data structures in
Contributions are welcome! Please read our CONTRIBUTING.md for guidelines on how to contribute, report bugs, or suggest enhancements.
To ensure a welcoming and inclusive environment, this project adheres to a Code of Conduct. Please review and follow it in all your interactions with the project.
This project is licensed under the MIT License. See the LICENSE file for details.
- GUI or web interface for managing accounts, settings, and monitoring.
- Advanced error handling, including robust retry logic for network issues or UI changes.
- Integration with proxy services for enhanced multi-account management and anonymity.
- More detailed per-account activity logging and analytics.
- Improved AI-driven content analysis and decision-making.