Skip to content

Commit 246a82a

Browse files
arahanguanicoloval
andauthored
Pull from dev branch (#27)
* fixed/updated: model search function, removed outdated update-status method * added:cli command for using all query combinations, fallback categorization for popular services * update welcome and help content * updated footer commands * centered scapo ascii ast * changed vhs files structure and rerun gifs * updated tui commands * removed obsolete gifs * updated: readme, quickstart on mcp and command usage, more models --------- Co-authored-by: Nicolo Vallarano <nicolo.vallarano@imtlucca.it>
1 parent ce71411 commit 246a82a

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

71 files changed

+1110
-158
lines changed

QUICKSTART.md

Lines changed: 39 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,8 @@ uv run playwright install
1414

1515
### 2. Configure LLM (Choose One)
1616

17+
**Important:** Extraction quality varies by LLM - stronger models find more specific tips!
18+
1719
#### Option A: OpenRouter (Recommended - Free Model!)
1820
```bash
1921
cp .env.example .env
@@ -63,23 +65,29 @@ Extract specific optimization tips for AI services:
6365
scapo scrape discover --update
6466

6567
# Step 2: Extract tips for specific services
66-
scapo scrape targeted --service "Eleven Labs" --limit 20
67-
scapo scrape targeted --service "GitHub Copilot" --limit 20
68+
scapo scrape targeted --service "Eleven Labs" --limit 20 --query-limit 20
69+
scapo scrape targeted --service "GitHub Copilot" --limit 20 --query-limit 20
6870

6971
# Or batch process by category
70-
scapo scrape batch --category video --limit 15
72+
scapo scrape batch --category video --limit 20 --batch-size 3
7173

7274
# Process ALL priority services one by one
73-
scapo scrape all --priority ultra --limit 20 # Process all ultra priority services
74-
scapo scrape all --dry-run # Preview what will be processed
75+
scapo scrape all --limit 20 --query-limit 20 --priority ultra # Process all ultra priority services
76+
scapo scrape all --dry-run # Preview what will be processed
7577
```
7678

7779
### Key Commands:
7880
- `discover --update` - Find services from GitHub Awesome lists
7981
- `targeted --service NAME` - Extract tips for one service
80-
- `batch --category TYPE` - Process multiple services (limited)
82+
- `batch --category TYPE` - Process ALL services in category (in batches)
8183
- `all --priority LEVEL` - Process ALL services one by one
8284

85+
### Important Parameters:
86+
- **--query-limit**: Number of search patterns (5 = quick, 20 = comprehensive)
87+
- **--batch-size**: Services to process in parallel (3 = default balance)
88+
- **--limit**: Posts per search (20+ recommended for best results)
89+
90+
8391
## 📚 Approach 2: Legacy Sources
8492

8593
Use predefined sources from `sources.yaml`:
@@ -109,6 +117,27 @@ scapo models search "copilot" # Search for specific models
109117
cat models/audio/eleven-labs/cost_optimization.md
110118
```
111119

120+
### 5. (Optional) Use with Claude Desktop
121+
122+
Add SCAPO as an MCP server to query your extracted tips (from models/ folder) directly in Claude:
123+
124+
```json
125+
// Add to claude_desktop_config.json
126+
{
127+
"mcpServers": {
128+
"scapo": {
129+
"command": "npx",
130+
"args": ["@scapo/mcp-server"],
131+
"env": {
132+
"SCAPO_MODELS_PATH": "path/to/scapo/models"
133+
}
134+
}
135+
}
136+
}
137+
```
138+
139+
Then ask Claude: "Get best practices for Midjourney" - no Python needed!
140+
112141
## 📊 Understanding the Output
113142

114143
SCAPO creates organized documentation:
@@ -126,13 +155,13 @@ models/
126155

127156
```bash
128157
# ❌ Too few posts = no useful tips found
129-
scapo scrape targeted --service "HeyGen" --limit 5 # ~20% success rate
158+
scapo scrape targeted --service "HeyGen" --limit 5 --query-limit 5 # ~20% success rate
130159

131160
# ✅ Sweet spot = reliable extraction
132-
scapo scrape targeted --service "HeyGen" --limit 20 # ~80% success rate
161+
scapo scrape targeted --service "HeyGen" --limit 20 --query-limit 20 # ~80% success rate
133162

134163
# 🎯 Maximum insights = comprehensive coverage
135-
scapo scrape targeted --service "HeyGen" --limit 30 # Finds rare edge cases
164+
scapo scrape targeted --service "HeyGen" --limit 30 --query-limit 20 # Finds rare edge cases
136165
```
137166
**Why it matters:** LLMs need multiple examples to identify patterns. More posts = higher chance of finding specific pricing, bugs, and workarounds.
138167

@@ -148,7 +177,7 @@ LLM_QUALITY_THRESHOLD=0.4 # More tips (less strict)
148177
### "No tips extracted"
149178
```bash
150179
# Solution: Use more posts
151-
scapo scrape targeted --service "Service Name" --limit 25
180+
scapo scrape targeted --service "Service Name" --limit 25 --query-limit 20
152181
```
153182

154183
### "Service not found"

README.md

Lines changed: 52 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -53,15 +53,15 @@ scapo scrape discover --update
5353
Extract optimization tips for specific services
5454

5555
```bash
56-
scapo scrape targeted --service "Eleven Labs" --limit 20
56+
scapo scrape targeted --service "Eleven Labs" --limit 20 --query-limit 20
5757
```
5858
![Scapo Discover](assets/scrape-targeted.gif)
5959

6060

6161
Batch process multiple priority services (Recommended)
6262

6363
```bash
64-
scapo scrape batch --max-services 3 --category audio
64+
scapo scrape batch --category audio --batch-size 3 --limit 20
6565
```
6666
![Scapo Discover](assets/scrape-batch.gif)
6767

@@ -89,6 +89,8 @@ uv run playwright install # Browser automation
8989

9090
### 2. Configure Your LLM Provider
9191

92+
**Note:** Extraction quality depends on your chosen LLM - experiment with different models for best results!
93+
9294
#### Recommended: OpenRouter (Cloud)
9395
```bash
9496
cp .env.example .env
@@ -111,14 +113,14 @@ Get your API key from [openrouter.ai](https://openrouter.ai/)
111113
scapo scrape discover --update
112114

113115
# Step 2: Extract optimization tips for services
114-
scapo scrape targeted --service "HeyGen" --limit 20
115-
scapo scrape targeted --service "Midjourney" --limit 20
116+
scapo scrape targeted --service "HeyGen" --limit 20 --query-limit 20
117+
scapo scrape targeted --service "Midjourney" --limit 20 --query-limit 20
116118

117119
# Or batch process multiple services
118-
scapo scrape batch --category video --limit 15
120+
scapo scrape batch --category video --limit 20 --batch-size 3
119121

120122
# Process ALL priority services one by one (i.e. all services with 'ultra' tag, see targted_search_generator.py)
121-
scapo scrape all --priority ultra --limit 20
123+
scapo scrape all --limit 20 --query-limit 20 --priority ultra
122124
```
123125

124126
#### Option B: Legacy method: using sources.yaml file
@@ -196,13 +198,13 @@ scapo scrape discover --show-all # List all services
196198
scapo scrape targeted \
197199
--service "Eleven Labs" \ # Service name (handles variations, you can put whatever --> if we don't get hit in services.json, then it will be created under 'general' folder)
198200
--limit 20 \ # Posts per search (15-20 recommended)
199-
--max-queries 10 # Number of searches
201+
--query-limit 20 # Query patterns per service (20 = all)
200202

201203
# Batch process
202204
scapo scrape batch \
203205
--category audio \ # Filter by category
204-
--max-services 3 \ # Services to process
205-
--limit 15 # Posts per search
206+
--batch-size 3 \ # Services per batch
207+
--limit 20 # Posts per search
206208

207209

208210
### Legacy Sources Mode
@@ -249,13 +251,52 @@ SCRAPING_DELAY_SECONDS=2 # Be respectful
249251
MAX_POSTS_PER_SCRAPE=100 # Limit per source
250252
```
251253

252-
### Why --limit Matters (More Posts = Better Tips)
254+
### Key Parameters Explained
255+
256+
**--query-limit** (How many search patterns per service)
257+
```bash
258+
--query-limit 5 # Quick scan: 1 pattern per category (cost, optimization, technical, workarounds, bugs)
259+
--query-limit 20 # Full scan: All 4 patterns per category (default, most comprehensive)
260+
```
261+
262+
**--batch-size** (For `batch` command: services processed in parallel)
263+
```bash
264+
--batch-size 1 # Sequential (slowest, least resource intensive)
265+
--batch-size 3 # Default (good balance)
266+
--batch-size 5 # Faster (more resource intensive)
267+
```
268+
269+
**--limit** (Posts per search - More = Better extraction)
253270
```bash
254271
--limit 5 # ❌ Often finds nothing (too few samples)
255272
--limit 15 # ✅ Good baseline (finds common issues)
256273
--limit 25 # 🎯 Will find something (as long as there is active discussion on it)
257274
```
258-
so, hand-wavy breakdown: With 5 posts, extraction success ~20%. With 20+ posts, success jumps to ~80%.
275+
Hand-wavy breakdown: With 5 posts, extraction success ~20%. With 20+ posts, success jumps to ~80%.
276+
277+
## 🤖 MCP Server for Claude Desktop
278+
279+
Query your extracted tips directly in Claude (reads from models/ folder - run scrapers first!):
280+
281+
```json
282+
// Add to %APPDATA%\Claude\claude_desktop_config.json (Windows)
283+
// or ~/Library/Application Support/Claude/claude_desktop_config.json (macOS)
284+
{
285+
"mcpServers": {
286+
"scapo": {
287+
"command": "npx",
288+
"args": ["@scapo/mcp-server"],
289+
"env": {
290+
"SCAPO_MODELS_PATH": "C:\\path\\to\\scapo\\models" // Your models folder
291+
}
292+
}
293+
}
294+
}
295+
```
296+
297+
Then ask Claude: "Get me best practices for GitHub Copilot" or "What models are good for coding?"
298+
299+
See [mcp/README.md](mcp/README.md) for full setup and available commands.
259300

260301
## 🎨 Interactive TUI
261302

Lines changed: 18 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,25 @@
11
# Eleven Labs - Cost Optimization Guide
22

3-
*Last updated: 2025-08-14*
3+
*Last updated: 2025-08-16*
44

55
## Cost & Pricing Information
66

7-
- 60% of credits left (~400,000 credits)
8-
- Subscription renewal failed due to paywall issues
7+
- Free trial limited to 10,000 characters per month
8+
- 60% of credits left (about 400,000 credits)
9+
- $15k saved in ElevenLabs fees
10+
- Free access limited to 15 minutes of voice recording per day
11+
- Last year I was paying +$1000/month for AI voiceovers for only one channel.
12+
- $29/month for unlimited usage on ElevenReader.
913
- $99/month plan
14+
- $29/month for unlimited
15+
- Credits should last until June 5th
16+
- 10,000 free credits per month on the free plan.
17+
18+
## Money-Saving Tips
19+
20+
- I built my own tool, just for me. No subscriptions, no limits, just fast, clean voice generation. Cost me ~ $4/month to run.
21+
- MiniMax have daily credit refresh in TTS not like ElevenLabs where you need to wait 1 month to refresh.
22+
- Use the free plan to get 10,000 credits per month for free.
23+
- So, when I do, I use a temporary email to create a new account so the 10,000 chatacter limit 'resets.'
24+
- When converting text to voice, adding periods between letters (e.g., B.O.D.) can force the model to pronounce acronyms letter by letter, though it may consume more credits.
1025

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
{
22
"service": "Eleven Labs",
33
"category": "audio",
4-
"last_updated": "2025-08-14T18:53:47.086694",
4+
"last_updated": "2025-08-16T13:46:28.510586",
55
"extraction_timestamp": null,
66
"data_sources": [
77
"Reddit API",
88
"Community discussions"
99
],
10-
"posts_analyzed": 79,
10+
"posts_analyzed": 338,
1111
"confidence": "medium",
1212
"version": "1.0.0"
1313
}

models/audio/eleven-labs/parameters.json

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"service": "Eleven Labs",
3-
"last_updated": "2025-08-14T18:53:46.993256",
3+
"last_updated": "2025-08-16T13:46:28.342822",
44
"recommended_settings": {
55
"setting_0": {
66
"description": "voice_name=Mun W"
@@ -16,9 +16,12 @@
1616
}
1717
},
1818
"cost_optimization": {
19-
"tip_0": "60% of credits left (~400,000 credits)",
20-
"tip_1": "Subscription renewal failed due to paywall issues",
21-
"pricing": "$99/month plan"
19+
"tip_0": "Free trial limited to 10,000 characters per month",
20+
"tip_1": "60% of credits left (about 400,000 credits)",
21+
"pricing": "$29/month for unlimited",
22+
"tip_3": "Free access limited to 15 minutes of voice recording per day",
23+
"tip_4": "Credits should last until June 5th",
24+
"tip_5": "10,000 free credits per month on the free plan."
2225
},
2326
"sources": [
2427
"Reddit community",
Lines changed: 23 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,35 @@
11
# Eleven Labs - Common Pitfalls & Issues
22

3-
*Last updated: 2025-08-14*
3+
*Last updated: 2025-08-16*
44

55
## Technical Issues
66

7-
### ⚠️ Unable to switch back to a Custom LLM after testing with a built-in model (gemini-2.0-flash); interface shows 'Fix the errors to proceed' even though Server URL, Model ID, and API Key are correctly filled.
7+
### ⚠️ Cannot switch back to a Custom LLM after testing with a built-in model (gemini-2.0-flash) on the ElevenLabs Conversational AI dashboard; even after correctly filling out Server URL, Model ID, and API Key, the interface still shows the message: 'Fix the errors to proceed' even though there is no error.
88
**Fix**: Store API keys in environment variables or use a secrets manager.
99

10-
### ⚠️ audio plays back a female voice regardless of which option is selected when using elevenLabs API
10+
### ⚠️ ElevenLabs API always returns a female voice regardless of the selected gender option
11+
12+
### ⚠️ Tasker Action Error: 'HTTP Request' (step 11) Task: 'Text To Speech To File Elevenlabs {"detail":{"status":"invalid_uid","message". "An invalid ID has been received: %voice_id'. Make sure to provide a correct one."}
1113

1214
## Policy & Account Issues
1315

14-
### ⚠️ Account credits wiped (about 400,000 credits) after attempting to renew a $99/month subscription; paywall prevented payment and support ticket received no response.
16+
### ⚠️ Eleven Labs wiped 400,000 credits from a user's account on the $99/month plan; the user had 60% of credits left (about 400,000 credits) and was unable to renew subscription due to paywall issues.
17+
**Note**: Be aware of terms of service regarding account creation.
18+
19+
### ⚠️ Free trial for ElevenLabs is limited to 10,000 characters a month, which is insufficient for scripts that are often ~20-40,000 characters long.
1520
**Note**: Be aware of terms of service regarding account creation.
1621

22+
## Cost & Limits
23+
24+
### 💰 ElevenReader credit system is considered bad by some users, making it off-putting for average consumers.
25+
26+
### 💰 Free access to ElevenLabs is limited to 15 minutes of voice recording per day.
27+
28+
### 💰 Free trial limited to 10,000 characters per month
29+
30+
### 💰 Free access limited to 15 minutes of voice recording per day
31+
32+
### 💰 $29/month for unlimited usage on ElevenReader.
33+
34+
### 💰 $29/month for unlimited
35+

models/audio/eleven-labs/prompting.md

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,21 @@
11
# Eleven Labs Prompting Guide
22

3-
*Last updated: 2025-08-14*
3+
*Last updated: 2025-08-16*
44

55
## Tips & Techniques
66

7+
- I built my own tool, just for me. No subscriptions, no limits, just fast, clean voice generation. Cost me ~ $4/month to run.
8+
- Use ElevenLabsService(voice_name="Mun W") in Manim Voiceover
9+
- MiniMax have daily credit refresh in TTS not like ElevenLabs where you need to wait 1 month to refresh.
10+
- The ElevenLabs voice agent is the entry point into the whole system, and then it will pass off web development or web design requests over to n8n agents via a webhook in order to actually do the work.
11+
- Use the free plan to get 10,000 credits per month for free.
12+
- So, when I do, I use a temporary email to create a new account so the 10,000 chatacter limit 'resets.'
713
- self.set_speech_service(ElevenLabsService(voice_name="Mun W"))
14+
- MacWhisper 11.10 supports ElevenLabs Scribe for cloud transcription.
815
- from manim_voiceover.services.elevenlabs import ElevenLabsService
16+
- I built my own tool to avoid ElevenLabs fees.
17+
- When converting text to voice, adding periods between letters (e.g., B.O.D.) can force the model to pronounce acronyms letter by letter, though it may consume more credits.
18+
- ElevenLabs Scribe v1 achieves 15.0% WER on 5-10 minute patient-doctor chats, averaging 36 seconds per file.
919

1020
## Recommended Settings
1121

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
{
2+
"service": "Fireflies.ai",
3+
"category": "audio",
4+
"last_updated": "2025-08-16T13:46:29.623761",
5+
"extraction_timestamp": "2025-08-16T13:29:54.297790",
6+
"data_sources": [
7+
"Reddit API",
8+
"Community discussions"
9+
],
10+
"posts_analyzed": 171,
11+
"confidence": "medium",
12+
"version": "1.0.0"
13+
}
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# Fireflies.ai - Common Pitfalls & Issues
2+
3+
*Last updated: 2025-08-16*
4+
5+
## Technical Issues
6+
7+
### ⚠️ Failed to create a send channel message in Slack. Error from Slack: invalid_thread_ts
8+
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
# Fireflies.ai Prompting Guide
2+
3+
*Last updated: 2025-08-16*
4+
5+
## Tips & Techniques
6+
7+
- Configure Zapier to send transcripts to a channel without duplicate notifications by adjusting thread settings
8+
- Use custom prompts called 'apps' in Fireflies.ai to create reusable ready‑made prompts.
9+
- Use Zapier to send Fireflies.ai transcripts to Slack
10+
11+
## Sources
12+
13+
- Reddit community discussions
14+
- User-reported experiences

0 commit comments

Comments
 (0)