-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New Components - bright_data #17907
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
New Components - bright_data #17907
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ |
WalkthroughThis change introduces a new Bright Data app integration with three new actions: "Scrape Website", "Scrape SERP", and "Unlock Website". The app provides comprehensive property definitions and utility methods for interacting with the Bright Data API, including dataset and zone selection, and request helpers. The package version and dependencies are updated accordingly. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant BrightDataAction
participant BrightDataApp
participant BrightDataAPI
User->>BrightDataAction: Trigger action (e.g., Scrape Website)
BrightDataAction->>BrightDataApp: Call method (e.g., scrapeWebsite or requestWebsite)
BrightDataApp->>BrightDataAPI: Send HTTP request with parameters
BrightDataAPI-->>BrightDataApp: Return response data
BrightDataApp-->>BrightDataAction: Return processed data
BrightDataAction-->>User: Output result
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Assessment against linked issues
Assessment against linked issues: Out-of-scope changesNo out-of-scope changes detected. Poem
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
components/bright_data/actions/unlock-website/unlock-website.mjs (1)
51-66
: Consider adding error handling for consistency.Unlike the scrape-serp action, this action lacks error handling for API responses. For consistency across the integration, consider adding similar error handling.
+import { ConfigurationError } from "@pipedream/platform"; + export default { // ... existing code ... async run({ $ }) { const data = await this.brightData.requestWebsite({ $, data: { url: this.url, zone: this.zone, format: this.format, method: this.method, country: this.country, data_format: this.dataFormat, }, }); + if (data.status_code && data.status_code >= 400) { + throw new ConfigurationError(data.body || `API request failed with status ${data.status_code}`); + } + $.export("$summary", `Unlocked website ${this.url}`); return data; },
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
pnpm-lock.yaml
is excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (5)
components/bright_data/actions/scrape-serp/scrape-serp.mjs
(1 hunks)components/bright_data/actions/scrape-website/scrape-website.mjs
(1 hunks)components/bright_data/actions/unlock-website/unlock-website.mjs
(1 hunks)components/bright_data/bright_data.app.mjs
(1 hunks)components/bright_data/package.json
(2 hunks)
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: when developing pipedream components, do not add built-in node.js modules like `fs` to `package.json...
Learnt from: jcortes
PR: PipedreamHQ/pipedream#14935
File: components/sailpoint/package.json:15-18
Timestamp: 2024-12-12T19:23:09.039Z
Learning: When developing Pipedream components, do not add built-in Node.js modules like `fs` to `package.json` dependencies, as they are native modules provided by the Node.js runtime.
Applied to files:
components/bright_data/package.json
📚 Learning: "dir" props in pipedream components are hidden in the component form and not user-facing, so they do...
Learnt from: js07
PR: PipedreamHQ/pipedream#17375
File: components/zerobounce/actions/get-validation-results-file/get-validation-results-file.mjs:23-27
Timestamp: 2025-07-01T17:07:48.193Z
Learning: "dir" props in Pipedream components are hidden in the component form and not user-facing, so they don't require labels or descriptions for user clarity.
Applied to files:
components/bright_data/package.json
components/bright_data/bright_data.app.mjs
📚 Learning: in the salesloft api integration (components/salesloft/salesloft.app.mjs), the _makerequest method r...
Learnt from: GTFalcao
PR: PipedreamHQ/pipedream#16954
File: components/salesloft/salesloft.app.mjs:14-23
Timestamp: 2025-06-04T17:52:05.780Z
Learning: In the Salesloft API integration (components/salesloft/salesloft.app.mjs), the _makeRequest method returns response.data which directly contains arrays for list endpoints like listPeople, listCadences, listUsers, and listAccounts. The propDefinitions correctly call .map() directly on these responses without needing to destructure a nested data property.
Applied to files:
components/bright_data/bright_data.app.mjs
📚 Learning: the salesloft api list endpoints (listpeople, listcadences, listusers, listaccounts) return arrays d...
Learnt from: GTFalcao
PR: PipedreamHQ/pipedream#16954
File: components/salesloft/salesloft.app.mjs:14-23
Timestamp: 2025-06-04T17:52:05.780Z
Learning: The Salesloft API list endpoints (listPeople, listCadences, listUsers, listAccounts) return arrays directly in the response body, not wrapped in a metadata object with a nested data property. The _makeRequest method correctly returns response.data which contains the arrays that can be mapped over directly in propDefinitions.
Applied to files:
components/bright_data/bright_data.app.mjs
🧬 Code Graph Analysis (3)
components/bright_data/actions/unlock-website/unlock-website.mjs (2)
components/bright_data/actions/scrape-website/scrape-website.mjs (1)
data
(28-40)components/bright_data/actions/scrape-serp/scrape-serp.mjs (1)
data
(54-64)
components/bright_data/actions/scrape-serp/scrape-serp.mjs (2)
components/bright_data/actions/unlock-website/unlock-website.mjs (1)
data
(52-62)components/bright_data/actions/scrape-website/scrape-website.mjs (1)
data
(28-40)
components/bright_data/actions/scrape-website/scrape-website.mjs (2)
components/bright_data/actions/unlock-website/unlock-website.mjs (1)
data
(52-62)components/bright_data/actions/scrape-serp/scrape-serp.mjs (1)
data
(54-64)
🔇 Additional comments (11)
components/bright_data/package.json (2)
3-3
: LGTM! Appropriate version bump for new features.The version increment from 0.0.1 to 0.1.0 correctly reflects the addition of new action functionality.
15-17
: LGTM! Correct dependency addition.The
@pipedream/platform
dependency is properly added to support theConfigurationError
class used in the action files.components/bright_data/actions/scrape-serp/scrape-serp.mjs (2)
1-2
: LGTM! Proper imports.Correct imports for the Bright Data app and ConfigurationError from the platform.
19-27
: LGTM! Proper zone filtering for SERP.The zone propDefinition is correctly filtered to only show zones of type "serp", which aligns with the action's purpose.
components/bright_data/actions/unlock-website/unlock-website.mjs (1)
17-25
: LGTM! Proper zone filtering for unblocker.The zone propDefinition is correctly filtered to only show zones of type "unblocker", which aligns with the action's purpose.
components/bright_data/bright_data.app.mjs (5)
1-1
: LGTM! Correct import.Proper import of axios from the Pipedream platform.
6-18
: LGTM! Well-structured dataset propDefinition.The async options loading for datasets is properly implemented with clear label/value mapping.
70-83
: LGTM! Clean HTTP request abstraction.The
_makeRequest
method provides good abstraction with proper authentication header setup and flexible options handling.
96-109
: LGTM! Clear API method separation.The distinction between
scrapeWebsite
(for dataset-based scraping) andrequestWebsite
(for direct URL requests) is well-defined and matches the usage patterns in the action files.
23-26
: Double-check Bright Data zone filtering logic
- Location: components/bright_data/bright_data.app.mjs (lines 23–26)
- Current code:
async options({ type }) { const zones = await this.listZones(); return zones?.filter((zone) => zone.type === type)?.map(({ name }) => name) || []; }- Observations:
- Optional chaining plus
|| []
guards against nullish/emptyzones
.- If
type
is undefined or misspelled, you may end up with an empty result set or inadvertently include zones lacking atype
property.- No similar filtering pattern exists in other components (e.g., Google Cloud simply returns all zone names).
- Recommendations:
- Validate or default the incoming
type
parameter before filtering.- Normalize
zone.type
andtype
(e.g., trim, case-fold) to avoid mismatches.- Consider throwing a clear error or warning when
type
is missing or invalid.components/bright_data/actions/scrape-website/scrape-website.mjs (1)
27-40
: LGTM! Appropriate data structure for dataset-based scraping.The use of
params.dataset_id
anddata.input
array structure aligns with the Bright Data dataset API requirements and differs appropriately from the direct request approach used in other actions.
Resolves #17768
Resolves #17618
Summary by CodeRabbit
New Features
Chores