Shiv Gupta

Shiv Gupta

Pinterest Keyword-Based Content Scraper with AI Agent & BrightData Automation

Pinterest Keyword-Based Content Scraper with AI Agent & BrightData Automation

Overview

This n8n workflow automates Pinterest content scraping based on user-provided keywords using BrightData's API and Claude Sonnet 4 AI agent. The system intelligently processes keywords, initiates scraping jobs, monitors progress, and formats the extracted data into structured outputs.

Architecture Components

🧠 AI-Powered Controller

  • Claude Sonnet 4 Model : Processes and understands keywords before initiating scrape
  • AI Agent : Acts as the intelligent controller coordinating all scraping steps

πŸ“₯ Data Input

  • Form Trigger : User-friendly keyword input interface
  • Keywords Field : Required input field for Pinterest search terms

πŸš€ Scraping Pipeline

  1. Launch Scraping Job : Sends keywords to BrightData API
  2. Status Monitoring : Continuously checks scraping progress
  3. Data Retrieval : Downloads completed scraped content
  4. Data Processing : Formats and structures the raw data
  5. Storage : Saves results to Google Sheets

Workflow Nodes

1. Pinterest Keyword Input

  • Type : Form Trigger
  • Purpose : Entry point for user keyword submission
  • Configuration :
    • Form title: "Pinterest"
    • Required field: "Keywords"

2. Anthropic Chat Model

  • Type : Language Model (Claude Sonnet 4)
  • Model : claude-sonnet-4-20250514
  • Purpose : AI-powered keyword processing and workflow orchestration

3. Keyword-based Scraping Agent

  • Type : AI Agent
  • Purpose : Orchestrates the entire scraping process
  • Instructions :
    • Initiates Pinterest scraping with provided keywords
    • Monitors scraping status until completion
    • Downloads final scraped data
    • Presents raw scraped data as output

4. BrightData Pinterest Scraping

  • Type : HTTP Request Tool
  • Method : POST
  • Endpoint : https://api.brightdata.com/datasets/v3/trigger
  • Parameters :
    • dataset_id : gd_lk0sjs4d21kdr7cnlv
    • include_errors : true
    • type : discover_new
    • discover_by : keyword
    • limit_per_input : 2
  • Purpose : Creates new scraping snapshot based on keywords

5. Check Scraping Status

  • Type : HTTP Request Tool
  • Method : GET
  • Endpoint : https://api.brightdata.com/datasets/v3/progress/{snapshot_id}
  • Purpose : Monitors scraping job progress
  • Returns : Status values ​​like "running" or "ready"

6. Fetch Pinterest Snapshot Data

  • Type : HTTP Request Tool
  • Method : GET
  • Endpoint : https://api.brightdata.com/datasets/v3/snapshot/{snapshot_id}
  • Purpose : Downloads completed scraped data
  • Trigger : Executes when status is "ready"

7. Format & Extract Pinterest Content

  • Type : Code Node (JavaScript)
  • Purpose : Parses and structures raw scraped data
  • Extracted Fields :
    • URL
    • Post ID
    • Title
    • Content
    • Date Posted
    • User
    • Likes & Comments
    • Average
    • Image URL
    • Categories
    • Hashtags

8. Save Pinterest Data to Google Sheets

  • Type : Google Sheets Node
  • Operation : Append
  • Mapped Columns :
    • Post URL
    • Title
    • Content
    • Image URL

9. Wait for 1 Minute (Disabled)

  • Type : Code Tool
  • Purpose : Adds delay between status checks (currently disabled)
  • Duration : 60 seconds

Setup Requirements

Required Credentials

  1. Anthropic API

    • Credential ID: ANTHROPIC_CREDENTIAL_ID
    • Required for Claude Sonnet 4 access
  2. BrightData API

    • API Key: BRIGHT_DATA_API_KEY
    • Required for Pinterest scraping service
  3. Google Sheets OAuth2

    • Credential ID: GOOGLE_SHEETS_CREDENTIAL_ID
    • Required for data storage

Configuration Placeholders

Replace the following placeholders with actual values:

  • WEBHOOK_ID_PLACEHOLDER : Form trigger webhook ID
  • GOOGLE_SHEET_ID_PLACEHOLDER : Target Google Sheets document ID
  • WORKFLOW_VERSION_ID : n8n workflow version
  • INSTANCE_ID_PLACEHOLDER : n8n instance identifier
  • WORKFLOW_ID_PLACEHOLDER : Unique workflow identifier

Data Flow

 User Input (Keywords)
 ↓
 AI Agent Processing (Claude)
 ↓ 
BrightData Scraping Job Creation
 ↓
 Status Monitoring Loop
 ↓
 Data Retrieval (when ready)
 ↓
 Content Formatting & Extraction
 ↓
 Google Sheets Storage

Output Data Structure

Each scraped Pinterest pin contains:

  • URL : Direct link to Pinterest pin
  • Post ID : Unique Pinterest identifier
  • Title : Pin title/heading
  • Content : Pin description text
  • Date Posted : Publication timestamp
  • User : Pinterest username
  • Engagement : Likes and comments count
  • Media : Media type information
  • Image URL : Direct image link
  • Categories : Pin categorization tags
  • Hashtags : Associated hashtags
  • Comments : User comments text

Usage Instructions

  1. Initial Setup :

    • Configure all required API credentials
    • Replace placeholder values ​​with actual IDs
    • Create target Google Sheets document
  2. Running the Workflow :

    • Access the form trigger URL
    • Enter desired Pinterest keywords
    • Submit the form to initiate scraping
  3. Monitoring Progress :

    • The AI ​​agent will automatically handle status monitoring
    • No manual intervention required during scraping
  4. Accessing Results :

    • Structured data will be automatically saved to Google Sheets
    • Each run appends new data to existing sheet

Technical Notes

  • Rate Limiting : BrightData API has built-in rate limiting
  • Data Limits : Current configuration limits 2 pins per keyword
  • Status Polling : Automatic status checking until completion
  • Error Handling : Includes error capture in scraping requests
  • Async Processing : Supports long-running scraping jobs

Customization Options

  • Adjust Data Limits : Modify limit_per_input parameter
  • Enable Wait Timer : Activate the disabled wait node for longer jobs
  • Custom Data Fields : Modify the formatting code for additional fields
  • Alternative Storage : Replace Google Sheets with other storage options

Sample Google Sheets Template

Create a copy of the sample sheet structure:

 https://docs.google.com/spreadsheets/d/SAMPLE_SHEET_ID/edit

Required columns:

  • Post URL
  • Title
  • Content
  • Image URL

Troubleshooting

  • Authentication Errors : Verify all API credentials are correctly configured
  • Scraping Failures : Check BrightData API status and rate limits
  • Data Formatting Issues : Review the JavaScript formatting code for parsing errors
  • Google Sheets Errors : Ensure proper OAuth2 permissions and sheet access

For any questions or support, please contact: Email or
fill out this form

Do you want to automate your business?

Let's talk about your project