Skip to content

tengweiherr/scrapee

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Scrapee Workspace

Monorepo for Scrapee extension and web app using pnpm workspaces. Scrapee helps to collect your competitors' product info & vouchers on Shopee.

Structure

scrape/
├── apps/
│   ├── extension/    # Chrome extension
│   └── webapp/       # Web management app
├── package.json      # Root workspace config
└── pnpm-workspace.yaml

Architecture overview

Three scripts with different roles:

  1. webapp-connector.ts (Content Script - Webapp only)
  • Runs on: localhost:3000 only (webapp domain)
  • Purpose: Bridge between webapp and extension
  • Communication: BroadcastChannel (webapp ↔ extension)
  1. content/index.ts (Content Script - All pages)
  • Runs on: <all_urls> (every webpage)
  • Purpose: Access page DOM and extract HTML
  • Communication: browser.runtime.sendMessage (background ↔ content)
  1. background/index.ts (Service Worker)
  • Runs on: Extension background context (always running)
  • Purpose: Orchestrates scraping workflow
  • Communication: Messages to/from content scripts and webapp-connector

Complete Flow Diagram

flowchart TD
    Start([User clicks Scrape]) --> Webapp["🌐 Webapp<br/>Dashboard.tsx"]
    
    Webapp -->|"📡 BroadcastChannel<br/>START_SCRAPING"| Connector["🔗 webapp-connector.ts<br/>Content Script"]
    
    Connector -->|"📨 SCRAPE_URL"| Background["⚙️ background/index.ts<br/>Service Worker"]
    
    Background -->|"1️⃣ Create tab"| Tab["📑 New Tab<br/>Target URL"]
    Tab -->|"2️⃣ Wait for load"| Loaded["✅ Tab Loaded"]
    Loaded -->|"3️⃣ CHECK_CONTENT_READY"| Content["📄 content/index.ts<br/>Content Script<br/>Target Page"]
    
    Content -->|"🔍 Query DOM"| Check{"Content<br/>Ready?"}
    Check -->|"❌ No"| Loaded
    Check -->|"✅ Yes"| GetHTML["📥 GET_HTML"]
    
    GetHTML -->|"📋 Full HTML"| Process["🔧 Process Data<br/>extractData + cheerio"]
    
    Process -->|"📤 Results"| Connector2["🔗 webapp-connector.ts"]
    Connector2 -->|"📡 SCRAPE_COMPLETE"| Webapp2["🌐 Webapp<br/>Dashboard.tsx"]
    
    Webapp2 -->|"🤖 AI Processing"| AI["🧠 Extract Product Info<br/>& Shop Vouchers"]
    AI -->|"✨ Update UI"| End([Results Displayed])

    style Webapp fill:#4A90E2,color:#fff
    style Webapp2 fill:#4A90E2,color:#fff
    style Connector fill:#F5A623,color:#fff
    style Connector2 fill:#F5A623,color:#fff
    style Background fill:#E94B3C,color:#fff
    style Content fill:#50C878,color:#fff
    style Process fill:#9B59B6,color:#fff
    style AI fill:#3498DB,color:#fff
    style Start fill:#95A5A6,color:#fff
    style End fill:#2ECC71,color:#fff
    style Check fill:#F39C12,color:#fff
Loading

Setup

# Install all dependencies
pnpm install

Development

# Run extension in dev mode
pnpm dev

# Run webapp in dev mode
pnpm dev:webapp

# Build extension
pnpm build

# Build webapp
pnpm build:webapp

# Lint all packages
pnpm lint

Workspace Commands

You can also run commands in specific packages:

# Extension commands
pnpm --filter extension dev
pnpm --filter extension build

# Webapp commands
pnpm --filter webapp dev
pnpm --filter webapp build

About

[WIP] Scrapes your competitors' product info & vouchers on Shopee.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages