InstaScrape — Async Instagram Comment Scraper
Scrape all parent comments from any Instagram Reel with automated login, async speed, real-time progress, and clean exports — no manual cookie copying required.
✨ Features
- ✅ Automated Login:
cookie.json persistence with iat + expiry, no manual cookies needed.
- Self-healing Auth: detects expired cookies mid-run, prompts relogin, resumes automatically.
- ⚡ Async Engine: powered by
httpx.AsyncClient with requests-per-second throttling.
- Progress Tracking: accurate percent and ETA from Instagram’s comment count.
- Dual Exports: TXT and JSON files saved in timestamped folders.
Requirements
- Python 3.9+
- Dependencies:
pip install -r requirements.txt
️ Installation
git clone https://github.com/kaifcodec/InstaScrape
cd InstaScrape
pip install -r requirements.txt
▶️ Usage
python3 main.py
- Enter the Instagram Reel URL (e.g., https://www.instagram.com/reel/SHORTCODE/).
- Set Max requests per second (5-7 recommended). Adjust for stability.
- On first run, provide username/password; cookie.json is created and reused until expiry.
Output
- TXT: download_comments/txt/reel_comments_YYYYMMDD_HHMMSS.txt
- JSON: download_comments/json/reel_comments_YYYYMMDD_HHMMSS.json
Example structure:
"generated_at": 1700000000,
"count": 123,
"comments": [
{ "username": "user1", "text": "Nice!", "created_at": 1699999000 }
How it Works
- Cookie Lifecycle: cookie.json stores iat and expiry; validated on startup & during requests.
- Error Resilience: retries transient errors and refreshes cookies on 401/redirect-to-login.
- Progress Accuracy: uses Instagram’s comment count to calculate percent & ETA.
- Async Efficiency: httpx.AsyncClient with HTTP/2, keep-alive, and RPS limiter.
Repository url: https://github.com/kaifcodec/InstaScrape.git