IdleAI — Distributed Compute for AI Workloads
IdleAI: Distributed AI Compute Platform
Revised Business Plan — February 2026
Tagline: "Earn money from your idle computer. Help democratize AI."
Founder: Solo technical founder, bootstrapped.
Executive Summary
IdleAI is a distributed compute platform that turns idle consumer hardware into an AI processing network. Unlike competitors fixated on LLM inference (which requires expensive GPUs), IdleAI recognizes that most AI workloads are not LLM inference — and many run perfectly well on ordinary CPUs.
The platform operates two contributor tiers:
- Tier 1 ("Everyone"): Any modern computer contributes CPU cycles for embedding generation, speech-to-text, OCR, reranking, and other lightweight AI tasks. Install and forget.
- Tier 2 ("GPU Owners"): Discrete GPU or Apple Silicon 16GB+ users serve LLM inference for small-to-medium models.
This multi-workload approach solves the fundamental problem that killed earlier distributed AI attempts: there aren't enough GPUs in consumer hands to build a reliable LLM inference network, but there are 1.5 billion PCs that can do everything else.
Target: $50K MRR within 18 months. Bootstrapped to profitability.
The Core Insight
The AI infrastructure market treats "AI compute" as synonymous with "GPU compute." This is wrong.
The breakdown of a typical AI application's compute spend:
| Workload | % of Total Compute Cost | GPU Required? |
|---|---|---|
| Embedding generation | 15-25% | No |
| Speech-to-text | 10-15% | No (CPU is fine) |
| Document processing/OCR | 5-10% | No |
| Reranking & retrieval | 5-10% | No |
| Text preprocessing | 3-5% | No |
| Image classification | 5-8% | Helps, not required |
| LLM inference | 35-50% | Yes |
50-65% of AI compute spend does NOT require a GPU. That's the market IdleAI attacks first.
TIER 1: "Everyone" — CPU Contributors
Addressable Contributor Base
- ~1.5B active PCs worldwide
- ~500M meet minimum specs (4+ cores, 8GB+ RAM, broadband)
- Realistic early target: 10K-100K contributors (comparable to Folding@home's sustained base)
- At SETI@home's peak: 5.2M active users — proves the install-and-forget model works
Workload 1: Embedding Generation (text → vectors)
What it is: Converting text into numerical vectors for search, recommendation, and RAG systems. Every AI app with a knowledge base needs this.
Market demand:
- OpenAI charges $0.13 per 1M tokens for text-embedding-3-large, $0.02/1M for text-embedding-3-small
- Cohere charges $0.10/1M tokens for embed-v3
- Google charges $0.025/1M tokens (Gecko)
- Market size: Estimated $800M-1.2B/yr for embedding APIs (growing 40%+ YoY as RAG adoption explodes)
- Every company building RAG, semantic search, or recommendation systems needs continuous embedding generation
CPU performance:
- Model: all-MiniLM-L6-v2 (22M params) or BGE-small (33M params)
- Typical laptop (8GB RAM, 4-core i5/M1): ~200-400 embeddings/sec (short texts, 128 tokens)
- Typical desktop (16GB RAM, 8-core): ~500-1000 embeddings/sec
- For comparison: A single A100 GPU does ~5,000-10,000/sec
- Key insight: 10 desktops ≈ 1 GPU for embeddings. CPUs are legitimately competitive here because embedding models are tiny.
Pricing:
- Cloud: $0.02-0.13 per 1M tokens
- IdleAI price: $0.005-0.03 per 1M tokens (50-75% cheaper)
- Contributor payout: 60% of revenue
Unit economics per device (desktop, 8 idle hours/day):
- Throughput: ~500 embeddings/sec × 28,800 sec = ~14.4M embeddings/day
- At ~100 tokens/embedding = 1.44B tokens/day
- Revenue at $0.01/1M tokens = $0.014/day = ~$0.43/month per device
- Contributor earns: ~$0.26/month
- Electricity cost for contributor: ~$0.02/day (incremental CPU load ~15W) = $0.60/month
- ⚠️ Reality check: At current embedding prices, contributors LOSE money on electricity. This only works if: (a) prices stay above $0.05/1M tokens (use higher-quality models like E5-large), (b) the work is bundled with other tasks, or (c) contributors don't care about profit (altruism/gamification model like SETI@home)
Revised approach: Focus on larger, higher-quality embedding models (E5-large, BGE-large, 335M params) where:
- Cloud pricing is higher ($0.10-0.20/1M tokens)
- CPU still handles them (50-150 embeddings/sec on desktop)
- Revenue per device: $0.05-0.15/day — still marginal but approaching break-even
Verdict: ⭐⭐⭐ High volume, but razor-thin margins. Best as part of a bundle, not standalone.
Workload 2: Speech-to-Text (Whisper)
What it is: Transcribing audio to text. Massive demand from podcasters, meeting tools, accessibility, call centers.
Market demand:
- OpenAI Whisper API: $0.006/minute
- Google Speech-to-Text: $0.016-0.048/minute (depending on model)
- AWS Transcribe: $0.024/minute
- Deepgram: $0.0043-0.0145/minute
- Market size: ~$5-8B/yr for speech-to-text (includes enterprise, call centers, media)
- Growing rapidly with meeting transcription (Otter, Fireflies), podcast tools, accessibility mandates
CPU performance:
- Whisper-small (244M params): Processes 1 minute of audio in ~30-60 seconds on a modern 8-core CPU (real-time or near-real-time)
- Whisper-medium (769M params): ~2-4 minutes per minute of audio on CPU — too slow for real-time, fine for batch
- Whisper-large (1.5B params): ~5-10 minutes per minute on CPU — batch only
- Whisper-tiny (39M params): 5-10x real-time on CPU — fast but lower quality
- Key insight: Whisper-small on CPU is competitive for batch transcription. Not real-time, but most transcription is batch (upload file → get transcript later).
Pricing:
- Cloud: $0.006-0.048/minute
- IdleAI price: $0.003-0.008/minute (target the OpenAI price point, undercut by 30-50%)
- Contributor payout: 60% of revenue
Unit economics per device (desktop, 8 hours idle):
- Whisper-small: ~480 minutes of audio processed per 8-hour shift (1:1 real-time)
- Revenue at $0.004/minute = $1.92/day
- Contributor earns: $1.15/day = $34.50/month
- Electricity cost: ~$0.05/day (CPU fully loaded ~45W incremental)
- ✅ Contributor is profitable! This is by far the best CPU workload.
Why it works: Audio transcription has a much higher $/compute ratio than text embeddings. Processing 1 minute of audio takes real work and commands real money.
Verdict: ⭐⭐⭐⭐⭐ The killer workload for Tier 1. High margin, CPU-competitive, massive demand, batch-tolerant.
Workload 3: Image Classification / Object Detection (Small Models)
What it is: Classifying images, detecting objects, content moderation, visual search preprocessing.
Market demand:
- Google Vision API: $1.50-3.50 per 1,000 images (label detection)
- AWS Rekognition: $1.00-1.20 per 1,000 images
- Azure Computer Vision: $1.00-1.50 per 1,000 images
- Market: ~$2-4B/yr for image analysis APIs
- Use cases: content moderation (huge), product categorization, medical image triage, security
CPU performance:
- MobileNet v2 (3.4M params): ~50-100 images/sec on modern CPU
- EfficientNet-B0 (5.3M params): ~20-40 images/sec
- YOLO-tiny (object detection): ~5-15 FPS on CPU
- ResNet-50 (25M params): ~10-20 images/sec
- Sufficient for batch processing. Not for real-time video.
Pricing:
- Cloud: $1.00-3.50 per 1,000 images
- IdleAI: $0.30-0.80 per 1,000 images (60-75% cheaper)
- Contributor payout: 60%
Unit economics per device (desktop, 8 hours):
- MobileNet: ~50 imgs/sec × 28,800 sec = 1.44M images/day
- Revenue at $0.50/1K images = $720/day ← This can't be right
- Reality check: Cloud prices are high because they include model development, fine-tuning, and complex multi-label classification with high accuracy. A raw MobileNet pass is not equivalent to Google Vision API. Realistic equivalent pricing for basic classification: $0.05-0.10/1K images
- Revised revenue: $72-144/day per device ← Still too high. The issue is demand won't saturate a device.
- Actual constraint: demand, not supply. A single desktop can process millions of images/day. You'd need massive customer volume to keep even one machine busy at competitive prices.
Verdict: ⭐⭐⭐ Good margins per image, but hard to generate enough demand to keep contributors busy. Best as an occasional workload mixed in.
Workload 4: RAG Retrieval and Reranking
What it is: After vector search returns candidate documents, a reranking model scores them for relevance. Critical for RAG quality.
Market demand:
- Cohere Rerank: $2.00 per 1,000 searches
- Jina Reranker: $0.50-1.00 per 1,000 queries
- Market: Nascent but growing fast — every RAG pipeline needs this. Estimated $200-500M/yr and doubling annually.
CPU performance:
- Cross-encoder reranking models (e.g., ms-marco-MiniLM-L-6-v2): ~50-100 query-document pairs/sec on CPU
- A typical rerank call: 1 query × 20 documents = 20 pairs → ~0.2-0.4 seconds on CPU
- CPU is perfectly adequate for reranking. These are small BERT-class models.
Pricing:
- Cloud: $0.50-2.00 per 1K searches
- IdleAI: $0.20-0.50 per 1K searches
- Contributor payout: 60%
Unit economics per device (desktop, 8 hours):
- ~200 rerank queries/sec (assuming 20 docs each, 100 pairs/sec, batch-optimized)
- 200 × 28,800 = 5.76M queries/day
- Revenue at $0.30/1K = $1,728/day ← Again, demand-constrained, not supply
- Realistic utilization: Maybe 1-5% → $17-86/day per device at full utilization windows
- Same issue as image classification: supply will vastly exceed demand initially
Verdict: ⭐⭐⭐ Good margins, CPU-native, but demand-constrained per device. Best bundled.
Workload 5: Text Preprocessing and Tokenization
What it is: Cleaning, chunking, tokenizing text for downstream AI processing.
Market demand:
- Typically bundled into other services, not sold standalone
- Some demand in data pipeline services
- Very low $/compute — this is not a viable standalone revenue workload
Verdict: ⭐ Too cheap to meter. Not worth the orchestration overhead. Skip as a paid workload; include as "free value-add" when bundling with other services.
Workload 6: Synthetic Data Generation (Small Models)
What it is: Using small language models (Phi-3, TinyLlama, etc.) to generate training data, augment datasets.
Market demand:
- Emerging market: Scale AI, Snorkel, Gretel charge $5-50/hr for data generation pipelines
- Market: ~$1-2B/yr for synthetic data (Gartner predicts 60% of AI training data will be synthetic by 2026)
CPU performance:
- Phi-3-mini (3.8B): ~2-5 tokens/sec on CPU with 16GB RAM — painfully slow
- TinyLlama (1.1B): ~10-20 tokens/sec on CPU — usable for batch
- Marginal on CPU. This workload really belongs in Tier 2 for any model above 1B params.
Verdict: ⭐⭐ Only viable with sub-1B models on CPU. Move to Tier 2 for anything useful.
Workload 7: OCR and Document Processing
What it is: Extracting text from images/PDFs, structured data extraction from documents.
Market demand:
- AWS Textract: $1.50 per 1,000 pages (basic), $15 per 1,000 pages (forms/tables)
- Google Document AI: $1.50-65 per 1,000 pages depending on processor
- Azure Form Recognizer: $1.50-50 per 1,000 pages
- Market: ~$3-5B/yr for document processing (insurance, legal, finance, healthcare)
CPU performance:
- Tesseract OCR: ~1-3 pages/sec on modern CPU (basic text extraction)
- PaddleOCR: ~2-5 pages/sec
- DocTR (deep learning OCR): ~0.5-2 pages/sec on CPU
- Layout analysis + table extraction: ~0.2-0.5 pages/sec
- CPU is the standard for OCR. GPUs help but aren't required.
Pricing:
- Cloud: $1.50-15.00 per 1,000 pages
- IdleAI: $0.50-5.00 per 1,000 pages (60-70% cheaper)
- Contributor payout: 60%
Unit economics per device (desktop, 8 hours):
- ~2 pages/sec × 28,800 sec = 57,600 pages/day
- Revenue at $1.00/1K pages = $57.60/day
- Contributor earns: $34.56/day = $1,037/month at full utilization
- Realistic utilization (10-20%): $100-200/month
- Electricity: ~$1.50/day
- ✅ Highly profitable for contributors
Verdict: ⭐⭐⭐⭐⭐ Excellent workload. CPU-native, high margins, massive enterprise demand, batch-tolerant.
TIER 2: "GPU Owners" — LLM Inference
Addressable Contributor Base
| Segment | Devices | VRAM/RAM | Models They Can Run |
|---|---|---|---|
| NVIDIA 8GB VRAM (RTX 3060, 4060, etc.) | ~70M | 8-12GB | 7-8B models (Llama 3 8B, Mistral 7B, Phi-3) |
| NVIDIA 16GB+ (RTX 3090, 4080, 4090) | ~15M | 16-24GB | Up to 30B models, 70B quantized |
| Apple Silicon 16GB+ | ~50-80M | 16-24GB unified | 7-13B models comfortably, 30B quantized |
| Apple Silicon 32GB+ | ~15-25M | 32GB+ unified | Up to 70B quantized |
| Crypto miners (idle) | ~5-10M | 8-16GB typically | 7-8B models |
| Total addressable | ~150-200M devices |
Realistic early contributors: 1K-10K GPU nodes (need to prove payout before scaling)
LLM Inference Workloads
Market demand:
- OpenAI API revenue: ~$5-10B/yr (2025)
- Anthropic, Google, Mistral combined: ~$3-5B/yr
- Total LLM API market: ~$15-25B/yr and growing 50-100% annually
- Together.ai, Fireworks, Groq, etc. selling inference at 50-80% discount to OpenAI
Performance on consumer hardware:
| Model | Hardware | Tokens/sec | Latency (first token) |
|---|---|---|---|
| Llama 3 8B Q4 | RTX 4060 8GB | 40-60 tok/s | 200-400ms |
| Llama 3 8B Q4 | M2 16GB | 25-40 tok/s | 300-600ms |
| Llama 3 70B Q4 | RTX 4090 24GB | 15-25 tok/s | 1-3s |
| Llama 3 70B Q4 | M3 Max 64GB | 15-20 tok/s | 2-4s |
| Mistral 7B Q4 | RTX 3060 12GB | 35-50 tok/s | 200-500ms |
| Phi-3 mini 3.8B | RTX 3060 | 60-90 tok/s | 100-200ms |
Cloud comparison: A100 serves Llama 3 8B at ~200+ tok/s per concurrent user. Consumer hardware is 3-10x slower per device, but effectively free compute (contributor provides the hardware).
Pricing:
- OpenAI GPT-4o-mini: $0.15/$0.60 per 1M tokens (input/output)
- Together.ai Llama 3 8B: $0.10/$0.10 per 1M tokens
- IdleAI Llama 3 8B: $0.03-0.06 per 1M tokens (70-80% cheaper than Together)
- IdleAI Llama 3 70B: $0.20-0.40 per 1M tokens (70% cheaper than cloud)
Unit economics per device (RTX 4060, 8 hours idle):
- Llama 3 8B at 50 tok/s = 1.44M tokens/8hr shift
- Revenue at $0.05/1M tokens (blended) = $0.07/day ← Terrible
- Reality check: At current open-model pricing ($0.10/1M tokens on Together.ai), there's almost no margin for a distributed network.
The LLM pricing problem: Open model inference is commoditized to near-zero. Together.ai charges $0.10/1M tokens for Llama 3 8B. To undercut them AND pay contributors, you'd need volume in the billions of tokens/day.
Revised LLM approach: Target use cases where latency tolerance is high and price sensitivity is extreme:
- Batch processing (not real-time chat)
- Dev/test environments (developers testing against Llama before deploying to prod)
- Fine-tuned model hosting (run YOUR fine-tuned model — cloud providers charge 2-5x for custom models)
- Privacy-sensitive inference (data never leaves the contributor's machine — customer sends encrypted prompt)
Revised unit economics (batch LLM, RTX 4060, 8 hours):
- Higher throughput in batch mode: ~80 tok/s sustained
- 2.3M tokens/day
- Revenue at $0.15/1M tokens (custom/batch premium): $0.35/day
- Contributor earns: $0.21/day = $6.30/month
- Electricity: ~$0.15/day (150W GPU × 8hrs × $0.12/kWh)
- Net contributor profit: ~$1.80/month ← Still marginal
The honest answer on LLM inference: At 2026 pricing, distributed LLM inference is a loss leader or community feature, not a profit center. It's useful for:
- Marketing ("run Llama on our network!")
- Handling overflow from CPU workloads
- Custom/fine-tuned model hosting (where cloud alternatives are expensive)
- Privacy-focused inference
Verdict: ⭐⭐ Important for marketing and product positioning, but margins are near-zero. Not the revenue driver.
Revenue Model & Mix
Realistic Revenue Mix (Year 1-2)
| Workload | % of Revenue | Why |
|---|---|---|
| Speech-to-text (Whisper) | 35% | Best CPU economics, huge demand |
| OCR / Document processing | 30% | High margins, enterprise buyers |
| Embedding generation | 10% | High volume, low margin |
| Reranking / retrieval | 10% | Growing demand, CPU-native |
| Image classification | 5% | Niche but profitable |
| LLM inference | 10% | Loss leader, marketing value |
Key takeaway: 90% of revenue comes from CPU workloads. LLM inference is the shiny marketing story; CPU workloads are the actual business.
Pricing Structure
For customers (API buyers):
- Pay-as-you-go API pricing (50-75% cheaper than cloud incumbents)
- Volume discounts at $1K/mo, $5K/mo, $10K/mo tiers
- SLA tiers: Best-effort (cheapest), Guaranteed (higher price, redundant processing)
For contributors (compute sellers):
- 60% revenue share on all workloads
- Weekly payouts via Stripe/PayPal (minimum $5)
- Dashboard showing earnings, uptime, jobs completed
- Bonus multipliers for reliability (99%+ uptime) and speed
Unit Economics Summary
Per-Device Monthly Economics (8 hrs idle/day, 20% utilization)
| Device Type | Primary Workload | Gross Revenue | Contributor Payout (60%) | Electricity Cost | Net to Contributor |
|---|---|---|---|---|---|
| Laptop (8GB, CPU) | Whisper + embeddings | $8-15/mo | $5-9/mo | $3-5/mo | $0-4/mo |
| Desktop (16GB, CPU) | Whisper + OCR | $20-40/mo | $12-24/mo | $5-8/mo | $4-16/mo |
| Desktop + RTX 4060 | Whisper + OCR + LLM | $25-45/mo | $15-27/mo | $8-12/mo | $3-15/mo |
| Mac M2 16GB | Whisper + LLM | $15-30/mo | $9-18/mo | $2-4/mo | $5-14/mo |
| Desktop + RTX 4090 | All workloads | $35-60/mo | $21-36/mo | $12-18/mo | $3-18/mo |
Honest assessment: Most Tier 1 contributors will earn $0-15/month — barely covering electricity. The pitch needs to be:
- "Beer money" — not "quit your job" income
- Altruism angle — "help democratize AI" (like SETI@home, most people didn't care about earnings)
- Gamification — leaderboards, badges, contribution streaks
- The real earners are people with multiple machines or always-on desktops
Platform Economics
| Metric | Month 6 | Month 12 | Month 18 |
|---|---|---|---|
| Contributors | 2,000 | 15,000 | 50,000 |
| Active (8+ hrs/day) | 800 | 6,000 | 20,000 |
| API customers | 20 | 100 | 300 |
| Monthly GMV | $5K | $50K | $250K |
| Platform revenue (40%) | $2K | $20K | $100K |
| Infrastructure costs | $3K | $8K | $20K |
| Net | -$1K | $12K | $80K |
Product & App Experience
Tier 1: Install and Forget
Onboarding (< 2 minutes):
- Download app (macOS, Windows, Linux)
- Create account (email or Google)
- App auto-benchmarks hardware (30 seconds)
- Set preferences: "Run when idle" / "Run always" / "Run on schedule"
- Set resource limits: "Use up to 50% CPU" / "Max 4GB RAM"
- Done. App sits in system tray.
Ongoing experience:
- System tray icon shows status (idle/working/earning)
- Weekly earning summary notification
- Monthly payout
- No model selection, no configuration, no technical knowledge needed
- Auto-updates models and workload types silently
Technical architecture:
- Lightweight agent (~50MB install)
- Pulls work units from central coordinator
- Executes in sandboxed container (security)
- Returns results, gets credit
- Coordinator handles load balancing, quality verification, redundancy
Tier 2: Light Configuration
Additional onboarding for GPU users:
- App detects GPU/Apple Silicon automatically
- Shows: "You qualify for premium workloads! Enable GPU acceleration?"
- Optional: Choose which models to download (defaults to recommended)
- First model download: 4-8GB, takes 5-10 minutes
- Done. GPU work is prioritized when available.
Ongoing experience:
- Same as Tier 1 plus GPU utilization stats
- Can opt into specific models or leave on auto
- Higher earnings dashboard
- Model management: delete/add models as desired
Why Multi-Workload Beats Pure LLM
| Factor | LLM-Only Platform | IdleAI Multi-Workload |
|---|---|---|
| Contributor base | ~50-150M (GPU owners only) | ~500M+ (any modern PC) |
| Contributor earnings | $0-6/month (razor thin) | $5-40/month (viable on CPU workloads) |
| Revenue per API call | $0.00001-0.0001 | $0.001-0.01 (Whisper/OCR much higher) |
| Utilization | Low (LLM demand is spiky) | High (diverse workloads fill gaps) |
| Competition | Together.ai, Fireworks, Groq (well-funded) | Few competitors in distributed CPU AI |
| Network effect | Need massive GPU fleet for reliability | Even small network handles batch CPU work |
| Cold start | Hard (need many GPUs before useful) | Easy (a few hundred CPUs serve real customers) |
The fatal flaw of LLM-only distributed platforms:
- Cloud LLM inference is plummeting in price (80% drop in 18 months)
- A single H100 replaces hundreds of consumer GPUs
- Latency requirements for chat mean you can't distribute across home internet
- Consumer GPUs are only idle 8-12 hrs/day — unreliable for SLA-bound customers
Why CPU workloads work better:
- Batch-tolerant (nobody needs an embedding in <100ms)
- CPUs are always available (don't conflict with gaming)
- No VRAM limitations
- Workloads are embarrassingly parallel (split across thousands of machines trivially)
- Enterprise customers are used to paying real money for OCR/transcription
Go-to-Market Strategy
Phase 1: Supply-Side (Months 1-3)
Goal: 1,000 contributors
- Launch on Hacker News, Reddit r/passive_income, r/beermoney
- "SETI@home for AI" narrative — nostalgia + novelty
- Open-source the contributor agent (trust + contributions)
- Benchmark tool: "See how much your computer could earn" (viral calculator)
- Discord community for contributors
Phase 2: Demand-Side (Months 3-6)
Goal: 20 paying API customers
- Target indie developers and small startups
- API compatible with OpenAI/Cohere endpoints (drop-in replacement)
- Free tier: 10K API calls/month
- Content marketing: "We cut our embedding costs by 70%" case studies
- Anchor customers: Approach podcasters/YouTubers for batch transcription (clear value prop, clear savings)
Phase 3: Scale (Months 6-18)
Goal: $50K+ MRR
- Enterprise pilots (SOC2 compliance, dedicated contributor pools)
- Geographic distribution as a feature ("process data in 50+ countries")
- Specialized workload partnerships (legal OCR, medical transcription)
- Referral program: contributors earn bonus for inviting others
Marketing Messages
For contributors:
- "Earn money while you sleep. Your computer works, you get paid."
- "Join 10,000 people powering the future of AI — from their living rooms."
- "Your idle laptop could earn $5-20/month. Here's how."
For customers:
- "AI APIs at 70% less. Same accuracy, no vendor lock-in."
- "Transcribe 10,000 hours of audio for the price of 3,000."
- "The world's most distributed AI compute network."
Technical Architecture (Solo Founder Scope)
MVP Components (Month 1-2)
-
Coordinator Service — Go or Rust, deployed on a single $20/mo VPS
- Work queue (Redis)
- Result verification (run same job on 2 contributors, compare)
- API gateway for customers
- Contributor management
-
Contributor Agent — Python + Electron or Tauri wrapper
- Model runtime (ONNX Runtime for CPU, llama.cpp for GPU)
- Sandboxed execution
- Auto-update mechanism
- System tray app
-
API Layer — OpenAI-compatible endpoints
- /v1/embeddings
- /v1/audio/transcriptions
- /v1/images/classify (custom)
- /v1/documents/ocr (custom)
- /v1/chat/completions (for LLM)
-
Dashboard — Simple React app
- Contributor: earnings, stats, settings
- Customer: usage, billing, API keys
Quality Assurance
- Redundant execution: Each job runs on 2+ contributors; results compared
- Spot checks: 5% of jobs also run on a trusted server for ground truth
- Reputation system: Contributors build trust score over time; high-reputation contributors get single-execution jobs (more efficient)
- Cryptographic verification: Hash inputs/outputs to prevent tampering
What to Build First
- Whisper transcription (highest margin, clearest value prop)
- Embedding generation (highest volume, easiest to implement)
- OCR (high margin, enterprise appeal)
- LLM inference (last — hardest, lowest margin, but needed for marketing story)
Financial Projections (Bootstrapped)
Costs (Monthly)
| Item | Month 1 | Month 6 | Month 12 |
|---|---|---|---|
| Infrastructure (VPS, Redis, etc.) | $100 | $500 | $2,000 |
| Verification compute (spot checks) | $50 | $300 | $1,500 |
| Payment processing | $0 | $200 | $1,000 |
| Domain, email, misc | $50 | $50 | $100 |
| Total | $200 | $1,050 | $4,600 |
No salary — founder lives off savings/other income. No office. No employees.
Revenue
| Month | Contributors | Customers | GMV | Platform Revenue (40%) |
|---|---|---|---|---|
| 3 | 500 | 5 | $1,000 | $400 |
| 6 | 2,000 | 20 | $5,000 | $2,000 |
| 9 | 8,000 | 50 | $20,000 | $8,000 |
| 12 | 15,000 | 100 | $50,000 | $20,000 |
| 18 | 50,000 | 300 | $250,000 | $100,000 |
Break-even: Month 5-6 ($1K MRR)
Ramen profitable: Month 8-9 ($5K MRR)
Real business: Month 14-16 (~$50K MRR)
Risks & Mitigations
| Risk | Severity | Mitigation |
|---|---|---|
| Cloud prices drop further, eliminating margin | High | Focus on workloads where our edge isn't price but distribution (privacy, geographic) |
| Contributors churn when earnings are low | High | Gamification, community, altruism angle; be honest about earnings from day 1 |
| Quality/reliability concerns from customers | High | Redundant execution, reputation system, SLA with refunds |
| Security (malicious contributors) | High | Sandboxing, result verification, encrypted data in transit |
| Single point of failure (coordinator) | Medium | Multi-region coordinator, contributor-side caching for resilience |
| Incumbents copy the model | Medium | Network effects + community moat; they'd cannibalize their own cloud revenue |
| Regulatory (data processing across jurisdictions) | Medium | Allow customers to restrict to specific countries; GDPR compliance mode |
Competitive Landscape
| Company | Model | Weakness vs IdleAI |
|---|---|---|
| Together.ai | Cloud GPU inference | Not distributed, higher cost basis |
| Vast.ai | GPU marketplace | GPU only, technical users only |
| Akash Network | Decentralized cloud (crypto) | Complex, crypto-native, not consumer-friendly |
| Render Network | Distributed GPU rendering | GPU only, rendering focus |
| Bittensor | Crypto-incentivized AI | Complex tokenomics, crypto-native |
| Salad.cloud | Distributed GPU cloud | GPU only, gaming rigs |
IdleAI's unique position: The only platform that makes ANY computer useful for AI workloads, not just GPU machines. The SETI@home of AI.
Key Metrics to Track
- Contributor metrics: Active contributors, uptime %, jobs completed, churn rate
- Customer metrics: API calls/day, revenue/customer, churn, NPS
- Platform metrics: Job completion time, quality score, cost per job
- Economics: GMV, take rate, contributor payout ratio, infrastructure cost %
Summary: What Makes This Work
- CPU workloads are the real business. LLM inference is marketing; Whisper, OCR, and embeddings are revenue.
- SETI@home proved the model. Millions of people will donate compute for a good cause; paying them is even better.
- 50-75% cheaper than cloud is possible because contributor hardware is a sunk cost with near-zero marginal cost.
- Batch tolerance is the key. We don't compete on latency; we compete on price for workloads where "done in 5 minutes" beats "done in 5 seconds at 10x the price."
- Solo founder can build this. Coordinator + agent + API + dashboard. The hard part isn't the tech — it's the two-sided marketplace.
The one thing that must be true: Enough customers willing to trade latency for 50-75% cost savings on AI workloads. If that's true, everything else follows.
Plan version: 2.0 — February 2026 Author: [Founder Name] Status: Pre-launch