projects

IdleAI — Distributed Compute for AI Workloads

February 15, 2026
Updated Mar 21, 2026
business-plandistributed-computeAIstartup

IdleAI: Distributed AI Compute Platform

Revised Business Plan — February 2026

Tagline: "Earn money from your idle computer. Help democratize AI."

Founder: Solo technical founder, bootstrapped.


Executive Summary

IdleAI is a distributed compute platform that turns idle consumer hardware into an AI processing network. Unlike competitors fixated on LLM inference (which requires expensive GPUs), IdleAI recognizes that most AI workloads are not LLM inference — and many run perfectly well on ordinary CPUs.

The platform operates two contributor tiers:

  • Tier 1 ("Everyone"): Any modern computer contributes CPU cycles for embedding generation, speech-to-text, OCR, reranking, and other lightweight AI tasks. Install and forget.
  • Tier 2 ("GPU Owners"): Discrete GPU or Apple Silicon 16GB+ users serve LLM inference for small-to-medium models.

This multi-workload approach solves the fundamental problem that killed earlier distributed AI attempts: there aren't enough GPUs in consumer hands to build a reliable LLM inference network, but there are 1.5 billion PCs that can do everything else.

Target: $50K MRR within 18 months. Bootstrapped to profitability.


The Core Insight

The AI infrastructure market treats "AI compute" as synonymous with "GPU compute." This is wrong.

The breakdown of a typical AI application's compute spend:

Workload% of Total Compute CostGPU Required?
Embedding generation15-25%No
Speech-to-text10-15%No (CPU is fine)
Document processing/OCR5-10%No
Reranking & retrieval5-10%No
Text preprocessing3-5%No
Image classification5-8%Helps, not required
LLM inference35-50%Yes

50-65% of AI compute spend does NOT require a GPU. That's the market IdleAI attacks first.


TIER 1: "Everyone" — CPU Contributors

Addressable Contributor Base

  • ~1.5B active PCs worldwide
  • ~500M meet minimum specs (4+ cores, 8GB+ RAM, broadband)
  • Realistic early target: 10K-100K contributors (comparable to Folding@home's sustained base)
  • At SETI@home's peak: 5.2M active users — proves the install-and-forget model works

Workload 1: Embedding Generation (text → vectors)

What it is: Converting text into numerical vectors for search, recommendation, and RAG systems. Every AI app with a knowledge base needs this.

Market demand:

  • OpenAI charges $0.13 per 1M tokens for text-embedding-3-large, $0.02/1M for text-embedding-3-small
  • Cohere charges $0.10/1M tokens for embed-v3
  • Google charges $0.025/1M tokens (Gecko)
  • Market size: Estimated $800M-1.2B/yr for embedding APIs (growing 40%+ YoY as RAG adoption explodes)
  • Every company building RAG, semantic search, or recommendation systems needs continuous embedding generation

CPU performance:

  • Model: all-MiniLM-L6-v2 (22M params) or BGE-small (33M params)
  • Typical laptop (8GB RAM, 4-core i5/M1): ~200-400 embeddings/sec (short texts, 128 tokens)
  • Typical desktop (16GB RAM, 8-core): ~500-1000 embeddings/sec
  • For comparison: A single A100 GPU does ~5,000-10,000/sec
  • Key insight: 10 desktops ≈ 1 GPU for embeddings. CPUs are legitimately competitive here because embedding models are tiny.

Pricing:

  • Cloud: $0.02-0.13 per 1M tokens
  • IdleAI price: $0.005-0.03 per 1M tokens (50-75% cheaper)
  • Contributor payout: 60% of revenue

Unit economics per device (desktop, 8 idle hours/day):

  • Throughput: ~500 embeddings/sec × 28,800 sec = ~14.4M embeddings/day
  • At ~100 tokens/embedding = 1.44B tokens/day
  • Revenue at $0.01/1M tokens = $0.014/day = ~$0.43/month per device
  • Contributor earns: ~$0.26/month
  • Electricity cost for contributor: ~$0.02/day (incremental CPU load ~15W) = $0.60/month
  • ⚠️ Reality check: At current embedding prices, contributors LOSE money on electricity. This only works if: (a) prices stay above $0.05/1M tokens (use higher-quality models like E5-large), (b) the work is bundled with other tasks, or (c) contributors don't care about profit (altruism/gamification model like SETI@home)

Revised approach: Focus on larger, higher-quality embedding models (E5-large, BGE-large, 335M params) where:

  • Cloud pricing is higher ($0.10-0.20/1M tokens)
  • CPU still handles them (50-150 embeddings/sec on desktop)
  • Revenue per device: $0.05-0.15/day — still marginal but approaching break-even

Verdict: ⭐⭐⭐ High volume, but razor-thin margins. Best as part of a bundle, not standalone.


Workload 2: Speech-to-Text (Whisper)

What it is: Transcribing audio to text. Massive demand from podcasters, meeting tools, accessibility, call centers.

Market demand:

  • OpenAI Whisper API: $0.006/minute
  • Google Speech-to-Text: $0.016-0.048/minute (depending on model)
  • AWS Transcribe: $0.024/minute
  • Deepgram: $0.0043-0.0145/minute
  • Market size: ~$5-8B/yr for speech-to-text (includes enterprise, call centers, media)
  • Growing rapidly with meeting transcription (Otter, Fireflies), podcast tools, accessibility mandates

CPU performance:

  • Whisper-small (244M params): Processes 1 minute of audio in ~30-60 seconds on a modern 8-core CPU (real-time or near-real-time)
  • Whisper-medium (769M params): ~2-4 minutes per minute of audio on CPU — too slow for real-time, fine for batch
  • Whisper-large (1.5B params): ~5-10 minutes per minute on CPU — batch only
  • Whisper-tiny (39M params): 5-10x real-time on CPU — fast but lower quality
  • Key insight: Whisper-small on CPU is competitive for batch transcription. Not real-time, but most transcription is batch (upload file → get transcript later).

Pricing:

  • Cloud: $0.006-0.048/minute
  • IdleAI price: $0.003-0.008/minute (target the OpenAI price point, undercut by 30-50%)
  • Contributor payout: 60% of revenue

Unit economics per device (desktop, 8 hours idle):

  • Whisper-small: ~480 minutes of audio processed per 8-hour shift (1:1 real-time)
  • Revenue at $0.004/minute = $1.92/day
  • Contributor earns: $1.15/day = $34.50/month
  • Electricity cost: ~$0.05/day (CPU fully loaded ~45W incremental)
  • Contributor is profitable! This is by far the best CPU workload.

Why it works: Audio transcription has a much higher $/compute ratio than text embeddings. Processing 1 minute of audio takes real work and commands real money.

Verdict: ⭐⭐⭐⭐⭐ The killer workload for Tier 1. High margin, CPU-competitive, massive demand, batch-tolerant.


Workload 3: Image Classification / Object Detection (Small Models)

What it is: Classifying images, detecting objects, content moderation, visual search preprocessing.

Market demand:

  • Google Vision API: $1.50-3.50 per 1,000 images (label detection)
  • AWS Rekognition: $1.00-1.20 per 1,000 images
  • Azure Computer Vision: $1.00-1.50 per 1,000 images
  • Market: ~$2-4B/yr for image analysis APIs
  • Use cases: content moderation (huge), product categorization, medical image triage, security

CPU performance:

  • MobileNet v2 (3.4M params): ~50-100 images/sec on modern CPU
  • EfficientNet-B0 (5.3M params): ~20-40 images/sec
  • YOLO-tiny (object detection): ~5-15 FPS on CPU
  • ResNet-50 (25M params): ~10-20 images/sec
  • Sufficient for batch processing. Not for real-time video.

Pricing:

  • Cloud: $1.00-3.50 per 1,000 images
  • IdleAI: $0.30-0.80 per 1,000 images (60-75% cheaper)
  • Contributor payout: 60%

Unit economics per device (desktop, 8 hours):

  • MobileNet: ~50 imgs/sec × 28,800 sec = 1.44M images/day
  • Revenue at $0.50/1K images = $720/day ← This can't be right
  • Reality check: Cloud prices are high because they include model development, fine-tuning, and complex multi-label classification with high accuracy. A raw MobileNet pass is not equivalent to Google Vision API. Realistic equivalent pricing for basic classification: $0.05-0.10/1K images
  • Revised revenue: $72-144/day per device ← Still too high. The issue is demand won't saturate a device.
  • Actual constraint: demand, not supply. A single desktop can process millions of images/day. You'd need massive customer volume to keep even one machine busy at competitive prices.

Verdict: ⭐⭐⭐ Good margins per image, but hard to generate enough demand to keep contributors busy. Best as an occasional workload mixed in.


Workload 4: RAG Retrieval and Reranking

What it is: After vector search returns candidate documents, a reranking model scores them for relevance. Critical for RAG quality.

Market demand:

  • Cohere Rerank: $2.00 per 1,000 searches
  • Jina Reranker: $0.50-1.00 per 1,000 queries
  • Market: Nascent but growing fast — every RAG pipeline needs this. Estimated $200-500M/yr and doubling annually.

CPU performance:

  • Cross-encoder reranking models (e.g., ms-marco-MiniLM-L-6-v2): ~50-100 query-document pairs/sec on CPU
  • A typical rerank call: 1 query × 20 documents = 20 pairs → ~0.2-0.4 seconds on CPU
  • CPU is perfectly adequate for reranking. These are small BERT-class models.

Pricing:

  • Cloud: $0.50-2.00 per 1K searches
  • IdleAI: $0.20-0.50 per 1K searches
  • Contributor payout: 60%

Unit economics per device (desktop, 8 hours):

  • ~200 rerank queries/sec (assuming 20 docs each, 100 pairs/sec, batch-optimized)
  • 200 × 28,800 = 5.76M queries/day
  • Revenue at $0.30/1K = $1,728/day ← Again, demand-constrained, not supply
  • Realistic utilization: Maybe 1-5% → $17-86/day per device at full utilization windows
  • Same issue as image classification: supply will vastly exceed demand initially

Verdict: ⭐⭐⭐ Good margins, CPU-native, but demand-constrained per device. Best bundled.


Workload 5: Text Preprocessing and Tokenization

What it is: Cleaning, chunking, tokenizing text for downstream AI processing.

Market demand:

  • Typically bundled into other services, not sold standalone
  • Some demand in data pipeline services
  • Very low $/compute — this is not a viable standalone revenue workload

Verdict: ⭐ Too cheap to meter. Not worth the orchestration overhead. Skip as a paid workload; include as "free value-add" when bundling with other services.


Workload 6: Synthetic Data Generation (Small Models)

What it is: Using small language models (Phi-3, TinyLlama, etc.) to generate training data, augment datasets.

Market demand:

  • Emerging market: Scale AI, Snorkel, Gretel charge $5-50/hr for data generation pipelines
  • Market: ~$1-2B/yr for synthetic data (Gartner predicts 60% of AI training data will be synthetic by 2026)

CPU performance:

  • Phi-3-mini (3.8B): ~2-5 tokens/sec on CPU with 16GB RAM — painfully slow
  • TinyLlama (1.1B): ~10-20 tokens/sec on CPU — usable for batch
  • Marginal on CPU. This workload really belongs in Tier 2 for any model above 1B params.

Verdict: ⭐⭐ Only viable with sub-1B models on CPU. Move to Tier 2 for anything useful.


Workload 7: OCR and Document Processing

What it is: Extracting text from images/PDFs, structured data extraction from documents.

Market demand:

  • AWS Textract: $1.50 per 1,000 pages (basic), $15 per 1,000 pages (forms/tables)
  • Google Document AI: $1.50-65 per 1,000 pages depending on processor
  • Azure Form Recognizer: $1.50-50 per 1,000 pages
  • Market: ~$3-5B/yr for document processing (insurance, legal, finance, healthcare)

CPU performance:

  • Tesseract OCR: ~1-3 pages/sec on modern CPU (basic text extraction)
  • PaddleOCR: ~2-5 pages/sec
  • DocTR (deep learning OCR): ~0.5-2 pages/sec on CPU
  • Layout analysis + table extraction: ~0.2-0.5 pages/sec
  • CPU is the standard for OCR. GPUs help but aren't required.

Pricing:

  • Cloud: $1.50-15.00 per 1,000 pages
  • IdleAI: $0.50-5.00 per 1,000 pages (60-70% cheaper)
  • Contributor payout: 60%

Unit economics per device (desktop, 8 hours):

  • ~2 pages/sec × 28,800 sec = 57,600 pages/day
  • Revenue at $1.00/1K pages = $57.60/day
  • Contributor earns: $34.56/day = $1,037/month at full utilization
  • Realistic utilization (10-20%): $100-200/month
  • Electricity: ~$1.50/day
  • Highly profitable for contributors

Verdict: ⭐⭐⭐⭐⭐ Excellent workload. CPU-native, high margins, massive enterprise demand, batch-tolerant.


TIER 2: "GPU Owners" — LLM Inference

Addressable Contributor Base

SegmentDevicesVRAM/RAMModels They Can Run
NVIDIA 8GB VRAM (RTX 3060, 4060, etc.)~70M8-12GB7-8B models (Llama 3 8B, Mistral 7B, Phi-3)
NVIDIA 16GB+ (RTX 3090, 4080, 4090)~15M16-24GBUp to 30B models, 70B quantized
Apple Silicon 16GB+~50-80M16-24GB unified7-13B models comfortably, 30B quantized
Apple Silicon 32GB+~15-25M32GB+ unifiedUp to 70B quantized
Crypto miners (idle)~5-10M8-16GB typically7-8B models
Total addressable~150-200M devices

Realistic early contributors: 1K-10K GPU nodes (need to prove payout before scaling)

LLM Inference Workloads

Market demand:

  • OpenAI API revenue: ~$5-10B/yr (2025)
  • Anthropic, Google, Mistral combined: ~$3-5B/yr
  • Total LLM API market: ~$15-25B/yr and growing 50-100% annually
  • Together.ai, Fireworks, Groq, etc. selling inference at 50-80% discount to OpenAI

Performance on consumer hardware:

ModelHardwareTokens/secLatency (first token)
Llama 3 8B Q4RTX 4060 8GB40-60 tok/s200-400ms
Llama 3 8B Q4M2 16GB25-40 tok/s300-600ms
Llama 3 70B Q4RTX 4090 24GB15-25 tok/s1-3s
Llama 3 70B Q4M3 Max 64GB15-20 tok/s2-4s
Mistral 7B Q4RTX 3060 12GB35-50 tok/s200-500ms
Phi-3 mini 3.8BRTX 306060-90 tok/s100-200ms

Cloud comparison: A100 serves Llama 3 8B at ~200+ tok/s per concurrent user. Consumer hardware is 3-10x slower per device, but effectively free compute (contributor provides the hardware).

Pricing:

  • OpenAI GPT-4o-mini: $0.15/$0.60 per 1M tokens (input/output)
  • Together.ai Llama 3 8B: $0.10/$0.10 per 1M tokens
  • IdleAI Llama 3 8B: $0.03-0.06 per 1M tokens (70-80% cheaper than Together)
  • IdleAI Llama 3 70B: $0.20-0.40 per 1M tokens (70% cheaper than cloud)

Unit economics per device (RTX 4060, 8 hours idle):

  • Llama 3 8B at 50 tok/s = 1.44M tokens/8hr shift
  • Revenue at $0.05/1M tokens (blended) = $0.07/day ← Terrible
  • Reality check: At current open-model pricing ($0.10/1M tokens on Together.ai), there's almost no margin for a distributed network.

The LLM pricing problem: Open model inference is commoditized to near-zero. Together.ai charges $0.10/1M tokens for Llama 3 8B. To undercut them AND pay contributors, you'd need volume in the billions of tokens/day.

Revised LLM approach: Target use cases where latency tolerance is high and price sensitivity is extreme:

  • Batch processing (not real-time chat)
  • Dev/test environments (developers testing against Llama before deploying to prod)
  • Fine-tuned model hosting (run YOUR fine-tuned model — cloud providers charge 2-5x for custom models)
  • Privacy-sensitive inference (data never leaves the contributor's machine — customer sends encrypted prompt)

Revised unit economics (batch LLM, RTX 4060, 8 hours):

  • Higher throughput in batch mode: ~80 tok/s sustained
  • 2.3M tokens/day
  • Revenue at $0.15/1M tokens (custom/batch premium): $0.35/day
  • Contributor earns: $0.21/day = $6.30/month
  • Electricity: ~$0.15/day (150W GPU × 8hrs × $0.12/kWh)
  • Net contributor profit: ~$1.80/month ← Still marginal

The honest answer on LLM inference: At 2026 pricing, distributed LLM inference is a loss leader or community feature, not a profit center. It's useful for:

  1. Marketing ("run Llama on our network!")
  2. Handling overflow from CPU workloads
  3. Custom/fine-tuned model hosting (where cloud alternatives are expensive)
  4. Privacy-focused inference

Verdict: ⭐⭐ Important for marketing and product positioning, but margins are near-zero. Not the revenue driver.


Revenue Model & Mix

Realistic Revenue Mix (Year 1-2)

Workload% of RevenueWhy
Speech-to-text (Whisper)35%Best CPU economics, huge demand
OCR / Document processing30%High margins, enterprise buyers
Embedding generation10%High volume, low margin
Reranking / retrieval10%Growing demand, CPU-native
Image classification5%Niche but profitable
LLM inference10%Loss leader, marketing value

Key takeaway: 90% of revenue comes from CPU workloads. LLM inference is the shiny marketing story; CPU workloads are the actual business.

Pricing Structure

For customers (API buyers):

  • Pay-as-you-go API pricing (50-75% cheaper than cloud incumbents)
  • Volume discounts at $1K/mo, $5K/mo, $10K/mo tiers
  • SLA tiers: Best-effort (cheapest), Guaranteed (higher price, redundant processing)

For contributors (compute sellers):

  • 60% revenue share on all workloads
  • Weekly payouts via Stripe/PayPal (minimum $5)
  • Dashboard showing earnings, uptime, jobs completed
  • Bonus multipliers for reliability (99%+ uptime) and speed

Unit Economics Summary

Per-Device Monthly Economics (8 hrs idle/day, 20% utilization)

Device TypePrimary WorkloadGross RevenueContributor Payout (60%)Electricity CostNet to Contributor
Laptop (8GB, CPU)Whisper + embeddings$8-15/mo$5-9/mo$3-5/mo$0-4/mo
Desktop (16GB, CPU)Whisper + OCR$20-40/mo$12-24/mo$5-8/mo$4-16/mo
Desktop + RTX 4060Whisper + OCR + LLM$25-45/mo$15-27/mo$8-12/mo$3-15/mo
Mac M2 16GBWhisper + LLM$15-30/mo$9-18/mo$2-4/mo$5-14/mo
Desktop + RTX 4090All workloads$35-60/mo$21-36/mo$12-18/mo$3-18/mo

Honest assessment: Most Tier 1 contributors will earn $0-15/month — barely covering electricity. The pitch needs to be:

  1. "Beer money" — not "quit your job" income
  2. Altruism angle — "help democratize AI" (like SETI@home, most people didn't care about earnings)
  3. Gamification — leaderboards, badges, contribution streaks
  4. The real earners are people with multiple machines or always-on desktops

Platform Economics

MetricMonth 6Month 12Month 18
Contributors2,00015,00050,000
Active (8+ hrs/day)8006,00020,000
API customers20100300
Monthly GMV$5K$50K$250K
Platform revenue (40%)$2K$20K$100K
Infrastructure costs$3K$8K$20K
Net-$1K$12K$80K

Product & App Experience

Tier 1: Install and Forget

Onboarding (< 2 minutes):

  1. Download app (macOS, Windows, Linux)
  2. Create account (email or Google)
  3. App auto-benchmarks hardware (30 seconds)
  4. Set preferences: "Run when idle" / "Run always" / "Run on schedule"
  5. Set resource limits: "Use up to 50% CPU" / "Max 4GB RAM"
  6. Done. App sits in system tray.

Ongoing experience:

  • System tray icon shows status (idle/working/earning)
  • Weekly earning summary notification
  • Monthly payout
  • No model selection, no configuration, no technical knowledge needed
  • Auto-updates models and workload types silently

Technical architecture:

  • Lightweight agent (~50MB install)
  • Pulls work units from central coordinator
  • Executes in sandboxed container (security)
  • Returns results, gets credit
  • Coordinator handles load balancing, quality verification, redundancy

Tier 2: Light Configuration

Additional onboarding for GPU users:

  1. App detects GPU/Apple Silicon automatically
  2. Shows: "You qualify for premium workloads! Enable GPU acceleration?"
  3. Optional: Choose which models to download (defaults to recommended)
  4. First model download: 4-8GB, takes 5-10 minutes
  5. Done. GPU work is prioritized when available.

Ongoing experience:

  • Same as Tier 1 plus GPU utilization stats
  • Can opt into specific models or leave on auto
  • Higher earnings dashboard
  • Model management: delete/add models as desired

Why Multi-Workload Beats Pure LLM

FactorLLM-Only PlatformIdleAI Multi-Workload
Contributor base~50-150M (GPU owners only)~500M+ (any modern PC)
Contributor earnings$0-6/month (razor thin)$5-40/month (viable on CPU workloads)
Revenue per API call$0.00001-0.0001$0.001-0.01 (Whisper/OCR much higher)
UtilizationLow (LLM demand is spiky)High (diverse workloads fill gaps)
CompetitionTogether.ai, Fireworks, Groq (well-funded)Few competitors in distributed CPU AI
Network effectNeed massive GPU fleet for reliabilityEven small network handles batch CPU work
Cold startHard (need many GPUs before useful)Easy (a few hundred CPUs serve real customers)

The fatal flaw of LLM-only distributed platforms:

  • Cloud LLM inference is plummeting in price (80% drop in 18 months)
  • A single H100 replaces hundreds of consumer GPUs
  • Latency requirements for chat mean you can't distribute across home internet
  • Consumer GPUs are only idle 8-12 hrs/day — unreliable for SLA-bound customers

Why CPU workloads work better:

  • Batch-tolerant (nobody needs an embedding in <100ms)
  • CPUs are always available (don't conflict with gaming)
  • No VRAM limitations
  • Workloads are embarrassingly parallel (split across thousands of machines trivially)
  • Enterprise customers are used to paying real money for OCR/transcription

Go-to-Market Strategy

Phase 1: Supply-Side (Months 1-3)

Goal: 1,000 contributors

  • Launch on Hacker News, Reddit r/passive_income, r/beermoney
  • "SETI@home for AI" narrative — nostalgia + novelty
  • Open-source the contributor agent (trust + contributions)
  • Benchmark tool: "See how much your computer could earn" (viral calculator)
  • Discord community for contributors

Phase 2: Demand-Side (Months 3-6)

Goal: 20 paying API customers

  • Target indie developers and small startups
  • API compatible with OpenAI/Cohere endpoints (drop-in replacement)
  • Free tier: 10K API calls/month
  • Content marketing: "We cut our embedding costs by 70%" case studies
  • Anchor customers: Approach podcasters/YouTubers for batch transcription (clear value prop, clear savings)

Phase 3: Scale (Months 6-18)

Goal: $50K+ MRR

  • Enterprise pilots (SOC2 compliance, dedicated contributor pools)
  • Geographic distribution as a feature ("process data in 50+ countries")
  • Specialized workload partnerships (legal OCR, medical transcription)
  • Referral program: contributors earn bonus for inviting others

Marketing Messages

For contributors:

  • "Earn money while you sleep. Your computer works, you get paid."
  • "Join 10,000 people powering the future of AI — from their living rooms."
  • "Your idle laptop could earn $5-20/month. Here's how."

For customers:

  • "AI APIs at 70% less. Same accuracy, no vendor lock-in."
  • "Transcribe 10,000 hours of audio for the price of 3,000."
  • "The world's most distributed AI compute network."

Technical Architecture (Solo Founder Scope)

MVP Components (Month 1-2)

  1. Coordinator Service — Go or Rust, deployed on a single $20/mo VPS

    • Work queue (Redis)
    • Result verification (run same job on 2 contributors, compare)
    • API gateway for customers
    • Contributor management
  2. Contributor Agent — Python + Electron or Tauri wrapper

    • Model runtime (ONNX Runtime for CPU, llama.cpp for GPU)
    • Sandboxed execution
    • Auto-update mechanism
    • System tray app
  3. API Layer — OpenAI-compatible endpoints

    • /v1/embeddings
    • /v1/audio/transcriptions
    • /v1/images/classify (custom)
    • /v1/documents/ocr (custom)
    • /v1/chat/completions (for LLM)
  4. Dashboard — Simple React app

    • Contributor: earnings, stats, settings
    • Customer: usage, billing, API keys

Quality Assurance

  • Redundant execution: Each job runs on 2+ contributors; results compared
  • Spot checks: 5% of jobs also run on a trusted server for ground truth
  • Reputation system: Contributors build trust score over time; high-reputation contributors get single-execution jobs (more efficient)
  • Cryptographic verification: Hash inputs/outputs to prevent tampering

What to Build First

  1. Whisper transcription (highest margin, clearest value prop)
  2. Embedding generation (highest volume, easiest to implement)
  3. OCR (high margin, enterprise appeal)
  4. LLM inference (last — hardest, lowest margin, but needed for marketing story)

Financial Projections (Bootstrapped)

Costs (Monthly)

ItemMonth 1Month 6Month 12
Infrastructure (VPS, Redis, etc.)$100$500$2,000
Verification compute (spot checks)$50$300$1,500
Payment processing$0$200$1,000
Domain, email, misc$50$50$100
Total$200$1,050$4,600

No salary — founder lives off savings/other income. No office. No employees.

Revenue

MonthContributorsCustomersGMVPlatform Revenue (40%)
35005$1,000$400
62,00020$5,000$2,000
98,00050$20,000$8,000
1215,000100$50,000$20,000
1850,000300$250,000$100,000

Break-even: Month 5-6 ($1K MRR) Ramen profitable: Month 8-9 ($5K MRR) Real business: Month 14-16 (~$50K MRR)


Risks & Mitigations

RiskSeverityMitigation
Cloud prices drop further, eliminating marginHighFocus on workloads where our edge isn't price but distribution (privacy, geographic)
Contributors churn when earnings are lowHighGamification, community, altruism angle; be honest about earnings from day 1
Quality/reliability concerns from customersHighRedundant execution, reputation system, SLA with refunds
Security (malicious contributors)HighSandboxing, result verification, encrypted data in transit
Single point of failure (coordinator)MediumMulti-region coordinator, contributor-side caching for resilience
Incumbents copy the modelMediumNetwork effects + community moat; they'd cannibalize their own cloud revenue
Regulatory (data processing across jurisdictions)MediumAllow customers to restrict to specific countries; GDPR compliance mode

Competitive Landscape

CompanyModelWeakness vs IdleAI
Together.aiCloud GPU inferenceNot distributed, higher cost basis
Vast.aiGPU marketplaceGPU only, technical users only
Akash NetworkDecentralized cloud (crypto)Complex, crypto-native, not consumer-friendly
Render NetworkDistributed GPU renderingGPU only, rendering focus
BittensorCrypto-incentivized AIComplex tokenomics, crypto-native
Salad.cloudDistributed GPU cloudGPU only, gaming rigs

IdleAI's unique position: The only platform that makes ANY computer useful for AI workloads, not just GPU machines. The SETI@home of AI.


Key Metrics to Track

  1. Contributor metrics: Active contributors, uptime %, jobs completed, churn rate
  2. Customer metrics: API calls/day, revenue/customer, churn, NPS
  3. Platform metrics: Job completion time, quality score, cost per job
  4. Economics: GMV, take rate, contributor payout ratio, infrastructure cost %

Summary: What Makes This Work

  1. CPU workloads are the real business. LLM inference is marketing; Whisper, OCR, and embeddings are revenue.
  2. SETI@home proved the model. Millions of people will donate compute for a good cause; paying them is even better.
  3. 50-75% cheaper than cloud is possible because contributor hardware is a sunk cost with near-zero marginal cost.
  4. Batch tolerance is the key. We don't compete on latency; we compete on price for workloads where "done in 5 minutes" beats "done in 5 seconds at 10x the price."
  5. Solo founder can build this. Coordinator + agent + API + dashboard. The hard part isn't the tech — it's the two-sided marketplace.

The one thing that must be true: Enough customers willing to trade latency for 50-75% cost savings on AI workloads. If that's true, everything else follows.


Plan version: 2.0 — February 2026 Author: [Founder Name] Status: Pre-launch