Tools: I Built a Free Synthetic Data Generator β€” Here's How (React + Tailwind)

Tools: I Built a Free Synthetic Data Generator β€” Here's How (React + Tailwind)

Source: Dev.to

We've all been there β€” you need 10,000 realistic user records to test your app, or a batch of fake healthcare data for a demo, or transaction logs to stress-test a dashboard. So you either: ## What It Does ## πŸ“Š General Data ## πŸ₯ Healthcare / HIPAA-Safe ## Key Features ## ⚑ Generate Up to 50,000 Records Instantly ## 🎬 Custom Scenarios ## πŸ“€ Export Formats ## 🌱 Reproducible with Seeds ## How I Built It ## Tech Stack ## The Seeded Random Number Generator I got tired of this, so I built DataForge β€” a free, browser-based synthetic data generator that creates realistic fake data instantly. No signup, no server, no limits. DataForge generates realistic fake data across 9 data types in two categories: All data is 100% synthetic β€” no real patient data, no HIPAA risk. Everything runs client-side. No API calls, no server. Your data never leaves your browser. This is where it gets powerful. Instead of random data, you can define rules: Set a seed value and get the exact same data every time. Perfect for consistent test suites. Instead of using Math.random() (which isn't seedable), I built a custom PRNG based on a simple hash function: Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse CODE_BLOCK: typescript class SeededRandom { private seed: number; constructor(seed: number) { this.seed = seed; } next(): number { this.seed = (this.seed * 16807 + 0) % 2147483647; return this.seed / 2147483647; } nextInt(min: number, max: number): number { return Math.floor(this.next() * (max - min + 1)) + min; } pick<T>(array: T[]): T { return array[this.nextInt(0, array.length - 1)]; } } --- If you work in Health IT, you know the pain: ❌ You can't use real patient data for testing (HIPAA) ❌ Epic/Cerner sandboxes have limited test patients ❌ Synthea is powerful but requires Java + CLI setup ❌ Most online generators don't understand healthcare data DataForge fills this gap: βœ… FHIR-native export β€” Generate valid FHIR Bundles βœ… Real ICD-10 & CPT codes β€” Not random strings βœ… Clinical scenarios β€” Elderly cohorts, critical labs, denied claims βœ… Runs in the browser β€” Share the URL with your QA team βœ… 50K records β€” Enough for load testing --- What's Next I'm planning to add: Custom schema builder (define your own data types) API endpoint mode (use as a mock API) Relationships between tables (foreign keys) More healthcare standards (HL7v2 messages, C-CDA) Localization (non-US names, addresses, phone formats) If this tool saves you time, drop a ⭐ on the repo or leave a comment. I'd love to hear how you're using it! Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: typescript class SeededRandom { private seed: number; constructor(seed: number) { this.seed = seed; } next(): number { this.seed = (this.seed * 16807 + 0) % 2147483647; return this.seed / 2147483647; } nextInt(min: number, max: number): number { return Math.floor(this.next() * (max - min + 1)) + min; } pick<T>(array: T[]): T { return array[this.nextInt(0, array.length - 1)]; } } --- If you work in Health IT, you know the pain: ❌ You can't use real patient data for testing (HIPAA) ❌ Epic/Cerner sandboxes have limited test patients ❌ Synthea is powerful but requires Java + CLI setup ❌ Most online generators don't understand healthcare data DataForge fills this gap: βœ… FHIR-native export β€” Generate valid FHIR Bundles βœ… Real ICD-10 & CPT codes β€” Not random strings βœ… Clinical scenarios β€” Elderly cohorts, critical labs, denied claims βœ… Runs in the browser β€” Share the URL with your QA team βœ… 50K records β€” Enough for load testing --- What's Next I'm planning to add: Custom schema builder (define your own data types) API endpoint mode (use as a mock API) Relationships between tables (foreign keys) More healthcare standards (HL7v2 messages, C-CDA) Localization (non-US names, addresses, phone formats) If this tool saves you time, drop a ⭐ on the repo or leave a comment. I'd love to hear how you're using it! CODE_BLOCK: typescript class SeededRandom { private seed: number; constructor(seed: number) { this.seed = seed; } next(): number { this.seed = (this.seed * 16807 + 0) % 2147483647; return this.seed / 2147483647; } nextInt(min: number, max: number): number { return Math.floor(this.next() * (max - min + 1)) + min; } pick<T>(array: T[]): T { return array[this.nextInt(0, array.length - 1)]; } } --- If you work in Health IT, you know the pain: ❌ You can't use real patient data for testing (HIPAA) ❌ Epic/Cerner sandboxes have limited test patients ❌ Synthea is powerful but requires Java + CLI setup ❌ Most online generators don't understand healthcare data DataForge fills this gap: βœ… FHIR-native export β€” Generate valid FHIR Bundles βœ… Real ICD-10 & CPT codes β€” Not random strings βœ… Clinical scenarios β€” Elderly cohorts, critical labs, denied claims βœ… Runs in the browser β€” Share the URL with your QA team βœ… 50K records β€” Enough for load testing --- What's Next I'm planning to add: Custom schema builder (define your own data types) API endpoint mode (use as a mock API) Relationships between tables (foreign keys) More healthcare standards (HL7v2 messages, C-CDA) Localization (non-US names, addresses, phone formats) If this tool saves you time, drop a ⭐ on the repo or leave a comment. I'd love to hear how you're using it! - Write a janky script that generates "User_1, User_2, User_3..." - Spend 30 minutes configuring a CLI tool - Use a SaaS tool that limits you to 100 rows on the free tier - Users β€” Names, emails, phones, DOB, company, job title - Addresses β€” Street, city, state, ZIP, coordinates - Transactions β€” Amounts, merchants, categories, status - Patients β€” MRN, blood type, allergies, conditions, insurance - Medical Records β€” ICD-10 codes, vitals, visit types, clinical notes - Prescriptions β€” Real medications, dosages, DEA numbers, NDC codes - Lab Results β€” 26 real lab tests with reference ranges and flags - Insurance Claims β€” Charged/allowed/paid amounts, claim status - Healthcare Providers β€” NPI numbers, specialties, credentials - πŸ§“ Elderly Patient Cohort (ages 65+) - πŸ‘Ά Pediatric Cohort (ages 0-17) - 🚨 Critical Lab Values - πŸ’° High-Value Transactions ($5K+) - πŸ•΅οΈ Fraud Patterns - ❌ Denied Claims Batch - πŸ—‘οΈ Dirty/Messy Data (with nulls and errors) - Set null rates per field (0-80%) - Define value ranges (age 65-95, amount $10K+) - Force specific values (status = "Denied") - Add custom value pools - Control duplicate rates and error injection - JSON β€” Standard structured data - CSV β€” For spreadsheets and databases - SQL β€” Ready-to-run INSERT statements - HL7 FHIR β€” Healthcare interoperability standard - React 18 + TypeScript - Tailwind CSS β€” Dark HD interface - Vite β€” Fast builds - Custom seeded PRNG β€” No external faker library needed