⚠️ EDUCATIONAL USE ONLY: All patient data is 100% SYNTHETIC. NEVER use for actual clinical decisions or patient care.
Dataset Marketplace 100% Synthetic HIPAA-Free by Design

The Data Infrastructure Healthcare Education Is Missing

Real patient data is locked behind HIPAA, IRBs, and data use agreements. We generate realistic synthetic patient records so data scientists, ML engineers, and healthcare students can work with production-quality clinical data instantly.

What We Are

A dataset marketplace.
We sell data.

PatientDatasets.com is a storefront for downloadable synthetic patient records. The data happens to be healthcare data, but the buyer pool is anyone who needs realistic patient data -- from a data science student in India to a Kaggle competitor to a PhD candidate at MIT.

Primary Market

Data Scientists & ML Engineers

Feature engineering, model training, portfolio projects, pipeline testing, algorithm benchmarking. Healthcare AI is a $45B market by 2026 -- and all of it needs training data. We provide ML-ready files in CSV, JSON, FHIR R4, and Parquet.

Python pandas scikit-learn Jupyter

Secondary Market

Healthcare Students

Medical coding (CPC/CCS/CCA), billing, RCM, HIM, nursing, pharmacy, and informatics students who need realistic patient records for practice and certification prep. Paired with discipline-specific workbooks and answer keys.

ICD-10 CPT/HCPCS Claims EOBs

Tertiary Market

Administrators & Vendors

Healthcare administrators, EHR vendors, AI startups, clinical research teams, and public health analysts who need IRB-free, HIPAA-free synthetic patient data on demand for testing, demos, and development.

FHIR R4 HL7 C-CDA Custom

What We're Not

Clear boundaries, clear product.

We get asked this often enough that it's worth spelling out explicitly.

Not a coding school

We don't teach medical coding. We sell the patient records that coding students practice on. The data is the product.

Not a billing course

We don't run a billing curriculum. We provide the synthetic claims, EOBs, and remittance data that billing students use for hands-on practice.

How the Data Is Made

Built by The Generator. 100% synthetic.

Every patient record is AI-generated from scratch -- not sampled, not anonymized, not derived from real patients. The result is clinically realistic data that is HIPAA-free by design.

Dedicated Infrastructure

The Generator runs on a Spark DSX NAS with 11TB of storage. Records are generated continuously and validated through a multi-agent QA pipeline before release.

Zero Real Patient Data

Every name, diagnosis, lab result, and billing record is fictional. HIPAA's Privacy Rule, Security Rule, and Breach Notification Rule do not apply. No IRB. No DUA. No BAA. Buy and download.

Clinically Realistic

Records pass clinical plausibility checks: diagnoses map to appropriate procedures, lab values fall within realistic ranges, medication dosages match conditions, and billing codes align with documentation.

87 fields per record

Every patient record includes complete clinical and financial data across these categories:

Demographics
ICD-10-CM diagnoses
ICD-10-PCS procedures
CPT / HCPCS codes
Medications (NDC/RxNorm)
Lab results (LOINC)
Clinical notes
Encounter history
Charge capture
Claim submissions
Remittance / EOB
Denials & appeals

Three Product Lines

Data. Workbooks. Instructor resources.

The datasets are the foundation. Everything else builds on top of them.

Core Product

Synthetic Patient Datasets

Downloadable bundles of complete patient records with clinical and financial data. Available in CSV, JSON, FHIR R4, and Parquet. ML-ready out of the box.

CSV JSON FHIR R4 Parquet
$9 – $149 Browse →

Education

Student Workbooks

11 discipline-specific workbooks with homework assignments, practice exercises, case studies, and answer keys. Data science, medical coding, billing, RCM, HIM, nursing, pharmacy, and more.

Data Science Coding Billing +8 more
$39 – $49 View All →

For Instructors

Instructor Resources

Answer keys, auto-grading scripts, rubric templates, syllabus templates, and semester planning guides. Professor-only materials sold separately from student workbooks.

Answer Keys Auto-Graders Rubrics
$99 – $1,499 Details →

Get in Touch

Questions? We're here.

Whether you need a custom dataset, have a question about our workbooks, or want to discuss enterprise pricing -- drop us a line.

support@patientdatasets.com →