The integration team had spent four months building against the EHR vendor's sandbox. The sandbox was STU3. The documentation was STU3. The example payloads were STU3. When they finally got production access — after contracting was done, the security review was complete, and the go-live date was on the calendar — they discovered that production ran FHIR R4. The Patient resource had changed enough that their parser threw errors on names and addresses. The MedicationRequest resource had been restructured in ways their requester extraction code couldn't handle. Their Observation panel-result aggregation silently dropped half the lab results because the related element it was traversing no longer existed. Their launch date slipped by six weeks.

That scenario plays out regularly enough that it has become a known hazard in healthcare interoperability work. Sandbox environments lag behind production. Vendors migrate at different paces. Documentation doesn't always reflect reality. And FHIR versions are different enough that assuming compatibility between STU3 and R4 is a mistake that shows up in production — not in testing, not in code review, not in any stage where it is cheap to fix.

It is 11 PM on a Sunday. Go-live is in nine hours. Your senior engineer is on a call with the EHR vendor's support team, walking through JSON payloads. The vendor's sandbox was STU3. Production is R4. You didn't know until access was provisioned forty-eight hours ago. Your requester path is wrong. Your Observation panel logic is wrong. Your Coverage grouping structure is wrong. You have until 8 AM to fix three resource parsers in a codebase you've been building for four months. This is not a hypothetical. This is Tuesday night for real integration teams, more often than anyone in healthcare IT publicly admits.

This article is a comprehensive technical guide to every consequential difference between FHIR STU3 and FHIR R4 — the specific resource-level changes, the federal regulatory context that makes R4 non-negotiable for US healthcare, the testing strategies that catch version mismatches before production, and the synthetic data you need to exercise every code path for both versions. It is long because the subject is genuinely complex, and the teams who get burned are the ones who thought they could skim.

A Brief History of FHIR Versions: From DSTU1 to R5

HL7 International published the first FHIR draft specification in 2012, and the standard has evolved through several major versions since. Understanding that history matters because healthcare systems don't upgrade on synchronized schedules — you will encounter all of these versions in the wild, and knowing the lineage helps you understand why specific design decisions were made.

DSTU1 (Draft Standard for Trial Use 1, 2014) was an early public draft, primarily used by early adopters and researchers to explore the concept. Very few production systems were built on DSTU1, and essentially none survive in active use today. If you encounter a DSTU1 system, you are dealing with legacy infrastructure that is a decade overdue for replacement.

DSTU2 (2015) was the version that first achieved meaningful adoption. The Argonaut Project — a coalition of EHR vendors including Epic, Cerner, athenahealth, and others — built their initial SMART on FHIR implementations on DSTU2. You will occasionally encounter DSTU2 endpoints at older enterprise EHR installations. The Argonaut profiles based on DSTU2 are still technically available but represent a dead end for new development.

STU3 (Standard for Trial Use 3, released March 2017) was a substantial maturation of the standard. It introduced cleaner resource definitions, better extension mechanisms, and the conformance resource framework. Many EHR vendors did significant implementation work against STU3, and it became the de facto standard for the 2018-2020 period. STU3 is still running in production at a significant number of health systems — estimates suggest somewhere between 25-35% of US hospital FHIR endpoints still serve STU3 as of 2026, particularly at smaller or community hospitals with slower upgrade cycles.

R4 (Release 4, released January 2019) was the first release designated as "normative" for its core components. In FHIR terminology, normative means that the specification has been through a rigorous ballot and review process, and the core elements will not change in backward-incompatible ways without a new major release. R4 is the version mandated by US federal regulation, required by the ONC's 21st Century Cures Act final rule and the CMS Interoperability and Patient Access final rule. It is the current standard for all new healthcare application development in the US.

R4B (2022) was a minor "point release" that addressed specific issues in a handful of resources — primarily the Medication resources, Citation, and several clinical reasoning resources. R4B is backward compatible with R4 in almost all respects. Most development teams do not need to treat R4B as a separate target; if you build against R4, you will handle R4B correctly for any resources that weren't affected by the R4B changes. The affected resources are largely niche; only teams specifically working with citation management or the immunization domain need to pay close attention to R4B differences.

R5 (released March 2023) is the most recent major release. It introduces significant changes to several resources and adds new capabilities around subscriptions, clinical reasoning, and cross-version comparisons. R5 is not yet required by any federal mandate, and production R5 implementations were sparse as of early 2026. You should be aware of R5 if you are designing a new platform that will have a multi-year lifespan, but for the vast majority of integration projects, R4 is the correct target and R5 is a planning consideration, not an implementation requirement.

For new integrations in 2026, the answer is unambiguous: build against R4. R4 is what the federal mandates require. It's what modern EHR vendors expose in their production APIs. It's what the US Core profiles are built on. The only valid reason to build against STU3 in 2026 is maintaining a legacy integration that cannot be migrated — and even then, you should be planning the migration.

The Federal Mandate Landscape: Why R4 Is Not Optional

Two major federal regulatory actions, both finalized in 2020 and with compliance timelines that have now largely passed, make FHIR R4 the mandatory standard for US healthcare interoperability. Understanding both rules in detail is important not just for compliance but for understanding why the entire healthcare vendor ecosystem has converged on R4 and will continue to do so.

21st Century Cures Act and the ONC Information Blocking Rule

The 21st Century Cures Act, signed into law in December 2016, included sweeping health IT provisions designed to accelerate interoperability and prohibit practices that restrict patient data access. The Office of the National Coordinator for Health Information Technology (ONC) finalized the implementing regulation — 45 CFR Parts 170 and 171 — in May 2020.

The ONC rule has two critical components for FHIR integration developers. First, it mandates that certified health IT — meaning EHR systems used by providers who participate in Medicare and Medicaid — must support standardized API access. Specifically, it requires implementation of the HL7 FHIR standard release 4 (FHIR R4) for patient access APIs, and it references the US Core Implementation Guide (US Core IG) as the required profile set.

Second, the information blocking provisions establish that entities covered by the rule — healthcare providers, health IT developers, and health information networks — may not engage in practices that interfere with the access, exchange, or use of electronic health information. The practical effect is that EHR vendors cannot charge unreasonable fees to provide API access, cannot impose technical barriers that prevent access to data that patients and authorized third parties have rights to, and cannot require proprietary formats when standardized formats are available.

The ONC rule's API certification criteria specifically reference FHIR R4 and US Core. An EHR that wants to maintain its ONC certification — which is effectively required to participate in Medicare and Medicaid programs — must expose production patient data via a FHIR R4 API that conforms to the US Core profiles. This is why every major EHR vendor has R4 production APIs: regulatory compliance, not interoperability idealism.

CMS Interoperability and Patient Access Final Rule (CMS-9115-F)

The CMS Interoperability and Patient Access Final Rule, finalized simultaneously with the ONC rule in May 2020, addresses a different sector: health insurance plans. It requires Medicare Advantage organizations, Medicaid managed care plans, CHIP managed care entities, and qualified health plan issuers on the federal exchange to implement patient access APIs.

The CMS rule explicitly requires FHIR R4, the HL7 SMART on FHIR authorization framework, and specifically the CARIN Alliance Blue Button 2.0 Implementation Guide for claims and encounter data. The compliance deadlines for most payer categories were July 2021. Payers that failed to comply face enforcement action from CMS.

For integration developers, this means: if you are building anything that touches a payer's patient data API — a patient portal, a care gap analytics tool, a prior authorization workflow, a member-facing mobile application — you will be consuming FHIR R4 from that payer's API. There is no STU3 payer endpoint that meets the CMS rule requirements.

The Provider Directory and Drug Formulary Requirements

Beyond the patient access provisions, the CMS rule also requires covered payers to publish provider directory APIs and drug formulary APIs using FHIR R4 — specifically the DaVinci PDEX Plan Net Implementation Guide for provider directories and the DaVinci Drug Formulary Implementation Guide for formularies. If you are building applications that consume payer provider directory data or formulary data, you need to consume FHIR R4.

The regulatory reality is this: any software that touches US healthcare interoperability in a meaningful way is now touching FHIR R4. The federal mandates have created a binary: R4-compliant, or non-compliant. STU3 is not a compliant target for any new regulated use case.

US Core Profiles: The Layer That Lives on Top of R4

Understanding FHIR R4 in isolation is not enough for US healthcare integration. You need to understand US Core — the implementation guide published by HL7 that defines how US healthcare organizations must implement FHIR R4. US Core is what the ONC rule actually requires, and it substantially constrains how the base FHIR R4 resources are used.

US Core profiles do several things to base FHIR R4 resources. They mark certain optional elements as required — a Patient resource without a US Core profile can omit the patient's name, but a US Core-compliant Patient resource must include it. They constrain value sets — where FHIR R4 allows a range of code systems for race and ethnicity, US Core specifies exactly which codes to use. They define search parameters that servers must support. They add US-specific extensions for data elements like race, ethnicity, and tribal affiliation that aren't part of the base FHIR specification.

The version history of US Core is important:

There is no US Core version based on STU3. The US Core Implementation Guide has always targeted FHIR R4. When a system claims "US Core compliance," it is, by definition, an R4 system. When an EHR vendor tells you their system is US Core 3.1.1 compliant, they are telling you it serves FHIR R4 in the specific constrained form that the ONC rule requires.

Resource-by-Resource: What Changed Between STU3 and R4

The following section covers the specific breaking changes for each major resource type. This is where most integration failures originate — developers who understand the high-level version difference but haven't drilled into the specific element-level changes that will break their parsers in production.

Critical Resource Changes: STU3 vs R4 — Summary

Resource STU3 R4

Patient

careProvider element; contact relationship uses v2 codes; no communication language preference constraint

generalPractitioner replaces careProvider; contact relationship restructured; communication language now must use BCP-47

MedicationRequest

requester is a BackboneElement with agent and onBehalfOf sub-elements

requester is a direct Reference; doNotPerform added; reportedBoolean / reportedReference added

Observation

related BackboneElement links observations with typed relationships

related removed entirely; replaced by hasMember and derivedFrom direct references — silent data loss if not handled

Condition

clinicalStatus is optional; uses simple string binding

clinicalStatus is required when Condition is active; uses CodeableConcept with specific ValueSet

Coverage

grouping BackboneElement contains plan, group, subgroup, class, subclass in a flat structure

grouping replaced by class array; significant cost-sharing restructuring; costToBeneficiary added

DiagnosticReport

image BackboneElement for associated images; performer is a BackboneElement

image renamed to media; imagingStudy reference restructured; performer simplified to direct Reference

Encounter

Class uses a V3 code system; hospitalization BackboneElement structure differs

Class now uses a CodeSystem (http://terminology.hl7.org/CodeSystem/v3-ActCode) with different binding strength; status value set changed

AllergyIntolerance

status combines clinical and verification status; onset uses onset[x]

clinicalStatus and verificationStatus separated into distinct CodeableConcept elements; both now required in R4

Patient Resource: The Details That Break Parsers

The Patient resource is the one integration teams most frequently get wrong because it looks similar between versions — the same basic fields, the same basic structure — but the devil is in the specific element paths and code system bindings that changed.

The most consequential Patient change is the renaming of careProvider to generalPractitioner. In STU3, the reference to a patient's primary care provider was at Patient.careProvider. In R4, this element was renamed to Patient.generalPractitioner and the reference types it accepts were expanded. Code that navigates careProvider on an R4 Patient resource will find nothing — the element doesn't exist in R4, and most FHIR parsers won't throw an error on a missing optional element. It will silently return null.

The contact element's relationship field changed its coding. In STU3, relationship used the PatientContactRelationship value set from the v2 coding system. In R4, it uses a different value set from the v3 Role Code system. Code that maps STU3 contact relationship codes to display strings will produce wrong output against R4 data. The codes themselves are different strings — "C" (emergency contact in v2) vs "C" meaning something different in v3.

The communication element's language field changed its binding. STU3 used a broad binding to any language code. R4 requires BCP 47 language codes specifically — the IETF standard format like "en-US" or "es" rather than the older HL7 language code tables. Code that accepts any code string and presents it to users may not break, but code that validates language codes against an expected value set will fail.

The address structure itself didn't change substantially, but the use code binding tightened in R4 with a required binding to a smaller value set. STU3 accepted codes like "primary" and "secondary" that R4 does not include in its required value set.

MedicationRequest: The Requester Restructure

In STU3, the prescribing provider on a MedicationRequest was represented as a BackboneElement:

The path to the prescribing provider was MedicationRequest.requester.agent — you had to navigate through the BackboneElement to reach the Reference. The onBehalfOf element at the same level expressed the supervising organization. This two-level structure was designed to support delegation scenarios.

In R4, the BackboneElement was eliminated. requester is now a direct Reference. The path is simply MedicationRequest.requester. The onBehalfOf concept for organizations was dropped from this resource (though it remains in other contexts). Any code that navigates requester.agent will fail against R4 — the agent sub-element does not exist.

R4 also added doNotPerform (a boolean that inverts the meaning of the request — "do not give this medication"), reportedBoolean and reportedReference (indicating whether the medication was reported by a patient versus documented by the ordering clinician), and a restructured dosageInstruction that uses the new Dosage datatype more explicitly. These are additive changes and won't break STU3-built parsers, but they represent data your parsers aren't capturing.

Observation: The Silent Data Loss Problem

The removal of the related BackboneElement from Observation deserves special attention because it is the change most likely to cause silent data loss — bugs that don't crash, don't throw exceptions, and don't generate log entries, but quietly drop data.

In STU3, a panel result — a comprehensive metabolic panel, a lipid panel, a CBC — was represented as a parent Observation with child component Observations linked via the related element. Each related entry had a type (typically "has-member") and a target reference to the child Observation resource. Code that assembled panel results iterated over Observation.related where type equals "has-member" and followed those references.

In R4, the related element was removed entirely. Panel relationships are now expressed using direct reference elements: hasMember (replacing "has-member" typed related entries) and derivedFrom (replacing "derived-from" typed related entries). The data is still there — but it is at a completely different path.

STU3 code that iterates Observation.related will find an empty array on R4 data, not an error. The parent Observation will parse and validate correctly. The panel result will appear to have no component observations. Downstream code that expects panel components will quietly produce incomplete results — missing sodium, potassium, creatinine values on a BMP, or missing LDL and HDL on a lipid panel. This is the kind of bug that makes it through QA because the test system has different data than production, and only surfaces when a real patient result comes through with panel data.

Condition: The Required Clinical Status

The Condition resource changed substantially in R4 in a way that will cause validation failures, not just silent data loss. In STU3, clinicalStatus was an optional string field. In R4, clinicalStatus is a required CodeableConcept when the condition is active, relapsed, or in remission — and it must use the specific ConditionClinicalStatusCodes value set with codes like "active," "recurrence," "relapse," "inactive," "remission," and "resolved."

Code that creates Condition resources for active problems and omits clinicalStatus will fail R4 validation. Code that passes a simple string to clinicalStatus (which worked in STU3) will fail R4 parsing. The verificationStatus field similarly changed from a simple code binding to a required CodeableConcept structure with the ConditionVerificationStatusCodes value set.

The practical implication for integration developers is that any pipeline that creates Condition resources — from ADT feeds, from problem list synchronization, from clinical document imports — must be updated to populate these required CodeableConcept structures using the correct R4 value sets.

AllergyIntolerance: Separated Status Fields

In STU3, AllergyIntolerance had a single status field that combined both clinical status (is the allergy active?) and verification status (has it been confirmed?). The value set included codes that expressed both dimensions — "active," "inactive," "resolved" for clinical status and "unconfirmed," "confirmed," "entered-in-error" for verification status.

In R4, these were separated into two distinct elements: clinicalStatus (a CodeableConcept) and verificationStatus (a separate CodeableConcept). The R4 AllergyIntolerance invariant requires that if verificationStatus is "entered-in-error," the clinicalStatus element should not be present. This sounds like a minor structural change, but it means any code that reads or writes a single status field on AllergyIntolerance will be reading or writing the wrong thing against an R4 endpoint.

Coverage: The Grouping Restructure

The Coverage resource underwent one of the most substantial restructurings between STU3 and R4. In STU3, plan and group details were stored in a flat grouping BackboneElement with fields like grouping.group, grouping.groupDisplay, grouping.plan, grouping.planDisplay, grouping.class, and grouping.subClass.

In R4, the grouping element was replaced by a repeating class array where each entry has a type, value, and optional name. The types in the class array use a CodeSystem to express what the entry represents (group, plan, class, subplan, etc.). This is a more flexible structure but requires a completely different traversal pattern.

Coverage also gained the costToBeneficiary element in R4, which represents cost-sharing arrangements (copays, coinsurance) in a structured way that didn't exist in STU3. For RCM applications that use Coverage data for benefits verification, this is a significant improvement — but only accessible on R4 endpoints.

Encounter: Class Code System Changes

The Encounter resource's class field changed from using a ValueSet (allowing a range of coding systems) to requiring codes from a specific CodeSystem — http://terminology.hl7.org/CodeSystem/v3-ActCode. In practice, the codes are similar (AMB for ambulatory, IMP for inpatient, EMER for emergency), but the binding strength changed to required in R4, and the allowed values narrowed. Code that accepts any code system for Encounter class will not produce errors against R4 data, but code that validates against an expected STU3 CodeSystem URI will fail.

The Encounter status value set also changed. STU3 included statuses like "in-progress" and "finished." R4 renamed some statuses and added others — "in-progress" became "in-progress" (unchanged), but the overall value set was tightened and the required binding makes non-conforming codes invalid.

DiagnosticReport: The Image/Media Rename

The image BackboneElement in the STU3 DiagnosticReport was renamed to media in R4. The structure is similar, but the element name change means any code that navigates DiagnosticReport.image will find nothing in R4. For radiology integrations that reference associated images in reports, this is a breaking change that will silently drop image references.

The performer element in DiagnosticReport was also restructured. STU3 had a BackboneElement with role and actor. R4 simplified this to a direct Reference for the performer and added a separate resultsInterpreter element for the interpreting physician. Code that navigates performer.actor will fail against R4.

DocumentReference: Status Code Changes

The DocumentReference status element in STU3 used the DocumentReferenceStatus value set with codes "current," "superseded," and "entered-in-error." R4 uses the same codes but tightened the binding. More importantly, the docStatus field (which was optional in STU3) now has cleaner semantics in R4 using the CompositionStatus codes — "preliminary," "final," "amended," "entered-in-error." Applications that relied on STU3's looser binding for status codes need to update their code mappings.

Serialization-Level Breaking Changes

Beyond resource-specific changes, there are serialization-level differences between STU3 and R4 that can cause subtle parsing failures even for resources that appear structurally similar.

Date Format Handling

FHIR has always required date values to be in ISO 8601 format, but R4 is stricter about which partial date formats are acceptable in which contexts. In STU3, some implementations accepted date strings like "2023" (year only) or "2023-04" (year and month) in fields that should have contained full dates. R4 validators are stricter, and R4-conformant parsers may reject or sanitize partial dates that STU3 parsers accepted. If your integration pipeline relies on accepting partial dates as valid, test this specifically against your R4 data.

Reference Format and Versioning

FHIR references can be either relative (just the resource type and ID, like Patient/12345) or absolute (a full URL). In R4, the specification is clearer about when references should be versioned — including a resource version in the reference like Patient/12345/_history/3. Some R4 implementations include versioned references in contexts where STU3 implementations did not. Code that parses reference strings by splitting on "/" needs to handle the additional _history/{version} suffix correctly.

Extension URL Namespace Changes

Several extensions that were defined within the base FHIR specification in STU3 were promoted to core elements in R4, while their STU3 extension URLs are no longer recognized by R4 validators. The most common example is the US Core race and ethnicity extensions — the STU3 extension URLs under http://hl7.org/fhir/StructureDefinition/ were replaced by US Core-specific URLs under http://hl7.org/fhir/us/core/StructureDefinition/. Code that reads race and ethnicity from Patient resources using STU3 extension URLs will read nothing from US Core R4 Patient resources.

Contained Resources

Contained resources — resources embedded inline within a parent resource rather than referenced by ID — have stricter rules in R4. R4 requires that contained resources be referenced at least once from their parent resource; STU3 allowed "orphaned" contained resources. R4 also disallows contained resources within already-contained resources. If your integration creates resources with contained resources, validate this logic specifically against R4 rules.

The Sandbox Trap: Why Version Mismatches Happen

The scenario that opens this article — sandbox on STU3, production on R4 — happens for entirely understandable reasons that are worth understanding, because understanding the mechanism helps you protect against it.

EHR vendors and health systems maintain separate sandbox environments for developer testing precisely because production systems can't be exposed to arbitrary API calls from unknown developers. Those sandbox environments are maintained separately from production. When a vendor upgrades their production FHIR implementation from STU3 to R4 — which involves substantial engineering work across the EHR codebase — they don't always upgrade the sandbox simultaneously. Sandbox upgrades are expensive, lower priority than production, and sometimes require separate credentialing and provisioning workflows.

The lag can be significant. In documented cases from 2021-2023, some major EHR vendors ran sandboxes that were 18 months behind their production FHIR version. Developers who built against the sandbox in good faith had production surprises waiting for them.

There is no fully reliable protection against this. The best practices are:

The pattern that experienced healthcare integration teams use: they maintain two separate test data sets — one in the version the sandbox serves (for connectivity and authentication testing), and one in the version production serves (for parser and logic testing). The sandbox tests prove you can reach the endpoint and authenticate. The synthetic data tests prove your code handles the actual payload structure correctly. Conflating these two test concerns is how six-week delays happen.

SMART on FHIR: Auth Differences Between Versions

SMART on FHIR is the authorization framework layered on top of FHIR APIs — it handles OAuth 2.0 flows, launch contexts, and scopes for healthcare applications. The SMART specification has evolved alongside FHIR, and the version in use affects how applications authenticate and what access they can request.

SMART v1 (the version associated with early STU3 deployments) uses a straightforward OAuth 2.0 authorization code flow with healthcare-specific scopes. SMART scopes in v1 use a format like patient/Patient.read or user/Observation.read — resource-level granularity.

SMART v2 (formally published in 2021 and increasingly required by R4 implementations) introduced several significant changes:

For integration developers, the SMART version question is separate from the FHIR version question — a server can serve FHIR R4 with SMART v1 authorization, or FHIR STU3 with SMART v2 (though this combination is unusual). However, EHR implementations that comply with the ONC rule typically implement both FHIR R4 and SMART v2, so they tend to appear together in practice.

When you query a server's CapabilityStatement, look at the rest.security section for the SMART capability URLs. These will tell you which SMART version is in use and what authentication flows are supported. Code that implements only SMART v1 flows will fail against servers that require SMART v2 backend service authentication.

Bulk FHIR ($export): Version Differences

Bulk FHIR — the $export operation — allows requesting large amounts of FHIR data in batch, rather than record-by-record. It is the mechanism used for population-level analytics, care gap identification, and large-scale data exchange between health systems and payers. The Bulk FHIR specification evolved substantially between what was informally supported in the STU3 era and the formally specified version associated with R4.

The Bulk FHIR Implementation Guide (Bulk Data Access IG) was formally published in 2021 and closely tied to the SMART Backend Services Authorization framework. Key differences in the R4-era bulk export:

If you are building a population health analytics pipeline that uses bulk FHIR export, you are building against the R4 bulk specification. The STU3-era informal bulk implementations were inconsistent across vendors and are not a viable baseline for production analytics work.

CDS Hooks and FHIR Version Requirements

CDS Hooks is a separate HL7 standard that allows clinical decision support services to be embedded in EHR workflows — triggering automated recommendations at points like when a patient encounter is opened, when an order is placed, or when a medication is prescribed. CDS Hooks services receive FHIR resources as context and can return "cards" with recommendations that appear in the EHR interface.

CDS Hooks 2.0, the current version, specifies that FHIR context objects should be FHIR R4 resources. CDS Hooks 1.0 (the earlier version) did not specify a FHIR version for context objects, leaving it to EHR implementations. The practical effect is that CDS Hooks services deployed against R4-era EHRs will receive R4 context objects — R4 Patient resources, R4 Encounter resources, R4 MedicationRequest resources. Services that parse context objects using STU3 resource structure will misparse R4 context and may produce incorrect or absent recommendations.

The prefetch mechanism in CDS Hooks — which allows services to declare what FHIR data they need, and have the EHR retrieve it before the hook fires — uses FHIR search parameters. R4 search parameters changed for some resources. CDS Hooks services that declare prefetch using STU3 search parameter syntax may not work correctly against R4 EHRs that validate prefetch query syntax.

Conformance Resources: CapabilityStatement, StructureDefinition, ValueSet

FHIR defines a set of resources for expressing system capabilities and data constraints. These conformance resources are important for integration architects and tooling developers, and they changed between STU3 and R4 in ways that affect how you discover what an endpoint supports.

CapabilityStatement (renamed from Conformance in STU3)

In STU3, the resource that describes what a FHIR server supports was called the Conformance resource, accessible at [base]/metadata. In R4, it was renamed to CapabilityStatement — but it is still accessed at the same URL ([base]/metadata). The structure changed substantially, however.

The R4 CapabilityStatement has a clearer structure for expressing supported interactions, search parameters, and operation definitions. It includes an explicit implementationGuide element where servers can declare which implementation guides they conform to — this is how you find out that a server is US Core 3.1.1 compliant versus US Core 7.0 compliant. Code that parses the Conformance resource in STU3 format will fail against R4 CapabilityStatements because the element paths are different.

StructureDefinition

StructureDefinitions define the formal structure of FHIR resources and profiles. In R4, StructureDefinitions use a more explicit differential/snapshot structure that cleanly separates what a profile changes (the differential) from the complete resolved structure (the snapshot). This makes it easier for tooling to validate resource instances against profiles. R4 StructureDefinitions are accessed at different canonical URLs than STU3 ones — the base R4 StructureDefinitions are at http://hl7.org/fhir/StructureDefinition/{ResourceType}, and US Core profiles are at http://hl7.org/fhir/us/core/StructureDefinition/{ProfileName}.

ValueSet and CodeSystem

R4 made a significant conceptual change to how code systems and value sets are managed. In STU3, there was a single ValueSet resource that could both define codes (code system content) and group codes (value set content). In R4, these responsibilities were separated: the CodeSystem resource is responsible for defining code systems, and the ValueSet resource is responsible for defining value sets by selecting codes from code systems. This separation is important for tooling that needs to load or validate terminologies.

Building a Version-Agnostic FHIR Client

For platforms that must support both STU3 and R4 — multi-EHR integration platforms, legacy migration tools, international deployments — building version-agnostic client code is more sustainable than maintaining parallel codebases. The abstraction strategy that works best involves several layers.

The first layer is version detection. Before any resource fetch, query the CapabilityStatement/metadata endpoint and parse the fhirVersion field. Store the detected version and route all subsequent resource operations through version-specific handlers. This detection should happen once per session and be cached; re-detecting on every call is unnecessary overhead.

The second layer is version-specific parsers. For each resource type where STU3 and R4 differ materially — Patient, MedicationRequest, Observation, Condition, AllergyIntolerance, Coverage, Encounter — implement separate parser methods for each version. Name them explicitly: parsePatientSTU3() and parsePatientR4(). Route to the appropriate parser based on the detected version. This approach is more verbose than a single parser that tries to handle both, but it is much easier to maintain and test.

The third layer is a version-neutral domain model. Both STU3 and R4 parsers should produce the same internal domain objects — your application should work with a Patient domain object that has a primaryProvider field regardless of whether that came from careProvider (STU3) or generalPractitioner (R4). The version translation happens in the parser layer; the application logic layer should be version-unaware.

The fourth layer is version-specific serializers for any operations that write FHIR resources back to endpoints. Your application logic creates a domain MedicationRequest object; the serializer converts it to either STU3 or R4 JSON depending on the target endpoint's detected version.

Testing Strategy: What Data You Need for Each Version

Testing a FHIR integration properly requires test data at every resource type your integration touches, in both versions you claim to support. Here is what a comprehensive FHIR integration test suite needs:

For R4 Testing

You need Patient resources that include generalPractitioner references (not careProvider), BCP-47 language codes in the communication element, and the US Core race and ethnicity extensions using R4 extension URLs.

You need MedicationRequest resources with the flattened requester reference (not the nested BackboneElement), doNotPerform values in both true and false states, and reportedBoolean set in some records to test the reported medication flow.

You need Observation resources that use hasMember references for panel results (not related), and you need panel parent records alongside their member child records so you can test panel assembly logic end-to-end.

You need Condition resources with clinicalStatus populated as a required CodeableConcept — including conditions in active, inactive, resolved, and remission states — to test validation logic that rejects conditions missing required status.

You need Coverage resources with the R4 class array structure (not the STU3 grouping BackboneElement), including plan, group, and subplan class entries, to test benefits verification workflows.

For STU3 Testing

You need the same resource types but with STU3-specific structures: careProvider on Patient (not generalPractitioner), the nested BackboneElement requester.agent on MedicationRequest, the related BackboneElement on Observation for panel members, and the flat grouping BackboneElement on Coverage.

For Version Detection Testing

You need CapabilityStatement resources for both STU3 (where the resource type is "Conformance") and R4 (where it is "CapabilityStatement"), with differing fhirVersion values, to test that your version detection logic correctly identifies which version it is talking to.

The most important test you can run is the silent data loss test. Deliberately send an R4 Observation resource through your STU3 parser and verify that the test explicitly checks for the absence of panel members — confirming that your test suite detects the data loss, not just that the resource parses without errors. Most FHIR integration test suites test for parse success, not for data completeness.

Real-World Migration Scenarios That Break

Beyond individual resource-level changes, the following real-world integration scenarios are where STU3-to-R4 migrations most commonly produce failures:

EHR Vendor API Upgrades

When a health system upgrades their EHR to a new major version, the FHIR API often upgrades with it. Epic's Spring 2021 release moved the primary patient-facing API to R4. Oracle Health (formerly Cerner) Millennium's production API moved to R4 in phases across 2020-2022. Integrations built before those upgrade dates against STU3 production APIs needed to be updated — but not all were caught in time, because API version upgrades aren't always communicated to integration partners with sufficient lead time.

Payer Data Exchange After CMS Compliance Deadlines

Health plans that complied with the CMS Interoperability rule by the July 2021 deadline deployed new R4 APIs. Third-party applications that had been consuming those plans' prior data feeds — which might have been DSTU2 or STU3 — suddenly faced R4 payloads. Price transparency applications, care coordination tools, and member-facing apps all required updates.

Patient Portal Third-Party App Integrations

Patient portals that support SMART on FHIR allow patients to connect third-party health applications. As portals upgraded to R4, apps built on STU3 resource structures began failing. The apps would authenticate successfully, request patient data, and then misparse the R4 resources they received — presenting incomplete health histories, missing medications, or truncated problem lists to patients.

The R5 Horizon: Planning Ahead Without Overbuilding

R5, published in March 2023, is worth understanding even though it is not yet required for production implementations. The key R5 changes that will matter when it becomes a mandate target:

The practical recommendation for 2026: build against R4 as your current standard. Design your abstraction layers (the version-specific parsers and serializers described above) to make adding R5 support a localized change rather than a system-wide refactor. Don't build R5 support speculatively, but don't build R4 implementations in ways that will make R5 migration unnecessarily painful.

FHIR R4-Conformant Synthetic Test Records — Ready Now

PatientDatasets.com provides validated FHIR R4 synthetic patient records with correct resource structure across Patient, Observation, MedicationRequest, Condition, Coverage, Encounter, AllergyIntolerance, DiagnosticReport, and more. Test your version detection logic, your R4 parsers, and your migration tools against realistic data — no sandbox access required, no PHI, commercially licensed. STU3 format data also available for version-comparison testing.

Explore FHIR Synthetic Data →

The Practical Checklist for 2026

To close, here is a concrete checklist for integration teams navigating the STU3/R4 landscape:

The integration team from the opening scenario eventually shipped — six weeks late, with a robust version-detection layer they built during the emergency remediation. Their test suite now validates against both STU3 and R4 synthetic data before any production cutover. The painful lesson became a better engineering practice. The goal is to get that practice without the pain — and that starts with understanding every specific difference documented above, before the first production call goes out.