Data & Architecture

Analysis-Ready Data

The concept that satellite imagery should arrive ready for science, not ready for more preprocessing

DAT-004

Analysis-Ready Data (ARD) is satellite imagery that has been processed to a standard where it can be used directly for analysis without additional preprocessing. This means the image has been geometrically corrected (pixels are in the right geographic locations), radiometrically calibrated (pixel values represent meaningful physical quantities like surface reflectance rather than arbitrary digital numbers), atmospherically corrected (the atmosphere's distortion has been removed), and often cloud-masked (unusable pixels are flagged). ARD is the difference between receiving raw ingredients and receiving a prepared, measured, recipe-ready mise en place.

Why It Matters

The vast majority of time spent working with satellite data is not spent doing analysis. It is spent getting the data ready for analysis.

A remote sensing analyst who wants to study vegetation health across a region over five years using Sentinel-2 imagery faces this sequence before any actual science begins: download the scenes, check cloud cover, apply atmospheric correction to convert top-of-atmosphere reflectance to surface reflectance, apply geometric correction to align pixels to a coordinate reference system, resample to a common grid, apply cloud and shadow masks, verify radiometric consistency across scenes, and handle any data gaps. For a single scene, this takes minutes to hours depending on tooling. For a time series across a large area, it can take days to weeks.

This preprocessing burden is the single largest barrier to wider use of earth observation data. Not the cost of the data — much of it is free through programs like Copernicus. Not the complexity of the analysis — vegetation indices are straightforward arithmetic on spectral bands. The barrier is the gap between what the data provider delivers and what the analyst needs.

ARD closes this gap. When data is delivered as ARD, the analyst receives surface reflectance values in a known projection with quality flags already applied. They can start computing NDVI immediately. They can compare scenes from different dates because the radiometry is consistent. They can composite across sensors because the data has been harmonized to a common standard.

The Committee on Earth Observation Satellites (CEOS) formalized ARD requirements in their ARD for Land (CARD4L) framework, defining minimum specifications for what constitutes analysis-ready optical and radar data. These specifications cover geometric accuracy, radiometric consistency, atmospheric correction quality, and metadata completeness.

Fabric Connection

Fabric's primary function is producing ARD — and then going beyond it into cross-sensor harmonization.

When Fabric processes a fire detection pattern, it takes raw Sentinel-2 scenes, applies atmospheric correction to derive surface reflectance, computes thermal indices, applies cloud masking, reprojects to a common grid, and produces a harmonized, analysis-ready product with full provenance tracking. The output is not just ARD in the CEOS sense — it is harmonized ARD that can be directly compared with Landsat-derived products, SAR observations, and historical baselines.

Fabric's processing time for this workflow — approximately 16 seconds for what would take 3-6 hours manually in a GIS — demonstrates the practical value of automating the ARD pipeline. The time saved is not a convenience. It is the difference between analysis that happens and analysis that was too expensive to attempt.

Every step in Fabric's ARD pipeline carries provenance: which atmospheric correction model was used, which cloud mask algorithm was applied, which geometric correction parameters were chosen, and what the resulting quality metrics are. This provenance is not metadata appended after the fact — it is generated as part of the processing chain, implementing Observational Grammar's requirement that every claim traces back to its physical basis.

Philosophical Thread

The epistemology of preprocessing. ARD is not just a technical convenience — it is an epistemological stance. The decision to deliver surface reflectance rather than top-of-atmosphere radiance is a claim about what the analyst needs: not what the sensor measured, but what the surface reflected. The atmospheric correction is an interpretation, not a fact — it depends on models with their own uncertainties.

This connects to Observational Grammar's core principle that every observation is a claim made by a specific sensor under specific conditions. ARD makes those conditions explicit through provenance and quality flags. A system that delivers ARD without quality information is claiming more certainty than the physics justifies.

References

CEOS (2023). "CARD4L: CEOS Analysis Ready Data for Land." Committee on Earth Observation Satellites. https://ceos.org/ard/
Dwyer, J.L., et al. (2018). "Analysis Ready Data: Enabling Analysis of the Landsat Archive." Remote Sensing, 10(9), 1363. https://doi.org/10.3390/rs10091363
Main-Knorn, M., et al. (2017). "Sen2Cor for Sentinel-2." Proc. SPIE 10427, Image and Signal Processing for Remote Sensing XXIII. https://doi.org/10.1117/12.2278218
Vermote, E., et al. (2016). "Preliminary Analysis of the Performance of the Landsat 8/OLI Land Surface Reflectance Product." Remote Sensing of Environment, 185, 46-56.
Frantz, D. (2019). "FORCE — Landsat + Sentinel-2 Analysis Ready Data and Beyond." Remote Sensing, 11(9), 1124. https://doi.org/10.3390/rs11091124