Training data for spatial AI

Training signals that match reality—not just benchmarks.

~60 min

Observational Grammar OG-001 Observational Grammar (OG) is the idea that sensors — satellites, radar, spectrometers, thermal cameras — can form a language of evidence about reality that operates independently of human bias, market incentives, or bureaucratic approval chains. Just as grammar gives structure to language, OG gives structure to what instruments can claim about the physical world. It is M33's foundational concept: build systems that let reality set the table, then let markets and decisions work within those constraints, rather than the other way around. Analysis-Ready Data DAT-004 Analysis-Ready Data (ARD) is satellite imagery that has been processed to a standard where it can be used directly for analysis without additional preprocessing. This means the image has been geometrically corrected (pixels are in the right geographic locations), radiometrically calibrated (pixel values represent meaningful physical quantities like surface reflectance rather than arbitrary digital numbers), atmospherically corrected (the atmosphere's distortion has been removed), and often cloud-masked (unusable pixels are flagged). ARD is the difference between receiving raw ingredients and receiving a prepared, measured, recipe-ready mise en place. Data Provenance SEC-001 Data provenance is the complete, verifiable record of where a piece of data came from, every transformation it underwent, and who or what performed those transformations. In satellite imagery and remote sensing, provenance is not a nice-to-have audit trail — it is the difference between evidence and hearsay. Why Geospatial Intelligence Resists General-Purpose AI EO-001 General-purpose machine learning treats location as just another feature in a table. But spatial data has properties that systematically violate the assumptions underlying most AI architectures — non-stationarity, autocorrelation, heterogeneous observation networks, and sensor-specific physics. The most effective geospatial AI systems are not the largest or most general. They are the ones that encode domain knowledge into their structure. This has implications for how intelligence layers over Earth observation data should be designed.

esc

Domains

No results for “”

Searching…