9 Remote Sensing in the Design of Sampling Frames
9.1 Outline
Sampling frames are the architectural blueprint of agricultural statistics — the invisible scaffolding that determines which farmers are counted, which fields are measured, and ultimately how nations understand their food systems. For decades, National Statistical Offices (NSOs) have often relied on frames that resembled faded blueprints: agricultural censuses frozen in time, area grids blind to seasonal changes, and farmer registries missing entire segments of the population. As cultivation patterns shift and climate pressures reshape landscapes, these static frames have become increasingly detached from reality. The result has been biased estimates: unregistered smallholders may account for a significant share of production yet remain statistically invisible, while mismatches between registry parcels and actual fields undermine the accuracy of agricultural statistics.
These shortcomings stem from three structural challenges:
Temporal decay – Agricultural landscapes evolve more quickly than statistical cycles. When sampling frames rely on censuses updated every 5–10 years, they rapidly become obsolete as farmers change crops, fields are subdivided, or land use shifts due to climate or market pressures. Carletto et al {[1]] highlight how the lack of frequent frame updates erodes the precision and relevance of agricultural surveys even within a single multi‑year survey cycle.
Spatial blindness – Traditional area frames often include large tracts of non‑agricultural land because they lack current spatial information. This leads to inefficiencies, as enumerators spend time and resources visiting locations where no relevant agricultural activity takes place.
Exclusion bias – Farmer registries and list frames frequently omit smallholders, tenants, or producers in informal land‑use arrangements. This results in systematic under‑representation of groups that may contribute substantially to agricultural production but remain statistically invisible.
These challenges propagate into agricultural statistics, affecting production estimates, food security assessments, and evidence‑based policymaking.
9.2 EO as a Partner to Statistical Rigor
Earth Observation (EO) has emerged not as a replacement for statistical rigor but as a powerful complement to traditional sampling theory. When William Cochran published Sampling Techniques in 1977 [2], he articulated principles that satellite technology can now operationalize at scale. With Sentinel‑1’s all‑weather radar imaging and Sentinel‑2’s 10‑meter resolution and frequent revisit times, NSOs can build sampling frames that are both statistically robust and dynamically updated.
EO provides the means to:
Update frames frequently: Annual or seasonal cropland masks derived from Sentinel imagery directly address temporal decay by aligning sampling frames with actual land use.
Enable spatial stratification: Up‑to‑date land‑cover maps such as ESA’s WorldCover support the exclusion of non‑agricultural areas and the creation of homogeneous strata, reducing fieldwork costs and improving precision.
Enrich list frames: When registries exist, EO-derived parcel boundaries can validate and supplement farmer records, ensuring more complete coverage.
Crucially, EO-based survey design creates reciprocal benefits: the georeferenced in‑situ observations collected for statistical estimation can also be used as training and validation data for EO‑based crop classification models. This synergy allows survey data and EO models to improve together over successive agricultural seasons.
9.3 Matching In-situ Survey Data to Remote Sensing Analysis Needs
Data collections done by NSOs can potentially become the main source of training data for EO applications for agricultural statistics. However, while national surveys often adopt GPS technology, satellite imagery remains largely unused by NSOs. To change this status quo, the Global Strategy to Improve Agricultural and Rural Statistics compiled various use cases in its Handbook on Remote Sensing for Agricultural Statistics, highlighting the missing links between EO-driven surveys and most common NSO surveys [3]. Experience shows that sampling design, response design, and quality control of NSO surveys must follow well-documented requirements to obtain statistically sound results when using EO data.
EO-related quality assurance of the in situ datasets developed in the FAO EOStat and ESA Sent4Stat projects includes two main components: (a) evaluation of survey design and (b) in situ data assessment using EO data. Quality assessment measures the suitability of a statistical survey (i.e. sampling and response design) to leverage satellite imagery in support of agriculture statistics. Many NSOs create their in situ protocols with a focus on aggregation at higher administrative tiers, often overlooking their potential application in EO contexts. Table 1 presents eleven criteria for NSOs to enable the combined use of in situ data for traditional surveys and EO applications.
| Criteria related to the sampling design | |
|
|
|
|
|
|
| Criteria related to the response design | |
|
|
|
|
|
|
|
|
|
9.4 EO-based Quality Control of In-situ Surveys
An integral aspect of the statistical survey process is its quality control procedure. Nationwide surveys require heavy logistics involving hundreds of enumerators dispersed across the country. In many countries, digital encoding devices have integrated GPS receivers and communication support, making near real-time quality checks feasible. In Senegal, the Direction de l’Analyse, de la Prévision et des Statistiques Agricoles (DAPSA) employs near real-time quality control to oversee national data collection, enabling the field campaign to incorporate repetition requests and corrections as needed.
Achieving the quality required for EO utilization imposes demands on the training of enumerators and presents more challenges for controllers. Besides being useful for aggregated surveys, in situ data must pass EO-based quality control checks. Such protocol is even more critical when combining datasets from distinct surveys requiring strict harmonisation. Experiences of FAO-EOSTAT and Sen2Stat with list frame and area frame statistical survey datasets led us to establish an EO-based quality control protocol. This protocol relies on existing maps and satellite time series processing as independent data quality control sources. Table 2 outlines the criteria applied for EO-based data quality control.
9.5 Technical Suitability of In-situ Data
The first three criteria of the EO-based quality control concern the technical suitability of in situ data for EO applications. Typically, surveyors record GPS coordinates for household localization, crop observation placement, or field area measurement. Requirements for geospatial analysis include ensuring the topological soundness of spatial features, which involves verifying polygon closure, identifying duplicate points, and resolving polygon overlaps (Figure 9.1).
9.6 Measuring Spatial Precision of Field Plot Boundaries
The last three criteria of the EO-based quality control rely on satellite imagery analysis. The spatial precision of field plot boundaries requires visual image interpretation of high spatial resolution imagery that aligns with the survey timeframe. This type of control usually involves overlaying the plotted polygons onto frequent data, such as monthly cloud-free surface reflectance base maps. Figure 9.2 shows a good-quality sample that aligns well with a cultivated field, whereas the second sample area covers multiple fields and some trees. The ideal situation is to use high-resolution images to visually check samples and plot boundaries and then classify the areas with lower spatial resolutions. A possible situation is to use 4.8-meter Planet monthly reflectance maps for sample quality control and 10-meter Sentinel-2 images for classification.
From an EO perspective, assessing sample quality requires time series from satellites such as Sentinel-2 for the growing season. Open-source platforms such as Sen4Stat and Sen2Agri toolbox [4] allow processing of all Sentinel-2 satellite images acquired along the season. One quantitative indicator of sample purity is the NDVI standard deviation (Figure 9.2, right plot) computed from the values of cloud-free satellite observations.
9.7 Using NDVI Temporal Profiles to Assess Crop Phenology
The last EO-based requirement is the most difficult to address. We assume that crops of the same type, grown in the same agro-climatic zone, have similar planting and growing cycles. Analysing NDVI profiles across all crop samples strengthens crop label confidence. Atypical growth patterns (e.g., varied planting/cycle length or lack of growth) indicate potential mislabeling or mislocated samples. These samples need more scrutiny before they can be deemed usable.
Consider Ethiopia’s diverse crop cycles in Figure 9.3. Temporal profiles for barley, fava beans, and teff show outliers indicating marginal sample quality. The different NDVI profiles for maize samples reveal that a significant portion underwent a double cropping cycle within the observation period. Since the first crop cycle delayed planting, the sample population needs to be divided appropriately using clustering methods. The complexity of the wheat cropping cycle is heightened by sowing dates and varietal differences affecting cycle length. Thus, EO quality control aims to reject unsuitable samples for model calibration and output validation. Subsequent EO-derived results, like crop type maps, area estimates, and yield forecasts, critically depended on this quality control process.
knitr::include_graphics("./images//th_design_frames/QA_plot_Ethiopia.png")
9.8 Summary
In this chapter, we provide recommendations for National Statistical Offices (NSO) to design in situ data collection campaigns that benefit both conventional statistics and EO-based assessments. Following these guidelines will increase the accuracy of EO-based land use classification.