Skip to content

Overview

Here is the written documentation for our method. Additionally, explore our engaging video lecture presented by Professor Jon Wakefield from University of Washington. The slides are available here.

Small Area Estimation (SAE)

Small area estimation (SAE) refers to the process of producing estimates of quantities of interest—such as prevalence rates—for specific geographic areas, even when data are sparse or unavailable. The surveyPrev R package and the sae4health R Shiny app are designed to generate such estimates using household survey data.

These data are typically collected through complex sampling designs (e.g., stratified, two-stage cluster sampling), which must be accounted for during modeling. SAE4Health integrates methods from survey sampling and SAE to produce reliable subnational estimates and quantify associated uncertainty.

Mapped outputs offer intuitive visualizations of geographic variation. When paired with uncertainty intervals, these estimates help identify areas most in need of targeted intervention.


Administrative Levels of Analysis

We follow a common convention for defining administrative levels:

  • Admin-0: National
  • Admin-1: First subnational level (e.g., regions or provinces)
  • Admin-2: Second subnational level (e.g., districts or municipalities)

For example, in Nigeria:

  • Admin-0 is the national level,
  • Admin-1 corresponds to its 36 states plus Abuja
  • and Admin-2 includes the 774 Local Government Areas (LGAs) nested within states.

Administrative structures differ across countries, but they generally follow this nested hierarchy.


Survey Design Considerations

Most Demographic and Health Surveys (DHS) and Multiple Indicator Cluster Surveys (MICS) follow a stratified two-stage cluster sampling design:

  1. Stratification is typically based on a cross of urban/rural status and Admin-1 regions.
  2. In the first stage, clusters (enumeration areas) are sampled using probability proportional to size (PPS).
  3. In the second stage, a fixed number of households (e.g., 25) are randomly selected within each cluster.

Surveys are typically powered to produce estimates at the Admin-1 level. SAE methods make it possible to derive stable estimates at the finer Admin-2 level.


Modeling Approaches

The surveyPrev package and Shiny app provide three SAE modeling options in line with the standard classification from Rao and Molina (2015)1.


  1. J.N.K. Rao and I. Molina. Small Area Estimation, Second Edition. John Wiley, New York, 2015.