# Example PDFs for Week 5

These are the anchor examples used in the PDF-processing session. The aim is to return to a small number of real KNBS PDFs and inspect how the same document looks as a human-readable report, extracted text, and possible structured evidence.

## 1. CPI March 2026

**File:** `example-pdfs/Kenya-Consumer-Price-Indices-and-Inflation-Rates-March-2026.pdf`

**Use:** clean/simple demo PDF for the PyMuPDF walkthrough.

Suggested question: *What was Kenya's annual inflation rate in March 2026?*

This is the best PDF for the live coding demo because it is short, has a usable text layer, and includes a clear table where the extracted text is mostly readable.

## 2. 2025 Facts and Figures

**File:** `example-pdfs/2025-Facts-and-Figures.pdf`

**Use:** table-heavy report.

This is useful for showing that even when text extraction works reasonably well, a better ingestion pipeline should preserve table captions, row/column structure, units, and page references.

## 3. ICT indicators factsheet based on the 2023/24 Kenya Housing Survey

**File:** `example-pdfs/Fact-Sheet-Key-Indicators-of-ICT-Based-on-203-24-Kenya-Housing-Survey.pdf`

**Use:** visual factsheet / complex layout example.

This is separate from the 2025 Facts and Figures report. It is useful because many values survive extraction, but the visual grouping, chart structure, and relationship between labels and values become weaker.

## Later possible additions

If available later, the following would still be useful:

- **2025 Economic Survey** - large production-style report and multi-attachment report-page example.
- **Full 2023/24 Kenya Housing Survey Basic Report** - longer survey report for comparing extractor behaviour across document families.
