Bridging the connectivity gap with offline RAG: a literature review

AXAM Literature Review | Offline RAG for Global South Education
AXAM · offline RAG synthesis

Bridging the connectivity gap with offline RAG
a literature review

No published system combines offline RAG, quantized small language models, multilingual embeddings, and educational tutoring for low-connectivity environments. AXAM stands at this convergence.
2.6B
People still offline
ITU Facts & Figures 2024
89%
Children in SSA cannot read by age 10
World Bank Africa's Pulse 2024
47
RAG chatbots for education (2024-25)
Applied Sciences survey
140+
Languages supported by Gemma 3 4B
Gemma Team, 2025

1. RAG systems for education rely almost exclusively on cloud infrastructure

Retrieval-Augmented Generation has become dominant for educational AI. A 2025 survey in Applied Sciences identified 47 published RAG chatbots for education — yet the vast majority depend on cloud-hosted GPT-3.5/4 or Gemini. The parallel systematic review in Computers & Education: AI confirms that RAG reduces hallucination and enables dynamic updates, while highlighting computational cost reduction as the critical unsolved challenge.

LPITutor (Liu et al., 2025): GPT-3.5 + vector DB shows RAG tutors perform "far better than conventional static tutors".
Hevia et al. (2025, arXiv:2510.06255): Closest offline RAG architecture for biology, but uses smaller models and calls for "quantized larger models" — precisely what AXAM delivers: Gemma 3 4B Q4_K_M.
Gap identified: no published system combines offline-first design, educational RAG tutoring, quantized SLMs, and deployment targeting low-connectivity Global South environments.

2. Quantized LLMs on consumer hardware have reached production viability

Methods like GPTQ, AWQ, and GGUF have proven that 4-bit quantization preserves model quality while enabling CPU deployment. Kurt (2025, arXiv:2601.14277) found Q4_K_M offers a strong quality-compression trade-off. On an AMD Ryzen 7 CPU, Q4_K_M GGUF achieved 47.9 tokens/second — 18× improvement over FP16 with >90% RAM reduction.

Gemma 3 Technical Report (2025): The 4B-IT model is competitive with Gemma 2 27B-IT due to distillation and efficient global-to-local attention. Supports 140+ languages, 4T tokens training.

Qin et al. (2024) provide direct justification: "with a given finetuning and inference budget, it is beneficial to increase parameters while decreasing precision" — larger quantized model outperforms smaller full-precision model.

3. Multilingual retrieval works for African languages but remains challenging

AXAM uses BAAI/bge-m3 embeddings (100+ languages, 8,192 token window) to query Swahili/Kinyarwanda/French against English MIT OCW transcripts. BGE-M3 (Chen et al., ACL 2024) achieves SOTA on MIRACL and MKQA. The AfriMTEB benchmark (Uemura et al., EACL 2026) covers 59 African languages including Swahili and Kinyarwanda — directly validating evaluation axes for AXAM.

Cross-lingual retrieval challenge: Hits@20 drops 30–50 points in cross-lingual vs same-language retrieval. However, Chirkova & Nikoulina (2024) found that multilingual RAG significantly outperforms monolingual RAG for low-resource languages.

4. Sub-Saharan Africa's education crisis demands offline AI solutions

$54
SSA gov spend per student
vs $8,500 high-income
27%
Internet access in SSA
GSMA 2024
92%
OER content in English
UNESCO GEM 2023

Existing offline platforms like Kolibri (3+ million learners) and UNICEF Learning Passport prove offline delivery works, but lack generative AI tutoring. UNESCO (2024) explicitly endorses small language models as "a cheaper, greener route into AI" to bridge the digital divide. AXAM directly answers this call.

5. AXAM fills a unique gap: comparative mapping

SystemYearEducational RAGOffline-firstQuantized SLMMultilingualLow-connectivity target
Yu et al. (ACM SIGCSETS)2025
Hevia et al. (arXiv)2025Partial
Eirena & Shah (JLIS)2025PartialPartial
EdgeRAG (arXiv)2024N/A
Kolibri (Learning Equality)2024
AXAM (proposed)2026

Five specific gaps converge in AXAM's design: no offline educational RAG system at scale; none targets Sub-Saharan African education; no system uses MIT OCW as a RAG knowledge base for 120K+ chunks; no cross-lingual educational RAG evaluated for African queries; no empirical evaluation of RAG quality in low-connectivity developing contexts.

6. RAG evaluation has converged on multi-metric, claim-level frameworks

Today's gold standard uses RAGAS (faithfulness, answer relevancy, context precision) and RAGChecker which achieves highest correlation with human judgments via claim-level entailment. The ARES methodology provides statistically grounded evaluation with ~150 human annotations. For AXAM, cross-lingual retrieval metrics (Precision@k, MRR, NDCG@k) and generation faithfulness are central.

AfriQA & AfriMTEB-Lite enable direct evaluation of Swahili/Kinyarwanda retrieval. AXAM's hybrid evaluation: LLM-as-a-Judge + human-in-the-loop for pedagogical correctness.

7. AXAM occupies an unprecedented research position

The literature confirms that each component — RAG for education, quantized SLMs, multilingual embeddings, offline deployment — has matured. Yet no prior work simultaneously combines them for low-connectivity environments. AXAM's contribution is not any single technical innovation but the deliberate integration of proven technologies to confront the assumption that AI-powered education requires reliable internet.

"The technical infrastructure for offline educational AI has matured to production viability — quantization, small language models, multilingual embeddings, and efficient vector stores each have robust research foundations — but no one has assembled these components for the populations who need them most."

Conclusion: What AXAM demonstrates

AXAM answers a fundamental research question: Can an offline RAG pipeline achieve retrieval accuracy and answer quality comparable to cloud-based tools? Based on the reviewed evidence — quantized models preserving >95% of downstream performance, multilingual RAG outperforming monolingual for low-resource languages, and vector-based RAG providing effective tutoring — it is both plausible and urgent to test. With 2.6 billion offline, 89% child illiteracy in SSA, and institutional consensus from UNESCO/World Bank demanding offline AI, AXAM provides a replicable blueprint for equity-focused EdTech.

120K+ MIT OCW chunks Swahili / Kinyarwanda / French queries Q4_K_M · 18× CPU speedup
Literature synthesis of 80+ papers across RAG, model quantization, multilingual embeddings, EdTech for Global South & evaluation frameworks.
AXAM project · bridging the connectivity gap · offline-first educational AI

Comments

Popular posts from this blog

Briefing Document: The State of AI - How Organizations Are Rewiring to Capture Value (McKinsey, March 2025)

TD Bank’s Data Awakening: What Every Business Can Learn About Enterprise Transformation