Cracking the code: enhancing development finance understanding with artificial intelligence

Beaucoral, Pierre

doi:10.1007/s41060-026-01086-w

At a glance

$30–40B / yr

Climate-finance reporting gap

Between donor self-declarations (Rio markers) and narrative-based classification, 2010–2022.

406 topics

Hidden thematic clusters

Recovered from 5M OECD CRS project narratives (1973–2022) — vs. 234 official purpose codes.

48%

Projects that don't fit cleanly

Share of outlier projects that the sector-code system cannot cluster coherently.

Abstract

Official climate-finance reporting depends on donors self-declaring which of their projects count as "climate" via the Rio markers — a binary flag known to be prone to over-reporting. Testing this at scale has been hard: the OECD Creditor Reporting System (CRS) contains ~5 million project descriptions spanning 50 years and dozens of languages.

This study applies a modern NLP pipeline — multilingual sentence-BERT embeddings, UMAP dimensionality reduction, HDBSCAN clustering, and LLM-assisted labelling — to classify every project from its narrative rather than from donor-declared tags. The method recovers 406 fine-grained thematic clusters and, once aggregated to climate-related topics, produces an estimate of climate finance that is $30–40B/year lower than the Rio-marker figures implied. The gap is economically and politically significant: it is roughly one-third of the $100B/year Paris Agreement commitment.

Beyond climate, the resulting dataset enables granular analyses that sector codes cannot support — from great-ape conservation and LGBTQ rights to the dyadic structure of Syrian humanitarian flows — and provides a reusable, multilingual framework for aid-transparency research.

Why this matters

Every climate-COP reopens the same question: are rich-country pledges actually being delivered? The honest answer is surprisingly hard to produce, because the accounting system we use — OECD Rio markers — is a self-declaration regime. Donors decide which of their projects count as climate-relevant, and those decisions are not independently verified.

A project narrative — what the aid agency says the project does — is a richer, harder-to-game signal than a binary flag. This paper treats the narrative as the primary data, lets a clustering algorithm discover the latent topic structure of global aid, and then compares the resulting topic-based aggregates with the official ones. The divergence is the reporting gap.

Research questions.

Can unsupervised topic modelling produce a more granular classification of development projects than sector codes or Rio markers?
How much does narrative-based climate-finance accounting diverge from Rio-marker-based accounting?
What hidden patterns, niches, and inconsistencies does the richer classification expose?

Method

1. Data

OECD Creditor Reporting System, ~5 million project records, 1973–2022. The text input is the concatenation of project title, short description, and long description. Declarations are multilingual; no manual translation is performed.

2. Embeddings

paraphrase-multilingual-MiniLM-L12-v2 (sentence-transformers) maps each narrative to a 384-dimensional dense vector. Multilingual by design — English, French, German, Dutch, Spanish, etc.

3. Dimensionality reduction

UMAP to 12 dimensions, preserving local and global structure while sidestepping curse-of-dimensionality issues that degrade Euclidean-based clustering.

4. Clustering

HDBSCAN (min. cluster size = 500) — density-based, hierarchical, optimal-k free, tolerant of the noise/"generic project" problem. Outliers are explicitly identified rather than force-assigned, then optionally re-absorbed via a second nearest-cluster pass.

5. Labelling

Class-based TF-IDF (c-TF-IDF) extracts the most discriminative tokens per cluster; the top-10 tokens and 5 representative narratives are passed to Zephyr-7B-β (fine-tuned Mistral-7B) to generate human-readable cluster labels.

6. Validation

Clustering quality tracked with Silhouette, Davies–Bouldin, and Calinski–Harabasz scores. Outlier reduction trades coherence for coverage and is reported transparently.

Reproducibility. Full pipeline (Python), fixed seeds, and a step-by-step README are available in the replication repository. The paper can be re-run end-to-end on a standard laptop with a GPU — no HPC cluster required.

Explore the data

Three live, interactive views of the 406-topic landscape. Hover, zoom, filter.

Open full-screen ↗

Topic frequency through years — the rise and fall of development fashions.

Open full-screen ↗

Topic mix per donor — whose portfolio is truly climate-heavy?

Open full-screen ↗

Commitments per topic over time — constant-USD.

Headline results

Climate-finance reporting gap: $30–40B/year. Aggregated narrative-based climate flows are systematically lower than Rio-marker totals throughout the 2010–2022 period. The gap widens post-Paris Agreement, consistent with over-labelling incentives after the $100B pledge.
406 vs. 234. The narrative-based taxonomy recovers 1.7× more distinct topics than the OECD purpose-code system, with finer granularity in thematic areas (e.g. "Government & Civil Society — general" breaks into ~30 sub-clusters).
Fashion cycles. Topics such as microfinance, coffee-farmer support, and Amazon Indigenous conservation display clear rise-and-fall dynamics that binary sector tags cannot capture.
Niche visibility. Small but policy-relevant flows — great-ape conservation, demining, LGBTQ rights, reproductive health — become individually trackable for the first time.

Geographic distribution of great-apes conservation projects

Fig. 6 — Great-Apes conservation projects, by country.

Donor-recipient flows for Syrian humanitarian aid cluster

Fig. 7 — Syrian humanitarian-aid dyadic flows.

Fig. 8 — Is aid a fashion victim? Topic share, 2010–2022.

For policy readers

OECD DAC & statistical units

Rio-marker figures should be cross-checked against narrative-based estimates before publication. AI has to offer in better understanding of development flows.

Climate negotiators

Paris-Agreement compliance assessments currently lean on self-reported numbers. An independent, narrative-based estimator puts a harder floor on the "delivered climate finance" claim.

Multilateral & bilateral donors

The 406-topic map surfaces internal inconsistencies — identical project types classified differently across agencies. Useful for portfolio review and accountability reporting.

NGOs, journalists, watchdogs

The interactive explorer and open replication make it possible to audit a donor's climate claims, track a niche flow (e.g. demining, gender-based violence response), or investigate dyadic patterns without custom tooling.

What this paper does not claim

Not a causal identification. The paper documents a measurement gap; it does not identify the causal mechanism (strategic over-labelling vs. reporting inertia vs. genuine ambiguity). That's the natural next paper.
Narrative quality varies. Reporting habits differ across agencies and over time; the multilingual embedding absorbs much of this, but not all. Results are robust to outlier handling; see Sec. 4.3 of the paper.
Lower bound. Because HDBSCAN sets some ambiguous projects aside as outliers, the climate-finance estimate is best read as a lower bound. The truth lies between my narrative-based figure and the Rio-marker figure.

Replication code

Full pipeline on GitHub

Python scripts, requirements file, and step-by-step README. Runs end-to-end on a standard laptop with a GPU — no cluster required.

Open repository ↗

Issues, pull requests, and forks are welcome — especially translations of the labelling prompts and extensions to non-OECD sources (World Bank, AIIB, AfDB).

How to cite

Beaucoral, P. (2026). Cracking the code: enhancing development finance understanding with artificial intelligence. International Journal of Data Science and Analytics, 22, 143. https://doi.org/10.1007/s41060-026-01086-w

BibTeX

@article{cite-key,
	abstract = {Analysing development projects is crucial for understanding donors'aid strategies, recipients'priorities, and for assessing development finance capacity to address development issues through on-the-ground actions. In this area, the Organisation for Economic Co-operation and Development's (OECD) Creditor Reporting System (CRS) dataset is a reference data source. This dataset provides a vast collection of project narratives from various sectors (approximately 5 million projects). While the OECD CRS provides a rich source of information on development strategies, it falls short in informing project purposes due to its reporting process, which is based on donors'self-declared main objectives and predefined industrial sectors. This research aims to employ novel and reproducible approach for practitioners and researchers in social sciences that combines machine learning (ML) techniques, specifically natural language processing (NLP), with an innovative Python topic modelling technique called BERTopic, to categorise (cluster) and label development projects based on their narrative descriptions. By revealing existing yet hidden topics within development finance, this application of artificial intelligence enables a better understanding of donor priorities and overall development funding, and provides methods to analyse public and private project narratives.},
	author = {Beaucoral, Pierre},
	date = {2026/04/19},
	date-added = {2026-04-19 15:51:44 +0200},
	date-modified = {2026-04-19 15:51:44 +0200},
	doi = {10.1007/s41060-026-01086-w},
	id = {Beaucoral2026},
	isbn = {2364-4168},
	journal = {International Journal of Data Science and Analytics},
	number = {1},
	pages = {143},
	title = {Cracking the code: enhancing development finance understanding with artificial intelligence},
	url = {https://doi.org/10.1007/s41060-026-01086-w},
	volume = {22},
	year = {2026},
	bdsk-url-1 = {https://doi.org/10.1007/s41060-026-01086-w}
}

Contact & collaboration

If you work on climate-finance accountability, ODA measurement, aid-transparency NLP, or development-economics methodology — I'd like to hear from you. I'm a PhD candidate at CERDI (Université Clermont Auvergne, CNRS, IRD) working on development and environmental economics; this paper is part of a broader agenda on measuring public-policy flows from text.

pierre.beaucoral@uca.fr
pierrebeaucoral.github.io
LinkedIn · GitHub · Google Scholar

At a glance

Abstract

Why this matters

Related work

Method

1. Data

2. Embeddings

3. Dimensionality reduction

4. Clustering

5. Labelling

6. Validation

Explore the data

Headline results

Fig. 6 — Great-Apes conservation projects, by country.

Fig. 7 — Syrian humanitarian-aid dyadic flows.

Fig. 8 — Is aid a fashion victim? Topic share, 2010–2022.

For policy readers

OECD DAC & statistical units

Climate negotiators

Multilateral & bilateral donors

NGOs, journalists, watchdogs

What this paper does not claim

Replication code

Full pipeline on GitHub

How to cite

Contact & collaboration