
Analysing development projects is crucial for understanding donors’ aid strategies, recipients’ priorities, and for assessing development finance capacity to address development issues through on-the-ground actions. In this area, the Organisation for Economic Co-operation and Development’s (OECD) Creditor Reporting System (CRS) dataset is a reference data source. This dataset provides a vast collection of project narratives from various sectors (approximately 5 million projects). While the OECD CRS provides a rich source of information on development strategies, it falls short in informing project purposes due to its reporting process, which is based on donors’ self-declared main objectives and predefined industrial sectors. This research aims to employ novel and reproducible approach for practitioners and researchers in social sciences that combines machine learning (ML) techniques, specifically natural language processing (NLP), with an innovative Python topic modelling technique called BERTopic, to categorise (cluster) and label development projects based on their narrative descriptions. By revealing existing yet hidden topics within development finance, this application of artificial intelligence enables a better understanding of donor priorities and overall development funding, and provides methods to analyse public and private project narratives.