Making evidence actually usable – Lindsey Moore

45 min

•Feb 4, 20264 months ago

Summary

Lindsay Moore, CEO of Developmetrics, discusses how AI and large language models can transform global development by making decades of buried evaluation data usable and actionable. Rather than solving an evidence shortage, development faces an evidence usability problem—millions of reports sit unread in databases. Moore shares five key lessons learned from analyzing USAID's 60 years of evaluations using fine-tuned AI models.

Insights

Development sector has an evidence usability crisis, not shortage—60% of World Bank reports are never downloaded, yet trillions of dollars fund evaluations that remain inaccessible
Generic large language models trained on internet data embed Western, male-dominated perspectives; domain-specific, hand-labeled models trained on local expertise are essential for accurate development insights
Effective AI for development requires 4+ years of expert hand-labeling and taxonomy development before deployment—efficiency gains only materialize after rigorous foundational work
Five evidence-based lessons from 60 years of USAID data: bring delivery closer to households, practice changes behavior (not training), design for scale not pilots, co-create with communities, and strengthen middle-layer implementers
Future progress depends on shared evidence infrastructure across donors and organizations, not more dashboards—integration into real workflows and human decision-making processes remains the unsolved challenge

Trends

Shift from evidence generation to evidence usability as the critical bottleneck in international developmentDomain-specific fine-tuned language models replacing generic AI tools for technical decision-making in developmentInstitutional memory preservation through AI as organizations face staff turnover and funding discontinuityLocally-led development and co-creation frameworks gaining evidence-based validation over top-down consultation modelsShared evidence infrastructure and interoperability between donor organizations emerging as next frontierEthical AI certification frameworks being developed specifically for development sector applicationsMiddle-layer implementers (teachers, nurses, agronomists) recognized as critical leverage point over high-level policy makersBehavioral change through practice and peer reinforcement outperforming traditional training and capacity-building workshopsPower-sharing and community monitoring replacing external oversight and control mechanisms in aid deliveryIntegration of local data sources and community testimonials to counterbalance historical Western bias in development evaluations

Topics

AI for International DevelopmentEvidence Usability in Global DevelopmentLarge Language Models for DevelopmentFine-Tuned Domain-Specific AI ModelsUSAID Evaluations and Institutional KnowledgeDevelopment Evidence InfrastructureLocally-Led DevelopmentCommunity Co-Creation in Aid ProgramsBehavioral Change vs. Training ProgramsDelivery at Household LevelPilot Programs and Scaling ChallengesMiddle-Layer Implementation SupportEthical AI in DevelopmentKnowledge Management in NGOsPower Dynamics in International Aid

Companies

Developmetrics

Lindsay Moore's women-owned technology social business building fine-tuned AI models to make development evidence usable

USAID

Primary client for Developmetrics; spent billions on evaluations over 60 years; recently dismantled, prompting knowle...

World Bank

Referenced for statistic that 60% of their reports are never downloaded, illustrating evidence usability problem

UNICEF

Contributing to Global AI Commons ethical AI certification framework for development sector applications

World Food Program

Potential partner in shared evidence infrastructure initiative across donor organizations

German Government

Early adopter of AI in development sector alongside USAID, FCDO, and NORAD

FCDO

UK Foreign Commonwealth Development Office; funding research on integrating evidence into real workflows

NORAD

Norwegian development agency moving forward with AI implementation in development sector

Red Cross

Mentioned as potential participant in shared evidence infrastructure across organizations

Global AI Commons

Creating ethical AI certification framework for development sector with UNICEF and other stakeholders

People

Lindsay Moore

CEO and founder of Developmetrics; former economist at USAID for 11+ years; expert on AI for development evidence

Dan Bannick

Host of In Pursuit of Development podcast; political scientist from University of Oslo interviewing Lindsay Moore

Pedro Concesario

Human Development Office official tracking downloads and citations of Human Development Report

Johanna Dolson

Colleague of Dan Bannick who co-authored piece on witchcraft and political decision-making in Malawi

Quotes

"We don't have an evidence shortage, we have an evidence usability problem."

Lindsay Moore•Early in episode

"Every model is an opinion. You know, it has an underlying data training set, which is mostly all the Internet data, which is a Western male perspective."

Lindsay Moore•Mid-episode discussion on bias

"AI is just like basic research methods. You have a research question, then you have to think of what's the data that you're going to answer this question?"

Lindsay Moore•Methodology discussion

"The only way I know to actually get real meaning is bringing people to do this hard work of interpreting it. Otherwise, you're relying on an interpretation which you can't verify."

Lindsay Moore•On hand-labeling requirement

"Just because you build it doesn't mean they will come. Right. Just because you write it doesn't mean they're going to read it."

Lindsay Moore•On evidence integration challenge

Full Transcript

You are listening to In Pursuit of Development with Dan Bannick. Much of the conversation on artificial intelligence these days is dominated by what we fear, such as deepfakes, disinformation and the erosion of trust. But there's also another story that deserves more attention. AI as a force multiplier for development. AI that helps us learn faster, make better informed decisions, reduce waste in time funding and effort, detect emerging risks earlier, and target interventions more precisely where they will have the greatest impact. This is particularly important because global development does not necessarily face an evidence shortage problem. Rather, it often faces an evidence usability challenge. For decades, donors, governments, and civil society organizations have produced mountains of evaluations and so-called lessons learned. However, that knowledge has often ended up scattered across unsearchable databases, buried in PDF files, or lost when staff and organizational priorities change. Although it has now become fashionable to highlight the importance of evidence-based decision-making, in reality, decisions continue to be made under pressure with incomplete information and very human shortcuts. So what would it look like to treat all that accumulated experience as a living resource? What if we could connect patterns across thousands of projects while still keeping context, nuance, and local meaning intact? My guest is Lindsay Moore, CEO and founder of Developmetrics, a women-owned technology social business committed to driving social impact and promoting equity worldwide. Lindsay brings the perspective of someone who, before she founded Developmetrics, spent years inside the system with more than a decade as an economist at USAID. One example of how AI and predictive analytics can support evidence-based decisions comes from Lindsay's recent work on land use policy. She and her team used a human-centered and fine-tuned language model which was trained on tens of thousands of expert-coded excerpts from USAID agricultural evaluations. The model can quickly identify and classify real interventions from water management to property rights. That helps policymakers cut through information overload without losing context. Thus, with a clear taxonomy and carefully labeled evidence, AI can turn decades of buried documentation into usable insight that is faster and cheaper than manual review. Another example is how Developmetrics used a domain-trained model to preserve learning from USAID after its closure. They turned decades of evaluations into structured and searchable evidence. What emerges are clearer patterns in what tends to work across settings. It is AI as institutional memory, strengthening judgment rather than substituting for it. In our conversation, Lindsay and I explore the practical and ethical questions of building AI for development. How can we best make development knowledge usable? Not just as a tool that sounds intelligent, but a system that reads evidence carefully, organizes it transparently, and helps practitioners find what is most relevant. And then there are the questions about who defines the categories, whose perspectives are missing, and how we can build AI that supports better decisions without reproducing old power imbalances in new digital form. The response to Season 6 has been fantastic, with thousands of listeners tuning in from around the world. The show has also been a regular presence in the top 100 science charts on Apple Podcasts in the United States and across several other countries. To help me reach even more listeners, please follow and subscribe to the show on Apple Podcasts or your preferred podcast platform. And if you enjoy the episode, I'd be very grateful if you could share it on social media and help spread the word. This is Dan Bannock, and you are listening to In Pursuit of Development. Lindsay, wonderful to see you. And thank you so much for accepting my invitation at short notice. Welcome to the show. Yeah, thank you so much for having me. So, Lindsay, you know, there's so much talk these days about all the negative stuff with AI. We are worried about how disinformation, misinformation, fake news, all of these aspects that AI is facilitating is going to weaken democracy. And I'm a political scientist. I've been writing about it. But there's also the other side of AI that I feel sometimes doesn't get enough attention, which is the positive side, how AI can be a force multiplier. it can actually be a force of good. And so I want to start there, Lindsay. If you were to describe AI for global development, I've read in your work that it hasn't really been used that much yet. But tell us about all the possibilities that AI offers in this pursuit of development. Yeah, sure. I love that premise. Thank you. Yeah, I mean, the possibilities are endless. And you're right, AI in international development has not... I mean, if you look at the philanthropy sector, I think they were much quicker to make some big investments in AI, many of which backfired, actually, in a lot of ways. Whereas we see the development sector really thought about it for a long time and honestly is still thinking about it. USAID was really the first to move forward with this kind of thinking. And now we're seeing kind of the German government, FCDO, NORAD moving forward. But the development sector really has to think a lot before they jump into these technologies, because it's not always about efficiency and saving money as it is in other sectors. It's really about saving lives. And so I think there's been a lot more thinking than acting, but we're starting to see some some movement. And so what does that movement really mean? You know, what potential could it hold? The thing I'm most excited about is just that, you know, there's been so much incredible knowledge created in the development sector, so much reports. I mean, when I worked at USAID, I was an economist at USAID for over 11 years in the Foreign Service. You know, I would spend days in the field talking to farmers about, you know, their irrigation ditches and they would be expecting, you know, that conversation to yield some results. And I would write it in a report and then no one would ever read it. Not even my mother would read these reports. And so there's all this knowledge out there that hasn't been connected. And now with AI, we can just light up all that knowledge by connecting it together. So I think the past is really focused on RTCs, evidence, how do we build it? And that work has been great, but it hasn't been used. The World Bank says that, I think they say about 60% of their reports are not even ever downloaded. So we don't have an evidence shortage. we have an evidence usability problem. And so now we have the ability with AI to build this, to turn this evidence into something that's really usable. And I think that's what's so, so exciting because we have less money than ever in the sector. So we have to use it more effectively. And I think it provides that opportunity. Well, a lot of what you're saying actually applies to many of our fields. I mean, I'm thinking about academia. We produce articles and books and you keep wondering, apart from the citation index, who actually reads. I was talking to Pedro Concesario from the Human Development Office, and he was telling me about how he tracks, his office tracks the number of times the Human Development Report is downloaded and cited. So everybody wants to, of course, make sure that whatever evidence they produce, whatever lessons are learned, are used effectively. But in our feelings, there's a lot of evidence that is being generated all the time. But I suppose sometimes we are bombarded with too much evidence. We can't tackle this information overload, right? I'm thinking that is where these large language models, the L&Ms, right? They help out in trying to synthesize, make sense. And I'm thinking it could be in relation to health. It could be in relation to education or social cash transfers or just decision-making. I mean, I think that is, for me, at least from a political science perspective, if I was a decision-maker and I had all of this evidence somehow systematized, perhaps I would be better able to make more informed decisions. Is that how you think also? I mean, that's exactly right. And that's where there's both the opportunity and the challenge and where we get a lot of the pushback from organizations. It's not just any AI, right? You can't take GPT and slap it on data from women in Ghana and ask how do we better empower these women in Ghana, right? Because every model is an opinion. You know, it has an underlying data training set, which is mostly all the Internet data, which is a Western male perspective. And the algorithms are also mostly Western male produced. Right. And so, yes, it can do that. It definitely can do that. But the how how is really, really important. And I think this is where I really hope to see the development sector kind of put their flag in the ground and say we're going to create our own models and we're going to work with the local community. So when we ask these women, you know, what does empowerment mean to you? It's not the narrative of Western male, but it's the narrative of the women in those communities. And same with the development sector. You know, at USAID, we fine-tuned very carefully what does resilience mean? You know, we spent months saying, what does resilience mean? Is this resilience? Hand-labeling that data and then using that to teach the algorithm. And, you know, even at USAID, different offices had different opinions. So, for example, one office thought that agriculture, you know, a fairly obvious definition included lively livestock and another office didn't. So if you say what agriculture projects have we done, they would have different answers. And so every definition has to be thought through and who is, you know, who is behind that definition. And so that where I think the opportunity is to actually really bring in these perspectives bring in this incredible technical expertise that exists out there And also I mean it a huge opportunity to include all the technical experts who have done so much thinking on every single perspective and really bring that in bring these voices into the conversation. So yes, if we can do that, then it would be an incredible opportunity. And I think that's where we get the debate often in between those two poles. So it's not just about the large data sets, our ability to efficiently scar through these data sets, but to make them actually more accurate. It's to reduce these misinterpretations, but that requires training. That requires humans coding these data sets, if I understand correctly. And that is time consuming, right? So to strive for efficiency, you have to actually put in the hard yards first. You do. I mean, I always tell people that AI is just like basic research methods. You have a research question, then you have to think of what's the data that's going to answer this question? What's the framework that we're using? AI is no different. And that's why I keep telling people it's not going to, if you're a technical expert, you're not going to lose your job if you can bring in your frameworks, your rigor, your thinking, your data sets, because that's how it should be treated like academic, you know, models. If you just want to ask a broad question, you know, what projects have we done in Kenya? Sure, you don't need to understand what's a project, what's Kenya, So co-pilot any of those tools are very good for generalized information. I mean, these models have been trained on trillions of parameters. More and more parameters keep coming in so that they can get these incredible answers that that sounds so good. And it has its place. But if we're going to ask technical questions, we need technical thinking behind it. So it's like giving yourself, you know, 200 Ph.D. students to help you understand the evidence faster and you can still guide that thinking. That sounds ideal. I would love to do that. But the hand labeling is time consuming. That's true. And I don't see a way around it, honestly. I'm also curious about what you just said. I mean, think about development that has been done in our parts of the world, the aid industry for the last 50, 60 years. A lot of the reports and evaluations that have been produced. There may have been certain very Western perspectives, There has been. You mentioned the male perspective, the global north perspective. It could be an arrogant, patronizing perspective. You know, a lot of advice and preaching, which may also have implications for evaluations, the way in which these were done. How do you correct for that, Lindsay? How do you introduce a more female global south perspective in all of this? I love that question. And I wish more people were asking that exact question because I don't hear it really honestly being asked. But it's true. I mean, evaluations are biased. That doesn't mean we shouldn't understand them. We should still analyze them. And what we saw with USAID data is that one evaluation would have these lessons and recommendations, and then the follow on would have the same lessons and recommendations. We see that as a pattern. So the same kind of project throughout 60 years, we see these same lessons and recommendations. So, yes, there's bias in there for sure. But there is also some incredibly useful information. And because we have such a huge amount of qualitative data, you can almost analyze it like qualitative, like with so much qualitative data, you can almost analyze it like quantitative. You have so many excerpts that over time you can look at these patterns and understand kind of what it's saying, but you have to read it in the context with which it's in. And that's the problem with these large language models. Often the citations are wrong and you can't really understand what is the context. So if you are looking at evaluations, you can go into it thinking, yes, okay, but this is obviously has some bias. What other data sets can we add? Or if we're doing a project, There was one project we started with USAID in Papua New Guinea. Unfortunately, we weren't able to finish it because, well, of course, USAID doesn't exist anymore, sadly. But, you know, we used the algorithm that we at Developmetrics built for USAID. And we asked, how do we, you know, help work on gender-based violence in Papua New Guinea? And, of course, it came out with the answer based on all of USAID evaluations. But then we brought in testimonials from the local communities there with women, just kind of transcripts. conversation. And then the recommended outcome changed completely. It said that we should work on sorcery accusation. So sorcery being accused of being a witch was the number one cause of gender-based violence for this community. So, and that would have never been picked up in, you know, a different algorithm. So we have to source local data, which again is back to basic research methods that we need to bring in. And there should be more donor support or any donor support, any support for this type of making sure that we don't whitewash the narrative out there with just a few voices. A colleague of mine, Johanna Dolson, and I, together with an African colleague, we recently wrote a piece on the role of witchcraft and political decision making. And much of the data was from Malawi. And the argument is something that a lot of Western donors are not very comfortable with, you see, because we can, from a Western perspective, emphasize good governance, transparency, sharing of information, accountability. But in reality, in many societies in Asia, in Africa, even in some parts of our world here in the North, there is suspicion. You know, people are distrustful. And because people are afraid of being bewitched, they don't wish to be transparent. They don't want to share any form of information. And this is something that plays out rather openly. But our impression, at least in that study, was that a lot of Western actors are very uncomfortable, don't want to talk about it. But to return to USAID, Lindsay, so one of my favorite recurring themes this season has been the dismantling of USAID. And I've had a lot of people, including the former chief economist, talking about it. And I know that you worked in USAID. Before we talk about what your company does and how you use AI, how was it for you to be an economist in Bangladesh? I mean, what was it that was working and what wasn't? I read somewhere you wrote that sometimes you had to change or make decisions at very, you know, short notice. And one of the things I think that motivated you to work in this field is that there just wasn't enough information to make these decisions correctly and quickly. So help us understand how was it in Bangladesh or in many of the other countries that you were working in? Yeah, I mean, Bangladesh was my first post. I came right out of LSC and was a country economist for Bangladesh. So I had a perception of what I would be doing. And then, of course, when you start a job, I think you get disabused of those perceptions often quite quickly. And yeah, you know, sometimes you just get a rush of money coming in. And I think one thing that maybe is missing in the narrative, a lot of kind of the dismantling of USAID is that, and then what I learned the hard way is that, yes, the intentions of USAID were to help people around the world solve some of the most intractable challenges. But more than that, it was the soft arm of the U.S. government. You know, it was the soft power. And the carrot works very well sometimes, right? So sometimes, you know, a government would get its aid, certain money taken away as a tool by the U.S. government. And that would end up on my lap to program in a country that was like Bangladesh, that was or the Dominican Republic where I worked or other countries. And so I don't know, that's a little bit outside the AI conversation. But just to say that aid has a soft power component that I think people in aid don't want to talk about because that's not why they join. They join because they want to help people. But it is part of it. And so, yes, when you have to program money quickly, you realize how this whole narrative of evidence-based decision making gets thrown out the window so, so quickly because you just have to get the money out the door. It could be December, right? It's Christmas around the corner and a new financial year starting, you have to get the money out. Yeah. And when I was first, you know, posted in Bangladesh, I expected that, you know, there were higher people higher up than me that were understood exactly what was happening and checking what I was doing. And, you know, it was, but as you get higher and higher, you see, you have less and less time and it gets less. And it's not because people don't want to use the evidence or they're not intelligent. It's just because it's humanly impossible to go through everything. So what do you then do? What kind of metrics do you use to make these decisions? Or what did you find that worked best for you? Well, you know, I think a lot of people find that the findings of our algorithms, actually, they already know it. So I think trusted people who have worked in the field for a long time, they have their own kind of internal understanding from talking to people from experience that they They generally rely on intuition that often we find with AI validates that same intuition. So they're never like, oh, wow, you know, we should bring delivery closer to households. What an incredible realization for them. They're like, yes, obviously I did this project and I realized it. But the problem is that you don't often know where those people are. They leave. And so you don't have kind of a centralized institutional knowledge where those people exist. So, you know, sometimes I would just guess and you, you know, just to be totally honest, and you read horror stories in aid of people who guessed on microfinance. And then, you know, farmers ended up killing themselves in India or, you know, so it's scary. And that is why I left USAID to start development metrics to try to solve that. One of your early clients was USAID. And what I find fascinating reading in this piece, and I'll put it in the show notes, which was published recently in Stanford Social Innovation Review, when USAID shut down, its lessons nearly vanished, AI helped recover them. That was the title. Now, it turns out that USAID had been spending billions, was it 30 billion in impact evaluations over the last, I don't know, how many decades, right? Not just impact, less impact, honestly. There were fewer impact, just overall evaluations, midterm and online, but also some impact. So, I don't know, thousands or billions or millions of pages of PDF files, a lot of them, right? And digitized. So, you had typewriter ones from the 60s. I mean, USAID did an incredible job of digitizing what they could find. So what was it that you were asked to do a few years ago? This is before USAID was dismantled. Was it to sort of systematize this evidence or to make this more up and running for the current use and using the latest technology? Yeah we started in 2021 just exactly helping to figure out solve exactly that question I was asking What should we do Well you know honestly we started with what do we do So USA didn't even know what are all the outcomes that we've tried to achieve? What are all the interventions that we've used to achieve those? Just having that taxonomy to use a really boring word that I've become incredibly passionate about. But taxonomy, you know, what do we do and how is it defined? That's kind of the foundation which became the knowledge graph of their algorithm, which is, you know, everything we've done. First, you have to understand what you've done in order to then populate it with the evidence and then look at the results. So that's really where we started is just figuring out what is it that they've done? You know, our first question was, what's every digital innovation? What's every digital intervention we've ever done in Africa, sponsored by the Africa Bureau? They just want to know what have they done. And so we used large language models, you know, very nascent back then to actually go through it all and look at everything that they've done. And from that, they were eventually able to use that evidence to respond to congressional taskers. So, you know, Congress is always messaging you saying, what did you do with identity verification in Malawi? And you then you have to say, oh, shoot, you have to call them Malawi mission. You have to look through documents manually. So instead of doing that whole exercise, they were able to reduce that time and save, I think, thousands and thousands of staff time. And so that's how we started. And all of this was because AI was at that time getting a lot of attention and people somewhere up in the system thought, this is about time we used AI to systematize the evidence? Or was it just accidental? They didn't know. I mean, a lot of language models weren't even a household name by then. And I mean, GPT, you know, was definitely operating, but they hadn't released their chat yet. So actually, USAID did not allow large language models to be, they made a ban against it completely. So we started before, actually, we went under investigation by the General Counsel of USAID. Really? Three or four years later, because we had been using it for so long that when they came out and finally made a policy against it, We had already been implementing it for so long, which then from that USAID agreed to sponsor our FedRAMP certification, which is a certification which allows you to go through all the evidence using AI as a technology kind of certification because they saw the value of money and time saved. But at the beginning, no, they didn't understand it. Artificial intelligence was still kind of a crazy idea. I had a PowerPoint with like a big tornado graphic that was like algorithm. And then, you know, your evidence, I would peddle that around trying to explain it. And people, I think, you know, people thought I was crazy, but they saw how quickly they could get to the evidence. And so that's the end what convinced them. But help my listeners understand, Lindsay, the difference between a large language model, LLM, and a development evidence LLM. Is that what you call it? D-E-L-L-M? The development evidence large language. So how do you go about constructing the D-E-L-L-M? Yeah, so we basically what we do is we separate understanding from generation. So we create a specialist model that reads and structures evidence, and then a generative model writes it so grounded in what was already retrieved. So you can always kind of think of it as like a librarian and a policy drafter. So the librarian finds the best passages and labels them based on this taxonomy that you've created with the, you know, with the specialist. and then the drafter writes a synthesis using only what the librarian kind of already surfaced. So, I mean, from a methodological perspective, we fine tune an encoder to do a high precision reading. So it extracts and it tags enriched excerpts. So did you go through all of these millions and millions of pages of text? Yeah. And that took years. Oh, yeah, about four years. Yeah. Of hand labeling text with experts from so many different universities. We started with USA, but then every time we worked with an organization, we would label more. And we're still doing that. I mean, we're still, you know, when you use kind of these generalized LLMs, you get good sounding answers, but they're hollow. You know, they have this kind of missing. So the only way you can do that is by hand labeling and really tuning it to what people think. So every time we get an answer that someone doesn't like, we say, OK, why? OK, it's not understanding this term. OK, who knows this term? Let's label the data. And every time we have to label less and less data as our archive gets bigger and bigger. I would be really worried to take on such an ambitious task. Well, we did it before AI was popular, you know, and this was how basic research was done. You know, people using Deduce or NVIDIA going through labeling and labeling text. I mean, so we weren't in the whole AI rush that that happened. And still to this date, the only way I know to actually get real meaning is bringing people to do this hard work of interpreting it. Otherwise, you're relying on an interpretation which you can't verify. And who would have thought four years ago that we'd be sitting here talking about the demise of USAID and all of these lessons that were learned. So, I mean, in hindsight, that was a very clever decision on whoever's part to initiate this. So let's cut to the chase and tell our listeners the main conclusions you arrived at in this exercise, Lindsay. I read about five overarching lessons learned, and I'm sure my listeners would be very intrigued. So could we briefly discuss some of these lessons? Yeah, I mean, I mean, the first one was just bringing delivery closer to to households. So we found that programs really perform best when the decisions kind of the follow ups and the problem solving happen where people actually live. So, you know, the farm, the school, the clinic and not in an air conditioned meeting room 100 miles away. And it sounds kind of obvious, but that's how development works. I mean, I spent most of my time in country in an air conditioned embassy conference room. Right. And this is an important point. It's not always happening from Washington or Oslo. It's also from the capital cities within a fortress of an embassy building. Exactly. And, you know, there was always the, you know, suggesting go out into the field, go meet people, go. But then there was never the budget for it, you know, especially in Bangladesh. when I was there, there was Hartals, it became very politically unstable, we could only operate in, you know, armored vehicles to get around. So how are you going to go into the field? How are you going to go to schools when you're in this type of situation? And so all the decisions were really made from these air conditioned office buildings or armored vehicles. And there was always this tension between actually getting getting into talking to people, even though there was that overall desire and encouragement, it rarely happened. So that was really a key. You know, sometimes you know these things, but actually implementing them are much harder. So that was the first, that bring delivery closer to households. I found that the second one also super interesting, practice changes practice. And what I was thinking about there was, And there's been this polarized debate, Lindsay, I'm sure you know, in many parts of the world where a lot of workshops and training sessions are advocated by Northern donors for capacity building, for training purposes, etc. But it often turns out to be more interesting for participants to get up a DM, you know, the daily allowances. And a lot of the lessons are forgotten. So I'm super excited that you identified this also to be a lesson. It came up again and again. And, you know, I experienced it myself as a USAID employee, whenever there was a problem or, you know, train them, train people. That was always the answer. And, you know, in a way, it's a very hopeful outlook. Oh, it's the problem is just that they don't understand. We just need to tell them, you know, so constant training. And one of the most frequent metrics that we saw was number of people trained in order to judge the success of a project. But we all know, we've all sit through some of these trainings, and you just mostly look forward to the coffee breaks. And often the most valuable parts of these trainings are the meeting of the people going back to the first lesson, then the actual, you know, PowerPoint lecture that is being given on stage. So we found that the behavior change really only happens when skills are practiced in the real world and reinforced by peers and kind of this more hands-on learning and that really a lot of these trainings and the money going to these trainings are wasted. And, you know, I want to add something to this, something that I've been telling some of the aid agencies in recent consultations with them, that danger that when you invite people for training, you're actually taking them away from their offices, from whatever they were doing. And Malawi, a country I know rather well, you sometimes have an entirely empty office because everybody is attending some workshop. Now, there's another justification for this, because at least from the bureaucratic perspective, they say we get such a low salary. This is our way of topping up. So the jury, I think, is still out. I'm really intrigued. I'd like to know more about this. We can continue the conversation as to, you know, when do these trainings actually work? And when is it just more performative and, you know, just filling in the blanks and writing, you know, how many how many people showed up? Yeah. I mean, and getting to travel, right? Like, you know, sometimes our trainings were held in Bangkok. So that was great. You know, who wouldn't want to leave Dhaka to go to Bangkok? I mean, you know, so yeah. Okay. So then the third one, design for scale, not for pilots. mm-hmm yeah and you know I love pilots everyone always talks about pilotitis or pilot pilots have a bad rap these days and I don't think pilots in themselves are a bad thing actually in any way it's really just that it's really the intention behind them so like if they're neat they're fundable small enough to photograph that's kind of what people go for is just have a glossy write up on them when really they need to be designed for scale meaning they need to have owners of them who are going to be there after the pilot. They have to have budgets or an idea of how a budget is going to occur. And they have to have kind of operational efficiency, just like anything else, right? Just regular logic needs to be built in. But often it's not the pilot that's the problem. It's the thinking of it that's the problem within such a small timeframe. I found this rather intriguing because I'm thinking of solutions that we don't always know the evidence of beforehand I mean we can be sure this is going to work So I suppose that is the reason for having smaller pilots If it works then you could scale them up once the evidence has been collected But you feel that some of these pilots just disappear, that they never really materialize. Yeah, exactly. Okay. And then fourth, co-creation beats consultation. Consultation is one of those buzzwords in development. We have to consult and it is important, right? It shows respect. It's about dignity. It's not about somebody coming and preaching and being arrogant. So consultation is important, but consultation can also be a bit one-sided that you ask people and then you go home and you do whatever you were going to do in the first place. Exactly. Stakeholder consultation, right? That's the key word. And it's a box you have to check. Actually, we had to check. Yes, checked stakeholders mapped and consulted done now let's move on you know um whereas it's really about sharing authority and actually the writing and really partnering uh in a meaningful way that's going that works rather than just this consultation so um and you know that's something that you said really tried to move towards with locally led development and with this idea of co-creation and that was really starting to materialize and i think there were a lot of good examples of how that was starting to show up. So I did see that shift occurring, but you have to give up power. And going back to the theme of power, that's not something that everyone always wants to do. Yeah, I was thinking about, you know, the importance of personalities here. I mean, there are some people who are know it all, you know, come there and they think they have the solutions and may be highly disrespectful, or they may consult without really listening. So I've had a lot of people on the show saying, you know, our voices are silent. We are not consulted or when we are asked for something, you know, people just ignore them, even though they wanted our views in the first place. So I suppose the role of personalities, I mean, if you and I like each other, you know, we could co-create something, but personal chemistry, I'm sure matters for co-creation, don't you think? It does. I think personal chemistry matters a lot, but I also think, the ability to let go matters a lot. You mean just not insist on your own beliefs? Well, for example, letting communities monitor their own projects. Why does it always have to be an outside person coming in with their clipboard to monitor? Can communities monitor and hold each other responsible? That's what we saw as a lesson that worked much better than having an outsider. So giving up, a lot of these practices are in place because at the end of the day it's U.S. taxpayer money and it needs to be protected and make sure that it doesn't you know result in you know the way that we wouldn't want it to so there's a balance to be struck there but I think right now the balance of the time was shifted way more towards the controls having to control everything and not necessarily uh really co-creating in a way that that shifts power to communities and then the final lesson strengthen the middle layer tell us a bit about that not just the highest officials or the low level officials, the middle? Yeah, I mean, you know, people kind of a theme that keeps coming up is there's the headlines that people love to make big headlines. But it's often the quieter, smaller work that doesn't get so much notice, this middle layer of teachers and nurses and agronomists and cooperative leaders, these really passionate, passionate people who are responsible for daily implementation. And that's kind of where the policy actually meets the the public and if you ignore that layer then we find that reforms collapse because those i mean you can think about it also from a government perspective we see this a lot in governments you know governments change well maybe not in this current case but normally governments change and but the layer of people who are actually doing the work stay so you have some sense that okay these reforms the policies are still going to be upheld those those kind of middle layer people are still there, you know, supervising things. And so they need to be supported. That's what really is the core. Everything else is just kind of theory on top of that. I remember when I was a young researcher, I would often think that going to the top official, the principal secretary or, you know, the head bureaucrat would be the most important thing. It turned out, and it still turns out, that it's important to talk to them to get permission to be able to consult and talk to their juniors, but they know very little. They sit at the higher level. So it's just a waste of time, really. Well, not always, but most often. So it is the middle layer. Did you find anything on corruption, on misuse of funds? Because that is something that aid agencies are always really worried about. You know, in evaluations, like we talk about, they're not really there. That is a fascinating question, that would be in a different database in the audits that the OIG, well, you know, when it was functioning would do. And that's where the that type of information would really, really be revealed. So that's a fascinating maybe follow up thing to think about is look at that data source and look at corruption. But I think if you know, if you look at it, it happens a lot less. Yes, it happens for sure. I saw it myself, you know, but it happens a lot less and a lot lower levels than whenever it happens, it gets a lot of sensational coverage. Exactly. Exactly. Okay. So now you you've done this evidence scanning and you've you've made these categories, these conclusions. How do you think all of this is going to help the US now that USA doesn't exist? Who's taking all of these lessons forward? And this goes to my final sort of set of issues or questions for you, Lin. So what is it that organizations should be doing more of? But let's start with the Americans. And going back to what we started, the loss of soft power. I can't tell you how frustrated I am that if you had so much soft power and you're giving it up as the U.S. has, it's just mind boggling. So now you have all of this lovely, very detailed knowledge about all the evaluations that were funded so very generously by the U.S. government. How can this be used? Is it going to be philanthropies? Are there smaller organizations that could benefit? Who's actually using this information? Well, I mean, that's the exact question, right? And I think that's the next frontier of knowledge in general. Yes, we know it. We've created the knowledge. I've written another report. I'm like, great. you know thanks for turning into a podcast you know maybe that will help other people think about it but it's just another report right that's going into another database and so how do we actually use evidence that's what we're studying actually we as you mentioned we got funded by fcdo to study this exact thing how can we actually and i believe ai can do it you know it might not be the chatbot or the dashboard, but how can we actually integrate this into real workflows? And that's what needs to be studied next. You know, we don't need another RCT methodology or another, you know, whatever. We need to actually figure out in practice just because you build it doesn't mean they will come. Right. Just because you write it doesn't mean they're going to read it. And just because you build a beautiful dashboard and even a chat bot doesn't mean it's going to work. There's still a human, a very, very, very human element in this that needs to be explored further. So that's our core research now into that because I really don't have the answer for that. Technology and AI is proceeding at such a fast pace, Lindsay. Every day there's something new coming up. If you were to, in conclusion, reflect on the way forward using AI for global development, you know, I'm a big fan of all these apps that could be used to detect anemia in villages. You know, there's something called Ruby or Snakebite, you know, how to treat that. I love these real world or even for education, the Kwame apps or the Kwanda apps. All of these are super, super relevant. But if you were to zoom out and think about AI and global development in the next six months, in the next year, where do you think we're going to make progress and where do you think we should be focusing much more attention on? Yeah, I think the biggest opportunity is investing in shared evidence infrastructure. So we stopped paying repeatedly to relearn the same thing. So what could that shared infrastructure look like? You know, we're starting because we're working on building, you know, the various large language models for different organizations who, you know, who are starting to implement it. as we build them their own LLMs, look at their own database, how can we then connect them to, how can we say, okay, you know, UNICEF, World Food Program, German government, FCDO, how can you share so that, you know, Red Cross, yes, it's great to look at your own data. Then you also need to bring in the local perspective. But if you're all programming in the same region, because there's a famine, shouldn't you share data? And that is a huge challenge. Is that possible? Would people be willing to share uncomfortable knowledge? I think we start with the comfortable. That's good enough. You know, what are, I mean, first you have to understand your own data. Then you have to bring in others and bring in the local perspective. And so I think it is iterative. And honestly, sure, fine. Keep your sensitive data locked up. You know, you can have a public and a private. You don't need to share everything. And the reality is it's not going to be shared always. But there's so much evidence that isn't damning, that is shareable, that we need to extract. And the reality is, yes, there's all this conversation about AI, how fast it's moving, but everybody still just has an unstructured database where they have reports piled without the metadata. So first they have to understand their own and then we have to share it. And then we also need ethical. We are working with Global AI Commons that is creating an ethical AI certification. So, for example, UNICEF is on the board and they're contributing to look at how are we when you're using AI with youth, how are you bringing in the youth perspective? Because let's be honest, the youth perspective is not represented, you know, in the in this these data sets. So creating a framework also with the biggest thinkers in this sector on how do we evaluate, how do we think about it, and having the shared understanding and the shared infrastructure. I mean, it's a huge project, but I think it's the way forward. Well, you're doing fascinating work. This was such a pleasure speaking with you today, Lindsay. Thank you very much for coming on my show. Thanks so much, Dan. Really appreciate it. Thank you for listening to In Pursuit of Development with Professor Dan Bannock from the University of Oslo. Please email your questions, comments and suggestions