Summary
Larry Lessig analyzes Allegation 4 against Francesca Gino, examining claims of data manipulation in a 15-year-old study on honesty pledges. The episode argues the allegation violates Harvard's own statute of limitations policy, lacks foundational evidence that File A was the final dataset, and is contradicted by payment receipts Gino discovered showing File B actually matches all study participants.
Insights
- Harvard's hearing committee failed to establish that File A was the final data file before analysis, undermining the entire fraud claim that Gino modified data between File A and File B
- Payment receipts discovered by Gino prove File A was incomplete (missing participants), while File B matches all receipts—suggesting File B is the actual final dataset, not evidence of manipulation
- The 15-year-old allegation violates Harvard's own six-year statute of limitations policy and the 2024 modification requiring reliance on fabricated data in subsequent research, not mere citation
- The hearing committee's written response claimed Gino 'did not provide receipts' when those receipts were actually in the record, demonstrating fundamental failures in evidence review
- The secondary allegation about payment timing relies solely on an early draft with a copy-pasted typo, contradicted by lab manager testimony, data structure, and logical experimental design
Trends
Institutional due process failures in academic misconduct proceedings despite extensive review processesIncomplete digital records from 15+ years ago creating unfair prosecution conditions for respondentsBurden of proof misapplication in tenure revocation cases versus stated clear-and-convincing-evidence standardsPsychological inertia in multi-stage investigations causing committees to defer to earlier findings rather than conduct independent reviewGaps between stated institutional policies (statute of limitations) and actual enforcement practicesChallenges in data provenance and version control in academic research from pre-cloud eraRisk of reputational damage persisting despite exonerating evidence due to public perception lag
Topics
Academic Misconduct Allegations and Due ProcessTenure Revocation Procedures at UniversitiesData Integrity and Version Control in ResearchStatute of Limitations in Academic Fraud CasesClear and Convincing Evidence Standard in Institutional ProceedingsEvidence Completeness in Historical Research InvestigationsLab Manager and Research Assistant Data Handling PracticesPayment Records and Participant Receipts as Forensic EvidenceIRB Submission and Documentation RequirementsExperimental Design Logic and ValidityInstitutional Inertia and Appellate Review StandardsDigital Records Preservation and Email Retention Policies
Companies
Harvard University
Revoked Francesca Gino's tenure in first such action in 390-year history; subject of investigation for procedural fai...
Harvard Business School
Conducted investigation into Gino's research and recommended tenure revocation based on four allegations
University of North Carolina
Institution where Gino conducted the 15-year-old study at issue; declined to provide archived email records
People
Larry Lessig
Harvard Law School professor hosting podcast; defends Gino's innocence and critiques university's procedural failures
Francesca Gino
Former Harvard Business School professor whose tenure was revoked; subject of four academic fraud allegations analyze...
Bill Ackman
Activist investor who publicly supported Gino and provided financial backing for her legal defense against Harvard
Jennifer Fink
Lab manager at UNC who conducted data collection and cleaning for the study; provided three file versions to investig...
Amanda Knox
Referenced as parallel case of wrongful conviction surviving despite exoneration; illustrates reputational damage per...
Bruce Ackerman
Lessig's mentor; co-authored 'The Stakeholder Society' on economic policy ideas referenced in episode
Quotes
"I am absolutely convinced that she is absolutely innocent. Not that she's partly innocent, not that she made some lesser offense, but that she committed no offense at all."
Larry Lessig•Early in episode
"The evidence, this evidence, does not meet the standard the university has established for convicting somebody of academic fraud. Believe what you want about whether she's guilty. The university is plainly guilty for failing to live up to its own standard."
Larry Lessig•Mid-episode
"This naive, childlike belief in process, in legal process, is just astonishing to me. The unwillingness to stand back and be critical is a weakness, not a virtue."
Larry Lessig•Discussion of institutional inertia
"File A is a file. It's produced after earlier versions of the file, but it doesn't announce itself as the final version of the file."
Ava•Data file analysis section
"The only file in this record with all of the participants in it is the file that was posted online, file B."
Larry Lessig•Conclusion of data analysis
Full Transcript
This is Larry Lessig. Welcome back to the podcast, The Law Such As It Is. This is episode five of season four. We are continuing the review of the charges against former Harvard Business School professor Francesca Gino. The first three episodes of this season laid out the procedural history of a case that began with allegations by Data Collada against Francesca and ended with her having her tenure removed by Harvard University, making her, and oh, what a surprise, it's a she, the first person in the history of Harvard ever to have her tenure revoked. There were four allegations of academic fraud brought against Francesca. The fourth episode in this season laid out the evidence behind allegation number two. In this episode, we're going to consider allegation number four. As with allegation number two, and, spoiler alert, allegation number three and allegation number one, this allegation, too, is astonishing in its weakness. It, too, does not establish, certainly with clear and convincing evidence, that Francesca committed academic fraud. Now, before we jump in, I do want to pause on an ambiguity that this way of framing the question might raise, for some at least. I said it doesn't establish with, quote, clear and convincing evidence that Francesca committed academic fraud. Why the qualification? Why, quote, with clear and convincing evidence? Well, I've said throughout this season that based on my review of the evidence adduced against Francesca and the evidence that she and her experts provided in her defense and my knowing her for more than a decade, I am absolutely convinced that she is absolutely innocent. Not that she's partly innocent, not that she made some lesser offense, but that she committed no offense at all. So it's my belief in simple layman terms, she is innocent. But whether she was actually innocent or not was not the question the hearing committee that revoked her tenure was supposed to address. The question the hearing committee was supposed to address, the committee that revoked her tenure, was to be whether the business school had established with clear and convincing evidence that she was guilty of academic fraud. This is a point that's often lost on people who are blessed not to be lawyers. So with apologies to the less blessed, let me unpack a little bit what that qualification actually means. If you drive your car and you negligently sideswipe another car and that other person sues you, demanding you pay for the damage you've caused, the question the fact finder, typically a judge, sometimes a jury would answer, is this. Is it more likely than not that you, the defendant, drove negligently? That question is asking a simple probability question. Is the fact finder 51% confident or more that you were negligent? If it is, you are guilty. If it is not, you are free. By contrast, if you are accused of a crime, and God forbid you are ever accused of a crime in the American criminal justice system, because that system is just a total disaster, but if you are and you do what literally 98% of federal criminal defendants don't do, you decide to go to trial, then the question the jury at that trial will have to address is whether there is proof beyond a reasonable doubt that you are guilty. The law doesn't translate that standard directly into probabilities. It's more like a confidence that you must have. So if you're a juror on a criminal trial, you should feel 90 to 95 percent confident that the person is guilty. That's an extremely high level of confidence. But it is the level that we as a society have decided is appropriate to avoid the horrendous outcome that the innocent would be convicted wrongly, though we can be 100% certain there are plenty of innocent people who are convicted wrongly. The standard the hearing committee was supposed to apply in deciding whether to terminate Francesca's academic career stands between these two standards. The standard was clear and convincing evidence, which the Supreme Court has explained as, quote, a firm belief or a conviction of guilt. And as scholars and courts have translated it, they've said it's a 70 to 80 percent level of confidence. Some have said even higher, 75 to 85 percent level of confidence. I talked a bit about this in the third episode. Now, if you're like me or someone like me and you believe that Francesca is actually innocent, the tragedy of this story is no matter what happens now, a significant proportion of people will think that she's guilty. Compare the story of Amanda Knox, the American student in Italy who was accused of murdering a close friend, but then years later, four years later, was found not guilty. Many still believe she is guilty, even though a man was convicted of the crime based on DNA evidence that linked him directly to the crime scene, while there was no DNA evidence tying Knox to the crime scene, not to mention the fact that she had zero motive to murder one of her closest friends. Still, though, Knox must live in a world where many think her guilty, because after all, that's what the courts originally concluded. And indeed, that's perhaps the most common reaction I've gotten as I've become public about defending Francesca. How can you be so sure, Lessig, when there was such an extensive process undertaken to determine whether she was guilty? The business school ran an extensive process. The university ran an extensive process. Literally millions of dollars have been spent to determine whether Francesca was guilty, and they concluded she was. When someone says something like that to me, I just want to say, grow the fuck up. Actually, I'm sorry, kids, that's not an appropriate word to use unless it is in an appropriate context, which this plainly is. This naive, childlike belief in process, in legal process, is just astonishing to me. The unwillingness to stand back and be critical is a weakness, not a virtue. The failure to recognize the inertia that bureaucracies unleash is an obliviousness that I cannot believe intelligent people entertain. Again, The Amanda Knox case is a sad parallel, a kind of inverse of Francesca's case. There, an American was swept up into a Byzantine Italian legal process that could not find an obvious truth despite years of judicial process. She sat in jail for four years. In this case, there's an Italian swept up into an American Byzantine legal process that refused to acknowledge the flaws, the obvious flaws at the very beginning of this case, and see how those flaws would tilt the whole process against its possibility of determining any truth. As the months of litigation continued, as the costs of that litigation mounted at each stage, if you know anything about the psychology of people in such a process, you know that these people were thinking to themselves, geez, the conclusions below just have to be right, or I need to be absolutely certain that the conclusions below are not right if I'm going to reverse them now. If the committee, the hearing committee, was actually an appellate court, a court whose job it was to review the fact-finding of a lower court, there might be some excuse for that perspective. Because ordinarily, an appellate court reviewing the fact-finding of a jury is not allowed to decide whether it believes the jury was right or wrong. It's supposed to decide whether it thinks that the decision of the jury was clearly erroneous. And only if it was clearly erroneous, do they have the right to reverse the finding of a jury. Indeed, I'm sure that's the hardest part about being an appellate judge, that they have to review these cases that they are certain were wrongly decided, but they're not allowed under the rules of appellate procedure to reverse because the mistake was not clearly erroneous. But in this case, the hearing committee was not reviewing the fact-finding of the business school under a rule of deference. Indeed, the hearing committee expressly said, quote, the hearing committee intends to review and consider the report of the Harvard Business School Investigation Committee as one part of the evidentiary record for this matter. As discussed below, the hearing committee will also consider any response provided by Professor Gino to that report. In addition, the hearing committee will conduct hearings and make findings of fact as required by the rules and governing third statute proceedings. The committee's job, as the committee itself believed, was to review the evidence de novo, meaning literally review the evidence anew and decide whether that evidence established under a clear and convincing standard that Francesca was guilty. And I get it. There's lots in this case that's hard. There are plenty of conflicting facts. There's a lot of hypotheticals about what could have happened and lots of daydreaming by the most cynical that maybe Francesca is some sort of evil genius and that she did everything she did in order to both cheat and to cover up her cheating. There's tons of speculation. But here is the easiest fact in this case. You cannot look at this evidence and draw the conclusion that there is clear and convincing evidence that she committed academic fraud. Again, I think she's absolutely innocent, but the point is, the critical point is, the reason why what the hearing committee did was wrong is, whatever the uncertainty, there is no uncertainty about this. The evidence, this evidence, does not meet the standard the university has established for convicting somebody of academic fraud. Believe what you want about whether she's guilty. The university is plainly guilty for failing to live up to its own standard. It has, for the first time in its 390-year history, removed the tenure of a faculty member by misapplying the standard established by the university to protect the tenure of faculty members. Now, that's not to say this is the first time Harvard has forced a faculty member out. There are lots of cases where somebody was accused of wrongdoing, and the university succeeded in forcing them to step aside. Some of those cases I know pretty well, and with the ones I do know, I'm glad the university has succeeded in getting the faculty member to step aside. But this faculty member, Francesca, when accused of a crime, said, hell no, I did not do it, And I'm asking you to apply your standard to determine whether I can keep my job despite the slanders against me. She fought back. And it is my firm belief that if the university had applied its standard properly, she would have won. Again, that's not to say she would have exonerated herself, at least in the eyes of a distracted public. The charge against her will always stain her reputation. However wrongfully, and I believe absolutely wrongfully, it was that they were raised against her. But this incident also stains Harvard's reputation. And I should think, at least among Harvard faculty who work with data, it should create a certain chill. Because are you absolutely confident that in the processing of your data by the RAs or lab assistants who actually handled your data, there were not mistakes made? Mistakes that might suggest that you had fraudulently represented your results? Because if you are not absolutely certain that such mistakes were not made, then I would strongly recommend you stop citing those papers and pray that within the next six years, no one raises any questions about them. That's the reality that follows from the prosecution of Francesca Gino. That reality is just absurd. Okay, now look, I get it. There are a lot of people who will cheer the fact that a tenured professor at Harvard, meaning me, has charged Harvard with a great wrong, because there are many people who hate my university. But as I said at the start of the season, I am not among the people who hate Harvard University. I love this university. And I'm among the people who would die to defend the academic freedom that universities like Harvard and Harvard permit. That freedom is this podcast. I'm criticizing something I love because it has acted wrongly. And in its wrong, it has done enormous and unjust harm to a decent and brilliant scholar. It is the greatest honor of my career that I get to work at an institution like this. It is an even greater honor that I live within a culture that permits me to criticize openly and freely an institution that I love. Because to criticize is not to condemn, to criticize is the opening move in an offer to repair. Okay, so one final point before we turn to the substance of allegation number four. In the time between the last episode and this episode, Wall Street activist investor Bill Ackman revealed that he has been supporting Francesca in her fight to clear her name. In a post on X, which has received millions of views, Mr. Ackman described why he had concluded that the charges against Francesca were false, and he indicated that he would support her in her effort to defend her name and reputation. So there's obviously a backstory here. The brief version is this. When Francesca began to run out of money to pay her lawyers to defend herself, I think she'd spent over a million dollars at that point and was drawing on her retirement and her children's education fund, she realized and her friends realized that she needed outside support if she was going to continue. I then reached out on her behalf to a number of people who Francesca knew who might be in a position to offer that support Bill Ackman was one of those people He responded and wanted to hear the story directly from her So Francesca Mr Ackman and I and one other person from his team listened to Francesca describe her story. Bill Ackman asked her questions, probing questions. He then took a significant amount of time to work with his staff to fill in the details so that they could come to believe that his initial reaction, his initial instinct was correct. Those instincts were the same as his conclusion, that she had been wrongly convicted. And so for the last couple of years, he has been the critical behind-the-scenes financial support that has made it possible for her to defend herself against Harvard, but for his support. She would be personally bankrupt without a job and with no chance of restoring any part of her reputation. Now, seeing the story about Bill Ackman supporting Francesca develop has been a little surreal for me. I try hard to avoid things like this on social media, but friends sent me a couple of choice tidbits. There was outrage out there among Trump supporters who were criticizing Bill Ackman, himself a supporter of Donald Trump, for his backing a project supported by somebody like me, an opponent of Donald Trump? What was his problem, they asked. How could he support something that a critic of the president was also supporting? I don't know Bill Ackman personally. I have enormous respect for the particular genius that would allow him to be so successful. I was also incredibly impressed when he came out as a strong supporter of birthright funds, which would be a government-funded investment account for every child, giving them a lump sum that is invested immediately in broad, low-cast equity funds held in tax-free accounts until their retirement. That's close to maybe better than an idea my mentor Bruce Ackerman has championed with Ann Allstott in their book, The Stakeholder Society. Bruce and Ann are also not Trump supporters, proving again that ideas can be thicker than politics. There are many things that Bill Ackman has said that I support. There are many things he has said that I don't support, but so what? I consider it a strength that even someone with whom I disagree could see the case as I do, that is, that the case against Francesca is fatally flawed. And I can't express enough my gratitude that he has been willing to give her the chance to prove it. Okay, finally, let's get back to the promise of this episode, allegation number four. Once again, I'm joined by an interlocutor. Like last time, the words have been written by a friend who is expert in data analysis, like last time. For complicated reasons that continue still, that person can't talk freely about the case. But this time, they've recorded the podcast with me, and I've used fancy AI from Eleven Labs to replace their voice with an AI's voice. I'm going to call this AI Ava. Ava and I will go back and forth discussing this allegation, just like I did with Ron Suskind and allegation number two. The reason for this style is the reality that humans are pretty good at understanding conversations, and they get distracted listening to monologues, especially monologues with beautifully mellifluous voices of someone like me. Seriously, how many times did you check your Instagram during the last rant I just unleashed? See? See my point? Okay, so that's the introduction. Let's turn now to the episode. So welcome, Ava. Tell us a little bit about allegation number four. Hi, Professor. Thanks for having me. Happy to help unpack this story to make the weakness with Allegation 4 clear. This allegation involved a study that Francesca conducted with four other researchers over 15 years ago while she was at the University of North Carolina. Fifteen years ago? That's right. Fifteen years ago. I don't understand. Doesn't Harvard have a policy that says that allegations of academic misconduct more than six years old, quote, may not be investigated? Yeah, that's a rhetorical question, right? I mean, you covered that. Was it in episode two? It was. So, yes, Harvard is a policy consistent with regulations from the U.S. Department of Health and Human Services Office of Research Integrity, which recognizes a six-year limitation period on allegations of misconduct due to, quote, the problems that may occur in investigating older allegations and the potential unfairness to the respondent in defending against them, end quote. But Hartford believed there is an exception to that policy if the alleged fraudulently created data continued to be, quote, used in some particular way. Used in some particular way? What does that mean? In episode three, at the transcript beginning at page 27, you explained the policy. The TLTR is this. The statute of limitations was originally a bit ambiguous, and some people thought merely citing an earlier work was enough to expose the author to subsequent investigation. That way of reading the rule, however, would make the protection of the rule meaningless. Merely posting a list of your works, like a CV or on a website, would mean that all your works were continually subject to investigation. I don't agree with that reading, but to remove any doubt, in 2024 the rule was explicitly modified. Modified how, he asks, trying to sound like this too is a genuine question. Modified to indicate that the class of cases they're accepting from the, quote, no investigation rule is when the earlier research is being relied upon in subsequent research. The precise language of the modification is, quote, citation to the portion or portions of the research record alleged to have been fabricated for the potential benefit of the respondent, end quote. Portions of the research record seems clearly to signal something more than a mere citation, right? Right. It seems to comport with what makes sense, that if you rely on the allegedly fabricated data and subsequent work, you lose the benefit of the limitation. But not if you simply cite an article alleged to have fabricated data behind it. So how precisely did Francesca cite the papers here? The short answer is simple and sweet. She never cited the research to support or rely on the findings subject to the allegations. She simply cited the papers. Okay, so to underline this point, she never cited the work in a way that should trigger the exception to the rule barring investigation. So that the rule barring investigation should have barred the investigation of this allegation, right? Exactly. Okay, again, so with the law professor's obsession about process, please recognize how incredibly important this process limitation is. If you believe in what the law calls statutes of limitations, if you believe in the justice of a statute of limitation, which is not about allowing guilty people to go free, but about allowing innocent people the freedom to live their lives without fear that they will be wrongfully accused of crimes allegedly committed, in this case, 15 years ago, then please recognize how wrong it is to allow the prosecution here of this 15-year-old charge. Most of the records surrounding the facts in this case just do not exist. Practically no one would have any recollection about what actually happened on the days allegedly constituting the crime. Email records don't exist. Intermediate copies of files don't exist. The IRB determination does not exist. Bottom line, this is a radically incomplete record. And thus, on the basis of this radically incomplete record, it is wholly improper for the university to charge someone with a crime. Even if the evidence were absolute and overwhelming, it would be wrong to prosecute this crime. That is what due process means. But in this case, the evidence is not absolute and overwhelming. It is not even persuasive. And yet Harvard prosecuted this alleged crime against Francesca. It is just wrong. Okay. Yeah. So that's another rant, and I'm sorry, let's get back to the actual allegation here. Absolutely. So tell us what happened. Francesca and her co-authors wanted to do a study to evaluate the effect of a pledge of honesty on someone's actual honesty. And a pledge of honesty? Yeah, you know, like a document you sign that says, I promise to be honest. Do people really sign such documents? Of course. For example, when a student takes an exam, often they sign an honor pledge promising that they won't cheat, or when you promise to tell the truth before you're deposed, or when you submit your taxes. There are examples of pledges of honesty all over the place. Okay, so how did they intend to measure whether a pledge of honesty actually affected actual honesty? The way it was to work was this. And remember, this is 15 years ago, so it was all in real space, in real rooms, in real buildings at the university. Francesca there was at the University of North Carolina. And most astonishingly, it was all done on paper. These were not study participants giving answers on the computer screen. These were subjects filling out answers to real questions in real rooms on paper. So the real space in particular was two rooms. In one room, the subjects were given a bunch of math problems to solve under time pressure. Math problems? Math problems. Love it. The whole exercise was framed around completing math problems and being paid for their performance on those problems. Participants were going to be paid a higher amount than what people usually received as a payment in a regular study because, they were told, they would be taxed on their earnings. And so the question was whether they over-reported their performance on the math problems they solved in the first room when filling out a form that looked like a tax form in the second room. Here's the important point. After they solved the math problems, they scored their own answers. Why would you let them score their own math problems? It was all a setup so that they would have the opportunity to cheat. Ah, so they do the math problems in room one, score their answers. They then go to room two and report the number they got right on this tax form. Right. And on some of the tax forms, there was a pledge of honesty. Sounds almost like Monty Hall. Okay, so what precisely happened then in this room two? In room two, there were basically three different treatments. Treatments? What is this, a spa now? No, treatments is the term in social science for the different processes that different subjects receive in an experiment. The whole purpose of the research is to compare how these different quote-unquote treatments or processes affect the outcome. By comparing these differences, researchers hope to make inferences about the relationship between treatment and outcome, between cause and effect. Okay, so what were the treatments these students were subjected to? Subjected to is a little harsh, but okay. In all three treatments, the students were going to report how many of the math problems they got correct, and they did so on a form that looks like a tax form. With one treatment, before the students reported, they signed a pledge of honesty saying, and I quote, I declare that I will carefully examine this return and that to the best of my knowledge and belief it is correct and complete. End of quote. That language is similar to what you affirm on a real tax form. Okay, so in this treatment, they sign a document that basically says I'm going to be honest. Then they answer the questions about how many answers they got right in room one. And the study then tracks whether they, in fact, were honest. Right. How do they know whether they're being honest? Well, the participants didn't know this. But though they didn't shine the sheets on which they worked out the answers to the math problems, there was a unique identifier for each participant on those sheets. The arrays conducting the research in room one collected the answers and could thus tell the number of math problems that each participant actually got correct. But what's the incentive to lie? Do they get paid money based on how many they get right or get wrong? Yes, they get paid more the more they get right. Okay, so in treatment one, the subject signs a form that says, I'm going to be honest, and then reports the number of math problems they got right. The researcher had a trick to determine whether they were exaggerating, and if so, by how much. Okay, so what's treatment number two? In treatment two, they first report the number of answers they got correct, and then they sign the tax form at the bottom. That form at the bottom includes a pledge of honesty. Okay, so after they have written down the number of questions they got right, they then pledge to be honest. It's kind of hard to imagine how that pledge could affect the number they had already reported, at least assuming that time travel is not possible. Well, there are actually plenty of forms like this. On the U.S. tax return, for example, you first complete your tax return and then you sign at the bottom. So someone clearly thinks it works to have people promise honesty at the end of a process. Whoever designed the IRS forms must think that. But yeah, it's a little weird. From first principles, it's not obvious why this would work. Okay, so then what was treatment number three? Treatment number three is the control case. There is no signature space on the tax forms, neither before they complete the form nor after. The participants are simply asked how many math problems they got correct. Okay, so after they've collected all these data and analyzed it, what did the authors conclude? They concluded that, in fact, signing the form saying that you were going to be honest, that is, signing at the top of the form, had a statistically significant effect on cheating. When they signed first, participants were, in fact, more honest than when they signed after or did not provide a signature at all. Okay, so then what was Harvard's allegation about this research? Well, the allegation is divided into two parts. For now, let's stick with the part related to the data used in the study. And what was the allegation about that data? It basically came down to the difference between two files. The first was the data set Francesca analyzed and posted online on OSF. She shared the write-up of the results from that data set on July 18, 2010 with her co-authors. Call that File B. A second file is a different data file that was sent to Francesca by her lab manager on July 16, 2010. Call that file A. HBS's data consultant, Maidstone, found differences between these two files. The differences were the foundation for the claim that Francesca had modified the data fraudulently. So what do you mean by differences here Well the values in some cells are different Specifically there are 73 differences plus three new rows of data And so why would that difference be significant The suggestion is that the differences in the file, all but one of which strengthen the conclusions of the paper, must be differences that were introduced by Francesca. And if she introduced those differences to strengthen the conclusions of her paper, well, that's fraught. Okay, that's agreed. But let's be sure we understand the context for these data files. So who produced these files? As you described in episodes two and three and a bit in four, the production of the data, which the academics then evaluate, is a process that is conducted by others, by graduate students, undergraduates, and sometimes people hired as lab managers or research assistants, RAs. These are people who find the academic work interesting or maybe just need a job. Some may want to become academics themselves. They help conduct the experiments and prepare the data to be analyzed. And again, to repeat a little bit what we said before or what was said earlier in earlier episodes, what do you mean by prepare the data? The data has to be, researchers call it cleaned. In this case, first it had to be typed into Excel because it originated on paper forms. Then within the data files, if there were any inconsistencies, they had to be flagged or corrected or somehow dealt with. Then the data needed to be put into a format that would allow for statistical analyses. And again, to beat a dead horse, who is doing this cleaning process? Basically, anybody except the professors running the ultimate analysis. I mean, the professors could do it, but they always want RAs or students to do it for them to make better use of their own time. Plus, the people who actually ran the experiment would know best who to exclude from the data set. They would see who did not follow instructions or was clearly not paying attention. But in this case, who specifically conducted that data work? In this case, the data work was done by Francesca's lab manager at UNC, Jennifer Fink, and one or more RAs who helped her run the study. In its investigation, HBS contacted Jennifer. She shared the three files that she still had. Each of them was attached to emails she had sent to Francesca back in July 2010. Each of them had the same name but had been saved at different times. The last one to be saved for this study is the one we're calling File A. HBS concluded that File A was the final file given to Francesca before she performed the analysis for the paper. Okay, so the lab manager was running the lab at UNC, and she ran this experiment. She and some of the RAs perhaps gathered the data, transcribed it, cleaned it, prepared it, and gave it to Francesca. Right. And the allegation here is that there is a difference between the data they gave to Francesca finally, supposedly file A, and the data that Francesca used as the basis for the analysis, file B. In the language that you've used in the course of this podcast, we can call those differences the anomalies. The question is, who produced those anomalies? The hearing committee believed that the time between the last time file A was saved and the time file B was created makes it difficult to imagine that those anomalies were produced by anyone other than Francesca. So they concluded she must have committed fraud in producing File B. Okay, but that assumption only makes sense if File A actually is the final work product of these research assistants or the lab manager, right? Right. The assumption of the hearing committee is that File A represents the final file that Jennifer and the research assistants worked on, that File B was not worked on by Jennifer and the research assistants. File B has anomalies within it. The difference is between it and file A. The hearing committee concluded Francesca produced those differences. Okay, so that's clear. So file A and file B are different. The difference is, except in one case, though to harp on a point I made in an earlier episode and will make even more strongly in the next two episodes, not all the changes actually strengthen the conclusions of the paper. But the difference is in general strengthen the conclusions here. like the HBS Investigation Committee, the hearing committee believed, therefore, that there was motive and opportunity and that, therefore, Francesca was guilty. That's what they concluded. Right. We don't conclude that. I don't conclude that. I don't believe she's guilty at all. I continue to believe, as she insists, that she is innocent. But it doesn't look good, right? I mean, there are differences between file A and file B. With one exception, those differences do strengthen the conclusions of the paper. Francesca was the one working with file B, so it certainly sounds like she must have made those changes. Yeah, it looks like that, except for one important fact about the context, which you were just hinting at. We do not know that file A is actually the final file, as prepared by Jennifer and the research assistants and given to Francesca to analyze. Okay, so I'll play along. What do you mean by that, Ava? Well, file A is a file. It's produced after earlier versions of the file, but it doesn't announce itself as the final version of the file. How could it? We've all seen files with names like final or final final or final revised, where someone thought a file was final, but then ended up making additional modifications. Anyway, there's not even a file name like that here. The actual name of the file was, quote, tax study. And it's not like there is an official archive in which each version of this file was stored. We're talking about some email attachments. What we know for sure is this is one version in the life of this data, a life that began when the data was transcribed from paper forms that no longer exist. But we don't know whether there were other modifications made to this file before it was finally given to Francesca. Finally? Right, finally. There's evidence she got a version of File A on July 16th, but we don't know whether there were other versions of file A that she got later than July 16th. Okay, so what you're saying is that file A is a file, but there's nothing to indicate that it's the final version of the file that was given to Francesca for her to evaluate. Exactly. Jennifer did not testify that file A was the final file. The investigation committee did not ask Jennifer whether she was certain that file A matched the raw data, nor did Jennifer say it matched the raw data. And HBS's own data consultant said it could not be sure either, as it did not have access to the raw data set or the complete email records for the time. It was HBS that concluded that file A was the final data file given to Francesca to analyze. Did anybody testify that file A was the final data file? No, nobody testified that file A was the final file. There was no actual evidence from witnesses that file A was the final file given to Francesca before it was analyzed by her to produce the paper. Did Harvard check Francesca's email? Not surprisingly. She doesn't have her email from 15 years ago from a different university and a different computer. In June 2022, Francesca tried to obtain her email records from UNC so she could reconstruct the research process related to the study. UNC informed her that since she was no longer employed, she wouldn't have access. And so she was going to go talk to the dean at UNC to make the case. But UNCIT told her that it was pointless because their email was only retained for five years anyway. Okay, so Jennifer provided three versions of this file we've called File A. But is there any indication that there were no other files? No, and that's the point. There is certainly no ledger of what files previously existed. A dozen years after, people began looking and no other files were found. But that really doesn't prove that there weren't other files at the time. And here's where time is so incredibly important. The evidence of files shared on email a dozen years ago is likely radically incomplete. Francesca testified that much of her work back then, continuing through the pandemic, was happening through exchange of data on USB thumb drives at in-person meetings, not across email. In July 2010, Francesca was still at UNC, ready for a move to Boston later that month. There in person, it would have been natural for her to continue to meet with Jennifer in person and exchange files using USB keys. But of course, no one has 15-year-old thumb drives. So there's at least a presumption, right, that file A is the final file, because no other file later than file A but before file B has been found. But is there anything in the content of file A to suggest that it actually wasn't the final version of the file? Anything to rebut this presumption that it was the final file? Absolutely. In the email with an earlier version of File A, Jennifer indicated that there were problems with the data. As she wrote, quote, the people are serious dum-dums on this study. Serious, all caps, in the original. They seem to be having some serious issues calculating the money, or if they got the amounts right, they were written and scribbled in very strange ways on the form, end quote. When was that email written? It was written after the earliest version of File A that we have was produced, but before File A was produced. And did the final version of file A correct the problems that Jennifer's email was referring to? It did not. File A still needed to have that work done. We just don't have a version of file A with that work done. Or put differently, we don't have a version of file A where the work that Jennifer had identified needed to be done was actually done. Okay, so that strongly suggests there was another file. Bingo. Jennifer is indicating that she has work that she needs to do. File A does not include that work. All indications are that she was a great lab manager, so having flagged work that needed to be done, surely she would have done it. We just don't have the file that demonstrates that she did the work or that shows data after she did the work. Which would mean we don't have the final version of file A, correct? Correct. Okay, so is there other evidence to suggest that file A is not the final file? Yeah, sorry to bury the lead. Here's the most conclusive evidence that file A is not the final file. It doesn't even include all the participants in the study. Wait, it's incomplete? It doesn't include all of the participants in the study? How do we know that? Well, because it turns out a dozen years after the fact, Francesca still had paper receipts for the payments made to these participants. It makes sense that she would have those receipts, at least initially after her move to Boston. Her HBS job started on July 1, 2010. She moved to Boston in late July. She needed the physical receipts so that HBS could reimburse her for the study she had conducted. That's why she would have brought them to Cambridge. But late in the summer of 2023, she discovered she had never thrown them away. When she unpacked some boxes in her garage, boxes delivered by HBS from her office after they put her on unpaid leave, she found a box with the receipts. Those receipts demonstrate conclusively that there were more participants who completed the study than were reported in file A. Well, wait. So what you're saying is that she has paper receipts demonstrating that there were more people who completed the survey than are reported in file A, meaning obviously file A is not the final data file for this study. Are those people included in file B? Yes. By examining the paper receipts for the study, Francesca determined that file A does not match the paper receipts, while file B does. This suggests that the committee did not have the final file that Francesca was working from. They therefore did not have the predicate for saying she changed anything. Slow down. So what do you mean by matching here? When subjects were paid for their participation in studies at UNC, the RA running the study recorded the payment on a payment record with subject name, address, ID number, and amount paid. From the subject name, it is possible to determine gender because the name is common or identify it via online searches, for example, LinkedIn. Meanwhile, the data files can be used to calculate the amount paid based on participants' answers, and the data files indicate the gender of each participant. So there are two fields to match gender and payment amount. The receipts Francesca found have records for 310 participants. The study at issue only had 101 participants. The 209 additional participants relate to other studies running in the same UNC lab in the same period. By coding participant gender from names on payment records and matching payment amounts in gender with what's in the data files, you can see that all the payments listed in File B are accounted for in the receipt records. By contrast, File A showed discrepancies. For example, File A listed more participants at certain payment amounts than were actually paid according to receipts. Specifically, according to File A, three participants should have received $7, but the subject receipts indicate that only two subjects were paid $7. Similarly, file A says 14 subjects were paid $16, but the payment receipts say only 11 were paid that amount. By contrast, file B matches the paid subject numbers, and that strongly supports the conclusion that file B is actually the final data that was given to Francesca after correction of the errors in file A by the people who corrected data, RAs in general, or Jennifer in particular. Okay, so Francesca's lawyers must have presented this obviously exonerating evidence to the hearing committee, right? I mean, how did the hearing committee address this evidence? This is one of the many examples of the hearing committee revealing that they didn't understand what was in their record. In response to her lawyers pointing to these receipts, the hearing committee wrote, and this is a quote, Professor Gino claims to have reviewed the original paper receipts completed by study participants and verified that the later data on which her analysis relied are accurate. She did not, however, provide those receipts or explain how they account for the analysis data set. And the citation there is A650. That sounds bad, but it's flatly contradicted by the actual record. because those receipts were, in fact, in the record at RX 626A, A1708-10, and RX 626B. Maybe we can post all that to the website, blurring the names. Yeah, we'll do that for sure. But this is really, truly astonishing. I mean, I met and worked a bit with Francesca's lawyers during the tenure revocation hearing. They were superstar lawyers, incredibly competent, and they conveyed their competence in every filing and every minute of their oral advocacy. So imagine the chutzpah of a committee that is told by the lawyers, look, there are receipts, and yet doesn't recognize that if the lawyers say there are receipts, there are freaking receipts. The lawyers are not going to lie about something like that They not going to say that there are receipts when there are not receipts So you would think that the hearing committee would think to itself geez have we missed that there are in fact receipts Can someone go and find the receipts in the record But that's not what they did. Instead, they just assumed, oh, here yet again, Francesca must be lying. But no, Francesca was not lying. Her lawyers were not lying. She was honestly reporting that she had found the receipts and that the receipts did show that there were people who participated in the study who were not reflected in file A, which means, again, file A was not the final file given to Francesca for her and her colleagues to evaluate. Yeah, that's right. I wish I had been in the third statute hearing room or in the room where someone, I guess, lawyers, drafted the remark that Francesca didn't provide the receipts because she plainly did, and those receipts plainly showed that the whole predicate to the committee's conclusion that she changed the values just disappears. The only file that's in the record that includes all the participants is file B. Okay, so let's pull this together. The whole of this charge against Francesca with respect to the data is the difference between file A and file B. But for that charge to be a valid charge, the committee had to establish that file A was, in fact, the final file as completed by her research lab manager, Jennifer, and that it was the file from which Francesca did her work. But in fact, there's plenty of evidence that it wasn't the final file. Most conclusively, that the paper receipts demonstrated that that file did not include all the people who actually took the survey. The only way to conclude that she changed the data is to provide the baseline from which the data was changed. Without that, there is no foundation for the claim. Francesca modified the original data. The only file with all the data is the file that was actually posted on the OSF site. That's right. The whole foundation for a charge of academic fraud is that she changed the data in a way that strengthened her conclusion. But the foundation for that claim does not exist. It's not surprising that it doesn't exist. Everything happened 15 years ago. But that's just another reason why it's improper to prosecute a claim of fraud that's 15 years old. It's not just that the charge is not timely. It's also that the evidence is so deeply flawed by the incompleteness of the record that there's no fair basis on which one could conclude that she's guilty. Again, the only file in this record with all of the participants in it is the file that was posted online, file B. Okay, but this story just gets worse because then there's another allegation the hearing committee took up with respect to this paper. Tell us about the other part of this allegation, which we could call allegation 4A. Yes, this part is even crazier. The essence of the charge is that Francesca initially described and conducted one experiment, But when a co-author later pointed out that that experiment didn't make sense, Francesca changed the description to describe a different experiment. Okay, so you're going to have to unpack this a bit. What exactly does that mean? Well, remember, Francesca was initially working with two other academics in designing this study. They joined forces with two other researchers who had field data that would make the paper stronger. When the two newcomers read the details of the July 2010 study in 2011, one of them pointed out a potential issue. Francesca made edits to respond to the issue. And so what was the issue? The original write-up described a study as requiring the subjects to do math problems in room one, being paid in room one, and then in the second room, making a pledge about honesty or not, and being paid again based on the results in room two. And so wait, why would they be paid in the first room if the purpose of the study was to measure whether they are honest in reporting their results, and that's done in the second room? And why pay them twice? Well, exactly. It would make no sense. If the payment occurred in both rooms, that would imply that the dependent variable, the thing they were trying to measure here, whether the subject lied, was completed before the independent variable, whether or when they made a pledge. But that's backwards. It could only make sense to alter the independent variable and then see how it affects the dependent variable. Yet the original draft of the paper seems to suggest that was the plan. It wasn't, and it wasn't what actually happened. So when one co-author noticed the weirdness in the description and raced it to Francesca, Francesca fixed it by conforming the description to what actually happened. Conformed it how? By changing the description of the study, explaining that payments were made in the second room only. But Herbert saw this change and charged that this change was deceptive. that in fact the study had been run in the flawed, brain-dead way described in the earlier draft and that Francesca covered up that blunder by revising the draft rather than discarding the flawed data. That cover-up, they alleged, was academic misconduct. Okay, but what evidence is there that the study was not conducted as the original version of the paper was described? With one exception that I guess will take up in a minute, all the rest of the evidence that there is in this case. At first, and again, simple logic, the design would be brain dead. No one, certainly not these academics, would make such a stupid design choice. Second, the lab manager Jennifer, who was actually in the room, were rooms where it happened. She did not say that there was a payment in the first room. In fact, she observed that if there had been a payment in the first room, then in some cases they would have had to take money back from participants in the second room. She said she didn't remember taking back any money at all. Third, the receipts. Again, if they had paid participants in both rooms, some would have had to have some of their payments reversed based on what happened in the second room, but none of the receipts indicate anyone gave any money back. And if they had paid participants in both rooms, there would have to be two receipts per person or two sections of one receipt showing the first payment and later the second. The receipts have nothing like that either. Fourth, the structure of the data. If there were two payments, there would need to be two variables in the data set to record that, maybe two rows for each participant or one row with two numbers, two columns, or with a comma, well, there's nothing like that either. There was one row consistent with the obvious design that payment be made after the treatment once. Fifth, HBS's second data expert acknowledged the weirdness in this charge, as he said, quote, the part that I find myself wondering about, end quote, on which, quote, I think there is the most room for disagreement, end quote. I can see why he was wondering, but I can't see why he thought there was any room for disagreement. So bottom line, we have a weird initial draft of the paper that described a brain-dead design for the experiment. But beyond that draft, there is no direct evidence to support the claim that this was, in fact, how the experiment was conducted. And there's clear indirect evidence that it was not done like that, both the logic and the structure of the data that was actually collected. So then what does Francesca say actually happened here? Well, first, like any normal person, she has no actual recollection of what was going on with these changes 15 years ago. But second, she has said that if there were changes, they were changes to conform the paper to what actually happened in the room. Period. Okay, so what's the exception? What evidence was there that the study was conducted as the original version of the paper described it? The hearing committee pointed to language used on an IRB form for this study. Okay, what's an IRB? That's the Institutional Review Board. It's basically the system to assure that research involving human subjects treats those human subjects ethically. Harvard pointed to language on that form that suggested the study was actually planned in the brain-dead way we discussed. Is Harvard's claim that the researchers twice described the study as involving payments in both rooms, so that's two times the evidence suggesting that that's in fact what happened? That's their theory. But actually, it's clear that the language on the IRB form was just copy-pasted into the write-up of the results of the study, which then became an early draft of the paper. It's clear? That's exciting. Something in this case is clear. Why is it clear that it was just copy-pasted? The study is described in bullet points in both documents. But there's an identical typo in both documents. They both say, quote, participants are welcome to the lab, ask the, read the consent form for the study, and sign it, end quote. That's the typo asked the read rather than asked to read. That's why I say that it's clear that these are not two documents independently created, each describing payments in both rooms. This is one document that was created, then copied into another, replicating the original typo in both places. There's nothing surprising about that. Researchers copy and paste across documents all the time. Many researchers feel some annoyance at having to write IRB proposals for studies that don't seem to present any kind of risk to participants, but by copying and pasting out of an IRB proposal, at least something useful comes from that process. And since the study is expected to be consistent between the proposal and what is actually done, ordinarily this would be fine. Did the description of payment in both rooms survive on the IRB submission? We don't have the submission, so we don't know what it said. The document that Harvard was pointing to is not from UNC's IRB. It was simply a document found on Francesca's computer. That document was the form that would have been submitted to the IRB. We have no evidence whether, in fact, it ever was. And so why don't we have that evidence? Well, it wasn't in the evidence. Francesca reached out to the IRB office and asked them if they had the final statements. The UNC IRB office said they did not. Okay, so basically, once again, we have an incomplete record about what actually was submitted to the IRB, and so no clear foundation to conclude, based on what was submitted to the IRB, that the experiment was as this original draft of the IRB form suggests. Certainly no reason to believe that this form was independently created, and therefore suggesting that there were two documents describing this brain-dead design. Instead, the text was copy-pasted between the documents, and so we're left with still just one document describing a brain-dead design and no additional evidence to support the finding that, in fact, that's how the study was conducted. Yeah, I would say that what was, or more accurately, what might have been submitted to the IRB doesn't change the way we look at whether the initial description actually described what actually happened in the room. Okay, so just to repeat and sum it up because it really is quite extraordinary. Whatever was described originally, there is first, no evidence from anybody who was actually in the room that there were payments done in both rooms. Second, no evidence in the forensic evidence, nothing in the receipts, nothing in the data files that suggest that payments were made in both rooms. And third, there is no godly reason why that's the way an experiment like this would have been designed, because that design would have no way to measure what the study was intended to measure. Causation flows in one direction. If a pledge of honesty is to have any effect, it's only going to have an effect in the future. So when Harvard found the change in the description, it could either have believed, number one, that the change was intended to conform the paper to what actually happened, or number two, that the change was meant to cover up what actually happened. But here's an example where there's not just no clear and convincing evidence of Francesca's guilt, but there's clear and convincing evidence of Francesca's innocence. You cannot, as a matter of law, conclude that the paper was covering up anything when number one, there was no evidence beyond the initial draft of anything to be covered up, and number two, that the thing that would be covered up makes no freaking sense as a design for an experiment. As between the two possibilities, number one, that the change was meant to conform the paper to the experiment, or number two, that the change was meant to hide a brain-dead design for the experiment, only the most biased fact finder could conclude number two over number one. And certainly no fair fact finder could conclude with clear and convincing evidence that these talented researchers, later joined by two other researchers, would design such a stupid study and then attempt to cover up their own design stupidity. To the contrary, there was a typo in an earlier draft of a paper describing the research project. That typo was corrected. It should be chilling to any academic that the evolution of a draft paper 15 years ago would be foundation for the prosecution to remove someone's tenure. Okay, that's allegation number four. Ava, thanks so much for participating in this conversation. I look forward to talking more about the remaining two allegations. Thanks. That was episode five of season four of the podcast, The Lost Itch It Is. We will consider in the next two episodes, two more allegations against Francesca, and then there will be at least one final episode pulling it all together and suggesting where do we go from here? You can find these podcasts wherever you find podcasts. There's a website, thegenocase.info, that has source material behind the podcast. There's also a sub stack you can find if you search for the Geno case with my name attached to it. I hope you will follow and share with people who might be interested to follow these facts. I'm grateful to my friends that they would help me unpack this story. I'm even more grateful that they will help me unpack the allegations in the next two episodes thanks again for listening stay tuned I hope not too long from now for the next episode this is Larry Lessig