GrantInq - Shiny Contest Submission

GrantInq

Authors: Finn Luebber

Abstract: The majority of research funding is distributed via grants. Researchers write applications and most of them are not funded, leading to huge waste of resources. But does this process find the best applications? How expensive are different variations? And how does fairness or a lack thereof play a role in selecting only excellent proposals?
In this app, users can create their own funding scenarios: They decide about organizational parameters of the process, and how the process of selecting applications is designed. They also decide how well researchers can distinguish between good and bad ideas, and whether they are biased against a subgroup of applicants.
Finally, users can simulate data and compare the outcome of their scenario with other scenarios in terms of quality, costs, and bias.

Full Description:

Motivation for the app

The main purpose of the app is to provide an opportunity to put numbers to arguments around grant funding and to simulate how these numbers lead to outcomes in terms of quality, fairness, and costs of grant application scenarios. This is because arguments about how to distribute money for research get emotionally quite fast and it is often unclear where the disagreement arises. Using the simulation feature of the app, generated data can be used as a foundation for arguing about pros and cons of certain grant funding scenarios. The main outcomes are the quality of the funded proposals, the overall costs, and the equality between groups, i.e., whether or not members of different groups (determined by the user, for instance men and women) have the same chance of receiving funding.

For example: Peer review of grant application is a standard procedure, but also a very expensive one, factoring in the time reviewers need to review proposals. So, let’s say Alice wants to devise a new strategy involving fewer reviewers (or even zero in case of a lottery), which might save a lot of money for society. However, Bob argues that the quality of funded proposals might drop due to a less accurate review process. Next, Carol remarks that reviewers are sometimes biased against certain groups, so less reviews might increase overall fairness. Then Dan chimes in, saying that reviewers often show conservatism bias: They tend to favor known avenues of research instead of ground-breaking ones, so less reviewers could lead to more ground-breaking research. What does that mean for the proposed new grant application scenario – would it be successful?

It is hard to tell since the arguments are of a qualitative nature at best. To decide whose argument is correct and to what extent, it is important to know what to expect from the proposed scenario in terms of outcomes: If Alice can say how much money is going to be saved, Bob has an idea how much the quality might suffer, Carol can point out how biased the processes are, and Dan can tell how much more likely a ground-breaking project is (and how much better it is than usual proposals), all in a common framework, the argument becomes more solution-oriented because it uncovers the source of the disagreement and how to move forward: If all agree on how the simulation operates and on the individual parameters, the simulation might show that the proposed new scenario leads to better outcomes overall, and the discussion ends (at least in principle, certain human characteristics might lead to slight complications). In the more likely case that the proposed scenario is better in one aspect but worse in another, this will turn into a discussion about how to weigh these different aspects: Is it worth to trade a bit of quality for saving money that could flow back into the system at a different place? This might not lead to agreement but it gets clear that this is now a discussion about values and political decisions, and not something that can be settled with a better theory or more data. If they agree about how the simulation operates, but disagree about parameter values, this can lead to new research ideas about how to estimate these parameter values empirically. If they disagree about how the simulation operates, this can lead to an improved model for the process, at which point the argument goes back to step 1 with an improved theory. Overall, an app like this strengthens the theoretical-empirical cycle and lowers barriers of entry for people who do not do these kinds of modeling and programming on a daily basis.

How the app works

Details about the app’s inner workings can be found [here](https://github.com/finn-luebber/GrantInq/blob/main/Supplementary_Information_Nature_Human_Behavior_2023_06_21.pdf), which is quite a long document, so I want to explain here briefly how a user might use the app:

In the main box at the top, users can first get used to the app by viewing the “Quick Guide Through App” on the left which offers a short introduction to the concepts. Below that, users can choose and explore some example scenarios which are modeled after established funding scenarios. The inputs and outputs in general have tooltips attached where users can find more information. Right next to it, organizational details of the process can be adjusted: How much money there is to distribute, how much each winner should receive, how many applications are there initially, how many stages are there where applications can be rejected, and how many samples should be simulated. Going one step further to the right, the distribution of idea qualities can be changed. Some people might argue they are normally distributed, while others might think there might be skew involved. The next part is about the groups applying for funding: There can be a maximum of two, and users can define their names, the proportions of the groups, as well as if they are equally likely to apply for a grant. The review parameters on the right are about how the review works. The inputs change depending on the “Competition Mode” and the sliders for the number of accepted applications are responsive to each other, meaning that whatever is selected in one stage becomes the maximum of the next. In case of a normative process in an early stage (which features a Cut-Off where all applications above are accepted, similar to publishing practice in academic journals), this slider in subsequent lottery or competitive stages changes depending on how many applications actually proceed (as this is not deterministic when using a quality cut-off).

Below that users find five boxes with a quick summary about the outcomes of their modeled process, as well as the opportunity to download the generated data. The bottom part features three different tabs: The tab “Save and Compare” offers the possibility to save a designed process, then model a different one, save it as well, and compare the main results of both scenarios right next to each other. The tab “Costs” include necessary parameters to calculate the costs of the process, as well as output plots showing the costs and work hours in different splits.

The most important aspect is the way how bias in the process of selecting grant applications is modeled in the Diversity & Quality tab: As stated earlier, bias in the grant allocation process cannot only creep in as a disadvantage against a specific group of applicants, but also along the range of idea quality. Conservatism bias is an example that is discussed quite often in this context: It describes the observation that truly ground-breaking projects often tend to deviate from their “normal-science” counterparts in a profound manner. Due to these differences, these applications get very spread-out reviews where some reviewers evaluate them very negatively which lets them get kicked out of the competition. In the app, these kinds of biases can be modeled very precisely:

The review process is modeled such that a single reviewer rating is the sum of the true quality score and an error term. The error term is drawn from a distribution with a given mean, standard deviation and skew. These three parameters can be made dependent on the idea quality in a linear, quadratic, or cubic manner by the user (“Advanced settings” of bias in the Diversity & Quality tab). The user selects the desired degree of the polynomial of the dependency and then can draw a custom curve how exactly the dependency is supposed to look like for each of the three error distribution parameters. Finally, the parameters are combined for every step of idea quality and an error term is drawn. The outcomes of this process are seen on the right-hand side next to the curve-fitting procedure: One theoretical distribution of the error term by idea quality, and the rated qualities plotted against the true qualities.

In a practical way, one could model a one-stage review process in which the mean of observed idea quality drops dramatically from a given level by using a quadratic function, while at the same time the standard deviation increases linearly with idea quality, leading to additionally more spread-out reviews for ground-breaking projects. This might on top be different for males and females, thus creating aforementioned disadvantages for subgroups.

For users who do not want to get involved that deeply, there are two different sets of options to adjust biases without this procedure: Predefined biases, which is just one button click per stage, and a middle-ground option where radio buttons can be used to adjust the different parameters (only mean and standard deviation of error scores) in a categorical fashion.

This process yields the observed rating of one reviewer for a single grant application for one stage of the process, and the same process applies for multiple reviewers and applications. Last, one score per application is calculated (average, minimum or maximum value) and applications are ranked by that score to determine who will advance to the next stage (or receive the grant money in the last stage). If there are multiple stages, users can change the bias calculation per stage. This enables to model different review qualities per stage: For instance, a short inaccurate review in the beginning and a more thorough in the final stage. Below the bias settings, there are three output plots, summarizing the data which are generated based on the general and bias settings: Individual applications by quality (Y axis) and stage, mean quality of applications by stage, and group proportions by stage.

Altogether, the individual parameter options enable a detailed simulation of grant application processes, which facilitates discussions about how money flow in research should be organized, and therefore also in which direction science is heading.


Shiny app: GrantInq
Repo: GrantInq/app.R at main · finn-luebber/GrantInq · GitHub

Thumbnail:

Full image: