Predicting Major Donor Prospects Using Machine Learning

Greg Lee, Aishwarya Vaishali Sathyamurthi

and Mark Hobbs

Jodrey School of Computer Science, Acadia University, Wolfville, Canada

Keywords:

Fundraising Institutions, Major Donors, Machine Learning.

Abstract:

An important concern for many fundraising institutions is major gift fundraising. Major gifts are large gifts

(typically $10,000+) and donors who give these gifts are called major donors. Depending upon the institution

type, major gifts can constitute 80% of donation dollars. Thus, being able to predict who will give a major gift

is crucial for fundraising institutions. We sought the most useful major donor prospect model by experimenting

with 11 shallow and deep learning algorithms. A useful model discovers major donor prospects (i.e., false

positives) without generating a similar number of false negatives, helping to preserve accuracy. The study also

examined the impact of using different types of data, such as donation data exclusively, on the model’s utility.

Notably, an LSTM-GRU model achieved a 92.2% accuracy rate with 110 false positive prospects and 40 false

negatives for a religious fundraising institution. This model could assist major donor ofﬁcers in identifying

potential major donors. Similarly, for an education fundraising institution, an extra trees classiﬁer was able to

generate a major donor model with 92.5% accuracy, 71 false positives and 40 false negatives. False positives

are prospects for fundraising institutions, providing major gift ofﬁcers potential major donors.

1 INTRODUCTION

Fundraising institutions rely on major gifts for a sig-

niﬁcant portion of their budget. University founda-

tions in particular receive about 80% of their donation

dollars from major gifts (Gift’s, 2021). The thresh-

old for a major gift varies by fundraising institution,

but typically ranges from $10,000 to $50,000. While

these are typical minimum thresholds, major gifts can

be in the range of millions of dollars.

Major donors are donors who have either given a

gift that meets the fundraising institution’s major giv-

ing threshold or who have an ofﬁcial pledge to do so

with the fundraising institution. Since these donations

can be 500x to 50000x larger than the average dona-

tion to a fundraising institution, fundraising institu-

tions spend much more time with each potential ma-

jor donor than non-major donors in order to increase

the likelihood of a gift. Thus, having a precise list of

likely major donors is critical for a fundraising insti-

tution’s success. The aim of the research presented in

this paper is to predict potential major donors. Ma-

jor donors are important because their gifts make up

a large chunk of the organisations overall fundrais-

ing revenue. It is crucial to prioritize the relationships

with them. Major donors are more inclined to give to

https://orcid.org/0009-0007-7107-1569

fundraising institutions that have a dedicated steward-

ship strategy to cultivate their relationships (Market-

ing, 2021).

An important distinction in the search for major

donors is the need for prospects, which amount to

false positives in the output of the model. These

prospects would be correctly classiﬁed as negative ex-

amples, since they have not yet given a major gift, but

a model that correctly classiﬁes all negative examples

is of no use to a fundraising institution or its major

gift ofﬁcers. On the other hand, false negatives are in

no way desirable as they amount to a model classi-

fying major donors as non major donors, which they

cannot be, since they have already given a major gift.

Thus, what is sought from a major gift model is an im-

perfect model (i.e., a model with some accuracy loss)

where more/most of the errors are false positives. We

thus created a metric called the False Positive Nega-

tive (FPN) ratio which we use in our empirical evalu-

ation to choose the best learners for various fundrais-

ing institutions. It is calculated as FP/FN. This is

similar to cost-sensitive learning, but more directly

models the relationship between (desirable) false pos-

itives and (undesirable) false negatives.

462

Lee, G., Sathyamurthi, A. and Hobbs, M.

Predicting Major Donor Prospects Using Machine Learning.

DOI: 10.5220/0012422700003636

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 16th International Conference on Agents and Artiﬁcial Intelligence (ICAART 2024) - Volume 2, pages 462-470

ISBN: 978-989-758-680-4; ISSN: 2184-433X

2 RELATED RESEARCH

Most of the past research done predicting donor giv-

ing behavior makes use of linear techniques. Con-

nolly and Blanchette (1986) (Michael S. Connolly,

1986) used discriminant analysis, and Gerlinda Mel-

chiori (1988) (Melchiori, 1988) used classiﬁcation

analysis to predict donor behavior, both of which are

types of linear regression. These techniques are in-

appropriate when the object is to predict rare events

(such as giving over $10,000) or when the dependent

variable has an upper or lower bound and there are a

large number of individuals at the bound (as with giv-

ing, where there are numerous individuals with zero

giving).

Brittingham and Pezzullo note that certain current

characteristics of alumni were found to be predictors

for major gift giving in some studies, but not oth-

ers (Brittingham and Pezzullo, 1990). Income, age,

number of degrees from the institution, emotional at-

tachment to the school, participation in alumni events,

and participation in and donation to other voluntary

and religious groups were found to be predictors.

Wesley and Christopher (1992) used logit analy-

sis in 1992 to predict the individuals who would give

higher (e.g., $100,000) or lower ($1,000) donations

based on the data from the alumni database as well as

the geo-demographic information (Winship and Lin-

dahl., 1992). Their result showed that 92% of the

dollars could be collected with 36.5% prospects se-

lected in the annual fund model. Later with their up-

graded model (1994) (Lindahl and Winship, 1994),

a slightly better performance was achieved for major

gift prediction. In this research, the test results using

deep learning models showed accurate results when

using large data sets for certain fundraising institu-

tions, compared to some shallow learning models as

described in empirical studies.

3 PROBLEM FORMULATION

The business problem at hand is to generate a ranked

list of constituents who have never given a major gift

as prospects, so that MGOs

can focus their time and

effort on them. To do so, we solve the problem of de-

termining which machine learning algorithm can best

learn to distinguish between major donors and non-

Fundraising institutions employ major gift ofﬁcers

(MGOs) to seek out and ‘convert’ major donor prospects.

These MGOs can spend years developing a relationship

with potential major donors and thus the decision con-

cerning with whom to begin a relationship is an important

one (Gift’s, 2021).

major donors and then use that algorithm to predict

future major donors.

The process of securing a major gift generally

takes over a year, and involves several touch points

from the MGO. Typically, an MGO meets in person

with a major gift candidate on several occasions be-

fore a gift can be secured. This differs from non-

major gifts where there is generally just one touch -

an email, phone, or direct mail solicitation. Fundrais-

ing institutions must be aware of their cost per dol-

lar raised, so when an MGO spends fundraising in-

stitution money and time on a prospect, the prospect

must have the potential to give a large gift. Thus,

it is imperative that the model ordering the major

donor prospects be accurate, since so much time (and

money) will be spent with each prospect.

The data used in the experiments is provided

anonymously by Anonymous, an Anonymous-based

company whose objective is to help non-proﬁt orga-

nizations raise more money by focusing on turning

one-time donors into lifetime supporters. Anonymous

works with organizations such as universities and dis-

ease related fundraising institutions. They create per-

sonalized emails and develop donor proﬁles based on

their interaction with the software. This approach

generates a huge amount of data, which is provided to

machine learning algorithms to help achieve the ob-

jective of this research.

The major donor data generated by Anonymous

is based on constituent interaction with fundraising

institutions. For our experiments, we collected data

from 8 fundraising institutions as shown in the Ta-

ble 1.

Table 1: Data sets from 8 fundraising institutions from 3

verticals (disease, education, religious).

Representation Type of FI’s

AlzF Alzheimer’s FI

CF Cancer FI

EF-1 Educational FI 2

EF-2 Educational FI 2

EF-3 Educational FI 3

EF-4 Educational FI 4

RF-1 Religious FI 1

RF-2 Religious FI 2

These data sets have far fewer major donors than

non-major donors as seen in Table 2. This means

the major donor data is heavily skewed towards non-

major donors and must be balanced before training a

model (Lee et al., ). Note that fundraising institutions

EF-1, EF-2, EF-4 and RF-1 had signiﬁcantly more

major donors than the other 4 fundraising institutions

and we focus our attention on these. We examine the

Predicting Major Donor Prospects Using Machine Learning

463

results for AlzF, CF, EF-3 and RF-2 in Experiment 5.

Table 2: Number of samples of each type in each data set.

Major Donors Non-Major Donors

AlzF 46 90859

CF 82 52123

EF-1 2080 104677

EF-2 4393 76155

EF-3 658 54121

EF-4 3226 211519

RF-1 1856 64843

RF-2 309 101459

Fundraising institutions gather data on their con-

stituents for tax purposes. This information includes

the constituent’s address, as well as the donation

amount and date. As data analysis and machine learn-

ing technologies become more common, fundraising

institutions have recognized the value of data and be-

gun to collect more data to help differentiate between

constituents. This data can be broken down roughly

into the following categories:

3.1 Demographic Data

Demographic data include age, gender, income, and

job title, but most fundraising institutions do not keep

track of these values for many constituents. Instead,

address information can be used to infer some of this

information, and the method of request is recorded to

determine which solicitation methods and modes of

communication are acceptable to a constituent.

3.2 Donation Data

Donation data is recorded by fundraising institutions

to track revenue. Donation dates and amounts can

aid machine learning algorithms, but they do not di-

rectly provide trend data to them. As a result of these

two simple features, we create new donation features

such as number of donations by phone appeal, small-

est gift, and variance.

3.3 Educational Data

University foundations beneﬁt from a more in-depth

understanding of their constituents’ activities while

they were students. These foundations use club mem-

berships, degree numbers, and graduation dates to de-

termine what materials to send to their alumni, and

machine learning can use these features as well.

3.4 Behavioural Data

The interactions of constituents with a given fundrais-

ing institution are frequently not recorded by the

fundraising institution, but they can be an indicator of

future giving. Whether a constituent attends events,

volunteers, opens emails, or watches fundraising in-

stitution videos, computers can learn how much afﬁn-

ity a constituent has for a fundraising institution.

4 DATA PREPARATION

We feed the data in the form of comma separated

value (CSV) ﬁles to machine learning models whose

dimension along the X-axis is the number of con-

stituents and the dimension along the Y-axis is the

number of features.

There are two different datasets fed to the machine

learning model, major donors (data which has ma-

jor donations made by constituents) and non-major

donors (data where no major donations are made by

constituents). The non-major donors data have more

samples (negatives) compared to major donor data

(positives) which makes the data unbalanced.

As the data in Table 2 are unbalanced, we balance

the data of major donors and non-major donors and

then split into train (70%) and test (30%) to feed to the

machine learning algorithm and calculate the accu-

racy and FPN ratio. We oversample the major donors

dataset and balance using the following approach.

We have oversampled the minority class us-

ing synthetic minority oversampling technique

(SMOTE). We deﬁne a SMOTE instance with default

parameters that will balance the minority class and

then ﬁt and apply it in one step to create a trans-

formed version of the dataset. Once transformed, we

summarize the class distribution of the new trans-

formed dataset, which would be balanced through the

creation of new synthetic examples in the minority

class.

4.1 Dealing with Missing Values in

Dataset

Most statistical modeling is unable to handle missing

values and may produce unpredictable results. In this

research, all the null values are replaced with zero be-

cause with neural networks, it is safe to input miss-

ing values as zero, with the condition that zero is not

already a meaningful value. The network will learn

from exposure to the data that the value zero means

missing data and will start ignoring the value.

ICAART 2024 - 16th International Conference on Agents and Artiﬁcial Intelligence

464

4.2 Handling Categorical Data

Categorical data is also known as nominal and ordi-

nal data. Some features, such as “title” (e.g., “Ms”)

are nominal and thus need to be transformed for most

machine learning algorithms. We use one-hot encod-

ing and create a new feature for every value of each

nominal feature, with exactly one of these newly cre-

ated features having value 1 for each parent feature,

and the rest of the values being 0.

4.3 Handling Giveaway Features

Giveaway features for major giving include maximum

donation, average donation, intercept, slope, total do-

nations and standard deviation of gifts, since values of

these that are larger than the major giving threshold

for a fundraising institution can immediately reveal to

a model who the major donors are, and thus create

a perfect, yet useless, model, since no false positives

will occur and thus prospects are found. We remove

giveaway features from the data in order to build use-

ful models.

5 THEORY AND APPROACH

We used 11 machine learning algorithms in order to

try to accurately model major giving, so that we can

feed a model a constituent and get an accurate idea

of whether that constituent is likely to give a major

gift. In this section, we brieﬂy describe the algorithms

used that produced usable results, and the setup of our

empirical work.

We compared various machine learning and deep

learning techniques and evaluated the mean accuracy

for each of them by a stratiﬁed K-fold cross-validation

to prevent overﬁtting. In this basic approach, K-fold

CV, the training set is split into k smaller sets:

1. The model is trained using the K-1 folds as train-

ing data.

2. The last fold is used to compute the model perfor-

mance.

5.1 Gaussian Naive Bayes

Naive Bayes classiﬁers are a group of supervised ma-

chine learning classiﬁcation algorithms based on the

Bayes theorem. It is a simple classiﬁcation technique,

but has high functionality (Majumder, 2021). Com-

plex classiﬁcation problems can also be implemented

by using Naive Bayes Classiﬁer. When working with

continuous data, an assumption often taken is that the

continuous values associated with each class are dis-

tributed according to a normal (or Gaussian) distribu-

tion. The likelihood of the features is assumed to be:

P(x

| y) =

2πσ

exp

−

− µ

)

2σ

(1)

Sometimes assume variance is independent of Y

(i.e., σi), or independent of Xi (i.e., σk) or both (i.e.,

σ(z))

5.2 Decision Trees

Decision trees are fundamentally recursive, the algo-

rithm learns through repetition (Khan, 2021). The al-

gorithms attempts different splits and determines the

split that achieves the correct classiﬁcation as many

times as possible. The root node is selected based on

the attribute selection measure (ASM) and is repeated

until there is a leaf node (cannot split anymore). ASM

is a technique used in data mining processes for data

reduction. The two main ASM techniques are Gini In-

dex and Information Gain (ID3). The ID3 algorithm

builds decision trees using a top-down greedy search

approach through the space of possible branches with

no backtracking. The steps in ID3 algorithm are as

follows:

1. It begins with the original set S as the root node.

2. On each iteration of the algorithm, it iterates

through the very unused attribute of the set S and

calculates Entropy(H) and Information Gain (IG)

of this attribute.

3. It then selects the attribute which has the smallest

entropy or largest information gain.

4. The set S is then split by the selected attribute to

produce a subset of the data.

5. The algorithm continues to recur on each subset,

considering only attributes never selected before.

5.3 Random Forest Classiﬁer

A random forest is a collection of decision trees

whose results are aggregated into one ﬁnal result.

They limit overﬁtting without substantially increasing

error due to bias. It is also one of the most used algo-

rithms, because of its simplicity and diversity (it can

be used for both classiﬁcation and regression tasks).

A random forest has nearly the same hyperparameters

as a decision tree or a bagging classiﬁer. With random

forest, we can also deal with regression tasks by us-

ing the algorithm’s regressor. Random forest adds ad-

ditional randomness to the model, while growing the

Predicting Major Donor Prospects Using Machine Learning

465

trees. Instead of searching for the most important fea-

ture while splitting a node, it searches for the best fea-

ture among a random subset of features. We can also

make trees more random by additionally using ran-

dom thresholds for each feature rather than searching

for the best possible thresholds (like a normal deci-

sion tree does).

5.4 Extra Trees Classiﬁer

An extra trees classiﬁer also known as extremely ran-

domized trees is an ensemble machine learning algo-

rithm that combines predictions from many decision

trees. It is related to random forests. It uses a simpler

algorithm to construct the decision trees than random

forests do to use as members of the ensemble.

5.5 Adaboost Classiﬁer

Adaboost or Adaptive Boosting is one of ensem-

ble boosting classiﬁer proposed by Yoav Freund and

Robert Schapire in 1996 (Navlani, 2018). It is a

meta-estimator that begins by ﬁtting a classiﬁer on the

original dataset and then ﬁts additional copies of the

classiﬁer on the same dataset but where the weights

of incorrectly classiﬁed instances are adjusted such

that subsequent classiﬁers focus more on difﬁcult

cases. The basic concept behind Adaboost is to set the

weights of classiﬁers and training the data sample in

each iteration such that it ensures the accurate predic-

tions of unusual observations. Any machine learning

algorithm can be used as a base classiﬁer if it accepts

weights on individual training examples. Adaboost

should meet two conditions:

1. The classiﬁer should be trained interactively on

various weighed training examples.

2. In each iteration, it tries to provide an excellent ﬁt

for these examples by minimizing training error.

5.6 Convolutional Neural Networks

Convolutional neural networks (CNNs) are a spe-

cialized type of neural network model designed for

working with two-dimensional image data, although

they can be used with one-dimensional and three-

dimensional data.

CNNs have the same functionality irrespective of

their dimensionality. The only difference is the struc-

ture of the input data and how the ﬁlter, also known as

convolutional kernel or feature detector, moves across

the data. Each layer of CNNs (Figure 1) conduct dif-

ferent tasks.

Convolutional Layer: It has two key parameters.

One is the kernel size, and the other is the number of

Figure 1: Convolutional neural networks(Saha, 2018).

ﬁlters. The layer ﬁrst divides the input into ﬁxed-size

patches that are the same size in all ﬁlters.

Pooling Layer: It reduces the size of a feature

map. Once the feature maps from previous convo-

lutional layer enter a pooling layer, the network di-

vides the input feature map into a ﬁxed number of re-

gions (determined by the pool size) and summarises

the value in each region into a single maximum or av-

erage value.

Dense Layer: It is also known as fully connected

layer and is the last part of the network. Follow-

ing completion of all convolutional-pooling computa-

tions, the network arranges the values of ﬁnal feature

maps in a row.

5.7 Recurrent Neural Networks (RNNs)

Standard neural networks do not have memory to

store what they learn. RNNs have a unique archi-

tecture that enables data to persist and models short

term dependencies. So, RNN are neural networks that

are designed for the effective handling of sequential

data but are also useful for non-sequential data (Seker,

2020).

5.8 Gated Recurrent Unit (GRU)

A Gated Recurrent Unit is a type of Recurrent Neu-

ral Network that addresses the issue of long term de-

pendencies which can lead to vanishing gradients. To

solve the vanishing gradient problem of a standard

RNN, GRU uses an update gate and reset gate. GRUs

store “memory” from the previous time point to help

inform the network for future predictions (Kostadi-

nov, 2017).

5.9 Recurrent Neural Networks

Standard neural networks such as feed forward neu-

ral networks do not have memory to store what they

learn. For every iteration, the network starts fresh as

it does not remember the data in the previous itera-

tion while processing the current set of data, which

ICAART 2024 - 16th International Conference on Agents and Artiﬁcial Intelligence

466

is a disadvantage when identifying correlations and

data patterns. This is where recurrent neural networks

(RNNs) come into picture. RNNs have a unique

architecture that enables data to persist and models

short term dependencies. So, RNN are neural net-

works that are designed for the effective handling of

sequential data but are also useful for non-sequential

data (Seker, 2020).

5.10 Gated Recurrent Unit

Gated Recurrent Unit is a type of Recurrent Neural

Network that addresses the issue of long term depen-

dencies which can lead to vanishing gradients larger

vanilla RNN networks experience. To solve the van-

ishing gradient problem of a standard RNN, GRU

uses update gate and reset gate. These are two vectors

which decide what information should be passed to

the output. GRUs address this issue by storing “mem-

ory” from the previous time point to help inform the

network for future predictions (Kostadinov, 2017).

5.11 Long Short-Term Memory

Networks

Long Short-Term Memory networks (LSTMs) is a

type of RNNs, which are capable of learning long-

term dependencies and they work effectively on a

large variety of problems. LSTMs remember infor-

mation for a long period of time and are designed

explicitly to solve long-term problems. LSTMs have

similar structure though the internals have different

components when compared to a single tanh (activa-

tion) layer in RNN. The four layers in the architecture

interact with each other. The cell state C allows infor-

mation to ﬂow through the entire LSTM unchanged,

which enables the LSTM to remember context for a

long period of time (See Figure 2). The horizon-

tal line has several inputs and outputs which is con-

trolled by gates that allows information to be added

to or removed from the cell state. The sigmoid lay-

ers output numbers between 0 and 1, describing how

much should be let through from each component. An

LSTM has three of these gates to control the cell state:

Forget gate, input gate and output gate.

5.12 Bi-Directional Long Short-Term

Memory Networks

A Bi-directional LSTM, or BDLSTM, is a sequence

processing model that consists of two LSTMs: one

taking the input in a forward direction, and the other

in a backwards direction. BDLSTMs effectively in-

Figure 2: Long Short-term Memory(blog, 2015).

Figure 3: Bi-directional long short-term memory (Alhamid,

2021).

crease the amount of information available to the net-

work, improving the context available to the algo-

rithm. BDLSTM adds one more LSTM layer, which

reverses the direction of information ﬂow. It means

that the input sequence ﬂows backward in the addi-

tional LSTM layer. Then it combines the outputs from

both LSTM layers in several ways, such as average,

sum, multiplication, or concatenation (Figure 3).

6 EMPIRICAL EVALUATION

The initial experiments were carried out on balanced

datasets using random forest classiﬁers, Adaboost,

extra trees classiﬁers, Gaussian na

ıve-Bayes, decision

trees and logistic regression to see how well shallow

machine learning models can predict potential major

donors, Non-major donors (negative cases) outnum-

ber major donors (positive cases) for all fundraising

institutions as seen in Table 2. Data was balanced

in all experiments in order to not bias the model to-

wards negative cases. Testing data was kept separate

from training data and all results shown are on testing

data. We used accuracy and the False Positive Neg-

ative (FPN) ratio as metrics to evaluate our models.

To be included in the presented results, models must

have had an FPN of at least 1 (at least as many false

positives as false negatives) and an accuracy above

80%. When more than one model met these criteria

for a fundraising institution, the model with the higher

Predicting Major Donor Prospects Using Machine Learning

467

FPN was presented. There were some exceptions to

this rule where there were no FPNs >= 1 or no accu-

racies above 80% among the models that were used.

6.1 Experiment 1: Predicting Prospects

Using Shallow Models

The goal of this experiment was to demonstrate shal-

low learning models’ ability to predict major donor

prospects for fundraising institutions (FIs). We exper-

imented with 6 different ML models for 4 fundraising

institutions in order to accurately predict future ma-

jor donor prospects. Based on the accuracy and FPN

values, the best performing models are shown in Ta-

ble 3. For each of the fundraising institutions with a

signiﬁcant number of major donors, there is a shallow

learner that is able to provide an accurate model (of

at least 82%) with at least as many FPs as FNs (an

FPN > 1). While AdaBoost generally provides the

most accurate models with FPN ratios above 1, de-

cision trees and extra tree classiﬁers sometimes have

similar performance.

Table 3: Best shallow learners for Experiment 1.

Learner FI Accuracy STD FP FN FPN

ExtraTrees EF-1 92.52 0.117 71 40 1.78

AdaBoost EF-2 91.83 0.045 87 86 1.01

AdaBoost EF-4 88.94 0.110 44 31 1.42

AdaBoost RF-1 81.95 0.135 134 67 2.00

6.2 Experiment 2: Predicting Prospects

Using Deep Learning

This experiment’s objective is to improve the FPN

while maintaining similar accuracies to Experiment 1

using deep learning techniques. Table 4 shows results

for 4 fundraising institutions (FIs) using deep learn-

ing algorithms. For EF-1 and EF-2, there is a drop in

accuracy, although EF-2 has a 65% rise in FPN. For

RF-1, there is an increase in both the accuracy of the

learner and the FPN, showing that deep learning helps

for that particular fundraising institutions.

Table 4: Best deep learners for Experiment 2.

Learner FI Accuracy STD FP FN FPN

BDLSTM-GRU-TDL EF-1 87.27 0.014 113 76 1.49

RNN EF-2 80.80 0.031 254 153 1.66

GRU EF-4 91.00 0.008 50 48 1.04

LSTM-GRU RF-1 92.19 0.012 110 40 2.75

6.3 Experiment 3: Predicting Prospects

Using Only Donation Data

This experiment’s objective is to input only donation

data to deep learning models and remove the behav-

ioral data, educational data, demographic data and

giveaway features to observe the effect for 4 fundrais-

ing institutions (FIs) to predict future major donors.

The data we used for this experiment is the dona-

tion data used in Experiment 6.1 (Table 1). Table 5

shows the best deep learners using only donation data.

LSTM-GRU is the most useful learner in this exper-

iment, showing an improvement or holding steady

over Experiment 2 in terms of both accuracy and FPN

ratio. We include LSTM-GRU for RF-1 here to show

that while BDLSTM-CNN was better in terms of ac-

curacy and FPN ratio, the difference is minimal and

that LSTM-GRU is generally the best choice of algo-

rithm when using only donation data.

Table 5: Best deep learners for Experiment 3.

Learner FI Accuracy STD FP FN FPN

LSTM-GRU EF-1 85.65 0.016 143 70 2.04

LSTM-GRU EF-2 85.75 0.008 216 86 2.51

LSTM-GRU EF-4 94.00 0.016 123 111 1.10

BDLSTM-CNN RF-1 94.34 0.019 32 31 1.03

LSTM-GRU RF-1 94.25 0.017 31 33 0.94

6.4 Experiment 4: Predicting Prospects

Using Only Donation and

Behavioural Data

This experiments objective is to input only donation

and behavioral data to deep learning models and re-

move educational data, demographic data and give-

away features to observe the effect for 4 fundraising

institutions (FIs) to predict future major donors. The

data we used for this experiment is the same dona-

tion and behavioural data used in experiment 6.1 (Ta-

ble 1). Table 6 shows the results. Accuracies drop

compared to Experiment 3, which is surprising given

that behavioural data is included here and is not in-

cluded in Experiment 3. False positive numbers in-

creasing is the explanation for this accuracy drop, and

if a fundraising institution is seeking more prospects

and willing to sacriﬁce some accuracy, using both do-

nation and behavioural data may be the best decision.

Note that we again include LSTM-GRU here for RF-

1 to show it’s FPN of 8 with an accuracy comparable

to RNN.

ICAART 2024 - 16th International Conference on Agents and Artiﬁcial Intelligence

468

Table 6: Best deep learners for Experiment 4.

Learner FI Accuracy STD FP FN FPN

RNN EF-1 77.23 0.096 205 133 1.54

LSTM-GRU EF-2 80.18 0.133 252 168 1.5

GRU EF-4 88.02 0.027 116 102 1.13

RNN RF-1 90.21 0.034 84 25 3.36

LSTM-GRU RF-1 89.49 0.029 104 13 8

6.5 Experiment 5: Predicting Prospects

for Smaller Fundraising Institutions

While fundraising institutions AlzF, CF, EF-3 and RF-

2 have only 46, 82, 658, and 309 major donors each,

they do represent much of the charitable sector that

does not have a large amount of data on major donors.

We present the best learners for these fundraising in-

stitutions (FIs) in Table 7. Note that all results are

using all data available. While the results are more

difﬁcult to trust given the smaller number of false pos-

itives and false negatives, they do follow the same pat-

tern as those in Experiments 1-4 and should provide

smaller fundraising institutions with some conﬁdence

that these methods will help them ﬁnd major donor

prospects.

Table 7: Best learners for smaller fundraising institutions.

Learner FI Accuracy STD FP FN FPN

GRU AlzF 88.00 0.015 6 2 3

Random Forest CF 94.11 0.162 2 1 2

RNN EF-3 85.56 0.016 33 21 1.57

BDLSTM-CNN RF-2 96.54 0.057 9 7 1.29

7 DISCUSSION AND FUTURE

WORK

1. Based on Experiments 1-5, shallow ML models

such as AdaBoost and Random Forest and deep

learning models, such as LSTM-GRU, BDLSTM-

GRU-TDL, BDLSTM-CNN, GRU, and RNN can

be used to accurately predict major donors for a

fundraising institution, with an FPN ratio above 1.

We summarize the best learners for each fundrais-

ing institution in Table 8.

2. The LSTM-GRU algorithm is the most consis-

tent across all fundraising institutions, in terms

of accuracy and FPN. While they are not the best

model for all data sets, the models produced are

among the best models.

3. Gaussian na

ıve-Bayes, Logistic Regression and

basic Decision Trees rarely produced models of

the same quality as other shallow models (such as

Table 8: Best ML model for each fundraising institution,

using accuracy and FPN.

Models Data set ML Model AccuracyFPN

AlzF All features GRU 88.00 3

CF All features Random Forest 94.11 2

EF-1 All features Extra trees 92.52 1.78

EF-2 Donation LSTM-GRU 85.75 2.51

EF-3 All features RNN 85.56 1.57

EF-4 Donation BDLSTM 94.00 1.10

-CNN

RF-1 Donation + RNN 90.21 3.36

Behavioural

RF-2 All features BDLSTM 96.54 1.29

-GRU-TDL

random forests, extra trees or AdaBoost) or deep

models.

4. For most models, eliminating data did increase

both false positives and false negatives (and thus

decrease accuracy), but as was seen in Table 5,

LSTM-GRU models actually improve or stay the

same with less data. This shows that perhaps de-

mographic, education, and behavioural data can

be noisy and that donation data provides a clearer

signal to LSTM-GRU models.

The current research can be advanced by taking

the following ways:

1. Using wealth indicators which are publicly avail-

able data points about donors that provide insights

into their income and wealth status. Wealth in-

dicators can tell which of the prospects are ﬁ-

nancially capable of making a major gift and the

likely size of that gift.

2. It will be interesting to explore other models such

as univariate chi-square methods for features se-

lection. The SMOTE upsampling method that

perturbs some of the features during upsampling

could be implemented and compared with the cur-

rent results.

3. Deep ANNs could be used to develop a regres-

sion model for predicting how much money major

donor constituents would actually contribute.

REFERENCES

Alhamid, M. (2021). Bidirectional lstm. https://

towardsdatascience.com/tagged/bidirectional-lstm.

blog, C. (2015). Understanding lstm networks.

https://colah.github.io/posts/2015-08-Understanding-

LSTMs/.

Brittingham, B. E. and Pezzullo, T. R. (1990). Fund

raising in higher education / barbara e. brittingham

Predicting Major Donor Prospects Using Machine Learning

469

and thomas r. pezzullo. https://catalogue.nla.gov.au/

Record/5520361.

Gift’s, M. (2021). Donor search. https://www.donorsearch.

net/major-gifts-guide/.

Khan, S. (2021). Data science explained: Deci-

sion trees. https://www.godatadrive.com/blog/data-

science-for-business-leaders-decision-trees.

Kostadinov, S. (2017). Understanding gru networks.

https://towardsdatascience.com/understanding-gru-

networks-2ef37df6c9be.

Lee, G., Raghavan, A. K. V., and Hobbs, M. Machine learn-

ing the donor journey. In Goutte, C. and Zhu, X., ed-

itors, Advances in Artiﬁcial Intelligence - 33rd Cana-

dian Conference on Artiﬁcial Intelligence, Canadian

AI 2020, Ottawa, ON, Canada, May 13-15, 2020, Pro-

ceedings.

Lindahl, W. and Winship, C. (1994). A logit model with

interactions for predicting major gifts doonors. Re-

search in Higher Education, 35(6):729–743.

Majumder, P. (2021). Gaussian naive bayes. https://iq.

opengenus.org/gaussian-naive-bayes/.

Marketing, N. (2021). Major donor fundraising: Effec-

tive strategies for 2022. https://www.donorsearch.net/

major-donor-fundraising/.

Melchiori, G. S. (1988). Alumni research: An introduc-

tion. https://onlinelibrary.wiley.com/doi/abs/10.1002/

ir.37019886003.

Michael S. Connolly, R. B. (1986). Understanding and pre-

dicting alumni giving behavior. https://onlinelibrary.

wiley.com/doi/abs/10.1002/ir.37019865107.

Navlani, A. (2018). Adaboost classiﬁer in python.

https://www.datacamp.com/community/tutorials/

adaboost-classiﬁer-python.

Saha, S. (2018). A comprehensive guide to convolutional

neural networks. https://towardsdatascience.com/a-

comprehensive-guide-to-convolutional-neural-

networks-the-eli5-way-3bd2b1164a53.

Seker, E. (2020). Recurrent neural networks and lstm

explained. https://purnasaigudikandula.medium.

com/recurrent-neural-networks-and-lstm-explained-

7f51c7f6bbb9.

Winship, C. and Lindahl., W. E. (1992.). Predictive mod-

els for annual fundraising and major gift fundraising.

Nonproﬁt Management and Leadership.

ICAART 2024 - 16th International Conference on Agents and Artiﬁcial Intelligence

470