diff --git a/idea.txt b/idea.txt index 1d83eea..cfe61ef 100644 --- a/idea.txt +++ b/idea.txt @@ -106,8 +106,30 @@ These datasets are readily accessible and can be used to develop the extensive s https://www.mdpi.com/2078-2489/14/1/31 https://www.kaggle.com/code/pcharambira/predicting-blood-donations https://gallery.azure.ai/Experiment/Predict-Blood-Donation-Likelihood-using-Real-Dataset +https://www.kaggle.com/datasets/whenamancodes/blood-transfusion-dataset +https://archive.ics.uci.edu/dataset/176/blood+transfusion+service+center + + Given is the variable name, variable type, the measurement unit and a brief description. The "Blood Transfusion Service Center" is a classification problem. The order of this listing corresponds to the order of numerals along the rows of the database. + +R (Recency - months since last donation), +F (Frequency - total number of donation), +M (Monetary - total blood donated in c.c.), +T (Time - months since first donation), and +a binary variable representing whether he/she donated blood in March 2007 (1 stand for donating blood; 0 stands for not donating blood). + + +Table 1 shows the descriptive statistics of the data. We selected 500 data at random as the training set, and the rest 248 as the testing set. + +Table 1. Descriptive statistics of the data + +Variable Data Type Measurement Description min max mean std +Recency quantitative Months Input 0.03 74.4 9.74 8.07 +Frequency quantitative Times Input 1 50 5.51 5.84 +Monetary quantitative c.c. blood Input 250 12500 1378.68 1459.83 +Time quantitative Months Input 2.27 98.3 34.42 24.32 +Whether he/she donated blood in March 2007 binary 1=yes 0=no Output 0 1 1 (24%) 0 (76%)