June 1, 2022
Dissertation Title: Case Study of Using Hybrid Model Machine Learning Techniques in Educational Data Mining to Improve the Classification Accuracies
When: Tuesday, June 21, 2022, 2:30 PM to 4:00 PM
Where: In person-Simrall-228 (Conference room) Remote access: https://msstate.webex.com/msstate/j.php?MTID=mf85bbfe14972d5f1f7bd23a9f6bc1d11
Candidate: Sujan Poudyal
Degree: Doctor of Philosophy, Electrical and Computer Engineering
Committee members:
Dr. Jean Mohammadi-Aragh
(Major Processor)
Dr. John Ball
(Committee Member)
Dr. Ryan Green
(Committee Member)
Dr. Umar Iqbal
(Committee Member)
Abstract:
A multitude of data is being produced by the increase in instructional technology, e-learning resources, and online courses. It has provided opportunities for data mining and learning analytics in the educational domain. This data could be used by educators to analyze and extract useful information, which could be beneficial to both instructors and students. Educational Data Mining (EDM) extracts hidden information from data contained within the educational domain. In data mining, a hybrid method is a combination of various machine learning techniques. Through this dissertation, we are exploring the novel use of machine learning hybrid techniques in EDM using three educational case studies. Specifically, we are focusing on important aspects of learning, such as the attention behaviors of students, their academic performance, and their academic integrity. First, in consideration of the importance of students’ attention, specifically in a large lecture class where students have their computer in front of them, we collected on and off-task data to analyze the attention behavior of the students. We combined two feature extraction techniques, Principal Component Analysis and Linear Discriminant Analysis, prior to using a Linear and Kernel Support Vector Machine (SVM) to improve the classification accuracies for classifying the students’ attention patterns. We also studied the relationship between attention and learning by calculating Pearson’s correlation coefficient and p-value. We then shifted our examination towards academic performance as it is important to ensure a quality education. We concatenated two different 2D- Convolutional Neural Network (CNN) models and produced a single model to predict students’ academic performance in terms of pass and fail. Our CNN hybrid model outperformed baseline single traditional models such as K-Nearest Neighbor (KNN), Naïve Bayes, Decision Tree, and Logistic Regression (LR) in terms of classification accuracy. Lastly, we considered the importance of using machine learning in online learning to maintain academic integrity. In this work, we primarily used traditional machine learning algorithms such as KNN, LR, random forest, and linear SVM to predict the cheaters in an online examination. We then used a 1D CNN architecture to extract the features from our cheater dataset and used the previously used machine learning model on the extracted features to detect the cheaters. This type of hybrid model outperformed the original traditional machine learning model and the CNN model when used alone in terms of classification accuracy. The three studies reflect the use of machine learning applications in EDM. We first used a single model as a classification technique in all our studies. Classification accuracy is important in EDM because different educational decisions are made based on the results of our model. So, to increase the accuracy, we employed a hybrid method, which is a blend of multiple techniques. We were successful in improving the accuracy. Thus, through this dissertation, we successfully showed that hybrid models can be used in EDM to improve the classification accuracy.