Normal view MARC view ISBD view

Investigating the Effects of Dimensionality Reduction and Implementation of a Novel Feature Engineering Framework Towards Low Dimensional Medical Datasets / Muhammad Amirul Fahmiin bin Abdullah

By: Muhammad Amirul Fahmiin bin Abdullah [author.]Contributor(s): Dr. Lim Tiong Hoo [supervisor.] | Dr. Kenneth Siok Kiam Yeo [supervisor.] | Universiti Teknologi Brunei Faculty of EngineeringMaterial type: Text

TextPublication details: Bandar Seri Begawan : Universiti Teknologi Brunei, ©2020. Description: 117 pages : coloured illustrations, charts, tables ; 30 cmSubject(s): -- Project Report Universiti Teknologi Brunei | Thesis Writing | Project Report, Academic | Project Report Universiti Teknologi Brunei | Medical informatics -- Data processing | Feature extraction (Computer science) | Dimensionality reduction (Statistics) | Medical records -- Data analysisOther classification: RTDS 343 | UTB 120 REPORT THESIS & DISSERTATION, RTDS 343

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

Holdings ( 1 )
Title notes ( 2 )
Comments ( 0 )

Holdings
Item type	Current library	Call number	Copy number	Status	Notes	Date due	Barcode
Reports, Thesis & Dissertation Students	Universiti Teknologi Brunei Library - at level 2	UTB 120 REPORT THESIS & DISSERTATION, RTDS 343 (Browse shelf(Opens below))	1	Not for loan	Reg. No._UTB [RTDS 343]		850389

Browsing Universiti Teknologi Brunei Library shelves, Shelving location: - at level 2 Close shelf browser (Hides shelf browser)

Previous	No cover image available	No cover image available	No cover image available	No cover image available	No cover image available	No cover image available	No cover image available	Next
Previous	UTB 120 REPORT THESIS & DISSERTATION, RTDS 340 Micro-ring Resonator Based Digital Photonic Logic Device and Circuit Design for Future Computers /	UTB 120 REPORT, THESIS & DISSERTATION, RTDS 341 Simulation of Water Quality Process in A Tidal River with Boating Activities and Mangrove Forests /	UTB 120 REPORT, THESIS & DISSERTATION, RTDS 342 Design, Simulation, Experimentation and Analysis of Fixed, Single and Dual-Axis Solar Tracking Systems /	UTB 120 REPORT THESIS & DISSERTATION, RTDS 343 Investigating the Effects of Dimensionality Reduction and Implementation of a Novel Feature Engineering Framework Towards Low Dimensional Medical Datasets /	UTB 120 REPORT THESIS & DISSERTATION, RTDS 344 The influence of road surface condition on road safety /	UTB 120 REPORT, THESIS & DISSERTATION, RTDS 345 Nutritional, Antioxidant, Sensory and Shelf Life Assessment of Artocarpus integer, Garcinia mangostana and Mangifera pajang as Fruit Jam Confections /	UTB 120 REPORT THESIS & DISSERTATION, RTDS 346 Characterisation of local jam confections from Artocarpus odoratissimus, Dimocarpus longan spp-Malasianus var. malesianus and Lansium parasiticum fruits in Brunei Darussalam /	Next

A thesis submitted to the Universiti Teknologi Brunei in the fulfillment of the requirements for the degree of Master of Science (MSc) in Electrical and Electronic Engineering.

Abstract

In machine learning applications for Electronic Patient Records (EPR), mainly only high dimensional datasets are used to train reliable models for prediction, as opposed to low-dimensional datasets that are dismissed due to the lack of features. However, in the case of health institutions in low developed to developing countries, big digitised data are scarce and Artificial Intelligence approaches have to rely on available low-dimensional datasets resulting in sub-par standards for the constructed predictive model. This research aims to improve reliability and accuracy of machine learning models trained on medical datasets to benefit the health institutions that only has low-dimensional datasets.

To realise it, a framework based on feature preprocessing along with selection of the most suitable classifying algorithm that provides the best overall performance boost is constructed.

This research starts off by identifying the datasets, dimensionality reduction methods and classification algorithms to be tested for its evaluation metrics as a form of performance benchmarking. In the first set of experiments, dimensionality reduction methods of Sequential Feature Selection (SFWS and SBS), Recursive Feature Elimination (RFE) and Principle Component Analysis (PCA) methods were used in variety of combinations with Artificial Neural Network (ANN), Support Vector Machine (SVM) and Random Forest (RF) algorithms. The outcome shows that the dimensionality reduction methods were able to identify the best subset of features from the original dataset. However, it produces negligible (less than 5% increase) or no performance improvements to the machine learning models.

In the second part, the research introduces feature engineering (FE) into the framework as a means of constructing additional data instances to better the quality of the datasets. This resulted in an increase in dataset size from the original three sets after the addition of the engineered features. A similar set of experiments performed in the first part were run, keeping all other variables and hyperparameters constant except for the fed train-test input data. Comprehensive analysis of the results shows consistent increases in accuracy and precision when implementing the FE-RFE-ANN approach resulting in an average of 2.89% increase in accuracy and 2.88% increase in precision across all datasets.

There are no comments on this title.

to post a comment.