Machine learning in personalised medicine

Author: Waleed Al-Dabbas

Al-Dabbas, Waleed, 2019 Machine learning in personalised medicine, Flinders University, College of Science and Engineering

Terms of Use: This electronic version is (or will be) made publicly available by Flinders University in accordance with its open access policy for student theses. Copyright in this thesis remains with the author. You may use this material for uses permitted under the Copyright Act 1968. If you are the owner of any included third party copyright material and/or you believe that any material has been made available without permission of the copyright owner please contact with the details.


Inadequate sleep is a common issue in Australia. Up to 45% of Australian adults reported a lack of sleep on a regular basis (Adams et al., 2017). This is significant given the ramifications of inadequate sleep which include obesity, type 2 diabetes and heart diseases (Adams et al., 2017). A major cause contributing to lack of sleep is sleep disorders, most commonly sleep apnea and insomnia (Adams et al., 2017). However, a large portion of clinically significant sleep disorders go undiagnosed (El-Sayed, 2012). With these disorders’ high prevalence and their associated lack of screening, a single accessible and reliable tool for the identification of various sleep problems is highly needed (Senthilvel, 2011).

A sleep health questionnaire was designed by a group of health professionals to tackle this issue. However, as the questionnaire consists of over 100 questions, it is very long for users to complete which may lead to inaccuracies in responses. As such, the project aims to add machine learning tools to the questionnaire to make it more personalised and adaptable. It would also be highly desirable for the machine learning platform to allow the potential for online learning to allow further understanding of sleep disorders. This experiment is a stepping stone towards achieving that objective, with the main aim to develop a method to replicate the existing question logic using machine learning tools.

To do this, fully itemised questionnaires were used in the modeling and testing of various machine learning tools. Tests involved utilising different methods to store data into models, using various methods of representing target data attributes. And finally implementing a variety of machine learning estimators, namely neural networks, decision trees, random forest, support vector machine, k-nearest neighbour and naïve Bayes. All models were tested using 10-fold cross-validation and accuracy scores were computed accordingly.

The results showed that Gaussian naïve Bayes models achieve the highest accuracy, with neural network models being the second most accurate. Although further research is required, the outcomes of this project were promising and can be further developed and improve.

Keywords: Machine Learning, Neural Networks, Dataframe, Scikitlearn

Subject: Engineering thesis

Thesis type: Masters
Completed: 2019
School: College of Science and Engineering
Supervisor: Karen Reynolds