How well a neural network system could interpret the number of contributors to a DNA Profile

Author: Partho Protim Ghosh

Ghosh, Partho Protim, 2020 How well a neural network system could interpret the number of contributors to a DNA Profile, Flinders University, College of Science and Engineering

Terms of Use: This electronic version is (or will be) made publicly available by Flinders University in accordance with its open access policy for student theses. Copyright in this thesis remains with the author. You may use this material for uses permitted under the Copyright Act 1968. If you are the owner of any included third party copyright material and/or you believe that any material has been made available without permission of the copyright owner please contact with the details.


Machine learning approaches and artificial neural network (ANN) are very popular approaches to improve the efficiency in many fields including economics, biology, finance, image processing and gaming. It reduces the manual effort where complex interpretation and analysis are required. The artificial neural network (ANN) and machine learning (ML) are quite new approach in the field of DNA analysis and there are lots of prospects which can be visualised by this technology. There are several goals included in the research work: Create a system using the extensions of ANN such as Recurrent Neural Networks (RNN), or Convolutional Neural Networks (CNN) that could interpret the number of contributors to a DNA profile. The research can be divided into four components: Simulate realistic DNA profile with a probability in accordance to their population frequency database (provided by FSSA named Australia_Caucasian.csv); Simulate DNA profile for single contributor and mix the single profiles to generate mixture of DNA profile for different contributors (Number of contributors in mixture can be 2/3/4/5); Divide the profiles in 21 smaller parts (Each part represents a Locus and DNA profile consists with 21 locus); Prepare the mix profiles and single contributor profiles to train the Neural Networks; Train the Convolutional Neural Networks (CNNs) with the input profiles (Mix Profiles) and output profiles (Individual Profiles) for all 21 Loci and evaluate the performance.

Keywords: DNA, DNA profiling, Artificial Neural Network (ANN), Convolutional Neural Network (CNN), Forensics, Bioinformatics

Subject: Bioinformatics thesis

Thesis type: Masters
Completed: 2020
School: College of Science and Engineering
Supervisor: Professor David M W Powers