Prediction of the retention time of natural product metabolites using transfer learning strategies

Κατσάρα, Βασιλική

dc.contributor.advisor	Matsoukas, Minos-Timotheos
dc.contributor.author	Κατσάρα, Βασιλική
dc.date.accessioned	2024-10-22T06:07:39Z
dc.date.available	2024-10-22T06:07:39Z
dc.date.issued	2024-10-11
dc.identifier.uri	https://polynoe.lib.uniwa.gr/xmlui/handle/11400/7837
dc.identifier.uri	http://dx.doi.org/10.26265/polynoe-7669
dc.description	The research for this thesis was conducted at CEU San Pablo University in Madrid, Spain.	el
dc.description.abstract	Retention time (RT) prediction in chromatography can play an important role for numerous analytical applications, including drug discovery and environmental monitoring. This study aims to enhance RT prediction accuracy by employing deep learning techniques, particularly focusing on transfer learning to adapt models trained on synthetic compounds acquired by High Pressure Liquid chromatography–Mass Spectrometry (HPLC-MS) to predict RTs for natural products in different chromatographic methods. We utilized the extensive METLIN Small Molecule Retention Time (SMRT) dataset, comprising over 80,000 synthetic compounds, to train a deep neural network (DNN). This model was then fine-tuned on smaller datasets of natural products, from the RepoRT database using a two-stage transfer learning approach. Initially, the DNN’s upper layers were frozen to retain knowledge about high level features while training on the new data. Subsequently, all layers were unfrozen for further training with a reduced learning rate, ensuring both general and unique patterns were captured. Hyperparameter optimization was conducted using Optuna, leveraging a 5-fold nested cross-validation to ensure robust performance. The evaluation metrics that we computed were the Mean Absolute Error (MAE), the Median Absolute Error (MedAE) and Mean Absolute Percentage error (MAPE). Transfer learning was then compared with new trained DNNs directly trained on the RepoRT database and showed that the strategy was successful according to the MAE and MadAE metric, although not according to the MAPE. We decided to remove the outliers and noticed that with the cleared data transfer learning performed better considering all the metrics. In the future, it will be necessary to refine this strategy to improve its performance, either by testing it on the same datasets or by incorporating additional data.	el
dc.format.extent	66	el
dc.language.iso	en	el
dc.publisher	Πανεπιστήμιο Δυτικής Αττικής	el
dc.rights	Αναφορά Δημιουργού - Μη Εμπορική Χρήση - Παρόμοια Διανομή 4.0 Διεθνές	*
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 Διεθνές	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	*
dc.subject	Metabolomics	el
dc.subject	Retention time	el
dc.subject	Molecular fingerprints	el
dc.subject	Molecular descriptors	el
dc.subject	Transfer learning	el
dc.subject	Machine learning	el
dc.title	Prediction of the retention time of natural product metabolites using transfer learning strategies	el
dc.title.alternative	Πρόβλεψη του χρόνου κατακράτησης μεταβολιτών φυσικών προϊόντων με χρήση στρατηγικών μεταφοράς μάθησης	el
dc.type	Διπλωματική εργασία	el
dc.contributor.committee	Athanasiadis, Emmanouil
dc.contributor.committee	Kostopoulos, Spiros
dc.contributor.faculty	Σχολή Μηχανικών	el
dc.contributor.department	Τμήμα Μηχανικών Βιοϊατρικής	el

Αρχεία σε αυτό το τεκμήριο

Όνομα:: Katsara_19388040.pdf
Μέγεθος:: 2.524Mb
Τύπος:: PDF
Περιγραφή:: Διπλωματική Εργασία

Προβολή/Άνοιγμα

Αυτό το τεκμήριο εμφανίζεται στις ακόλουθες συλλογές

Διπλωματικές εργασίες
Διπλωματικές εργασίες τμήματος Μηχανικών Βιοϊατρικής

Εμφάνιση απλής εγγραφής

Αναφορά Δημιουργού - Μη Εμπορική Χρήση - Παρόμοια Διανομή 4.0 Διεθνές

Εκτός από όπου επισημαίνεται κάτι διαφορετικό, το τεκμήριο διανέμεται με την ακόλουθη άδεια:
Αναφορά Δημιουργού - Μη Εμπορική Χρήση - Παρόμοια Διανομή 4.0 Διεθνές