Distributed ledger technologies (DLT) and machine learning

Σορτ, Ανδρέας Ρόναλντ

Τεχνολογίες κατανεμημένου καθολικού (DLT) και μηχανική μάθηση (machine learning)

Διδακτορική διατριβή

Author

Σορτ, Ανδρέας Ρόναλντ

Date

2024-09-16

Advisor

Leligou, Helen C. (Nelly)

Thesis-v2_2-signed.pdf (2.653Mb)

Keywords

Blockchain ; Federated learning ; Machine learning ; PLC ; SCADA ; Smart contracts ; Ethereum ; Audit trail ; Security ; Ομόσπονδη μάθηση ; Μηχανική μάθηση ; Έξυπνα συμβόλαια ; Ασφάλεια ; DLT

Abstract

This PhD thesis explores the benefits arising from the combining technologies from the following areas: a) Federated Learning, b) Distributed Ledger Technologies (DLTs) such as blockchains and c) Industrial applications. The ever-increasing use of Artificial Intelligence applications has made apparent that the quality of the training datasets affects the performance of the models. To this end, Federated Learning aims to engage multiple entities to contribute to the learning process with locally maintained data, without requiring them to share the actual datasets. In real use cases, the rewards that users get for their contribution to the learning process should depend on the characteristics/quality of their contribution. With this in mind, the thesis first designed and implemented a Federated Learning process that can operate on a blockchain network. The thesis then focuses on ways to strengthen user engagement by offering “fair” rewards, proportional to the model improvement (in terms of accuracy) they offer. Furthermore, to enable objective judgement of the quality of contribution, special smart contracts have been designed that operate on blockchain technologies. More precisely, a verification algorithm has been developed that is used to evaluate the performance of users’ contributions by comparing the resulting accuracy of the global model against a verification dataset. The algorithm is designed to run inside a smart contract and record performance (in the form of a point system) on a distributed ledger. This thesis later investigates solutions that can empower Programmable Logic Controllers (PLCs) by enabling them to query a blockchain infrastructure for commands and setpoints. The blockchain assumes a dual role in this context: serving as an immutable audit trail database as well as a trusted source for critical commands and setpoints. In contrast to the conventional paradigm of controlling PLCs through Human Machine Interface devices (HMIs), this novel approach does not require write access at the PLC level, thus minimizes its attack surface and helps protect against known and zero-day vulnerabilities often used in cyberwarfare such as in the case of the notorious Stuxnet worm. The blockchain network is used as an audit trail database for user actions, useful for applications that enforce the logging of user operations as Good Manufacturing Practices (GMP) or when required for compliance reasons. Any attempt to maliciously circumvent the logging operation would not affect the operation of a critical process. Real-world applica-tions and use cases are explored to demonstrate the tangible benefits of this approach in industrial settings. Additionally, a prototype implementation is developed in order to examine the feasibility and collect performance indicators.

Abstract

Η παρούσα διδακτορική διατριβή ερευνά τα οφέλη που προκύπτουν από τον συνδυασμό τεχνολογιών από τους ακόλουθους τομείς: α) ομόσπονδη μάθηση (Federated Learning, β) τεχνολογίες κατανεμημένου καθολικού (Distributed Ledger Technologies) όπως είναι τα Blockchain δίκτυα και γ) βιομηχανικές εφαρμογές. Η συνεχώς αυξανόμενη χρήση εφαρμογών τεχνητής νοημοσύνης έχει καταστήσει εμφανές ότι η ποιότητα των μεγάλων δεδομένων για εκπαίδευση μοντέλων μηχανικής μάθησης επηρεάζει την απόδοση των τελικών μοντέλων. Για το σκοπό αυτό, η ομόσπονδη μάθηση αποσκοπεί στη συμμετοχή πολλαπλών οντοτήτων που συμβάλλουν στη διαδικασία μάθησης με τοπικά δεδομένα, χωρίς να απαιτείται η κοινή χρήση των πραγματικών δεδομένων. Σε πραγματικές περιπτώσεις χρήσης, οι ανταμοιβές που λαμβάνουν οι χρήστες για τη συμβολή τους στη διαδικασία μάθησης θα πρέπει να εξαρτώνται από τη ποιότητα της συμβολής τους. Έχοντας λοιπόν υπόψιν τη σπουδαιότητα των δεδομένων εκμάθησης, η διατριβή σχεδίασε και στη συνέχεια υλοποίησε μηχανισμό ομόσπονδης μάθησης ο οποίος συντονίζεται από δίκτυο blockchain. Στη συνέχεια, εστιάζει σε τρόπους ενίσχυσης της αφοσίωσης των χρηστών, προσφέροντας "δίκαιες" ανταμοιβές, ανάλογες με τη πραγματική βελτίωση του μοντέλου (ως προς την ακρίβεια) που προσφέρουν. Επιπλέον, για να καταστεί δυνατή η αντικειμενική κρίση της ποιότητας της συνεισφοράς, σχεδιάστηκαν ειδικά έξυπνα συμβόλαια που λειτουργούν πάνω σε τεχνολογίες blockchain. Πιο συγκεκριμένα, έχει αναπτυχθεί ένας αλγόριθμος επαλήθευσης που χρησιμοποιείται για την αξιολόγηση της απόδοσης των συνεισφορών των χρηστών, συγκρίνοντας την ακρίβεια του συνολικού μοντέλου με ένα σύνολο δεδομένων επαλήθευσης. Ο αλγόριθμος έχει σχεδιαστεί για να εκτελείται μέσα σε ένα έξυπνο συμβόλαιο και να καταγράφει τις επιδόσεις (με τη μορφή συστήματος πόντων) σε ένα κατανεμημένο εδάφιο. Η παρούσα διατριβή ανέπτυξε λύσεις για την βελτίωση ασφάλειας σε βιομηχανικούς προγραμματιζόμενους λογικούς ελεγκτές (PLC), επιτρέποντάς τους να λαμβάνουν εντολές και παραμέτρους από ένα δίκτυο blockchain. Το δίκτυο σε αυτή τη περίπτωση αναλαμβάνει δύο ρόλους: χρησιμεύει ως αμετάβλητη βάση καταγραφής ενεργειών ελέγχου (audit trails) καθώς και ως αξιόπιστη πηγή για κρίσιμες εντολές και παραμέτρους. Σε αντίθεση με τον συμβατικό έλεγχο των PLC από συσκευές διεπαφής ανθρώπου-μηχανής (HMI), αυτή η νέα προσέγγιση δεν απαιτεί πρόσβαση εγγραφής στο επίπεδο του PLC, ελαχιστοποιώντας έτσι την επιφάνεια επίθεσής του και συμβάλλοντας στην προστασία από γνωστές ευπάθειες και νέες ευπάθειες (zero day) που εμφανίζονται συχνά σε κυβερνοεπιθέσεις. Το δίκτυο blockchain χρησιμοποιείται επίσης ως βάση δεδομένων που υποστηρίζει ιχνηλάτηση για τις ενέργειες των χρηστών, ιδιαίτερα χρήσιμο για εφαρμογές που επιβάλλουν την καταγραφή των ενεργειών των χρηστών ή για παράδειγμα όπως ορίζουν οι καλές πρακτικές παραγωγής (GMP).