Investigation and application of artificial intelligence algorithms for complexity metrics based classification of semantic web ontologies

dc.contributor.authorKoech, Gideon Kiprotich
dc.contributor.supervisorFonou-Dombeu, J. V., Dr.
dc.date.accessioned2022-11-07T04:55:00Z
dc.date.available2022-11-07T04:55:00Z
dc.date.issued2019-11
dc.descriptionM. Tech. (Department of Information Technology, Faculty of Applied and Computer Sciences), Vaal University of Technology.en_US
dc.description.abstractThe increasing demand for knowledge representation and exchange on the semantic web has resulted in an increase in both the number and size of ontologies. This increased features in ontologies has made them more complex and in turn difficult to select, reuse and maintain them. Several ontology evaluations and ranking tools have been proposed recently. Such evaluation tools provide a metrics suite that evaluates the content of an ontology by analysing their schemas and instances. The presence of ontology metric suites may enable classification techniques in placing the ontologies in various categories or classes. Machine Learning algorithms mostly based on statistical methods used in classification of data makes them the perfect tools to be used in performing classification of ontologies. In this study, popular Machine Learning algorithms including K-Nearest Neighbors, Support Vector Machines, Decision Trees, Random Forest, Naïve Bayes, Linear Regression and Logistic Regression were used in the classification of ontologies based on their complexity metrics. A total of 200 biomedical ontologies were downloaded from the Bio Portal repository. Ontology metrics were then generated using the OntoMetrics tool, an online ontology evaluation platform. These metrics constituted the dataset used in the implementation of the machine learning algorithms. The results obtained were evaluated with performance evaluation techniques, namely, precision, recall, F-Measure Score and Receiver Operating Characteristic (ROC) curves. The Overall accuracy scores for K-Nearest Neighbors, Support Vector Machines, Decision Trees, Random Forest, Naïve Bayes, Logistic Regression and Linear Regression algorithms were 66.67%, 65%, 98%, 99.29%, 74%, 64.67%, and 57%, respectively. From these scores, Decision Trees and Random Forests algorithms were the best performing and can be attributed to the ability to handle multiclass classifications.en_US
dc.identifier.urihttp://hdl.handle.net/10352/533
dc.language.isoenen_US
dc.publisherVaal University of Technologyen_US
dc.subjectArtificial intelligence algorithmsen_US
dc.subjectMachine learning algorithmsen_US
dc.subjectOntologyen_US
dc.subjectNational Center for Biomedical Ontologies (NCBO)en_US
dc.subject.lcshDissertations, Academic -- South Africaen_US
dc.subject.lcshSemantic weben_US
dc.subject.lcshOntologies (Information retrieval)en_US
dc.subject.lcshArtificial intelligenceen_US
dc.titleInvestigation and application of artificial intelligence algorithms for complexity metrics based classification of semantic web ontologiesen_US
dc.typeThesisen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Gideon_Koech Thesis.pdf
Size:
2.1 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.02 KB
Format:
Item-specific license agreed upon to submission
Description: