Integrating Okapi BM25 and Jaccard Algorithms in Thesis Search Engine
DOI:
https://doi.org/10.69478/JITC2024v6n2a11Keywords:
Search Engine, Document Management, Okapi BM25, Jaccard algorithmAbstract
The integration of technology into society has brought transformative changes, enhancing efficiency and accessibility across diverse domains. In the academic realm, the role of documents, particularly in versatile soft copy formats, is pivotal. This article introduces a groundbreaking Thesis Search Engine developed in response to challenges faced by the existing thesis document repository at the College of Computer Studies. Focused on Information Technology theses, the innovative tool leverages advanced algorithms like Okapi BM25 and Jaccard to systematically organize and manage documents. The study's objectives include designing search engine modules, integrating these algorithms to uncover trends and similarities in content and providing insights into the evolution of research themes. Rigorous evaluations assess data relevancy, irrelevancy, and computational efficiency. Findings reveal the efficient use of Okapi BM25 for document ranking, showcasing a user-friendly design. While Jaccard exhibits versatility, its inclination to return all similar documents raises considerations. Computational efficiency tests favor Okapi BM25, establishing its effectiveness in delivering prompt and relevant search results. The study's insights contribute to optimizing the search engine and offer valuable considerations for future developments in Information Technology research, emphasizing the importance of aligning algorithms with research goals and user expectations.
Downloads
Published
License
Copyright (c) 2024 Fema Rose Bronda-Ecraela, Remia L. Doctora
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.