Integrating Okapi BM25 and Jaccard Algorithms in Thesis Search Engine

Authors

  • Fema Rose Bronda-Ecraela College of Computer Studies, University of Antique, Sibalom, Antique, Philippines
  • Remia L. Doctora College of Arts and Sciences, Computer Department, Iloilo Science and Technology University, Iloilo City, Philippines

DOI:

https://doi.org/10.69478/JITC2024v6n2a11

Keywords:

Search Engine, Document Management, Okapi BM25, Jaccard algorithm

Abstract

The integration of technology into society has brought transformative changes, enhancing efficiency and accessibility across diverse domains. In the academic realm, the role of documents, particularly in versatile soft copy formats, is pivotal. This article introduces a groundbreaking Thesis Search Engine developed in response to challenges faced by the existing thesis document repository at the College of Computer Studies. Focused on Information Technology theses, the innovative tool leverages advanced algorithms like Okapi BM25 and Jaccard to systematically organize and manage documents. The study's objectives include designing search engine modules, integrating these algorithms to uncover trends and similarities in content and providing insights into the evolution of research themes. Rigorous evaluations assess data relevancy, irrelevancy, and computational efficiency. Findings reveal the efficient use of Okapi BM25 for document ranking, showcasing a user-friendly design. While Jaccard exhibits versatility, its inclination to return all similar documents raises considerations. Computational efficiency tests favor Okapi BM25, establishing its effectiveness in delivering prompt and relevant search results. The study's insights contribute to optimizing the search engine and offer valuable considerations for future developments in Information Technology research, emphasizing the importance of aligning algorithms with research goals and user expectations.

Downloads

Published

2024-04-30

How to Cite

Integrating Okapi BM25 and Jaccard Algorithms in Thesis Search Engine. (2024). Journal of Innovative Technology Convergence, 6(1). https://doi.org/10.69478/JITC2024v6n2a11

Similar Articles

21-30 of 39

You may also start an advanced similarity search for this article.