From LEXICAL to Predictive Search: Evolution or Involution?

Ricardo Baeza-Yates

doi:10.17821/srels/2026/v63i1/172020

From LEXICAL to Predictive Search: Evolution or Involution?

Authors

Ricardo Baeza-Yates
KTH Royal Institute of Technology, Sweeden

DOI:

https://doi.org/10.17821/srels/2026/v63i1/172020

Keywords:

AI, Information Retrieval, Large Language Models, RAG, Semantic Search

Abstract

We explore the paradigm shift in information retrieval from classic semantic search to the emerging dominance of neural information retrieval (Neural IR) and Generative AI. Based on the Sarada Ranganathan Endowment Lecture I gave in November 2025, our analysis questions whether this transition represents true technological evolution or an “involution”—a regression in which structural logic and verifiable truth are sacrificed for statistical prediction. We start by detailing the architecture of classic semantic search, emphasising its reliance on explicit knowledge resources such as ontologies and linguistic rules to “understand” the user’s truly semantic intent. This is contrasted with the “pseudosemantic search” of modern neural models, which utilise vector embeddings and Retrieval Augmented Generation (RAG) to mimic understanding. Significant attention is given to the societal and ethical risks of this shift, including the “illusion of understanding” in Large Language Models (LLMs), the dangers of anthropomorphising AI, and the deepening digital divide caused by language inequality. We conclude by advocating for a hybrid future that reintegrates the logic, reasoning, common sense, and explicit knowledge of semantic systems into the powerful predictive capabilities of generative AI.

Downloads

Download data is not yet available.

Downloads

Requires Subscription PDF ⁰

Published

2026-02-28

How to Cite

Baeza-Yates, R. (2026). From LEXICAL to Predictive Search: Evolution or Involution?. Journal of Information and Knowledge, 63(1), 01–07. https://doi.org/10.17821/srels/2026/v63i1/172020

Download Citation

Issue

Volume 63, Issue 1, February 2026

Section

Articles

License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

All the articles published in Journal of Information and Knowledge are held by the Publisher. Sarada Ranganathan Endowment for Library Science (SRELS), as a publisher requires its authors to transfer the copyright prior to publication. This will permit SRELS to reproduce, publish, distribute and archive the article in print and electronic form and also to defend against any improper use of the article.

References

Alonso, O., & Baeza-Yates, R., editors (2025). Information retrieval: Advanced topics and techniques. ACM Press. https://doi.org/10.1145/3674127 Baeza-Yates, R. (2024). The false promise of AI democratization [Post]. LinkedIn.

Baeza-Yates, R. (2022). Language models fail to say what they mean or mean what they say. Venture Beat. Baeza-Yates, R., & Ribeiro-Neto, B. (2011). Modern information retrieval, second edition. Addison-Wesley, UK.

Baeza-Yates, R., Murdock, V., & Hauff, C. (2009). Efficiency trade-offs in two-tier web search systems. Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 163-170. https:// doi.org/10.1145/1571941.1571971

Bruch, S., Lucchese, C., & Nardini, F. M. (2023). Efficient and effective tree-based and neural learning to rank. Foundations and Trends in Information Retrieval, 17(1), 1-123. https://doi.org/10.1561/1500000071

Dietz, L., Zendel, O., Bailey, P., Clarke, C. L., Cotterill, E., Dalton, J., ... Craswell, N. (2025). Principles and Guidelines for the Use of LLM Judges. Proceedings of the 2025 International ACM SIGIR Conference on Innovative Concepts and Theories in Information Retrieval, pp. 218-229. https://doi. org/10.1145/3731120.3744588

Fan, Y., Xie, X., Cai, Y., Chen, J., Ma, X., Li, X., …, Guo, J. (2022). Pre-training methods in information retrieval. Foundations and Trends in Information Retrieval, 16(3), 178-317. https:// doi.org/10.1561/1500000100

Huang, Y., & Huang, J. (2024). A survey on retrieval-augmented text generation for large language models. arXiv preprint arXiv:2404.10981.

Kosmyna, N., Hauptmann, E., Yuan, Y. T., Situ, J., Liao, X-H., Beresnitzky, A. V., …, Maes, P. (2025). Your brain on CHATGPT: Accumulation of cognitive debt when using an AI assistant for essay writing task. arXiv preprint arXiv:2506.08872.

Shah, C., & Bender, E. M. (2022). Situating search. Proceedings of the 2022 Conference on Human Information Interaction and Retrieval, pp. 221-232. https://doi. org/10.1145/3498366.3505816

Shah, C., & Bender, E. M. (2024). Envisioning information access systems: What makes for good tools and a healthy Web? ACM Transactions on the Web, 18(3), 1-24. https:// doi.org/10.1145/3649468

Tonellotto, N. (2025). Neural IR. In O. Alonso & R. Baeza- Yates (Eds.), Information Retrieval: Advanced Topics and Techniques (pp. 11-48), ACM Press. https://doi. org/10.1145/3674127.3674130

Urs, S. (2022). The power and the pitfalls of large language models: A fireside chat with Ricardo Baeza-Yates. Information Matters, 2(5). https://doi.org/10.2139/ ssrn.4280575

Weizenbaum, J. (1976). Computer power and human reason: From judgment to calculation, W.H. Freeman and Company.

From LEXICAL to Predictive Search: Evolution or Involution?

Authors

DOI:

Keywords:

Abstract

Downloads

Downloads

Published

How to Cite

Issue

Section

License

References

Make Submission

Authors Corner

Template

Our Journals

Editorial Team

Chief Editor

Announcements

Thanks to Authors for Publishing their articles as Open Access

backpage

Subscription

Keywords