From LEXICAL to Predictive Search: Evolution or Involution?

Authors

  • WASP Professor, KTH Royal Institute of Technology, Sweeden

DOI:

https://doi.org/10.17821/srels/2026/v63i1/172020

Keywords:

AI, Information Retrieval, Large Language Models, RAG, Semantic Search

Abstract

We explore the paradigm shift in information retrieval from classic semantic search to the emerging dominance of neural information retrieval (Neural IR) and Generative AI. Based on the Sarada Ranganathan Endowment Lecture I gave in November 2025, our analysis questions whether this transition represents true technological evolution or an “involution”—a regression in which structural logic and verifiable truth are sacrificed for statistical prediction. We start by detailing the architecture of classic semantic search, emphasising its reliance on explicit knowledge resources such as ontologies and linguistic rules to “understand” the user’s truly semantic intent. This is contrasted with the “pseudosemantic search” of modern neural models, which utilise vector embeddings and Retrieval Augmented Generation (RAG) to mimic understanding. Significant attention is given to the societal and ethical risks of this shift, including the “illusion of understanding” in Large Language Models (LLMs), the dangers of anthropomorphising AI, and the deepening digital divide caused by language inequality. We conclude by advocating for a hybrid future that reintegrates the logic, reasoning, common sense, and explicit knowledge of semantic systems into the powerful predictive capabilities of generative AI.

Downloads

Download data is not yet available.

Published

2026-03-23

How to Cite

Baeza-Yates, R. (2026). From LEXICAL to Predictive Search: Evolution or Involution?. Journal of Information and Knowledge, 63(1), 01–07. https://doi.org/10.17821/srels/2026/v63i1/172020

References

Alonso, O., & Baeza-Yates, R., editors (2025). Information retrieval: Advanced topics and techniques. ACM Press. https://doi.org/10.1145/3674127 Baeza-Yates, R. (2024). The false promise of AI democratization [Post]. LinkedIn.

Baeza-Yates, R. (2022). Language models fail to say what they mean or mean what they say. Venture Beat. Baeza-Yates, R., & Ribeiro-Neto, B. (2011). Modern information retrieval, second edition. Addison-Wesley, UK.

Baeza-Yates, R., Murdock, V., & Hauff, C. (2009). Efficiency trade-offs in two-tier web search systems. Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 163-170. https:// doi.org/10.1145/1571941.1571971

Bruch, S., Lucchese, C., & Nardini, F. M. (2023). Efficient and effective tree-based and neural learning to rank. Foundations and Trends in Information Retrieval, 17(1), 1-123. https://doi.org/10.1561/1500000071

Dietz, L., Zendel, O., Bailey, P., Clarke, C. L., Cotterill, E., Dalton, J., ... Craswell, N. (2025). Principles and Guidelines for the Use of LLM Judges. Proceedings of the 2025 International ACM SIGIR Conference on Innovative Concepts and Theories in Information Retrieval, pp. 218-229. https://doi. org/10.1145/3731120.3744588

Fan, Y., Xie, X., Cai, Y., Chen, J., Ma, X., Li, X., …, Guo, J. (2022). Pre-training methods in information retrieval. Foundations and Trends in Information Retrieval, 16(3), 178-317. https:// doi.org/10.1561/1500000100

Huang, Y., & Huang, J. (2024). A survey on retrieval-augmented text generation for large language models. arXiv preprint arXiv:2404.10981.

Kosmyna, N., Hauptmann, E., Yuan, Y. T., Situ, J., Liao, X-H., Beresnitzky, A. V., …, Maes, P. (2025). Your brain on CHATGPT: Accumulation of cognitive debt when using an AI assistant for essay writing task. arXiv preprint arXiv:2506.08872.

Shah, C., & Bender, E. M. (2022). Situating search. Proceedings of the 2022 Conference on Human Information Interaction and Retrieval, pp. 221-232. https://doi. org/10.1145/3498366.3505816

Shah, C., & Bender, E. M. (2024). Envisioning information access systems: What makes for good tools and a healthy Web? ACM Transactions on the Web, 18(3), 1-24. https:// doi.org/10.1145/3649468

Tonellotto, N. (2025). Neural IR. In O. Alonso & R. Baeza- Yates (Eds.), Information Retrieval: Advanced Topics and Techniques (pp. 11-48), ACM Press. https://doi. org/10.1145/3674127.3674130

Urs, S. (2022). The power and the pitfalls of large language models: A fireside chat with Ricardo Baeza-Yates. Information Matters, 2(5). https://doi.org/10.2139/ ssrn.4280575

Weizenbaum, J. (1976). Computer power and human reason: From judgment to calculation, W.H. Freeman and Company.