From January to March 2026, I taught INFO624: Intelligent Search and Language Models at Drexel CCI—a course that sits at the intersection of classical information retrieval (IR) and modern AI-driven language models.

This offering marked a deliberate shift from previous iterations of INFO624. While earlier versions focused on traditional IR systems, this course expanded to explore how language models are reshaping the way we search, rank, and interact with information. In many ways, the guiding question became: what does it mean to append the LM to the IR?

The course was delivered in a cross-listed format, with a mix of in-person and asynchronous students. Preparing and teaching it required not just updating materials, but continuously adapting to a rapidly evolving technical landscape—one where best practices can shift within months.

Topics

Despite losing two instructional days (MLK Day and a late-January snowstorm), the course covered eight weeks of material spanning both foundational and emerging topics:

Introduction to IR and AI foundations
Text Processing and AI-enchanced pre-processing
From Vector Space Models to Dense Representations
Probablistic Models and Neural Language Models for IR
AI-Driven Web Search and Retrieval Techniques
Graph Analysis and Neural Linking Models
Evaliation Metrics and AI-Enhanced IR Systems
Relevance Feedback with AI Techniques
Clustering and Classification with Deep Learning
Emerging Topics in AI (e.g., RAG, XAI, Multimodal IR)

Each topic could easily warrant a full course on its own, but the goal here was breadth with meaningful depth—enough to ground students before they explored ideas in their projects.

Student Projects

The course enrolled 20 students, who could choose to work individually or in groups. Projects took one of two forms: (1) an IR/AI-focused literature review or (2) the design and evaluation of a working system. In total, 12 projects were submitted, reflecting a wide range of interests across modern information retrieval and language model integration.

Systems

Omkar, Manjiri, and Priti developed a multi-source search system that retrieves, synthesizes, and self-evaluates information from web, academic, and local data to generate comprehensive, cited answers.
• https://github.com/Priti0427/Intelligent-Search-agent

Mokshad and Ishant built a search engine over arXiv papers that combines BM25 with BERT-based retrieval, while providing transparent explanations for ranking decisions.
• https://github.com/Mokshu3242/arXiv-Paper-Search-System

Ian built a two-stage recipe search engine on the Food.com corpus (~230K recipes, 1.1M reviews), integrating BM25 retrieval, rule-based query alignment, and neural embeddings derived from review-based quality signals.
• https://github.com/iauger/recipe-search-engine

Chinomso designed a system for question answering over PDFs that incorporates document structure (sections and hierarchy) into both retrieval and grounded generation.
• https://github.com/MishaelTech/explanable_structured_rag_pdf

Charles implemented a transparent full-text search engine over newly released JFK assassination documents, enabling precise and citable exploration of primary historical sources.

Robert and Ayush created a system that combines chapter-level character summaries with semantic retrieval to support exploration and querying of long-form narrative texts.

Jake developed a prototype system using FAISS, augmented with salience and recency signals, to retrieve narrative memories for consistent storytelling in AI-driven environments.

Mason built a RAG-based search engine for personal finance, retrieving and summarizing trusted financial documents to answer user questions in natural language.
• https://github.com/riccimason99/Financial-Planning-Search-Engine

Literature Reviews

Sriram, Sourav, Khushi, and Lohitha conducted a survey of retrieval-augmented generation (RAG) methods for academic use, focusing on hybrid retrieval, self-reflection, and challenges such as faithfulness and evaluation.

Muhammad analyzed the evolution of neural information retrieval, tracing the progression from early embeddings to modern transformer-based dense retrieval and identifying remaining challenges.

Sriram examined personalization in search, exploring how systems balance relevance with novelty and diversity under ambiguous or evolving user intent.

Grace compared thesauri, knowledge graphs, and latent semantic analysis as methods for incorporating semantic relationships into retrieval systems.

Conclusion

Overall, INFO624 highlighted just how quickly information retrieval and language models are converging—both in research and in practice. What once felt like separate paradigms are now deeply intertwined, with modern systems blending classical ranking methods and neural representations into hybrid approaches.

The range of student projects reflects this shift clearly: systems emphasized not only performance, but also transparency, evaluation, and real-world usability. Just as importantly, many projects grappled with emerging challenges such as faithfulness, explainability, and the limits of current models.

For me, teaching this course reinforced an important reality: working in this space requires constant adaptation. The tools, techniques, and expectations are evolving rapidly, and education must evolve with them. If anything, this iteration of INFO624 felt less like a static course and more like a snapshot of a moving target—one that students are now well-equipped to continue exploring.

← Previous Post