AI-Powered Search isn’t just a book; it’s a treasure map for navigating the complex and ever-evolving landscape of search technology.
Khalifeh AlJadda
Director of Data Science, Google

Enter your email for a 15% discount code

Works with:

What Others Are Saying

AI Powered Search

Develop the kind of self-learning search platform that understands the nuances of natural language queries and user preferences to improve relevance with each new user interaction. 

The book contains more than 200 executable code listings (among many more in the Jupyter notebooks) to implement the following: 

  • Basic relevance and vector operations 
  • User signals collection and processing 
  • Knowledge graph extraction 
  • Semantic knowledge graphs 
  • Learning misspellings/alternate labels/related terms/ from user signals and documents 
  • Query intent classification
  • Query-sense disambiguation 
  • Semantic query parsing with text tagging/entity extraction 
  • Semantic functions on parsed query trees 
  • Semantic search using knowledge graphs and splade-like sparse query expansion techniques 
  • Signals boosting models (basic, time decays, combining various signals types, handling noise/spam signal, index-time vs. query time signals boosting, personalized search (collaborative filtering for recommendations, matrix-factorization for learning latent use and item features, learning personalization profiles leveraging content embeddings combined with signals, contextual-clustering for personalization guardrails, and ranking and reranking techniques to integrate personalization carefully within search results)
  • Building and deploying and LTR model end-to-end (working with training data, training, testing, deploying) 
  • Generating click models from user signals 
  • Automating LTR end-to-end using judgements from learned click models 
  • Overcoming bias in click models (Simplified Dynamic Bayesian Networks for position bias, beta priors / distributions for confidence bias, Guassian / Expected Improvement algorithm for presentation bias) 
  • AB Testing for relevance 
  • Active Learning for results diversity 
  • Semantic search with dense vector embeddings 
  • Semantic autocomplete with dense vector embeddings 
  • Fine-tuning an LLM for question answering (training, validation sets, etc.)
  • Generating silver set training data from signals (vs. golden sets) 
  • Extractive question answering (retriever / reader pattern) 
  • ANN search (HNSW) 
  • RAG: results summarization and citations / abstractive question answering
  • Synthetic search data / training data generation using generative models (vs. signals)
  • Multimodal search (text to image, image to image, text+image to image queries
  • Cross encoders
  • Multimodal search (text-to-image, image-to-image, text + image-to-image queries)
  • Quantization techniques
  • Hybrid Search (Reciprocal Rank Fusion, vector rerank, etc.)

Expansive Code Examples

Easy to Use. No Coding Required.

1. Install Docker (one time)

2. Pull the code

git clone https://github.com/treygrainger/ai-powered-search.git

3. Start the Docker container with live Jupyter notebooks

cd ai-powered-search/

docker compose up

Now explore and run the book’s hundreds of AI-Powered Search code examples live, in your web browser.

Fully Supported Search Engines and Vector Databases:

Vespa

Concepts also Apply to:

Turbopuffer

Explore AI-Powered Search Topics Like:

“Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. 

About The Authors

Trey Grainger

Trey is one of the leading voices in search and data science. His firm Searchkernel focuses on next generation AI search. This includes developing semantic search, personalization and recommendation systems, and building self-learning search platforms leveraging content and behavior-based reflected intelligence. Author and speaker, he is the co-author of Solr in Action. He previously served as CTO of Presearch and as Chief Algorithms Officer and SVP Engineering at Lucidworks.

Doug Turnbull

Doug is Principal Engineer at Daydream and was formerly Principal Engineer at Reddit, Staff Relevance Engineer at Spotify, and Chief Technical Officer at OpenSource Connections. He is the co-author of the book Relevant Search, and contributed chapters 10-12 on “Learning to rank for generalizable search relevance,” “Automated learning to rank with click models,” and “Overcoming bias in learned relevance models.”

Max irwin

Max is the Founder of Max.io, focused on scaling production AI models, and former Managing Consultant at OpenSource Connections, a leading search relevance consultancy. Max contributed chapters 13-14 and much of 15 on “Semantic Search with Dense Vectors,” “Question answering with a fine-tuned large language model,” and “Foundation models and emerging search paradigms.”

Sign Up

SIGN UP AND RECEIVE EXCLUSIVE UPDATES