@eshorten300 — using GPT-4 as a reranker on top of search results
Using GPT-4 to rerank search results 🥸
I want to share two nuggets from my initial experiment:
- Adding an example of what the output should look like helps the model follow the instructions.
- Manually tweaking the role description to improve the ranking quality wasn't as [effective as expected]
That's interesting because it challenges the assumption that prompt engineering a model's persona is the highest-leverage knob. The first finding — showing the model a concrete output example rather than just describing what you want — is something that holds broadly across reranking, classification, and extraction tasks. Few-shot beats role-tuning, which tracks with how these models are actually trained.
The reranking use case itself is worth thinking about. A traditional search index gives you BM25 or embedding similarity, which are good at surface-level relevance but blind to intent and nuance. A reranker pass with a capable LLM can incorporate both — but you're adding latency and cost, so it only makes sense if your retrieval baseline leaves real relevance on the table. The fact that she's running experiments and documenting what worked is exactly the right approach before committing to this in production.
