Bing Enhances Search with LLM and SLM Models Optimized by TensorRT-LLM
4 days ago
Bing is enhancing search technology by transitioning to Large Language Models (LLMs) and Small Language Models (SLMs). LLMs can be costly and slow, while SLMs provide a 100x throughput improvement. Nvidia TensorRT-LLM optimizes SLM performance, reducing latency and operational costs by 57%. This integration results in faster, more accurate search results without sacrificing quality. Bing remains committed to advancing search technology and improving user experience.