Bing Enhances Search with LLM and SLM Models Optimized by TensorRT-LLM

December 18, 2024 at 5:34:30 AM

TL;DR Bing is enhancing search technology by transitioning to Large Language Models (LLMs) and Small Language Models (SLMs). LLMs can be costly and slow, while SLMs provide a 100x throughput improvement. Nvidia TensorRT-LLM optimizes SLM performance, reducing latency and operational costs by 57%. This integration results in faster, more accurate search results without sacrificing quality. Bing remains committed to advancing search technology and improving user experience.

Bing Enhances Search with LLM and SLM Models Optimized by TensorRT-LLM

Bing is advancing its search technology by integrating Large Language Models (LLMs) and Small Language Models (SLMs) to enhance search capabilities. The complexity of search queries has prompted the need for more efficient models. While LLMs are powerful, they can be costly and slow. In contrast, SLMs provide approximately 100x throughput improvement over LLMs, allowing for more precise processing of search queries.

Optimizing with TensorRT-LLM

To tackle latency and cost challenges associated with larger models, Bing has incorporated Nvidia TensorRT-LLM into its workflow, optimizing SLM inference performance. This optimization is particularly evident in the Deep search product, which utilizes SLMs to deliver optimal web results. The process involves understanding user intent and ensuring the relevance of results, with a focus on balancing speed and quality. TensorRT-LLM reduces model inference time, enhancing user experience without compromising result quality.

Before optimization, the original Transformer model had a 95th percentile latency of 4.76 seconds per batch and a throughput of 4.2 queries per second. After implementing TensorRT-LLM, latency improved to 3.03 seconds per batch, and throughput increased to 6.6 queries per second, resulting in a 57% reduction in operational costs.

Optimization Technique

The SmoothQuant technique, introduced in a research paper, allows inference using INT8 for both activations and weights while maintaining accuracy. TensorRT-LLM includes scripts for preprocessing model weights to utilize this method effectively.

Benefits for Users

The transition to SLM models and TensorRT-LLM integration offers several advantages:

  • Faster Search Results: Users experience quicker response times.
  • Improved Accuracy: Enhanced SLM capabilities provide more accurate and contextualized results.
  • Cost Efficiency: Reduced operational costs enable continued investment in innovations.

Looking Ahead

Bing is committed to refining its search technology and enhancing user experience through the ongoing development of LLM and SLM models, along with TensorRT-LLM integration. Future advancements are anticipated, promising to further push the boundaries of search technology.

Have more questions on this topic? Ask our AI assistant for in-depth insights.

Read more from sources 👇

The Only Digital Marketing Feed You'll Ever Need.

Stay informed your way. Tailored updates when and how you want them. 100% Free.

10,000+ Users

500+ Sources

1000+ Tools

Or

Related Posts

Bing Webmaster Tools Enhances Search Performance Feature with Date Comparisons

Bing Webmaster Tools Enhances Search Performance Feature with Date Comparisons

Bing
Bing

Official Source

Official Source

Bing is a Official Source. The source has been verified by Swipe Insight team.

Official Source
Copilot in Bing Webmaster Tools Now Available for All Users

Copilot in Bing Webmaster Tools Now Available for All Users

Bing
Bing

Official Source

Official Source

Bing is a Official Source. The source has been verified by Swipe Insight team.

Official Source
Microsoft tricks users into thinking Bing is Google with new search interface Trending ️‍🔥

Microsoft tricks users into thinking Bing is Google with new search interface

Marketing Workflows Powered by AI

Marketing Workflows Powered by AI

Featured
Markifact
Markifact

Verified Sponsor

Verified Sponsor

Markifact is a Verified Sponsor. Want to get featured here? Contact us.

Verified Sponsor
	Bing removes cache link from search results following Google's lead

Bing removes cache link from search results following Google's lead

Bing
Bing

Official Source

Official Source

Bing is a Official Source. The source has been verified by Swipe Insight team.

Official Source
 Microsoft Launches Private Preview of Copilot in Bing Webmaster Tools

Microsoft Launches Private Preview of Copilot in Bing Webmaster Tools

Microsoft
Microsoft

Official Source

Official Source

Microsoft is a Official Source. The source has been verified by Swipe Insight team.

Official Source
Microsoft Misses Q2 Expectations Despite 15% Increase in Revenue

Microsoft Misses Q2 Expectations Despite 15% Increase in Revenue

Reddit Now Blocks Bing from Crawling Its Site, Grants Exclusive Access to Google Trending ️‍🔥

Reddit Now Blocks Bing from Crawling Its Site, Grants Exclusive Access to Google

Bing SEO +1 more

Related Tools

Markifact logo

Markifact

Verified Tool

Verified Tool

Markifact is a Verified Tool. Want to get this badge? Contact us.

Verified Tool

Marketing Workflows Powered by AI

Featured
Marketing Auditor logo

Marketing Auditor

Verified Tool

Verified Tool

Marketing Auditor is a Verified Tool. Want to get this badge? Contact us.

Verified Tool

Automated audits for Google Ads and Analytics.

Get Featured Here

Showcase your tool in this list.

Contact Us