Performance Trade-offs of Optimizing Small Language Models for E-Commerce

Performance Trade-offs of Optimizing Small Language Models for E-Commerce
By: export.arxiv.org Posted On: October 28, 2025 View: 4

View a PDF of the paper titled Performance Trade-offs of Optimizing Small Language Models for E-Commerce, by Josip Tomo Licardo and 1 other authors

Abstract:Large Language Models (LLMs) offer state-of-the-art performance in natural language understanding and generation tasks. However, the deployment of leading commercial models for specialized tasks, such as e-commerce, is often hindered by high computational costs, latency, and operational expenses. This paper investigates the viability of smaller, open-weight models as a resource-efficient alternative. We present a methodology for optimizing a one-billion-parameter Llama 3.2 model for multilingual e-commerce intent recognition. The model was fine-tuned using Quantized Low-Rank Adaptation (QLoRA) on a synthetically generated dataset designed to mimic real-world user queries. Subsequently, we applied post-training quantization techniques, creating GPU-optimized (GPTQ) and CPU-optimized (GGUF) versions. Our results demonstrate that the specialized 1B model achieves 99% accuracy, matching the performance of the significantly larger GPT-4.1 model. A detailed performance analysis revealed critical, hardware-dependent trade-offs: while 4-bit GPTQ reduced VRAM usage by 41%, it paradoxically slowed inference by 82% on an older GPU architecture (NVIDIA T4) due to dequantization overhead. Conversely, GGUF formats on a CPU achieved a speedup of up to 18x in inference throughput and a reduction of over 90% in RAM consumption compared to the FP16 baseline. We conclude that small, properly optimized open-weight models are not just a viable but a more suitable alternative for domain-specific applications, offering state-of-the-art accuracy at a fraction of the computational cost.

Read this on export.arxiv.org
  Contact Us
  • 40 Baria Sreet 133/2 NewYork City
  • info@techaipulse.xyz
  • +88-111-555-666
Follow Us
Site Map
Get Site Map
  About

TechAIPulse.com brings together the latest stories from the world of Artificial Intelligence, technology, and innovation. Discover trending AI tools, industry news, and expert opinions curated from top sources worldwide.