Close Menu
    Facebook X (Twitter) Instagram
    Facebook Instagram YouTube
    Crypto Go Lore News
    Subscribe
    Monday, June 9
    • Home
    • Market Analysis
    • Latest
      • Bitcoin News
      • Ethereum News
      • Altcoin News
      • Blockchain News
      • NFT News
      • Market Analysis
      • Mining News
      • Technology
      • Videos
    • Trending Cryptos
    • AI News
    • Market Cap List
    • Mining
    • Trading
    • Contact
    Crypto Go Lore News
    Home»AI News»Researchers from the University of Washington Introduce Fiddler: A Resource-Efficient Inference Engine for LLMs with CPU-GPU Orchestration
    AI News

    Researchers from the University of Washington Introduce Fiddler: A Resource-Efficient Inference Engine for LLMs with CPU-GPU Orchestration

    CryptoExpertBy CryptoExpertFebruary 27, 2024No Comments4 Mins Read
    Share Facebook Twitter Pinterest Copy Link LinkedIn Tumblr Email VKontakte Telegram
    Researchers from the University of Washington Introduce Fiddler: A Resource-Efficient Inference Engine for LLMs with CPU-GPU Orchestration
    Share
    Facebook Twitter Pinterest Email Copy Link
    BTCC


    Mixture-of-experts (MoE) models have revolutionized artificial intelligence by enabling the dynamic allocation of tasks to specialized components within larger models. However, a major challenge in adopting MoE models is their deployment in environments with limited computational resources. The vast size of these models often surpasses the memory capabilities of standard GPUs, restricting their use in low-resource settings. This limitation hampers the models’ effectiveness and challenges researchers and developers aiming to leverage MoE models for complex computational tasks without access to high-end hardware.

    Existing methods for deploying MoE models in constrained environments typically involve offloading part of the model computation to the CPU. While this approach helps manage GPU memory limitations, it introduces significant latency due to the slow data transfers between the CPU and GPU. State-of-the-art MoE models also often employ alternative activation functions, such as SiLU, which makes it challenging to apply sparsity-exploiting strategies directly. Pruning channels not close enough to zero could negatively impact the model’s performance, requiring a more sophisticated approach to leverage sparsity.

    A team of researchers from the University of Washington has introduced Fiddler, an innovative solution designed to optimize the deployment of MoE models by efficiently orchestrating CPU and GPU resources. Fiddler minimizes the data transfer overhead by executing expert layers on the CPU, reducing the latency associated with moving data between CPU and GPU. This approach addresses the limitations of existing methods and enhances the feasibility of deploying large MoE models in resource-constrained environments.

    Fiddler distinguishes itself by leveraging the computational capabilities of the CPU for expert layer processing while minimizing the volume of data transferred between the CPU and GPU. This methodology drastically cuts down the latency for CPU-GPU communication, enabling the system to run large MoE models, such as the Mixtral-8x7B with over 90GB of parameters, efficiently on a single GPU with limited memory. Fiddler’s design showcases a significant technical innovation in AI model deployment.

    Tokenmetrics

    Fiddler’s effectiveness is underscored by its performance metrics, which demonstrate an order of magnitude improvement over traditional offloading methods. The performance is measured by the number of tokens generated per second. Fiddler successfully ran the uncompressed Mixtral-8x7B model in tests, rendering over three tokens per second on a single 24GB GPU. It improves with longer output lengths for the same input length, as the latency of the prefill stage is amortized. On average, Fiddler is faster than Eliseev Mazur by 8.2 times to 10.1 times and quicker than DeepSpeed-MII by 19.4 times to 22.5 times, depending on the environment.

    In conclusion, Fiddler represents a significant leap forward in enabling the efficient inference of MoE models in environments with limited computational resources. By ingeniously utilizing CPU and GPU for model inference, Fiddler overcomes the prevalent challenges faced by traditional deployment methods, offering a scalable solution that enhances the accessibility of advanced MoE models. This breakthrough can potentially democratize large-scale AI models, paving the way for broader applications and research in artificial intelligence.

    Check out the Paper and Github. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and Google News. Join our 38k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and LinkedIn Group.

    If you like our work, you will love our newsletter..

    Don’t Forget to join our Telegram Channel

    You may also like our FREE AI Courses….

    Nikhil is an intern consultant at Marktechpost. He is pursuing an integrated dual degree in Materials at the Indian Institute of Technology, Kharagpur. Nikhil is an AI/ML enthusiast who is always researching applications in fields like biomaterials and biomedical science. With a strong background in Material Science, he is exploring new advancements and creating opportunities to contribute.

    🚀 LLMWare Launches SLIMs: Small Specialized Function-Calling Models for Multi-Step Automation [Check out all the models]



    Source link

    Ledger
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Telegram Copy Link
    CryptoExpert
    • Website

    Related Posts

    AI News

    Learn CSS Easily with AI _ Step-by-Step Guide for Beginners _ai _aitools _css _aicoding#viral#shorts

    June 8, 2025
    AI News

    Privacy is the most fundamental aspect of human rights! #ai #ainews #chatgpt #openai #technews

    June 7, 2025
    AI News

    Test your AI knowledge | Fun AI Quiz for beginners & Developers

    June 6, 2025
    AI News

    Struggling with One Part? Let AI Guide You, Not Replace You #ai #shorts #homework

    June 5, 2025
    AI News

    Nude photo dikhai parliament me #news #nude #ai #parliament #newsupdate #foryou #shortsvideo #short

    June 4, 2025
    AI News

    Top 10 AI Tools in 2025 🔥 | Life-Changing Tools for Beginners | AI Use at 55 Story

    June 3, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Recommended
    Editors Picks

    Illegal Mining and Policy gaps stall Malaysia’s Crypto growth

    June 9, 2025

    Cetus Relaunches After $200 Million May Hack

    June 9, 2025

    Over 60% of Pump.fun wallets lost money: report

    June 9, 2025

    Circle rejected Ripple’s $5 billion buyout — now valued at over $20 billion after NYSE debut

    June 8, 2025
    Latest Posts

    We are a leading platform dedicated to delivering authoritative insights, news, and resources on cryptocurrencies and blockchain technology. At Crypto Go Lore News, our mission is to empower individuals and businesses with reliable, actionable, and up-to-date information about the cryptocurrency ecosystem. We aim to bridge the gap between complex blockchain technology and practical understanding, fostering a more informed global community.

    Latest Posts

    Illegal Mining and Policy gaps stall Malaysia’s Crypto growth

    June 9, 2025

    Cetus Relaunches After $200 Million May Hack

    June 9, 2025

    Over 60% of Pump.fun wallets lost money: report

    June 9, 2025
    Newsletter

    Subscribe to Updates

    Get the latest Crypto news from Crypto Golore News about crypto around the world.

    Facebook Instagram YouTube
    • Contact
    • Privacy Policy
    • Terms Of Service
    • Social Media Disclaimer
    • DMCA Compliance
    • Anti-Spam Policy
    © 2025 CryptoGoLoreNews. All rights reserved by CryptoGoLoreNews.

    Type above and press Enter to search. Press Esc to cancel.

    bitcoin
    Bitcoin (BTC) $ 106,706.59
    ethereum
    Ethereum (ETH) $ 2,510.95
    tether
    Tether (USDT) $ 1.00
    xrp
    XRP (XRP) $ 2.26
    bnb
    BNB (BNB) $ 653.56
    solana
    Solana (SOL) $ 153.45
    usd-coin
    USDC (USDC) $ 1.00
    dogecoin
    Dogecoin (DOGE) $ 0.184718
    tron
    TRON (TRX) $ 0.282762
    cardano
    Cardano (ADA) $ 0.668086