Close Menu
    Facebook X (Twitter) Instagram
    Facebook Instagram YouTube
    Crypto Go Lore News
    Subscribe
    Tuesday, May 26
    • Home
    • Market Analysis
    • Latest
      • Bitcoin News
      • Ethereum News
      • Altcoin News
      • Blockchain News
      • NFT News
      • Market Analysis
      • Mining News
      • Technology
      • Videos
    • Trending Cryptos
    • AI News
    • Market Cap List
    • Mining
    • Trading
    • Contact
    Crypto Go Lore News
    Home»AI News»Researchers from the University of Washington Introduce Fiddler: A Resource-Efficient Inference Engine for LLMs with CPU-GPU Orchestration
    AI News

    Researchers from the University of Washington Introduce Fiddler: A Resource-Efficient Inference Engine for LLMs with CPU-GPU Orchestration

    CryptoExpertBy CryptoExpertFebruary 27, 2024No Comments4 Mins Read
    Share Facebook Twitter Pinterest Copy Link LinkedIn Tumblr Email VKontakte Telegram
    Researchers from the University of Washington Introduce Fiddler: A Resource-Efficient Inference Engine for LLMs with CPU-GPU Orchestration
    Share
    Facebook Twitter Pinterest Email Copy Link
    Coinbase


    Mixture-of-experts (MoE) models have revolutionized artificial intelligence by enabling the dynamic allocation of tasks to specialized components within larger models. However, a major challenge in adopting MoE models is their deployment in environments with limited computational resources. The vast size of these models often surpasses the memory capabilities of standard GPUs, restricting their use in low-resource settings. This limitation hampers the models’ effectiveness and challenges researchers and developers aiming to leverage MoE models for complex computational tasks without access to high-end hardware.

    Existing methods for deploying MoE models in constrained environments typically involve offloading part of the model computation to the CPU. While this approach helps manage GPU memory limitations, it introduces significant latency due to the slow data transfers between the CPU and GPU. State-of-the-art MoE models also often employ alternative activation functions, such as SiLU, which makes it challenging to apply sparsity-exploiting strategies directly. Pruning channels not close enough to zero could negatively impact the model’s performance, requiring a more sophisticated approach to leverage sparsity.

    A team of researchers from the University of Washington has introduced Fiddler, an innovative solution designed to optimize the deployment of MoE models by efficiently orchestrating CPU and GPU resources. Fiddler minimizes the data transfer overhead by executing expert layers on the CPU, reducing the latency associated with moving data between CPU and GPU. This approach addresses the limitations of existing methods and enhances the feasibility of deploying large MoE models in resource-constrained environments.

    Fiddler distinguishes itself by leveraging the computational capabilities of the CPU for expert layer processing while minimizing the volume of data transferred between the CPU and GPU. This methodology drastically cuts down the latency for CPU-GPU communication, enabling the system to run large MoE models, such as the Mixtral-8x7B with over 90GB of parameters, efficiently on a single GPU with limited memory. Fiddler’s design showcases a significant technical innovation in AI model deployment.

    okex

    Fiddler’s effectiveness is underscored by its performance metrics, which demonstrate an order of magnitude improvement over traditional offloading methods. The performance is measured by the number of tokens generated per second. Fiddler successfully ran the uncompressed Mixtral-8x7B model in tests, rendering over three tokens per second on a single 24GB GPU. It improves with longer output lengths for the same input length, as the latency of the prefill stage is amortized. On average, Fiddler is faster than Eliseev Mazur by 8.2 times to 10.1 times and quicker than DeepSpeed-MII by 19.4 times to 22.5 times, depending on the environment.

    In conclusion, Fiddler represents a significant leap forward in enabling the efficient inference of MoE models in environments with limited computational resources. By ingeniously utilizing CPU and GPU for model inference, Fiddler overcomes the prevalent challenges faced by traditional deployment methods, offering a scalable solution that enhances the accessibility of advanced MoE models. This breakthrough can potentially democratize large-scale AI models, paving the way for broader applications and research in artificial intelligence.

    Check out the Paper and Github. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and Google News. Join our 38k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and LinkedIn Group.

    If you like our work, you will love our newsletter..

    Don’t Forget to join our Telegram Channel

    You may also like our FREE AI Courses….

    Nikhil is an intern consultant at Marktechpost. He is pursuing an integrated dual degree in Materials at the Indian Institute of Technology, Kharagpur. Nikhil is an AI/ML enthusiast who is always researching applications in fields like biomaterials and biomedical science. With a strong background in Material Science, he is exploring new advancements and creating opportunities to contribute.

    🚀 LLMWare Launches SLIMs: Small Specialized Function-Calling Models for Multi-Step Automation [Check out all the models]



    Source link

    Binance
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Telegram Copy Link
    CryptoExpert
    • Website

    Related Posts

    AI News

    AI Trading Bots Explained (Pocket Option Guide)

    April 9, 2026
    AI News

    How is AI reshaping opportunities for students? #news #ai #trending #opportunity #shorts

    April 3, 2026
    AI News

    Create Stunning AI Videos in Minutes! LunaBloomAI Full Tutorial for Beginners (2024)

    December 16, 2025
    AI News

    Glimmering Labs of 2050 AI Shaping Tomorrow’s Materials

    December 15, 2025
    AI News

    Sunday Funny Comic #google #AI News #War #Dogs Virals memes #stockmarket #news #crypto #shorts

    December 14, 2025
    AI News

    ✨ What I Noticed About AI Today 🤖 | Simple Tip for Beginners #shorts

    December 13, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Recommended
    Editors Picks

    Ethereum Sees 56.9% Jump in Transfers as Adoption Gains Ground

    April 12, 2026

    Polymarket Briefly Appears in Google News Before Being Removed

    April 12, 2026

    The Bitcoin miner sell-off looks close to exhaustion marking impending reversal in market pressure

    April 9, 2026

    Uniswap price outlook as Ethereum’s Vitalik Buterin offloads UNI tokens

    April 9, 2026
    Latest Posts

    We are a leading platform dedicated to delivering authoritative insights, news, and resources on cryptocurrencies and blockchain technology. At Crypto Go Lore News, our mission is to empower individuals and businesses with reliable, actionable, and up-to-date information about the cryptocurrency ecosystem. We aim to bridge the gap between complex blockchain technology and practical understanding, fostering a more informed global community.

    Latest Posts

    Ethereum Sees 56.9% Jump in Transfers as Adoption Gains Ground

    April 12, 2026

    Polymarket Briefly Appears in Google News Before Being Removed

    April 12, 2026

    The Bitcoin miner sell-off looks close to exhaustion marking impending reversal in market pressure

    April 9, 2026
    Newsletter

    Subscribe to Updates

    Get the latest Crypto news from Crypto Golore News about crypto around the world.

    Facebook Instagram YouTube
    • Contact
    • Privacy Policy
    • Terms Of Service
    • Social Media Disclaimer
    • DMCA Compliance
    • Anti-Spam Policy
    © 2026 CryptoGoLoreNews. All rights reserved by CryptoGoLoreNews.

    Type above and press Enter to search. Press Esc to cancel.

    bitcoin
    Bitcoin (BTC) $ 76,002.00
    ethereum
    Ethereum (ETH) $ 2,073.11
    tether
    Tether (USDT) $ 0.998513
    bnb
    BNB (BNB) $ 656.67
    xrp
    XRP (XRP) $ 1.33
    usd-coin
    USDC (USDC) $ 0.999751
    solana
    Solana (SOL) $ 83.81
    tron
    TRON (TRX) $ 0.373966
    staked-ether
    Lido Staked Ether (STETH) $ 2,265.05
    figure-heloc
    Figure Heloc (FIGR_HELOC) $ 1.03