Close Menu
    Facebook X (Twitter) Instagram
    Facebook Instagram YouTube
    Crypto Go Lore News
    Subscribe
    Wednesday, May 27
    • Home
    • Market Analysis
    • Latest
      • Bitcoin News
      • Ethereum News
      • Altcoin News
      • Blockchain News
      • NFT News
      • Market Analysis
      • Mining News
      • Technology
      • Videos
    • Trending Cryptos
    • AI News
    • Market Cap List
    • Mining
    • Trading
    • Contact
    Crypto Go Lore News
    Home»Blockchain»TEAL Introduces Training-Free Activation Sparsity to Boost LLM Efficiency
    Blockchain

    TEAL Introduces Training-Free Activation Sparsity to Boost LLM Efficiency

    CryptoExpertBy CryptoExpertSeptember 1, 2024No Comments3 Mins Read
    Share Facebook Twitter Pinterest Copy Link LinkedIn Tumblr Email VKontakte Telegram
    TEAL Introduces Training-Free Activation Sparsity to Boost LLM Efficiency
    Share
    Facebook Twitter Pinterest Email Copy Link
    Blockonomics




    Zach Anderson
    Sep 01, 2024 08:34

    TEAL offers a training-free approach to activation sparsity, significantly enhancing the efficiency of large language models (LLMs) with minimal degradation.





    TEAL (Training-Free Activation Sparsity in LLMs) has emerged as a groundbreaking approach to improve the efficiency of large language models (LLMs) without requiring additional training. According to together.ai, this method applies magnitude pruning to hidden states throughout the model, achieving 40-50% activation sparsity with minimal degradation. This innovation allows for the transfer of fewer weights to on-chip memory, addressing the memory-bound nature of LLM inference and translating into 1.53-1.8x wall-clock speedups in single-batch decoding.

    Background

    LLMs are known for their massive size, which poses challenges during inference, primarily due to the speed limitations of transferring parameters from device memory to registers. Various techniques such as quantization, weight sparsity, and speculative decoding have been developed to tackle this ‘memory wall’. Activation sparsity, which leverages zero values in hidden states, is a less explored method that avoids transferring unnecessary weight channels during decoding.

    Older models like OPT-175B show high activation sparsity, enabling methods like DejaVu to achieve significant speedups. However, newer models like LLaMA have moved to SwiGLU variants, making it harder to apply such methods. Recent research has attempted to ‘recover’ models that exhibit activation sparsity, but these require extensive retraining on massive datasets.

    Motivating Study: Distributional Properties of Activations in LLMs

    Research has shown that hidden states in LLMs exhibit outliers and are zero-centered with similar distributional shapes across layers. Specifically, states before MLP and Attention Blocks are Gaussian-shaped, while intermediate states are Laplacian-shaped. This suggests that many low-magnitude activations can be pruned with negligible model degradation, a concept also observed in other studies like CATS.

    Betfury

    TEAL

    TEAL introduces an optimization by sparsifying every tensor in the model, achieving near-zero degradation at 25% sparsity and minimal degradation at 40% sparsity. At 50% sparsity, Llama-3 variants show slightly more degradation compared to older Llama-2 and Mistral variants. TEAL outperforms CATS by sparsifying every tensor and choosing to sparsify through input, yielding lower error.

    Hardware-Aware Speed-up

    To benchmark real-world speedups, TEAL was integrated with GPT-Fast, achieving significant speedups of up to 1.53x and 1.8x at 40% and 50% sparsity, respectively. While the kernel is faster than cuBLAS at 0% sparsity, there is still room for further optimization.

    Compatibility with Quantization

    TEAL also demonstrates compatibility with quantization, another technique for efficient LLM inference. Combining activation sparsity and quantization unlocks new regimes for transferring memory to GPU registers, allowing for higher inference speed-ups.

    Applications

    TEAL’s most immediate application is accelerating inference in resource-constrained edge settings, particularly in single-batch scenarios. It also aids inference providers like Together AI, which hosts over 100 open-source models across a large fleet of GPUs, by serving models more efficiently.

    Image source: Shutterstock



    Source link

    okex
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Telegram Copy Link
    CryptoExpert
    • Website

    Related Posts

    Blockchain

    Polymarket Briefly Appears in Google News Before Being Removed

    April 12, 2026
    Blockchain

    OpenAI Launches Safety Fellowship to Tackle AI Alignment Research

    April 8, 2026
    Blockchain

    DeFi Is Optimizing For gas, Not For Markets

    April 2, 2026
    Blockchain

    Bitcoin Finds $65K Support as Week 14 Data Shows Easing Sell Pressure

    March 30, 2026
    Blockchain

    Memecoins Are Not Dead, but Will Return in Another Form: Crypto Exec

    December 15, 2025
    Blockchain

    BNB Hackathon in Abu Dhabi Showcases Innovative Blockchain Solutions

    December 14, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Recommended
    Editors Picks

    Ethereum Sees 56.9% Jump in Transfers as Adoption Gains Ground

    April 12, 2026

    Polymarket Briefly Appears in Google News Before Being Removed

    April 12, 2026

    The Bitcoin miner sell-off looks close to exhaustion marking impending reversal in market pressure

    April 9, 2026

    Uniswap price outlook as Ethereum’s Vitalik Buterin offloads UNI tokens

    April 9, 2026
    Latest Posts

    We are a leading platform dedicated to delivering authoritative insights, news, and resources on cryptocurrencies and blockchain technology. At Crypto Go Lore News, our mission is to empower individuals and businesses with reliable, actionable, and up-to-date information about the cryptocurrency ecosystem. We aim to bridge the gap between complex blockchain technology and practical understanding, fostering a more informed global community.

    Latest Posts

    Ethereum Sees 56.9% Jump in Transfers as Adoption Gains Ground

    April 12, 2026

    Polymarket Briefly Appears in Google News Before Being Removed

    April 12, 2026

    The Bitcoin miner sell-off looks close to exhaustion marking impending reversal in market pressure

    April 9, 2026
    Newsletter

    Subscribe to Updates

    Get the latest Crypto news from Crypto Golore News about crypto around the world.

    Facebook Instagram YouTube
    • Contact
    • Privacy Policy
    • Terms Of Service
    • Social Media Disclaimer
    • DMCA Compliance
    • Anti-Spam Policy
    © 2026 CryptoGoLoreNews. All rights reserved by CryptoGoLoreNews.

    Type above and press Enter to search. Press Esc to cancel.

    bitcoin
    Bitcoin (BTC) $ 75,770.00
    ethereum
    Ethereum (ETH) $ 2,073.95
    tether
    Tether (USDT) $ 0.998553
    bnb
    BNB (BNB) $ 655.25
    xrp
    XRP (XRP) $ 1.33
    usd-coin
    USDC (USDC) $ 0.999739
    solana
    Solana (SOL) $ 83.80
    tron
    TRON (TRX) $ 0.373663
    figure-heloc
    Figure Heloc (FIGR_HELOC) $ 1.03
    staked-ether
    Lido Staked Ether (STETH) $ 2,265.05