Close Menu
    Facebook X (Twitter) Instagram
    Facebook Instagram YouTube
    Crypto Go Lore News
    Subscribe
    Sunday, June 8
    • Home
    • Market Analysis
    • Latest
      • Bitcoin News
      • Ethereum News
      • Altcoin News
      • Blockchain News
      • NFT News
      • Market Analysis
      • Mining News
      • Technology
      • Videos
    • Trending Cryptos
    • AI News
    • Market Cap List
    • Mining
    • Trading
    • Contact
    Crypto Go Lore News
    Home»AI News»Google AI Proposes TransformerFAM: A Novel Transformer Architecture that Leverages a Feedback Loop to Enable the Neural Network to Attend to Its Latent Representations
    AI News

    Google AI Proposes TransformerFAM: A Novel Transformer Architecture that Leverages a Feedback Loop to Enable the Neural Network to Attend to Its Latent Representations

    CryptoExpertBy CryptoExpertApril 18, 2024No Comments4 Mins Read
    Share Facebook Twitter Pinterest Copy Link LinkedIn Tumblr Email VKontakte Telegram
    Google AI Proposes TransformerFAM: A Novel Transformer Architecture that Leverages a Feedback Loop to Enable the Neural Network to Attend to Its Latent Representations
    Share
    Facebook Twitter Pinterest Email Copy Link
    Blockonomics


    Transformers have revolutionized deep learning, yet their quadratic attention complexity limits their ability to process infinitely long inputs. Despite their effectiveness, they suffer from drawbacks such as forgetting information beyond the attention window and needing help with long-context processing. Attempts to address this include sliding window attention and sparse or linear approximations, but they often must catch up at large scales. Drawing inspiration from neuroscience, particularly the link between attention and working memory, there’s a proposed solution: incorporating attention to its latent representations via a feedback loop within the Transformer blocks, potentially leading to the emergence of working memory in Transformers.

    Google LLC researchers have developed TransformerFAM, a unique Transformer architecture employing a feedback loop to enable self-attention to the network’s latent representations, facilitating the emergence of working memory. This innovation improves Transformer performance on long-context tasks across various model sizes (1B, 8B, and 24B) without adding weights, seamlessly integrating with pre-trained models. TransformerFAM maintains past information indefinitely, promisingly handling infinitely long input sequences for LLMs. Without introducing new weights, TransformerFAM allows the reuse of pre-trained checkpoints. Fine-tuning TransformerFAM with LoRA for 50k steps significantly enhances performance across 1B, 8B, and 24B Flan-PaLM LLMs.

    Prior attempts to incorporate feedback mechanisms in Transformers mainly focused on passing output activations from top layers to lower or intermediate ones, neglecting potential representational gaps. While some research compressed information blockwise, none ensured infinite propagation—recurrent cross-attention between blocks and feedback from upper layers integrated past information to subsequent blocks. To overcome quadratic complexity in Transformer context length approaches like sparse attention and linear approximations were explored. Alternatives to attention-based Transformers include MLP-mixer and State Space Models. TransformerFAM draws inspiration from Global Workspace Theory, aiming for a unified attention mechanism for processing various data types.

    Two primary approaches are commonly employed in handling long-context inputs: increasing computational resources or implementing Sliding Window Attention (SWA). SWA, introduced by Big Bird, partitions the input into blocks, caching information block by block, a strategy termed Block Sliding Window Attention (BSWA). Unlike standard SWA, BSWA attends to all information within the ring buffer without masking out past keys and values. It employs two hyperparameters, block size, and memory segment, to control the size and scope of attended information. While BSWA offers linear complexity compared to the quadratic complexity of standard Transformers, it possesses a limited receptive field. This limitation necessitates further innovation to address long-context dependencies effectively.

    Binance

    FAM is developed in response to this challenge, building upon BSWA’s blockwise structure. FAM integrates feedback activations into each block, dubbed virtual activations, enabling the dynamic propagation of global contextual information across blocks. This architecture fulfills key requirements such as integrated attention, block-wise updates, information compression, and global contextual storage. Incorporating FAM enriches representations and facilitates the propagation of comprehensive contextual information, surpassing the limitations of BSWA. Despite the initial concern of potential inefficiency due to the feedback mechanism, the vectorized map-based self-attention in blocks ensures efficient training and minimal impact on memory consumption and training speed, maintaining parity with TransformerBSWA.

    In the movie “Memento,” the protagonist’s struggle with anterograde amnesia parallels the current limitations of LLMs. While LLMs possess vast long-term memory capabilities, their short-term memory is restricted by attention windows. TransformerFAM offers a solution to addressing anterograde amnesia in LLMs, leveraging attention-based working memory inspired by neuroscience. The study hints at a path toward resolving the memory challenge in deep learning, a crucial precursor to tackling broader issues like reasoning. 

    Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram Channel, Discord Channel, and LinkedIn Group.

    If you like our work, you will love our newsletter..

    Don’t Forget to join our 40k+ ML SubReddit

    Want to get in front of 1.5 Million AI Audience? Work with us here

    Sana Hassan, a consulting intern at Marktechpost and dual-degree student at IIT Madras, is passionate about applying technology and AI to address real-world challenges. With a keen interest in solving practical problems, he brings a fresh perspective to the intersection of AI and real-life solutions.

    🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others…



    Source link

    coinbase
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Telegram Copy Link
    CryptoExpert
    • Website

    Related Posts

    AI News

    Privacy is the most fundamental aspect of human rights! #ai #ainews #chatgpt #openai #technews

    June 7, 2025
    AI News

    Test your AI knowledge | Fun AI Quiz for beginners & Developers

    June 6, 2025
    AI News

    Struggling with One Part? Let AI Guide You, Not Replace You #ai #shorts #homework

    June 5, 2025
    AI News

    Nude photo dikhai parliament me #news #nude #ai #parliament #newsupdate #foryou #shortsvideo #short

    June 4, 2025
    AI News

    Top 10 AI Tools in 2025 🔥 | Life-Changing Tools for Beginners | AI Use at 55 Story

    June 3, 2025
    AI News

    What if the characters knew they were fake? 🤯 #ai #shorts #veo3 #aigenerated

    June 2, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Recommended
    Editors Picks

    How a 91% Audit Score Signals DeFi’s Maturing Moment

    June 8, 2025

    TRUMP Meme Coin is Unlikely to Recover Soon – Here’s Why

    June 8, 2025

    Privacy is the most fundamental aspect of human rights! #ai #ainews #chatgpt #openai #technews

    June 7, 2025

    Pumpfun pe memecoin kaise bnaye #crypto #guide

    June 7, 2025
    Latest Posts

    We are a leading platform dedicated to delivering authoritative insights, news, and resources on cryptocurrencies and blockchain technology. At Crypto Go Lore News, our mission is to empower individuals and businesses with reliable, actionable, and up-to-date information about the cryptocurrency ecosystem. We aim to bridge the gap between complex blockchain technology and practical understanding, fostering a more informed global community.

    Latest Posts

    How a 91% Audit Score Signals DeFi’s Maturing Moment

    June 8, 2025

    TRUMP Meme Coin is Unlikely to Recover Soon – Here’s Why

    June 8, 2025

    Privacy is the most fundamental aspect of human rights! #ai #ainews #chatgpt #openai #technews

    June 7, 2025
    Newsletter

    Subscribe to Updates

    Get the latest Crypto news from Crypto Golore News about crypto around the world.

    Facebook Instagram YouTube
    • Contact
    • Privacy Policy
    • Terms Of Service
    • Social Media Disclaimer
    • DMCA Compliance
    • Anti-Spam Policy
    © 2025 CryptoGoLoreNews. All rights reserved by CryptoGoLoreNews.

    Type above and press Enter to search. Press Esc to cancel.

    bitcoin
    Bitcoin (BTC) $ 105,539.28
    ethereum
    Ethereum (ETH) $ 2,513.11
    tether
    Tether (USDT) $ 1.00
    xrp
    XRP (XRP) $ 2.18
    bnb
    BNB (BNB) $ 649.76
    solana
    Solana (SOL) $ 149.94
    usd-coin
    USDC (USDC) $ 1.00
    dogecoin
    Dogecoin (DOGE) $ 0.183499
    tron
    TRON (TRX) $ 0.28523
    cardano
    Cardano (ADA) $ 0.658248