Close Menu
    Facebook X (Twitter) Instagram
    Facebook Instagram YouTube
    Crypto Go Lore News
    Subscribe
    Sunday, June 8
    • Home
    • Market Analysis
    • Latest
      • Bitcoin News
      • Ethereum News
      • Altcoin News
      • Blockchain News
      • NFT News
      • Market Analysis
      • Mining News
      • Technology
      • Videos
    • Trending Cryptos
    • AI News
    • Market Cap List
    • Mining
    • Trading
    • Contact
    Crypto Go Lore News
    Home»AI News»LlamaFactory: A Unified Machine Learning Framework that Integrates a Suite of Cutting-Edge Efficient Training Methods, Allowing Users to Customize the Fine-Tuning of 100+ LLMs Flexibly
    AI News

    LlamaFactory: A Unified Machine Learning Framework that Integrates a Suite of Cutting-Edge Efficient Training Methods, Allowing Users to Customize the Fine-Tuning of 100+ LLMs Flexibly

    CryptoExpertBy CryptoExpertMarch 25, 2024No Comments4 Mins Read
    Share Facebook Twitter Pinterest Copy Link LinkedIn Tumblr Email VKontakte Telegram
    LlamaFactory: A Unified Machine Learning Framework that Integrates a Suite of Cutting-Edge Efficient Training Methods, Allowing Users to Customize the Fine-Tuning of 100+ LLMs Flexibly
    Share
    Facebook Twitter Pinterest Email Copy Link
    Ledger


    Large language models (LLMs) have revolutionized natural language processing (NLP) by achieving remarkable performance across tasks such as text generation, translation, sentiment analysis, and question-answering. Efficient fine-tuning is crucial for adapting LLMs to various downstream functions. It allows practitioners to utilize the model’s pre-trained knowledge while requiring less labeled data and computational resources than training from scratch. However, implementing these methods on different models requires non-trivial efforts.

    Fine-tuning many parameters with limited resources becomes the main challenge of adapting LLM to downstream tasks. A popular solution is efficient fine-tuning, which reduces the training cost of LLMs when adapting to various tasks. Various other attempts have been made to develop methods for efficient fine-tuning LLM. Still, they need a systematic framework that adapts and unifies these methods to different LLMs and provides a friendly interface for user customization.

    The researchers from the School of Computer Science and Engineering, Beihang University, and the School of Software and Microelectronics, Peking University, present LLAMAFACTORY. This framework democratizes the fine-tuning of LLMs. It unifies various efficient fine-tuning methods through scalable modules, enabling the fine-tuning hundreds of LLMs with minimal resources and high throughput. Also, it streamlines commonly used training approaches, including generative pre-training, supervised fine-tuning (SFT), reinforcement learning from human feedback (RLHF), and direct preference optimization (DPO). Users can utilize command-line or web interfaces to customize and fine-tune their LLMs with minimal or no coding effort.

    LLAMAFACTORY consists of three main modules: Model Loader, Data Worker, and Trainer. They used LLAMABOARD, which provides a friendly visual interface for the above modules. This enables users to configure and launch individual LLM fine-tuning processes codeless and monitor the training status on the fly.

    coinbase

    Model Loader: The model loader consists of four components: Model Initialization, Model Patching, Model Quantization, and Adapter Attaching. It prepares various architectures for fine-tuning and supports over 100 LLMs. 

    Data Worker: The Data Worker processes data from different tasks through a well-designed pipeline supporting over 50 datasets. 

    Trainer: The Trainer unifies efficient fine-tuning methods to adapt these models to different tasks and datasets, which offers four training approaches.

    QLoRA consistently has the lowest memory footprint in training efficiency because the pre-trained weights are represented in lower precision. LoRA exhibits higher throughput by optimization in LoRA layers by Unsloth. GaLore achieves lower PPL on large models, while LoRA has advantages on smaller ones. In the evaluation results on downstream tasks, the averaged scores over ROUGE-1, ROUGE-2, and ROUGE-L for each LLM and each dataset were reported. LoRA and QLoRA perform best in most cases, except for the Llama2-7B and ChatGLM3- 6B models on the CNN/DM and AdGen datasets. Also, the Mistral-7B model performs better on English datasets, while the Qwen1.5-7B model achieves higher scores on the Chinese dataset.

    In conclusion, the researchers have proposed LLAMAFACTORY, a unified framework for the efficient fine-tuning of LLMs. A modular design minimizes dependencies between the models, datasets, and training methods. It provided an integrated approach to fine-tuning over 100 LLMs with a diverse range of efficient fine-tuning techniques. Also, a flexible web UI LLAMABOARD was offered, enabling customized fine-tuning and evaluation of LLMs without coding efforts. They also empirically validate the efficiency and effectiveness of their framework on language modeling and text generation tasks.

    Demo

    Non Necessary cookies to view the content.” data-cli-src=”https://www.youtube.com/embed/W29FgeZEpus?version=3&rel=1&showsearch=0&showinfo=1&iv_load_policy=1&fs=1&hl=en-US&autohide=2&wmode=transparent” allowfullscreen=”true” style=”border:0;” sandbox=”allow-scripts allow-same-origin allow-popups allow-presentation allow-popups-to-escape-sandbox”>

    Check out the Paper and Github. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram Channel, Discord Channel, and LinkedIn Group.

    If you like our work, you will love our newsletter..

    Don’t Forget to join our 39k+ ML SubReddit

    Asjad is an intern consultant at Marktechpost. He is persuing B.Tech in mechanical engineering at the Indian Institute of Technology, Kharagpur. Asjad is a Machine learning and deep learning enthusiast who is always researching the applications of machine learning in healthcare.

    🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others…



    Source link

    bybit
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Telegram Copy Link
    CryptoExpert
    • Website

    Related Posts

    AI News

    Privacy is the most fundamental aspect of human rights! #ai #ainews #chatgpt #openai #technews

    June 7, 2025
    AI News

    Test your AI knowledge | Fun AI Quiz for beginners & Developers

    June 6, 2025
    AI News

    Struggling with One Part? Let AI Guide You, Not Replace You #ai #shorts #homework

    June 5, 2025
    AI News

    Nude photo dikhai parliament me #news #nude #ai #parliament #newsupdate #foryou #shortsvideo #short

    June 4, 2025
    AI News

    Top 10 AI Tools in 2025 🔥 | Life-Changing Tools for Beginners | AI Use at 55 Story

    June 3, 2025
    AI News

    What if the characters knew they were fake? 🤯 #ai #shorts #veo3 #aigenerated

    June 2, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Recommended
    Editors Picks

    Crypto News 77 #ZKJ #BINANCE #LA #SXT #SOPH #NXPC #HUMA #ZRC #BNB #BTC #XRP #USDC #ONDO #anime #XNXX

    June 8, 2025

    Crypto Live Trading🔥🔥Crypto Trading, Crypto Trading For Beginners, Cryptocurrency, Crypto #shorts

    June 8, 2025

    Spot Ether ETFs ongoing inflow streak has hit $812.2M inflows

    June 8, 2025

    Solana (SOL) Introduces Alpenglow for Faster Blockchain Consensus

    June 8, 2025
    Latest Posts

    We are a leading platform dedicated to delivering authoritative insights, news, and resources on cryptocurrencies and blockchain technology. At Crypto Go Lore News, our mission is to empower individuals and businesses with reliable, actionable, and up-to-date information about the cryptocurrency ecosystem. We aim to bridge the gap between complex blockchain technology and practical understanding, fostering a more informed global community.

    Latest Posts

    Crypto News 77 #ZKJ #BINANCE #LA #SXT #SOPH #NXPC #HUMA #ZRC #BNB #BTC #XRP #USDC #ONDO #anime #XNXX

    June 8, 2025

    Crypto Live Trading🔥🔥Crypto Trading, Crypto Trading For Beginners, Cryptocurrency, Crypto #shorts

    June 8, 2025

    Spot Ether ETFs ongoing inflow streak has hit $812.2M inflows

    June 8, 2025
    Newsletter

    Subscribe to Updates

    Get the latest Crypto news from Crypto Golore News about crypto around the world.

    Facebook Instagram YouTube
    • Contact
    • Privacy Policy
    • Terms Of Service
    • Social Media Disclaimer
    • DMCA Compliance
    • Anti-Spam Policy
    © 2025 CryptoGoLoreNews. All rights reserved by CryptoGoLoreNews.

    Type above and press Enter to search. Press Esc to cancel.

    bitcoin
    Bitcoin (BTC) $ 106,017.41
    ethereum
    Ethereum (ETH) $ 2,516.78
    tether
    Tether (USDT) $ 1.00
    xrp
    XRP (XRP) $ 2.28
    bnb
    BNB (BNB) $ 651.67
    solana
    Solana (SOL) $ 150.74
    usd-coin
    USDC (USDC) $ 0.999967
    dogecoin
    Dogecoin (DOGE) $ 0.184654
    tron
    TRON (TRX) $ 0.286009
    cardano
    Cardano (ADA) $ 0.669136