A conceptual and hands-on introduction to tuning and evaluating large language models (LLMs) using RLHF.
View in browser
Courses top banner with the DeepLearning.AI logo
Reinforcement Learning from Human Feedback course promo banner

Dear Chris, 

 

We're thrilled to announce a new course on Reinforcement Learning from Human Feedback (RLHF) built in collaboration with Google Cloud. 

 

Large language models (LLMs) are trained on human-generated text, but additional methods are needed to align an LLM with human values and preferences. Reinforcement Learning from Human Feedback (RLHF) is a key method for aligning LLMs to make them more helpful, honest, and safe. 

    A glimpse into the RLHF course

    In this course, you will gain a conceptual understanding of the RLHF training process, and then practice applying RLHF to tune an LLM. You will: 

    • Explore the two datasets (“preference” and “prompt”) that are used in RLHF training.
    • Use the open source Google Cloud Pipeline Components Library to fine-tune the Llama 2 model with RLHF.
    • Assess the tuned LLM against the original base model by comparing loss curves and using the “Side-by-Side (SxS)” method.

    Join instructor Nikita Namjoshi, Developer Advocate for Generative AI at Google Cloud, to learn this exciting technique you can use in building your own applications.

      Enroll now!

      Keep learning,

      The DeepLearning.AI team

      Facebook
      LinkedIn
      Twitter
      Instagram

      Copyright © 2023 deeplearning.ai, All rights reserved.
      You are receiving this because you opted in to receive emails from deeplearning.ai.

      DeepLearning.AI, 195 Page Mill Road, Suite 115, Palo Alto, CA 94306, United States

      Manage preferences