Using Large Language Models (LLMs) for education triggers numerous challenges. In particular, LLMs are often fine-tuned and instructed for Question Answering tasks. However, such a behavior of directly providing an answer to a prompt does not encourage students to think critically and to self-discover information. In this work, we fine-tune LLMs for Socratic interactions, where a LLM guides students towards discovering answers to their own questions rather than providing a straight answer. We investigate diverse datasets containing various educational materials and Socratic dialogues and show how LLMs can achieve such a behavior with Direct Preference Optimization (DPO). Furthermore, we employ advanced models, such as GPT-4o, to evaluate our models. Our results indicate that DPO can be effectively used to fine-tune LLMs for Socratic dialogue, improving their educational utility.
EULER: Fine Tuning a Large Language Model for Socratic Interactions
AIxEDU 2024, 2nd International Workshop on Artificial Intelligence Systems in Education, 25-28 November 2024, Bolzano, Italy
      
  Type:
        Conférence
      City:
        Bolzano
      Date:
        2024-11-25
      Department:
        Data Science
      Eurecom Ref:
        7962
      Copyright:
        CEUR
      PERMALINK : https://www.eurecom.fr/publication/7962
 
 
 
     
                       
                      