This work, proposed and published as part of RIT’s thesis report, introduces a novel unified architecture utilizing BERT. It enables the training of various Knowledge Tracing (KT) models through a generalized BERT-based framework, simplifying the process of training different KT models for diverse scenarios. A copy of this work can be found on proquest library under the same title.
This research presents a comparative study focusing on the pivotal phases in the development of large language models (LLMs): pretraining and fine-tuning, all conducted within a local environment. We aim to shed light on the feasibility of these roles for the purpose of making a task specific LLM, (which in our case is to generate SQL like text) while keeping the scalability and efficiency in consideration. Our study centers on LLMs inspired by the GPT and llama-2 architecture and delves into the intricacies of pre-training, where LLMs acquire linguistic knowledge and also fine-tuning, where LLMs extends their linguistic learning based on the new data the LLM was exposed to. This pretraining phase serves as the foundation for our comparative analysis of the fine-tuning process. We employ the Qlora method, showcasing how it efficiently refines pre-trained Falcon 7B parameters models within the constraints of a local environment. This work underscores the harmonious interplay between pretraining and fine-tuning, providing valuable insights for the machine learning community.
We use cookies to analyze website traffic and optimize your website experience. By accepting our use of cookies, your data will be aggregated with all other user data.