Create a Virtual Professor using OpenAI’s Custom GPT
At the beginning of the COVID pandemic in 2020, we began online class sessions over Zoom. I began recording all my lectures to accommodate students who missed class due to illness or to care for a loved one.
I continue this practice today since the video recordings serves as a resource for students. I recently realized this would serve as a great source of information for a ChatBot for students. Could it serve as a 24/7 resource for students to ask questions about the lecture and the course content? I think so!
Use Cases
Provide students with a 24/7 resource for the course that can answer their natural language questions. (This also reduces the load on the professor)
Provide a tool for students that can automatically generate study notes on the content.
Provide a tool for students to explore the topic as their interest grows.
Solution: OpenAI Custom GPT
I had a few requirements for this idea of a “Virtual Professor” to serve students of this specific course. Below are the requirements and a description of how OpenAI’s Custom GPT service satisfies those requirements. Custom GPTs “…are custom versions of ChatGPT that users can tailor for specific tasks or topics by combining instructions, knowledge, and capabilities. They can be as simple or as complex as needed, addressing anything from language learning to technical support.” For my use case, I will tailor ChatGPT to provide answers specific to my course on the foundations of secure software development. The answers will be based on my lectures so that students can use it to reemphasize topics, create outlines, and self-generate quizzes.
Requirements:
The ChatBot needs to use the transcripts of my lectures as the primary source of information. However, it can be supplemented by the ChatGPT model. The free ChatGPT model may not be 100% accurate. However, for the purpose of the educational chatbot, I wanted students to take advantage of the expanded explanations grounded by the lecture content. We could discuss discrepancies as they came up (I could also tune the GPT chatbot).
The ChatBot needs to be accessible to my students for free. Fortunately, although a subscription is required to create a Custom GPT (my cost as the professor at $20 per month), my students can utilize the Custom GPT I create for free.
The ChatBot needed to have mature capabilities. I considered using LLAMA3 and creating my own chatbot service, but why should I put that energy into it at this stage of prototyping? ChatGPT can create outlines, quizzes, summaries and also search the web. In addition, adoption is already high by the consumer base. Therefore, no need for me to reinvent the wheel.
I needed a means to restrict access to only my students. The OpenAI custom GPT service allows me to specify that only users with the link can access the custom GPT I created. This is sufficient at this stage.
Verbatim phrases from the lectures need to be cleaned up before presenting to the user. I want students to be able to query the Chatbot but not get full access to the transcript. I did not clean the transcripts, so there may be grammatical errors during transcription, my lecture ramblings may not have been eloquent, etc. It is still possible to ask the ChatBot to output a source “verbatim.” However, I want to make sure the custom GPT cleans up the transcript before presenting it. With this custom GPT, I am able to include a prompt that cleans up the verbatim phrases before presenting it to the user.
General Architecture
Extract the text transcripts for each YouTube video lecture for the entire course semester.
Upload every text transcript to ChatGPT to create a custom GPT.
Tune the GPT and publish it as a custom GPT for the course that can be used by students to create outlines, summaries, and practice quiz questions. Students could also pivot their research and exploration to related topics via ChatGPT.
YouTube Video Transcripts
I post my videos as unlisted videos on YouTube to ensure they are not immediately discoverable by the public. The links to these unlisted videos are provided to students currently enrolled in the class. Since my videos are managed by the YouTube Studio application, it is easy to obtain a transcript of my videos. For any video in your channel content, open up the video Details. From there, click on the “Subtitles” button.
A window pops up that shows your video's text transcript in a scrollable text box on the left. Simply click on the "three-dot” menu and select “Download Subtitles” to save the transcript to a text file, “transcript.txt”. Rename and edit the transcript file as you need to.
Repeat this process until you have all the transcripts you want to include in your Custom GPT chatbot. Below is the list of transcript files that I am using to tailor this custom GPT.
Create a Custom GPT
You need to be a subscriber to create a Custom GPT. I have the “Plus” subscription, which is $20/month. The detailed instructions for creating a custom GPT are here: https://help.openai.com/en/articles/8554397-creating-a-gpt. It is an extremely simple process, and there are lots of YouTube videos on it. The following is a summary of the configurations I applied to create the custom GPT.
First, click on “Explore GPTs” in the left sidebar.
Then click on the “+Create” button at the top-right to start the creation of a custom GPT. The GPTs your create will show up in the “My GPTs” tab which we will see later when you click on that button.
Usually, you are presented with the “Create” mode, where you can create a GPT interactively with prompts to the GPT Builder. These prompts configure the custom GPT. To the right of the screen is the area where you can test the configuration incrementally.
I prefer the “Configure” tab, which is clearer about what information you need to input.
The “Name” and “Description” fields are self-explanatory. If you plan to make your custom GPT available to the public, spend some time developing good content for those fields.
The “Instructions” field contains the prompts to ground and tailor the GPT.
The “Conversation starters” create quick access buttons to run pre-determined prompts.
The “Knowledge” section is where you upload the files you want your custom GPT to reference. I will upload my video lecture transcripts here.
The “Capabilities” section indicates the additional capabilities of this custom GPT. For my purposes, I do not need image generation for the course. However, I do wish the GPT to seek additional information on the Web as necessary.
The “Actions” section allows you to integrate additional functionality like calling APIs. We do not need that functionality at this time.
Below is a video showing all the configurations I set for this custom GPT and a demonstration of a few prompts. Watch the video in full screen mode and pause as needed to see the configuration and watch the demonstration.
The instructions are the most important section (aside from the Knowledge files I uploaded), and I want to comment on them below. The bolded words are the instruction followed by my rationale:
Extract answers based on the lectures from the uploaded Knowledge files and supplement them with relevant information.
This is intended to confine answers to the content the lecture files I uploaded. At the same time, I allow the GPT to supplement it with information from the web. However, I have found that you sometimes need to preface a prompt with, “Based on the lectures in the files uploaded, …” to ensure it uses the lecture notes as the primary reference.
You are a patient professor and an expert in secure software development for desktop, web, and cloud applications.
I wanted explanations to be verbose from an academic perspective from a professional in the fields relevant to the course.
An output with source code should be in the Python programming language.
The custom GPT sometimes supplements responses with code. Occasionally, I saw it produce code in languages other than Python. Since our program standardizes on Python, this instruction tailors the GPT to generate primarily code examples in Python.
If any information about a date or when something is due for ISA 320 is requested, respond with, "This content is from the Spring 2024 semester; therefore, the dates are not relevant."
During my lectures, I also discussed due dates. Since those due dates are not relevant to students taking this course after Spring 2024, this prompt is intended to prevent the mentioning of dates.
If the content does not relate to the content in the uploaded files, cybersecurity, or software development, respond with, "This is beyond the scope of this ISA 320 chatbot."
This custom GPT should not provide answers on topics other than those in the lecture files or the course topics.
If a verbatim output from the uploaded Knowledge files is requested, respond with a detailed grammatically correct version with proper punctuation of the output.
I realized during testing that a query for “verbatim” content from the uploaded files does return content directly from the files. Since the transcripts were not “cleaned” and ensured to be grammatically correct (I ramble during my lectures sometimes…who doesn’t :)), the responses were not ideal or read out of context. Therefore, I added this prompt to clean up the response before presenting it to the user.
These are all the prompts I used at this point of prototyping and testing. I will continue to tune the prompts over the next several weeks. The official resources for prompt development are here:
Deployment
You have a few options for deploying the custom GPT when you click on the “Share” button.
Keep it Private to yourself
Only people with the link can access it
Public access via the GPT Store
At the moment, I am selecting “Only me” during this testing phase. However, once the course starts, I will change it to “Anyone with the link” and give the link to current students only. Below is how the custom GPT is presented. I could add an icon, etc. but this is good for now.
Final Thoughts
Every educator has struggled with using LLM technology to support their students and enhance learning. My perspective is that we need to evolve and develop ways to integrate it as soon as possible. This technology is just too good at providing information in response to natural language queries. The goal is to ensure the information is authoritative and accurate. There are real issues relating to information retention and cheating. Therefore, I am considering these steps:
Make the custom GPT unavailable during test time. This custom GPT will be available throughout the course so students can use it to study but not answer questions on a test.
Student assessments will need to be modified to include more detailed reasoning based on specific facts provided. I have found that the answers provided by LLMs on specific scenarios are too general and generic.
Include more oral presentations so that students can demonstrate synthesis and application of the topics.
The bottom line is AI/LLM is here and we need to master it before it becomes our masters.