See the code that made this possible: eidech/historyinterviewer: Python automation of AI-Generated History Interview Workflow (github.com)
The idea for this project first arose during the COSN "AI and Its Impact on K12 Education" workshop at Central Connecticut State University I attended with the other tech integrators from Trumbull. The keynote speaker for this event, Dan Fitzpatrick, showcased some of the strategies that he used with students that incorporated AI to provide novel experiences to students.
One such strategy that intrigued me was a virtual interview that he did with students so they could ask questions of King Henry VIII. He described a workflow involving ChatGPT and Midjourney. Once I tracked down a Twitter thread in which he explained his full workflow, I began planning in earnest.
His workflow made clear that his process was done outside of class. He manually typed his prompt into ChatGPT to produce a response, then used PlayHT to generate audio, then used D-iD to generate a video from the audio. As he said in his Twitter post, he was able to generate these videos in 10 minutes. I knew that I wanted to produce an experience of bringing a historical figure into class for an interview. I knew that such an experience involves asking questions and receiving answers in a timely fashion. I decided that I could automate this process and bring down generation time considerably.
I identified the APIs that I could use and made a workflow similar to Dan's that could produce videos for an in-class interview with President Franklin Delano Roosevelt, to be used in an Honors US History class at Trumbull High School.
To start, I used OpenAI's DALL-E to create an image of Franklin Delano Roosevelt that D-iD could use in generating videos. I also created a prompt for my script that ChatGPT could use to produce answers to student questions:
def call_chatgpt(question):
openaiclient = OpenAI()
response = openaiclient.chat.completions.create(
model="gpt-3.5-turbo-1106",
messages=[
{"role": "system", "content": "You are the president Franklin Delano Roosevelt in a high school history class. You are answering student questions about your life. Try to answer in three sentences or less"},
{"role": "user", "content": question}
]
)
return response.choices[0].message.content
So far, so good. My script would use my closely-guarded OpenAI API key to retrieve a response from ChatGPT. Next, I trained a Play.HT voice using a recording of Franklin Delano Roosevelt in the Public Domain. This trained model was available via Play.ht's API using the Python module pyht. I created a call_playht function that took my ChatGPT response and generated/downloaded audio of FDR's response.
Then, I hit my first snag. D-iD's API requires the user to provide a link to both the picture used as well as the audio file. This was not a problem in testing, since I was able to just upload the file to my Bluehost website and directly link to the file; however, I was going to need a way to access these files dynamically.
Enter my (relatively) new toy, the ZimaBoard 832. When I purchased Arlo security cameras for my home, I wanted to be able to store videos locally while being able to access them abroad, so I purchased this device to run a WireGuard server to allow me to VPN into my home network. Turns out, the ZimaBoard is also a fairly capable webserver. I was already running Apache2 to allow access to CasaOS as a virtual host using a reverse proxy; I decided I would use a Flask server (also running behind Apache2) and create a simple REST API to allow for upload, access, and deletion of files required for my script to run. With my endpoints generated and ready to go, I was now set to send API requests to D-iD to create videos on the fly.
And, thus, I had my finished product. I brought the script to history class, and we asked many questions, asking about the impact of his struggle with polio on his Presidency, how his wife Eleanor helped him navigate the Great Depression, what his favorite song was, and why he didn't end up packing the Supreme Court (among other topics).
This project still has a way to go before I'm satisfied with it. I'm looking for a reasonably priced GPU with CUDA cores to hook up to the PCIe port on my ZimaBoard so I can start generating my own video using SadTalker. I also want to create a web-interface for the History Interviewer.
Have a question about this project? Interested in contributing? Feel free to email at christopher.eide@gmail.com!