With the introduction of GPT-o1 and Google's NotebookLM, we hosted a session where we tried out different experiments with these new capabilities. We began by exploring voice. As it Chat GPT transitioned from solely a text-based tool, to a multimodal LLM, we started by talking to the model and paying attention to its tone, answers, and the affective difference between typing and talking to a bot. But to push the idea a little further we had two bots argue against each other about the correct materials for a sweater. We gave them different personalities, and also different argumentation styles to accompany either their allegiance to wool or blended fabrics. We then transitioned to how Google was using the format of the podcast for NotebookLM to synthesize and present diverse source of information. We had it explain different research papers on haptics and VR, and then had it describe Newton's 'Principia Mathematica' to us, and was pleasantly surprised by both its accessibility and accuracy of information. As a way to understand how these different models worked to summarize large swaths of complicated information we compared Chat GPT-o1's ability to accurately write complex code and explain the reasoning behind how it arrived at its output But to call attention to the specific types of problems LLMs can solve and how they arrive at their answer, we had it try and play Wordle at which it failed. Somehow Wordle was harder than python.
As LLM interfaces expand and we find new uses for their application, it's crucial to know how it is that these podcasts are being generated, or why the chatbot talks back to you in a certain way. This was an attempt to pull back the curtain on tools that truly accelerate information processing and even the solving of hardcore coding problems, but also fail in ways that surprise us.