Thank you for bringing some knowledgeable sanity to this important issue. The way this is packaged in the media is indeed a problem, and I am troubled by the way this noise has been drowning out the signal. Indeed, what is the signal here? It does exist: the reality remains that ChatGPT will pass many of our exams - as many of our colleagues have verified. The point the media gets wrong is that this is not about the algorithm, but about the exam! This is about the proxy measures we have used in education that won't work anymore. This is about the need to move to more human-centred modes of testing, and this is about the problem that those do not scale.
The resources we posted at the Sentient Syllabus Project (http://sentientsyllabus.org) have as their first principle: An AI can not pass this course. This is aspirational, but also a "survival strategy". We just can't afford to allow the AI a passing grade – if we are not better than the AI, we can no longer bring value to the system. But how to be better? Part of the answer is: to stand on the AI's shoulders, another part is to make the value explicit. There's more to be said, than will fit into this margin, I've written about it in the posts right here on Substack - just click on the profile.
Even so, a colleague of mine said: taking the AI performance as the failing grade? No way. Our students won't be able to do that.
That's where the real challenge is.
I just subscribed here, looking forward to learn more about your take on this.
You might be amused by my own little piece: https://open.substack.com/pub/salvatoreattardo/p/chatgpt4-failed-my-pragmatics-exam?r=1jtkvf&utm_campaign=post&utm_medium=web
Thanks for these articles. Seems like the progress is being made, but we should be cautious on not just overestimating the progress. I guess sometimes it is ok for a researcher to by hyped about its research as long as public is not being deceived to think the job is done.
One thing that I am curious about is if we could gather data from doctors interacting with patients via a text and image only chatting system. The doctor can ask for images that patient can take.
Then, if we finetune multimodals on this data, I guess it should be able to at least give you a little bit of guidance on what to do next. Maybe a doctor can check the answers after that but this way, one doctor can visit 10x patients.
At least north american healthcare really needs something like that. I mean, we don't want these bots to prescribe treatment but just maybe gathering needed data and then the job of the doctor would be much easier.