Didn't expect the 'GPT-3 moment' analogy for video models, but it makes so much sense now that you say it. The way Veo 3 is described, it really does feel like computer vison is finally catching up in the zero-shot reasoning game; guess my social media feed is about to get even more confusing.
It's definitely a bit of a stretch, but I liked the idea of calling it like that! It will certainly take more time as all traditional modalities are hard to bake into a single model, but the first signs are there. Exciting times ahead indeed!
Didn't expect the 'GPT-3 moment' analogy for video models, but it makes so much sense now that you say it. The way Veo 3 is described, it really does feel like computer vison is finally catching up in the zero-shot reasoning game; guess my social media feed is about to get even more confusing.
It's definitely a bit of a stretch, but I liked the idea of calling it like that! It will certainly take more time as all traditional modalities are hard to bake into a single model, but the first signs are there. Exciting times ahead indeed!