OpenAI calculated and ruthless, small vision models and the demise of Stablity AI

Hello!

Here are three things I found interesting in the world of AI over the last week.

OpenAI appoints former NSA director to board - blog post

Edward Snowden's articulation in the article sums it up the best for me - "They've gone full mask-off.... This is a willful, calculated betrayal of the rights of every person on Earth".

Personally, I appreciate their candour. Kind of. It's a little bit like hallucinations in Large Language Models: if you know that the AI will make stuff up and not know the difference between fact and fiction, it's easy to work around the limitation. It's when you aren't aware that bad things happen. If you know that OpenAI is spyware deeply integrated with the US intelligence apparatus then it's not that hard to figure out when to use it and when not to.

I do think it highlights how ruthless OpenAI will be in pursuing AGI - it is easy to rationalize any strategy as justified when you are intent on creating a digital god.

Vision models are getting smaller and smaller

A vision model typically means a LLM that can take text and image as inputs and it's common in all the big models like gpt-4o, gemini and claude. A common use case is to take a photo of food in your cupboard and ask for recipe ideas with those ingredients.

Both Meta and Microsoft released weights for new models which are small enough to run on consumer hardware and perform quite well. Apple are also publishing a lot of work on small models, including vision. There is a massive amount of research and money going into small on device AI and technical people have a lot of options to experiment with early stage tools. I expect we'll see some consumer friendly applications on mobile and desktop over the next six months, particularly with the roll out of Apple Intelligence.

This gap between what technical and non technical folks have access to is one of the reasons I'm so interested in things like teaching people to code a little bit, or get comfortable enough in the terminal to install a project from github. Case in point, apart from the Meta blog post the following links are fairly technical (I couldn't find good press coverage sorry), but it only takes a little bit of learning to be able to make sense of them.

Microsoft florence-2 model card | Meta chameleon announcement | Apple models

Stability AI release Stable Diffusion 3 - blog post

Stability AI are one of the nearly dead canaries in the coal mine of the AI bubble - out of cash (they burned at least 120m), chasing investment or acquisition, struggling to find any kind of revenue.

The most notable thing about this release is the licensing changes from previous models - it is only available for non commercial use and it costs $20 / month for small organisations (less than 1M of revenue, funding and monthly active users) and also has a limit of 6000 images per month. Anything larger is custom pricing with their enterprise sales team.

Yeah, nah. I don't like their chances.

I think this highlights how hard it will be for AI specialist labs (Mistral, ElevenLabs, Stability etc.) to survive. Building foundational models and selling access to them is a tough game. Giving them away open source (like Stability AI did) will only work for companies like Meta, Microsoft, and Google who can lower their average cost of research while the make money using the AI, and only a little bit selling access to it.