AI that analyses is old, AI creation is new
All the biggest tech companies have been crowing about AI for years. AI that isolates and parses your speech for dictation and voice assistants, and can distinguish between voices for personalized results. AI that pieces together recorded sounds to “talk” to you. AI that isolates parts of images to easily edit them. AI that identifies objects and people to power your searches. AI that lets you select the text in any image. Apple does all this stuff. It’s so important to the company that they build a Neural Engine into all their chips, specialized hardware that accelerates machine learning tasks like these. Apple’s even working on the biggest AI challenge of all, self-driving cars. But generative AI is something else. It’s a newer class of AI that creates something entirely new using almost no text input. Yes, training the models takes a ton of time and a mountain of data, but then those models that the users will run are comparatively small and can seemingly make an infinite amount of new stuff. The AI that can find all the potatoes in your photo library is a totally different thing from one that can draw a potato from scratch in a wide variety of artistic styles.
Mark Hachman / IDG
ChatGPT, Bard, and Bing
The headline-making generative AI tech right now is ChatGPT from OpenAI. The advanced chatbot, and tools built upon it, are already being used in the business world to generate articles, emails, templates, and more, with some controversy. Students are using it to write entire papers from a small prompt, and the results are good enough that there’s a race to develop good tools for teachers to identify ChatGPT-written assignments. Because it was trained with a ton of web data that, while dated, is still relevant for many things, it can almost be like a search engine you converse with. This freaks out Google so much that it announced its own rival conversational AI product, Bard, which is not quite ready for the world to try out yet but coming soon. A public demo provided wrong information about the James Webb Space Telescope, so Google clearly has work to do. Microsoft also announced a new conversational search feature you can start using right now in Bing and the Edge browser. It’s built on ChatGPT with some enhancements and modifications. These are more than just toys or curiosities. These are real tools that people are using to do real work and to power creative projects. It’s all early days, and sometimes feels like it’s not ready for the world at large, but the pace of improvement and innovation is staggering–the AI models are doubling in complexity and sophistication every six months.
Stable Diffusion, Midjourney, DALL-E
And it’s not just the written word. We all had a good laugh making silly prompts with DALL-E 2 last year, but with further training and enhancements, these generative AI art tools have become good for a lot more than just making images of anime-style cats scuba diving with fishbowls over their heads. Midjourney and Stable Diffusion have gotten so good they’re creating art that could easily grace the cover of a magazine–and can turn out dozens in a few minutes. These tools can do much more than just make completely new images in a wide range of styles. They can alter input images. The App Store is already awash with avatar- and profile-making apps that use this software to take a few photos of your face and modify them in stunning ways by changing physical features without anyone being the wiser. Such as putting on sunglasses that look completely real. Last year it was a gimmick, but the technology is developing so rapidly that it’s already a tool. Adobe has already improved a lot of its apps with AI-powered image generation tools, for one-click photo restoration and vastly improved object deletion. But the company plans to add significant generative AI to its toolset soon, allowing you to literally insert images into existing photos and artwork that look like they fit right in.
Dalle2.app
A narrow window to act
And where is Apple in all this? The company has positioned itself as a technology leader, especially in the creative space. But with the exception of a few blog posts from an ML research site and some relatively low-effort optimizations to libraries for Apple silicon, Apple seems to be sitting this one out. I mean, I think Divam Gupta’s DiffusionBee is super cool, but it’s a little independent third-party app that hasn’t been updated in quite a while and is already behind the state-of-the-art in AI image generation. This technology is going to be completely transformative. Don’t believe me? Check out OpenAI’s research into generating music. It creates new music in a variety of styles including some singing, completely out of nowhere. Microsoft’s VALL-E can generate shockingly realistic voices that sound very close to a real person, using just a tiny snippet of that person’s voice as input. It can even mimic various emotional states. Many of these projects, and dozens more, are still in the research stage. It’s not hard to find some flaws with any of them. But the journey from research to the real world will be quick, and the flaws will get vanishingly hard to find. Apple certainly has the tools to build its own generative AI chatbot. Every new Mac and iPhone has a Neural Engine that’s capable of up to 15.8 trillion operations per second, as well as powerful Core ML and machine learning APIs. But we haven’t seen any movement from inside Cupertino. Accuracy and speed are of paramount importance with AI chatbots—Google’s stock and credibility tumbled this week after an error in its Bard chatbot—so it’s possible that it’s doing work behind the scenes with Siri and in true Apple fashion won’t release anything until it’s perfected. But even with a wealth of tools at its disposal, the question remains: Is Apple even paying attention? If it isn’t intently watching the AI space, Apple might not realize how fast it’s evolving. It took a year for generative AI to go from a “silly online research project toy” to “dueling announcements from Microsoft and Google.” In two more years, these tools will be ten times better and there will be a whole lot more of them. You’ll have a hard time telling what is real or completely AI-generated out of thin air. If you have big ideas but limited artistic skills, generative AI will make it a lot easier to realize your dreams.
Apple
With Siri, Apple was at the forefront of bringing an AI voice assistant to the masses. As that technology evolved, Apple fell way behind, and now Siri is often viewed as a disappointment that can’t compare with Google Assistant or Alexa. When it comes to generative AI, Apple doesn’t even have a first-mover advantage as it did with Siri. Tech companies big and small are already shipping powerful tools. Without action, Apple will simply wind up making some of the hardware upon which our generative-AI-driven future will run. Without realizing the power of this new technology in its own software and services, Apple will let everyone else define the state of the art for what could be the most important shift in computing in decades. Perhaps the company is okay with that, but as hardware sales flatten and the software and services side of Apple’s business grows, it really can’t afford not to be a leader in the generative AI revolution. Of course, Apple is one of the most secretive companies in tech, especially when it comes to software. Apple could have big teams working hard to bring generative AI features to iMovie, Final Cut Pro, Logic Pro, Photos, Mail, Messages, and the whole iWork suite. All of these could be completely transformed by powerful generative AI tools. We know Apple bought at least one generative AI company, AI Music, about a year ago. It wouldn’t be unreasonable to see at least a “generate an original instant soundtrack for your video” tool in Apple’s products this year. We might not hear anything at all about generative AI out of Apple, and then at WWDC, BAM! World-class generative AI all over Apple’s products! I hope that’s the case, because if Apple’s late to the game on such a transformational technology, it’s going to doom its software to being years behind its competitors for years to come.
Author: Jason Cross, Senior Editor
I have written about technology for my entire professional life - over 25 years. I enjoy learning about how complicated technology works and explaining it in a way anyone can understand.