Very few people who talk about AI have built products with it.

I have. It’s really hard.

I’m going to write a few articles about the biggest f**kups of my AI product - building life.

This first installment was prompted by the news a few months ago that OpenAI acquired Jony Ive’s IO for $6.5bn. IO is a wearable generative AI engine that analyzes what it sees through a small lens (vs. an LLM which analyzes large amounts of language - not images.)

It reminded me of a product I tried to launch in Mexico City about 10 years ago. It didn’t go well. It was a hardcore disaster. Literally.

This first episode of “Sh*t My AI Built” is called “Mexican Mall P*rn”.

I was leading product strategy for a joint venture between Google and PwC. For one of our product launches we partnered with a major cable provider in Mexico to beta-test an AI-powered, voice-assisted sales tool.

Basically, it was like an AI earbud for sales reps—designed to help them close deals in real time.

Here’s the problem we were trying to solve: In big malls across Mexico City, the cable provider set up kiosks with giant TVs playing soccer matches. When shoppers would crowd around to watch, sales reps would approach them and see if they were interested in buying a new cable package.

Conversion rates were terrible. Because nobody goes to the mall thinking, “You know what I need? Cable.”

So we tried to build a smarter system.

We built an audio-assisted AI tool that connected to a mic and an iPad. The AI would “listen” to the conversation of the sales reps with the shoppers. So, if the sales rep asked “what do you like to watch?” and the shopper said “Game of Thrones”, the iPad would play a trailer of Game Of Thrones. If they liked telenovelas, same thing. It was immersive. Seamless. It didn’t feel like a sales pitch.

In the sanctuary of our quiet NYC studio, it worked beautifully. 

But, in the mall it was a hardcore disaster.

Malls are loud. Soccer crowds are louder. And the word that kept rising above all the noise?

FUUUUUUUUUCCCKKK. Missed penalty kick? “FUCK!” Lose the lead in extra time? “FUCK!” “FUCK, FUCK, FUCK!”

Actually, it’s worse than that. The Mexican-Spanish Equivalent of f**k is much raunchier than American-missionary. It translates as “F*** your mother!!” “D**K!” “Your wife’s cheating on you!” Or “Don’t Suck It”

Our AI picked it all up. Took it as a content preference. And diligently began serving trailers for XXX porn. To Dads. While their kids were buying t-shirts in The Gap. (If we were lucky).

All because our AI did exactly what we told it to do. It listened carefully. And gave people what it thought they wanted.

One of the challenges of building AI tools with audio and vision is that the AI doesn’t know when it’s being spoken to or shown something. With ChatGPT, we type what we want when we want something. There’s a clear opening and closing of the prompt window. With audio and vision, it’s less clear when it should listen and who it should listen to.

User experience design with AI is hugely important. (I sense that it’s also being neglected. And that this is contributing to a more painful trough of despair. More on that in another article.)

Building products is messy because people are messy. People don’t behave the way we anticipate. We don’t know what guardrails need to be factored into the design until we break stuff. What we program technology to do does not factor in the full, in-the-wild experience of real life. It gets messier with AI, which is supposed to enhance human engagement, but sometimes doesn’t know who to listen to and just plays porn.

I am sure that Jony Ive and Sam Altman have thought about the messiness of a generative AI engine that processes video rather than language. But, if not, I’ll see you at the mall.

SH*T MY AI BUILT” is a new talk that I’m rolling out this September in NY, Chicago, and Miami. For speaking inquiries email [email protected]

Keep Reading

No posts found