A note from the AI front lines
AI rightfully refuses to relinquish its place in our cultural conversation, and so I thought I'd share a few interesting things we've noticed at Pixee making an AI product security engineer. I hope that others might find validation, understanding, or even better -- invite discussion on how embarrassingly wrong I am.
The cloud flagship models are a moving target, and thus require regular maintenance to use correctly. "Isn't gpt-4 the same when you first tried it?" No, its just not. I'm sure some of it's core is the same, but the vendors I talk to are tuning their offerings due to quality degradation. Much of the community is watching llama with great interest for this and other reasons, but dreading the slow inference speed they'll get in customer environments.
Vendors are constantly convincing themselves they can "throw something together quick" using AI, but they learn quickly that it needs significant, intricate engineering work to get to production quality and sustained performance. There's no doubt that AI is magical, but there is no shortcut to the magic -- it requires significant investment, just like any other type of product development. Said another way, if you have a sizable problem to tackle and you're throwing a small team at it -- well, my advice is to live with the problem or buy a solution.
Our AI architectures are constantly evolving. Feeling unmoored from well-established patterns makes for an unsettling journey! Langchain and Semantic Kernel are just two very different ways of going about a problem. Prompt chains and agentic patterns are different, and hard choices are constantly being made -- but orienting on performance in solving the customer problem is the only metric to be worried about -- constant dizzying evaluation of new tools and patterns is the new norm. Having great benchmarks (e.g., tests) is the only way to make these changes with confidence.
People tend to fall into two buckets: AI haters ("you can't trust it" and "it can't do the job") and cautiously optimistic ("I'll see a demo") -- but in our industry, there are very few full-bore AI enthusiasts. I'm an optimist and an enthusiast, but to get the haters, the performance must speak for itself in terms of revenue, mindshare, be seen in buyer circles, etc.
Fine-tuned models can achieve high performance, but they’re expensive to build and operationalize. This is a tough technology to bet on before you’ve got PMF, when you’re constantly reinventing and re-architecting. The journey here will be painful for people. Once you’ve got some semblance of PMF, you can look at SLM, fine tuning, etc. — even if you’ve got synthetic data generation really nailed down.
More notes another time!