Can you profitably run an AI company?

Predicting the future of the hot new thing

Feb 26, 2020

In recent “this is 100% my jam” news, this past week Andreesen Horowitz published a summary [link] of how the economics are looking for the recent wave of ‘AI Startups’ compared to more traditional software-as-a-service (SaaS) businesses. In summary, AI companies have:

Lower gross margins due to heavy cloud infrastructure usage and ongoing human support;
Scaling challenges due to the thorny problem of edge cases;
Weaker defensive moats due to the commoditization of AI models and challenges with data network effects.

As someone who does this for a living, this all rings very true to me. We can put ourselves in the shoes of a bog-standard “AI company”, Fraudacity. Fraudacity serves online retailers big and small. We’ve got ourselves some fancy deep learning model straight out of MIT, and if you plug your payment processor into Fraudacity’s Fraudtomatron system we’ll find fraudsters and destroy them with extreme prejudice.

Let’s take a journey.

Fraudacity’s service-as-a-service model

While Fraudacity is marketing itself as “software as a service”, in reality we’re offering “service as a service”. The fundamental hook of the vast majority of “AI companies” out there is to take something valuable and offer to do it for you, with the heavy lifting being done by AI. This is most obvious with companies that claim to automate necessary, repetitive and technical tasks - i.e., companies like Fraudacity. It’s no less true for companies that offer genuinely new capabilities that were too manual to envision before - e.g., automated sentiment analysis of customer support calls. It’s all fundamentally performing the same role, which is to simulate an army of humans doing human things.

The moat problem for Fraudacity is much clearer when you conceptualize us as service-as-a-service. In order to secure new business, Fraudacity solved for lightweight setup process that can accept inputs from the software customers already have and return outputs back out. The computers are doing the hard work, but in terms of how it fits into customers’ workflows it’s sort of they box up some paperwork, send it to us, and we send back a file marked as “Fraud”. Fraudacity’s competitors offer the same easy setup process as us, and so if someone can come along and offer better pricing they can swipe our customer easily.

This dynamic will put pricing pressure on AI companies like us from the demand side. The supply side imposes cost pressure too, as behind the scenes it turns out that Fraudacity is really offering a hybrid software/labor service.

The ghost in the machine is a person

Setting up and running Fraudtomatron is far from trivial. While “machine learning” may suggest that machines train themselves, in fact setting up even a simple predictive model is heavy on human labor. Our data scientists need to run through long checklists such as (this list is very much *not* exhaustive):

Do I have reliable outcome data on confirmed fraudsters?
Do I have reliable input data that I can use to predict?
Will future fraud risks reliably have that same input data?
Has past input data (used for training) changed since it was first generated?
Will unforeseeable future events make some input data unavailable?
Will people start running different types of fraud?

These are difficult questions to answer, and can’t easily be taken away from human judgment. Building ML models to reliably and robustly predict fraud can benefit from fancy models, but it absolutely requires getting familiar with the underlying process being modeled. Otherwise it’s trivially easy to make huge mistakes in your model, where it looks good on the dataset you use to train but fails entirely when put out into the world.

Trust me, I’ve done it a lot myself.

Setting up a new customer imposes a lot of human cost in both original model construction and in ongoing maintenance. In upfront cost, it requires a lot of investment in creating high-quality datasets for model training and for data pipelines to make sure the same set of data is available when the model goes into production. All of this cost and effort is borne by the Fraudacity data science team - we’re the experts and we control the model, which means we need to figure out what fraud looks like at our new customer.

Furthermore, no scenario is truly one-size fits all and many customers will have different needs. The fraud risks at a tiny customer selling home-made vegan ice cream are commensurately tiny - but their risk tolerance is much lower, as a single bad fraud case could put our fearless entrepreneurs out of a home. Other companies may face different types of fraud - i.e., an online chemical supplier which has many “straw buyers” attempting to buy small (legal) quantities of precursor chemicals for drugs or explosives. And then there can be edge cases such as a data breach, which floods some clients with fake orders that look legitimate because the customer data used is real. These take time and keep getting hard and harder to solve.

As models become more mature…each new edge case becomes more and more costly to address, while delivering value to fewer and fewer relevant customers.

The maintenance cost is a huge pain as well, as all of this can change at any time. We incorporated some neat risk features which incorporate zip-code level data from the Census, which is wicked cool. Well, unfortunately next year when the administration pulls all of its census data offline our model will break and our customers are going to call us up just hopping mad. Fraudtomatron has always used IP address to identify risk, but one day all the Russian scammers switch their proxies from Lagos to Lincoln, Nebraska and we clear a bunch of dodgy transactions. Keeping on top of all of this conscientiously requires not just human time but human attention, as the number and quantities of issues that can pop up are truly staggeringly complex and multi-faceted.

The Fraudacity of Hope

As you can probably see by now, the economics of running Fraudacity are much more challenging than traditional software or SaaS companies. There’s much more pressure on cost from the hybrid human/software component (not to mention monster server bills), and much more pressure on pricing from the essentially commoditized nature of the service we’re providing. It’s easy for us to be outcompeted or for us to just blow it if we scale our customer base beyond our ability to pay attention.

It’s going to be interesting to see how this plays out in the future. As many of these companies get more mature, I would expect to see many of them transition into actual consulting firms where they sell the labor and their customers bear the cost of operating complex models. To make a profit without that turn to consulting will probably require deep attentional focus on scaling, both to run their models with low computational costs and to squeeze every bit of efficiency out of the scarce resource of human attention.

As these companies start to shake out, die, and prosper I expect the winners to not be those with fancy tech but those which focus primarily on carefully using humans’ time and attention.

Ryan Iyengar

Mar 5, 2020

To answer your headline question with https://en.wikipedia.org/wiki/Betteridge%27s_law_of_headlines... No.

I see that there is a way to profitably run a services or consulting company, but not at nearly the overall valuation multiple commanded by most of the current crop of AI companies I've seen. Would love to see them valued at a more realistic consulting firm multiple and see what kind of press they get then.

Expand full comment

Sebastian

This was one of my favorites so far! Learned a ton

Standard Errors

Discussion about this post