Real-estate lending, investment and advisory firm Greystone Labs is Skyline AI’s newest partner. The two announced a data-sharing agreement in July that will

What is vertical AI?

In a recent talk at AI by the bay, I laid out a four-factor definition of what I consider to be a vertical AI startup.

1. Full stack products

Provide a full-stack fully-integrated solution to the end customer problem from the interface that solves for the need all the way down the stack to the functionality, models, and data that power the interface. This ecosystem is much more defensible over time than just proprietary data or models. Designing the right product interface requires subject matter expertise, and owning the interface allows you to instrument it and gather proprietary data. Then you’re able to build models that drive high-value functionality in a virtuous cycle between the interface and the data. You control the ‘data value chain’ and have pricing power and defensibility over time. Example: Blue River builds agriculture equipment that reduces chemicals and saves costs. They ‘personalize’ treatment of each individual plant, applying herbicides only to the weeds and not to the crop or soil. They use computer vision to identify each individual plant, machine learning to decide how to treat each plant, and robotics to take precise corresponding action for each plant. Blue River is defensible because it’s incredibly hard to replicate such a complex full-stack product, from gathering the training data for the various models, to incorporating the models alongside robotics into the machines, to integrating these machines into existing farm equipment and distribution channels.

2. Subject matter expertise

Product and sales at vertical AI startups benefit from bringing in key leaders from the industry early on in the business. Building full-stack products requires deep subject matter expertise. Selling these products requires trust, respect, and relationships within the industry. Teams that manage to combine the subject matter and technical expertise are able to model the domain richly and drive innovation that comes from thinking outside the box by understanding what the box is. Teams that come with a domain-first approach tend to get stuck inside the box, and teams that come with a tech-first tend to get stuck out in left field. There is also a major issue with team evolution — if you’re unable to set the joint domain-tech DNA early, then one side dominates, and it becomes a real challenge to bring in world class folks from the other side, as they will never have the same level of authority and respect within the company. Example: the Zymergen leadership team is a great mix of strong capabilities targeted at industrial biology; commercial (CEO Joshua Hoffman), scientific (CSO Zach Serber), and data (CTO Aaron Kimball). The harder it is to assemble the mixed team and set the company joint-DNA early on, the more defensible the business.

3. Proprietary data

The technology market is hyper competitive. As soon as you demonstrate good results, many people will copy you almost instantly if they can. Defensible AI businesses are built on proprietary data that is difficult to replicate. This happens in two phases, bootstrapping and compounding. In the bootstrap stage, you are building a unique set of training data by aggregating publicly available data and enriching it in some challenging way, running simulations to generate synthetic data, or doing BD deals to gather a set of internal company data. Once you have bootstrapped, you are building a ‘data flywheel’ into your products, so that you are capturing totally unique data over time from how your product is used, and that data capture is designed precisely to serve the needs of your models, which are designed to serve the needs of the product functionality, which is designed to meet the needs of the customer. This data value chain ensures that the customer’s motivation is aligned with your motivation to compound the value of your proprietary dataset. Example: Merlon Intelligence gathers training data from compliance analyst interactions with a financial crimes investigation dashboard. Gathering the data requires a full stack product where the interface is designed and instrumented to gather data that feeds into the models. It’s a learning to rank setup — learning to rank for risk just like the Facebook newsfeed learns to rank for engagement. Banks have a great deal of operational risk in deploying new financial crimes compliance software, so it’s a challenge to penetrate the market. The harder it is to gather your data, and the more its intertwined with the product and go to market strategy, the more defensible the business.

4. AI delivers core value

Amazon, Netflix, and Facebook are all companies that use AI to drive very high percentage lift in revenue and engagement. That’s valid and awesome, but AI is not the core value of their products — Amazon is an ecommerce store, Netflix is a video entertainment company, and Facebook is a social media company. Back when we first started Data Collective, we called this scenario the ‘data side car’ — like those really cool old motorcycles with an attached sidecar. AI is not the core value, but an attachment that optimizes the core value. By contrast, Vertical AI solutions are about AI unlocking entirely new opportunities rather than just optimizing existing opportunities. Example: Opendoor’s entire business model for making a more liquid market in real estate is predicted upon the notion that they can use models to price a home so accurately that they can make an offer immediately. The more AI delivers the product’s core value by unlocking a totally new opportunity through rich domain modeling within the vertical and models built on top of proprietary data gathered via the product itself, the more defensible the business.

Read my full post on vertical AI