This is part one of a three-part AI-Core Insights series.
Foundation models have seen rapid adoption into the creation of AI-based products, right from image generation models (DALL-E, Midjourney and Stable Diffusion) to language (BERT, GPT-x, FLAN). The introduction of GPT-4 has further expanded the potential for multi-modal applications. Amidst this progress, a lively discussion has emerged within the research community regarding the merits of open-source versus closed-source models. As an AI PM at Microsoft for Startups, I have had the privilege of collaborating with a select group of AI-focused startups as part of the AI Grant partnership announced last year. I have been astounded by the rapid pace of decision-making and innovation that characterizes the adoption of foundation models.
In this series of articles, I aim to share insights from AI startup trailblazers that may prove valuable for your own product development. In this first installment, we will explore how startups are navigating the decision of which models to utilize.
Building with foundation models
So, what is a foundation model? In the last few years, we have seen large AI models trained on a vast corpus of data, often using self-supervised learning that can power a wide variety of downstream tasks. An example is GPT-3, a large language model that can summarize any topic, probabilistically.
With foundation models, I have often seen a model selection problem emerge between open-source or closed-source models. Why is model source relevant here? Like the software world, a model can be open-source (such as Stable Diffusion) or closed-source (like Dall-E). When choosing between models, several startups have considered the trade-off on this parameter. The inception of this conversation is grounded in topics like responsible AI and empowering more research. People smarter than me are refining the paradigm of choice every day. As that continues, my question for startup builders is this: As a user, is the choice between an open-source vs closed-source model the real question for you? Or is it the suitability of a model for your use case?
As I observe these startups, I constantly see that the model allegiance lies in quality and fit versus the source. For example, the Stable Diffusion model going open source gave rise to a sizable number of these startups last year.
Making the right choice for you
As a startup, how do you choose the best model to use? The output style of different models is a key consideration. As I heard from one of the startups building an object prototype, the images generated from DALL-E (closed-source) looked artistic, while Midjourney (closed-source) generated images appeared animated. The images generated from Stable Diffusion (open source) were realistic, and hence suited the business use case of prototype creation, better than the other two. For another user creating an NFT, DALL-E might be a better choice.
Taking a step away from image models and towards language models, I have seen GPT-3 and Codex (both closed-source) serve as startup powerhouses. Our previously featured startup Trelent based their docstring generation product on Codex. These choices, over the potential alternatives of CodeGen or GPT-J, highlighted how the model quality was a better fit for the startup use-case. Parallelly, GPT-3 continues to power innovations like this and this as it gets improved, inspiring further research in open and closed source communities.
An ever-changing AI landscape
In addition to the question of the right foundation model, startups are thinking about using foundation models as input to each other, to further refine the outputs. A few examples to this:
- Startups are leveraging GPT-3 to create prompt options for a user using their Stable-Diffusion-based text-to-image app. Generators like this can generate prompt ideas, making it easier for your users to brainstorm with AI and get creative results like in this example.
- Implementations like this bring Cognitive Search and GPT 3.5 together to power conversational AI experience over your own data. (Both Cognitive Search and GPT-3.5, are closed-source models.)
- LLM chainers like Chains — 🦜🔗 LangChain 0.0.130 are helping make AI responsible by bringing self-critique chains to improve quality of response.
Whether startups should be moving towards open-source or closed-source models is a question only they can answer. I would, instead, shift the conversation and ask this: upon internal benchmarking, which specific model do you see as the best performer for your use case? For the broader question of where this is heading, the landscape is still changing. I cannot wait to see where we converge as an industry – the possibilities are exciting.
For more tips on leveraging AI for your startup and to start building on industry-leading AI infrastructure, sign up today for Microsoft for Startups Founders Hub.