top of page

Thinking Machines Labs Lifts the Curtain: Multimodal AI for “Collaborative General Intelligence”

  • Writer: Niv Nissenson
    Niv Nissenson
  • Jul 16
  • 5 min read
ree

Just weeks after shattering records with a $2 billion seed round — the largest ever by a wide margin — Thinking Machines Labs, the secretive San Francisco startup founded by ex-OpenAI CTO Mira Murati, is finally starting to reveal what it’s building and looks like our assessment that they're building their own ChatGPT was largely on point although their ambitions seem greater than that.


In a post on X (formerly Twitter), Murati laid out the vision:

“We’re building multimodal AI that works with how you naturally interact with the world — through conversation, through sight, through the messy way we collaborate.” (full tweet below)


The first product, expected in the next couple of months (that sounds really fast), will include a significant open-source component designed to support researchers and startups building custom models. Murati also promised to share the company’s best science to help the broader AI community understand frontier systems. It's unclear how Multimodal their first product will actually be.


What is “multimodal AI”?

Simply put, multimodal AI systems can process and combine multiple types of input — text, images, maybe video or even audio — to understand or generate responses. For example, OpenAI’s GPT-4V (vision) can look at an image and answer questions about it, while Google DeepMind’s Gemini is designed to handle both language and image reasoning.


Feature / Focus

Thinking Machines Labs (planned)

ChatGPT-4V (OpenAI)

Google Gemini

Core approach to multimodality

Building native multimodal systems that integrate conversation, sight, and collaborative tasks in a unified architecture

Primarily a text model with image understanding added on top (vision is an “add-on”)

Native multimodal model, but largely still outputs text — tightly controlled pipelines

Openness & customization

Planning significant open sourcecomponents, with tools to help researchers and startups build custom models

Fully closed source; no user-level customization of underlying models

Closed model; limited external customization, mostly via API tuning

Collaboration / agentic interactivity

Aims to advance “collaborative general intelligence” — systems that work alongside humans in messy, natural workflows

Designed as a conversational assistant with limited autonomous collaborative planning

More integrated multi-turn reasoning, but still focused on user-as-prompt-master, not equal collaborator

Thinking Machines Labs is less than a year old, yet it’s already making waves, largely thanks to its pedigree: the startup was founded by ex-OpenAI CTO Mira Murati and is stacked with former OpenAI engineers and researchers. Their own team chart (see below) reads like an OpenAI alumni roster, underscoring just how much this ambitious new player draws from the original builders of ChatGPT.


TheMarketAI.com Take:It’s unclear whether Thinking Machines’ “first product” will truly be a fully multimodal system—building that typically takes longer than a few months to come out with such a breakthrough. More likely, we’ll see them unveil their own foundational LLM, laying the groundwork for the richer multimodal collaboration they promise.

Meanwhile, on prediction markets like Polymarket, Google still leads the pack for “Best AI model by end of 2025,” with OpenAI trailing at 26% and xAI at 15%. We’ll be watching closely to see if Thinking Machines earns a spot on the scoreboard in the coming months.






Thinking Machines founding team (source: https://thinkingmachines.ai):

Name

Bio

Alex Gartrell

Former Leader of Server Operating Systems at Meta, expert in Linux kernel, networking, and containerization.

Alexander Kirillov

Co-creator of Advanced Voice Mode at OpenAI and Segment Anything Model (SAM) at Meta AI, previously multimodal post-training lead at OpenAI.

Andrew Gu

Previously working on PyTorch and Llama pretraining efficiency.

Andrew Tulloch (Chief Architect)

ML systems research and engineering, previously at OpenAI and Meta.

Barret Zoph (CTO)

Formerly VP of Research (post-training) at OpenAI. Co-creator of ChatGPT.

Brydon Eastman

Formerly post-training research at OpenAI, specializing in human and synthetic data, model alignment and RL.

Chih-Kuan Yeh

Previously Building data for Google Gemini and Mistral AI.

Christian Gibson

Formerly infrastructure engineer at OpenAI, focused on supercomputers used in training frontier models.

Devendra Chaplot

Founding team member & Head of Multimodal Research at Mistral AI, co-creator of Mixtral and Pixtral. Expert in VLMs, RL, & Robotics.

Horace He

Interested in making both researchers and GPUs happy, formerly working on PyTorch Compilers at Meta, co-creator of FlexAttention/gpt-fast/torch.compile

Ian O'Connell

Infrastructure engineering, previously OpenAI, Netflix, Stripe.

Jacob Menick

ML researcher, led GPT-4o-mini at OpenAI, previously contributed to the creation of ChatGPT and deep generative models at DeepMind.

Joel Parish

Security generalist, helped ship and scale the first versions of ChatGPT at OpenAI.

John Schulman (Chief Scientist)

Pioneer of deep reinforcement learning and creator of PPO, cofounder of OpenAI, co-led ChatGPT and OpenAI post-training team.

Jonathan Lachman

Operations executive, former head of special projects at OpenAI and White House national security budget director.

Joshua Gross

Built product and research infrastructure at OpenAI, shaping ChatGPT's learning systems and GPU fleet; previously on product infra at Meta.

Kevin Button

Security engineer focused on infrastructure and data security, formerly at OpenAI.

Kurt Shuster

Reasoning at Google DeepMind, full-stack pre-training and inference at Character.AI, and fundamental dialogue research at Meta AI.

Kyle Luther

ML researcher, previously at OpenAI.

Lia Guy

Previously at OpenAI and DeepMind, working on model architecture research.

Lilian Weng

Formerly VP of Research (safety) at OpenAI. Author of Lil'Log.

Luke Carlson

Former ML Engineer in Apple's Machine Learning Research group, designed ML frameworks for model orchestration, speech generation, private federated learning, and image diffusion.

Luke Metz

Research scientist and engineer, previously at OpenAI and Google Brain. Co-creator of ChatGPT.

Mario Saltarelli

Former IT and Security leader at OpenAI.

Mark Jen

Generalist, most recently infra @ Meta.

Mianna Chen

Previously at OpenAI and Google DeepMind. Led advanced voice mode, 4o, 4o-mini, o1-preview, and o1-mini launches.

Mira Murati (CEO)

Former CTO of OpenAI, led OpenAI's research, product and safety.

Myle Ott

AI researcher, founding team at Character.AI, early LLM lead at Meta, creator of FSDP and fairseq.

Naman Goyal

Previously distributed training and scaling at FAIR and GenAI @Meta, most recently LLAMA pretraining.

Nikki Sommer

Formerly VP HRBP at OpenAI and Director, HRBP at Twitter.

Noah Shpak

ML Engineer, loves making data go vroom while GPUs go Brrr.

Pia Santos

Executive Operations Leader, previously at OpenAI.

Randall Lin

Previously babysitting ChatGPT at OpenAI and co-tech leading 'the Twitter algorithm' at X.

Rowan Zellers

Formerly at OpenAI, working on realtime multimodal posttraining.

Ruiqi Zhong

Passionate about human+AI collaboration, previously PhD at UC Berkeley, working on scalable oversight and explainability.

Sam Schoenholz

Led the reliable scaling team and GPT-4o optimization at OpenAI. Previously worked at the intersection between Statistical Physics & ML at Google Brain.

Sam Shleifer

Research engineer specializing in inference, previously at Character.AI, Google DeepMind, FAIR, HuggingFace.

Saurabh Garg

Researcher, formerly working on all things multimodal at Mistral AI. Deep into the magic of pretraining data and loving every byte of it!

Shaojie Bai

Avid ML researcher to make audio-visual models better and faster, previously at Meta.

Stephen Roller

Previously full-stack pre-training at DeepMind, CharacterAI, and MetaAI.

Yifu Wang

Passionate about novel ways of overlapping/fusing GPU compute and communication. Formerly PyTorch @ Meta.

Yinghai Lu

ML system engineer, formerly led various inference efforts at OpenAI and Meta.


bottom of page