Women Who AI
Posts
Let's Talk About the New OpenAI Releases

Let's Talk About the New OpenAI Releases

Leandra T
April 21, 2025

Cutting through the noise in AI

Welcome to the Women Who AI Newsletter, your weekly update on what actually matters in AI when you’re focused on building and scaling startups.

Was this forwarded to you? We send out this newsletter every Monday morning. Click here to subscribe and join our community of founders shaping the future of AI.

Deep Dive: New OpenAI Releases

Last week OpenAI released new AI models and an open-source coding assistant. Let’s break it down.

GPT-4.1

GPT-4.1 was released last week, but you can’t access it in the interface yet. It was released API-only. Despite the name, this model is newer than GPT-4.5, which is being deprecated.

There’s a hack if you want to try GPT-4.1 now; visit the OpenAI Playground. This is an interface for developers to try out API settings before adding them to their code, but you can also use it to test new models without any setup.

OpenAI Sandbox Interface

There are three AI models in the 4.1 family: GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano. These models outperform their previous GPT-4o models in coding abilities, accurately following instructions, and processing much larger amounts of information at once (up to 1 million tokens, equivalent to about 750,000 words, or roughly 8-10 full-length novels). That’s enough to let you process every email, document, and presentation from a typical startup's first year of operation in a single request. The models feature updated knowledge up to June 2024 and are designed to be not only more powerful but also more cost-effective and responsive than their predecessors.

The family includes options for different needs: GPT-4.1 for maximum capability, GPT-4.1 mini as a balance of performance and speed (it matches or exceeds GPT-4o while being nearly twice as fast and 83% less expensive), and GPT-4.1 nano as their fastest and cheapest model ever for tasks that require quick responses.

For founders who have built products on the 4o API, you can switch to 4-1 to both save money and improve performance in one go.

Reasoning Models: OpenAI o3 and o4-mini

OpenAI recently introduced two powerful "reasoning" models called o3 and o4-mini. These models are available in the ChatGPT interface today. “Reasoning” means that these models actually pause to think before answering your questions.

These models are more “agent” than “chatbot”. They determine which tools to use and how to combine them to solve your specific problem. The tools they have access to include web search, Python analysis, visual reasoning, and image generation, which they use and combine to solve problems, typically delivering results in under a minute.

Which Model, For What?

OpenAI o3 ($10/million input tokens, $40/million output tokens) is ideal when you need deep analysis of complex business problems. It excels at tasks requiring multi-faceted thinking - like evaluating potential market strategies, analyzing competitive landscapes, or interpreting visual data from presentations and reports.
OpenAI o4-mini ($1.10/million input tokens, $4.40/million output tokens) offers impressive reasoning at a significantly lower cost, making it perfect for frequent use or when you need to analyze large volumes of business data. Though smaller than o3, it still performs remarkably well on analytical tasks and non-technical challenges.

There are still challenges. Recent reporting from TechCrunch reveals that OpenAI's new reasoning models actually hallucinate more frequently than their predecessors. According to internal testing, o3 hallucinated on 33% of questions about people (double the rate of previous models), while o4-mini performed even worse at 48%. Researchers at Transluce found that o3 sometimes fabricates actions it claims to have taken, and OpenAI acknowledges in their technical report that "more research is needed" to understand why hallucinations increase as reasoning capabilities scale up.

Codex

On the same day as the reasoning models, OpenAI released Codex CLI—a practical coding tool that runs directly in your terminal. It's a lightweight agent that can read your code, make changes, and run commands. I’ve been using it and, while changes can take a long time, it’s been particularly useful as an educational resource when trying to understand a new codebase.

Try it out by following the setup instructions.

If you’re new to using the terminal, Terminus is a game from MIT that can serve as a gentle introduction.

Source: https://github.com/openai/codex

To spur adoption, OpenAI plans to distribute $1 million in API grants to eligible software development projects, awarded in $25,000 blocks of API credits. Apply here.

Hackathons

Ready to build that product you've been dreaming about? Check out these upcoming hackathons!

If you'd like to find a Women Who AI team for any event, reply to this email, and we'll connect everyone interested.

AI Build Jam | SF | Sun, April 27 | Event Link

r/AI_Agents 100k Hackathon! | Virtual | Wed, May 14 | Event Link

Plus, not a hackathon, but check out Redefine Possible – Women Pioneering the Future of AI if you’re in NYC this Thursday, April 24 - Event Link

Job Opportunity

Location: San Francisco, CA | Role: Full-Time | Stage: Profitable Startup | Focus: AI x Legal Tech

About Formally

Formally is building the AI-powered legal stack of the future—starting with immigration. We recently became the first AI platform adopted by AmLaw 100 firms for immigration work, and we’ve just achieved profitability with an exceptionally lean, mission-driven team.

We’re now seeking a Senior Founding Engineer to play a pivotal role in shaping our technical vision, leading execution, and developing transformative tools that make legal work more efficient, accessible, and scalable.

Tech Stack

Frontend: Next.js
Backend: Node.js + Express
Database: PostgreSQL
Infrastructure: Google Cloud Platform

You Might Be a Fit If…

5+ years of professional software engineering experience, with experience at early-stage startups
Proven track record of shipping production-grade code in fast-moving environments
Deep experience with modern full-stack development (React/Next.js, Node.js, REST APIs)
Strong architectural thinking and ability to design scalable systems
Bonus: familiarity with GCP, experience in legal tech or B2B SaaS, interest in immigration or legal tech

Apply at https://form.typeform.com/to/d9RKSjz6

Reply With Questions

We want this newsletter to address the real challenges you're facing. Is there a specific AI development you'd like explained? Jargon we included but didn't properly explain? A business problem you're trying to solve with AI? Reply directly to this email with your questions, and we'll tackle them in next week's edition.

If you found value in today's newsletter, please consider forwarding it to other women in your network who are building, or thinking about building, in the AI space. The more we grow this community, the stronger our collective impact becomes.

Here's to building the future of AI, together.

Lea & Daniela