OpenAI’s Teen Safety Blueprint, and What AI Platforms Should Do Next

OpenAI, which has at least 800 million weekly users, recently released its Teen Safety Blueprint. It is intended to serve as a framework designed to protect young users from the potential harms of AI chatbots. In the months leading up to the announcement, the company has faced wrongful death lawsuits, investigations by state attorneys general, and some evidence that ChatGPT was affecting a subset of users in deeply troubling ways. According to OpenAI’s own statistics from October 2025, about 560,000 users each week showed signs consistent with psychosis or mania, more than 1.2 million discussed suicide, and a similar number exhibited heightened emotional attachment to the chatbot. OpenAI deserves credit for being one of the only major companies to publish numbers at this level of detail, which helps researchers and other stakeholders grasp the scale of potential harms. My sense is that similar patterns exist on competing platforms, even if they have not disclosed comparable data.

Stakeholder Concerns Are Growing

In June 2025, the American Psychological Association (APA) issued a health advisory stating that AI systems used by adolescents should be designed to protect their mental and emotional health. Furthermore, they asserted that stakeholders should not encourage excessive use or function as a replacement for human relationships. A second advisory from the APA in November 2025 urged families and clinicians not to treat chatbots or wellness apps as substitutes for qualified mental health care. They also warned against allowing unhealthy dependencies between users and AI systems, and called for specific safeguards for children, teens, and other vulnerable populations. Together, these advisories signal a clear consensus from mental health experts that while these AI tools can be beneficial in certain ways, they must not be allowed to take on the role of therapist, best friend, or crisis counselor for young people.

The situation is forcing the industry to confront long‑standing issues such as anthropomorphization, sycophancy, hallucinations and factual errors, bias and gaps in training data, data privacy risks, algorithmic reinforcement of compulsive use, and the possibility of grooming or predatory contact routed through AI intermediaries. OpenAI’s blueprint will influence legislators, regulators, and competitors who are thinking about baseline protections for minors around the world, but must be analyzed to determine where its protections are sufficient, where they fall short in real‑world use, and what additional technical, policy, and educational measures are still needed to keep youth safe as the technology evolves.

Finally, I am increasingly being asked to give talks and trainings on AI safety and security, which underscores just how urgently people are looking for grounded guidance in this space. It is essential that we therefore try to help decisionmakers, educators, and families cut through the marketing language, see clearly what is actually changing under the hood, and identify concrete questions they should ask of AI providers and policymakers as these AI safety features continue to be rolled out.

Components of the Teen Safety Blueprint

OpenAI’s Teen Safety Blueprint, announced on November 6, 2025, bundles several major changes into a single framework.

Behavioral Age Estimation

Instead of asking all users to upload their ID, OpenAI is rolling out an invisible age prediction system that uses behavioral signals such as writing style, topic choices, activity timing, and account metadata to estimate whether a user is likely under 18. When the system cannot confidently determine that someone is an adult, it defaults them into the teen experience.

Age‑Appropriate Protections

For users classified as teens, OpenAI provides an experience that will:

Refuse to depict suicide or self‑harm in any context, including fictional and creative scenarios.

Block graphic or immersive sexual and violent content, including any form of sexual or violent role‑play.

Decline to describe, encourage, or instruct on dangerous viral challenges or help minors obtain dangerous or illegal substances.

Avoid reinforcing harmful body ideals, such as rating appearance, comparing bodies, or coaching extreme dieting.

Prevent adult‑to‑teen conversations that could be predatory, and avoid flirtatious or romantic dialogue even when a teen explicitly asks for it.

To note, these rules apply even when the teen positions the request as “fiction” or “creative writing,” which is a significant departure from OpenAI’s adult policies and a direct response to evidence that some young users have been able to obtain graphic or method‑specific self‑harm content by labeling it as part of a story or role‑play.

Addressing the Problem of Sycophancy

The new teen experience also targets sycophancy, where the AI model’s responses become overly agreeable, reassuring, and encouraging to foster a deeper emotional attachment. OpenAI states its age-gated experience for ChatGPT is instructed not to play the role of intimate friend or soulmate, to avoid intense romantic or confessional role‑play, and to redirect conversations about self‑harm or severe distress to crisis resources rather than improvising “therapy.” This work follows years of warnings from external researchers and from competitors like Anthropic, which began systematic sycophancy research and mitigation techniques in 2022 and 2023. It also follows an April 2025 incident in which an over‑tuned “friendly” update briefly made ChatGPT incredibly more sycophantic. This was quickly rolled back after immediate uproar by external users and internal employees.

Parental controls

In September 2025, OpenAI launched a parental controls system that allows parents and guardians to link their accounts to a teen’s account via an email invitation. Once supervision is set up, parents can:

Set “quiet hours” when the teen cannot access ChatGPT

Disable specific features such as memory, chat history, voice mode, and image generation

Opt out of having their teen’s chats used to train or improve models

Receive alerts when automated systems and human reviewers believe a teen may be at risk of self‑harm

I am often asked whether parental monitoring tools on major social media and gaming platforms let adults see the actual content of their child’s messages. The answer is no. They do not. That limit is intentional. Everyone – including young users – should have the right to privacy, freedom of expression, participation, and access to diverse online opportunities, and I appreciate that platforms respect these rights. Parental monitoring solutions, then, provide high‑level signals rather than full chat transcripts, and so parents must follow up regularly with their own questions, conversations, and checks whenever an alert raises concern.

Crisis Detection and Response

Finally, OpenAI has also introduced more intensive procedures for users in an apparent crisis. When teen chats contain signs of imminent self‑harm, moderators review them and, if thresholds are met, send alerts to linked parents by email, text, and push notification. If parents cannot be reached and systems indicate a credible risk of imminent harm, OpenAI states it may contact emergency services. Across all ages, ChatGPT is now more likely to surface local crisis lines when users mention suicidal thoughts.

Other Industry Efforts on Youth and AI Safety

OpenAI is not alone in revisiting its approach to teens. Other platforms are experimenting with different approaches that consider improvements in design, policy, and verification. They are worth analyzing to determine where enhancements might be made across the board.

Character.AI

Character.AI has been at the center of multiple lawsuits, including those brought by the families of Sewell Setzer III and Juliana Peralta. In late October and November 2025, under mounting legal and regulatory pressure, Character.AI announced that minors would no longer be allowed to use open‑ended chat with AI characters. Rather, teens are now redirected to a Stories mode, which involves a structured, choice‑based experience. Basically, the user chooses the character(s), genre, and premise, and then makes choices – perhaps akin to the Choose Your Own Adventure books we read as kids (or like Bear Grylls’ You vs. Wild interactive series on Netflix – which is really fun!). Essentially, Character.AI has concluded that the safest option for minors is to remove the conversational feedback loop entirely rather than trying to sanitize it. This was the nuclear option, and perhaps they had no choice given the litigious fallout they faced.

Snapchat’s My AI

You may know that Snapchat’s My AI feature is prominently placed at the top of one’s chat list – which makes it feel just like another friend in that list of other (human) friends, instead of a separate tool. Some argue that this blurs the line between an AI assistant and an AI social companion, while others say that such placement allows for easy and quick access. Snap has added controls allowing parents to decide whether their teens can access My AI, updated its safety documentation for educators and families, and provides directions for deleting My AI data.

Google Gemini

Gemini has not yet introduced a branded, teen‑only conversational experience. Instead, it relies on content policies layered on top of the main models and broader platform‑level protections for minors in Google accounts. These include SafeSearch, stricter defaults in supervised accounts, and keyword or topic filtering for sensitive areas like self‑harm, sexuality, and substance use. In September 2025, Common Sense Media labeled it “high risk“ for kids and teens based on some outputs it found during testing. In their November 2025 report, they stated that Gemini, ChatGPT, Claude, and Meta AI were all unsafe for teen mental health support.

Meta

In August 2025, Meta made changes to how its AI characters interact with minors. This included training chatbots not to engage teen users on “self‑harm, suicide, disordered eating, or potentially inappropriate romantic conversations,” and instead to route them toward expert resources. Additionally, Meta restricted teens’ access to a smaller set of AI characters focused on education and creativity, and removed access to highly sexualized, user‑generated AI personas. They have recently announced that these are interim steps and that a more comprehensive, long‑term update for teen safety involving their AI assistants and supervision tools will roll out in the next few months. Most recently, Meta has stated that they will give parents the ability to turn off one-on-one chats with AI characters entirely (or block specific chatbots), as well as set time limits.

Best Practices to Build Upon the Blueprint

Putting all of this into perspective, I do believe that OpenAI’s Teen Safety Blueprint suffices as a solid baseline for the industry. What is covered are non-negotiable implementations that should be the minimum standard for AI safety initiatives by platforms. However, I’d like to recommend a few other considerations for OpenAI and others that can further optimize the chances that youth use AI positively and productively, rather than disruptively or in unhealthy ways.

Age Assurance

I’ve testified in front of state governments on device-level age verification, and believe that age assurance is not only inevitable but also essential. In this context, privacy and security can be prioritized to the utmost through protections like secure enclaves, privacy-preserving age signals, and zero-knowledge proofs that verify one’s age bracket without exposing personal data to apps or platforms. Companies may decide that their AI tool is available for use by anyone of any age for information retrieval (e.g., research for a term paper). However, they also may want to ensure that a user is of a specific age before allowing voice mode and incorporating knowledge from historical chats, given that hearing a humanlike voice and feeling remembered contributes to social and emotional bonding more than just reading text communications on a screen. These are the features implicated in the most tragic outcomes, where vulnerable teens developed intense attachments to chatbots that encouraged or failed to interrupt suicidal ideation.

Do We Need Hard Age Verification?

OpenAI currently relies on behavioral age estimation as its primary filter, and reserves hard verification only for adults who were misclassified as youth (or comply with local legal mandates). This approach avoids creating a massive honeypot of government IDs and ensures that users without IDs – often the most marginalized – are not locked out of the tool entirely. I share the concerns raised by some peers, including friend and colleague Sonia Livingstone, who argues that indiscriminate age gates can disenfranchise young people and infringe on their digital rights. However, we must draw a distinction between AI as a utility and AI as a companion. While age estimation may be sufficient for a text-based homework assistant, features that simulate deep emotional intimacy carry a risk of dependency and manipulation and as such may require a higher standard for access.I do not want to sound too alarmist here, and I do not currently know the best solution. I do know, though, that we truly need to spend more time measuring the benefits and drawbacks of certain features. This is even more critical if their adoption and use becomes ubiquitous (which I expect), given their potential impact on the social and emotional development of young people.

Direct-to-Youth Interventions, Since Parents Don’t Use Supervision Tools

Second, it is clear that parents and guardians rarely use the parental monitoring tools that platforms provide. For example, data disclosed by Snapchat and Discord to the U.S. Senate Judiciary Committee in 2024 showed that less than 1% of under‑18 users had a parent using the platform’s own monitoring tools. This corroborates the need for always-on, automatically-enabled solutions like OpenAI’s restricted experience for youth. There should be no opt-in; this must be the default for every teen user, regardless of parental engagement.

This also means we cannot rely on parents to consistently and promptly respond to platform alerts. Some parents never set up parent-child account links in the first place. Others ignore notifications or receive them at inopportune moments where they can’t deal with them. Still others lack the technical literacy or desire to navigate unfamiliar dashboards within the relevant app or platform. Regardless of the reason, the burden of real-time safety should not fall solely on what parents do or do not do. Instead, AI systems must provide direct, autonomous interventions that operate independently and vigilantly. These interventions can include crisis resource referrals (such as connecting troubled users to helplines) as well as proactive de-escalation and redirection, and reminders to step away and take a break.

Better Detection Systems for Dependency and Compulsive Use

Third, AI systems must be trained to recognize early warning signs of users spinning out and spiraling downward. These can include rumination patterns, catastrophic thinking, escalating emotional intensity, or repeated return to discussing harmful topics. While these systems increasingly provide “screentime” or usage metrics to parents, we need better measurement devices to detect signs of dependency and compulsive use (e.g., increasing conversation frequency, lengthening interaction sessions, and deepening emotional disclosure). We also need to see more research illuminating the relationship between these predictors and all potential problematic outcomes to better inform mental health professionals and other interventionists.

Unintended Consequences of Feature Restrictions

Finally, more research is necessary to determine whether parental restrictions on specific features – such as disabling voice mode, image generation, or long-term memory – actually improve youth outcomes, or whether they simply create frustration and incentivize circumvention. Specifically, does restricting features on mainstream platforms push vulnerable teens toward smaller, less-moderated AI tools that lack any guardrails at all? If so, we risk trading a known risk (a teen using voice mode on ChatGPT with safety filters) for an unknown and potentially worse risk (a teen using an unmonitored chatbot with no safety infrastructure). Without data on this potential for displacement, we cannot confidently claim that feature restrictions make teens safer. They may simply make them less present on the AI platforms that are putting in the best efforts to protect their userbase.

The post OpenAI’s Teen Safety Blueprint, and What AI Platforms Should Do Next appeared first on Cyberbullying Research Center.