Office Hours: April 1, 2026 #58

andrewmchang · 2026-03-31T16:46:49Z

andrewmchang
Mar 31, 2026
Maintainer

Our next office hours are happening Wednesday, April 1, 2026 at 1630–1700 UTC!

Learn how T&S peers are using AI for safety workflows, ask for support in implementing open safety models, and connect directly with RMC model partners. We'll also discuss how you can contribute to ongoing projects that the RMC is working on!

Proposed Agenda

Introduction

Recap of what the ROOST Model Community is and the purpose of office hours
Personal intros
RMC Update: Child Safety Policies & Datasets from OpenAI and Everyone.AI
RMC Update: Project Sprint on using open safety models for T&S

Q&A / Show and Tell

What questions do you have about the RMC? What resources would be most beneficial?
What progress have you made in testing, using, or evaluating different open safety models?
What use cases are you most interested in these models for? What would you like to see more of?

Admin:

Contributing to upcoming RMC projects

What's Next:

Introduce yourself in GitHub Discussions
Start a new discussion! Whether it's an implementation question, research findings, or model specific feedback

cassidyjames · 2026-04-01T17:38:43Z

cassidyjames
Apr 1, 2026
Maintainer

Notes

Attendees/intros

ROOSTers (Andrew, Cassidy, Vinay)
Nik: testing CoPE and how it works on content (e.g. content you'd find in specific subreddits) to see how well it performs; interested in golden data sets
Travis (Matrix)
Ariel: T&S practicioner using OpenAI safety model, operates a youth AI safety startup; received grant to child safety AI eval tool
Trezy: building Catridge, a video game metadata management platform on atproto; also working on a labeling app for atproto to be used in conjunction w/automated moderation tools like Coop/Osprey
Nathan

Community Q&A

Eval tool from Ariel

Proposed in summer 2024 w/Safe Online (nonprofit associated w/Unicef)
- Evals owned by technical teams, but not necessarily experts in safety
- Proposed creating an automatic eval tool focused on youth safety, specifically child sexual exploitation and abuse (CSEA) and youth mental health
- Building risk taxonomy across those risk areas, and an automated evaluation tool to grade outputs of LLMs and benchmark AI systems
- Automated piece uses AI to create scenarios, simulate users, judge outputs
- Interested in hearing about youth AI use cases anyone is working on

Questions/comments? Youth AI use cases?

AI implementations like Attie (Bluesky's AI-powered feed generator): the feed generator uses an LLM/AI to create bespoke algorithmic timelines; would be interesting to see if/how the outputs of an algorithmic timeline can be evaluated to avoid social media pitfalls

Nik's use of CoPE

Building golden data sets (from Reddit, crawling for hate speech), seeing a lot of variation. Would like to see how to reduce this variation. For example, given a policy on hate speech, tiny variations (like the last paragraph ending with a comma versus a period) could measurably change the outputs.

Running the same prompt 30 times, do you get the same outputs?

Suggestions to try a "jury" of models approach; more expensive, but can get more consistent results.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Office Hours: April 1, 2026 #58

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Office Hours: April 1, 2026 #58

Uh oh!

Uh oh!

andrewmchang Mar 31, 2026 Maintainer

Proposed Agenda

Introduction

Q&A / Show and Tell

Admin:

What's Next:

Replies: 1 comment

Uh oh!

Uh oh!

cassidyjames Apr 1, 2026 Maintainer

Notes

Attendees/intros

Community Q&A

Eval tool from Ariel

Nik's use of CoPE

andrewmchang
Mar 31, 2026
Maintainer

cassidyjames
Apr 1, 2026
Maintainer