This is the last installment in Anthralytic’s three-part series on AI governance. In Part 1: Mapping the Landscape, I unpacked the major players and fault lines shaping global rules—from accelerationists to doomers, and everyone in between. Part 2: What Global Development Can Teach Us explored what international development has taught us (and failed to teach us) about building systems that are adaptive, participatory, and real. Now in Part 3, I’m focusing on the toolkit: the concrete mechanisms that can help translate principles into practice—before it’s too late.
All of the data we have collectively generated—our research, posts, and conversations—has paved the road that AI companies now drive on. They used our data to build these powerful models—and now they’re sending products back down that road and charging us to use them. I believe there is great promise in them. However, we have every right to demand that these systems serve the public, not just profit, and don’t cause harm that we can’t undo. AI will not legislate itself.
These tools aren’t new—but applying them to AI, at this speed and scale, is. From licensing regimes to model audits to third-party oversight, this post offers a plainspoken primer on the governance infrastructure we urgently need—and the tradeoffs we’ll have to navigate to get there.
I’m going to start with some humility. I’m not an AI governance expert. Much of what follows may be familiar and oversimplified to folks in policy, safety research, or regulatory tech. Every tool or approach here could fill a dissertation or career. But for most people—even those in government, civil society, and philanthropy—these ideas are still surprisingly niche.
What we need—and what I haven’t seen much of—is a plainspoken primer for laypeople: a map, not a manual.
This part focuses on practical tools anchored in core governance principles like Proportionality, Participation, and Accountability—with supporting principles such as Enforceability, Traceability, Resilience, and Interoperability to help make them real: what they are, why they matter, how they work together, and why we need to push lawmakers to implement these safeguards.
1. Participatory and Democratic Processes
Why it matters: Decisions about AI shouldn’t be made only by tech companies or government agencies. The public—especially people most affected—should have a voice in how these tools are designed and governed. The companies that have built these powerful frontier models did so by using our collective data. They wouldn’t have been able to train their models without it. We deserve a voice.
What it is: Structured ways to involve communities in AI policymaking, like citizens’ assemblies, participatory budgeting, and open consultations. Some of these efforts are already underway in places like the EU and Latin America. In the United States... not so much. What are we waiting for?
Principle: Participation. When people are genuinely involved in decision-making, policies are more legitimate and grounded in real-world needs.
Key challenge: Making public input meaningful—not just symbolic—by tying it to actual policy decisions and giving communities feedback on how their input was used.
2. Model Audits
Why it matters: Models can reproduce bias, hide risks, or simply malfunction. Audits shine a light on hidden behaviors, build trust, and catch failures before harm. In finance, banks undergo mandatory external audits by PCAOB-regulated firms; in food safety, government agencies conduct inspections. AI needs a comparable oversight framework.
What it is: Decades of research and industry best practices have identified five essential dimensions that ensure AI aligns with human interests and values: reducing bias, robustness, fairness, safety, and explainability. While labs like Anthropic and OpenAI routinely run internal audits, independent external testing is virtually nonexistent.
Principles: Accountability and Enforceability. Audits enforce responsibility, ensuring creators stand by their claims and address downstream harms and externalities. Penalties for noncompliance, mandatory remediation plans, and public reporting turn audit findings into real-world change.
Key challenges:
Conflict of interest: Audits must be external—such as those mandated in finance through PCAOB-regulated auditors—to ensure genuine oversight. Yet such independent scrutiny remains almost entirely absent in AI today. Internal audits, where companies choose and pay their own auditors, create conflicts of interest since auditors have incentives to downplay or overlook problems.
Expertise gap: Auditors need both AI and domain-specific knowledge to catch nuanced risks.
3. Licensing Regimes
Why it matters: Without clear rules, powerful AI systems could end up in sensitive places—like healthcare or infrastructure—without anyone checking if they’re safe. Licensing helps put guardrails in place before that happens.
What it is: A formal licensing system—think the FDA for drugs, driver’s licenses for cars, and food-service permits for restaurants—requiring AI developers to demonstrate safety and accountability before unleashing powerful models into critical applications.
Principle: Proportionality. The amount of oversight should match the level of risk. Powerful AI models that could cause serious harm should face stricter rules, while lower-risk tools can be developed and used with fewer barriers. Licensing helps match the level of oversight to the level of risk—adding checks for powerful, high-impact models while allowing lower-risk tools to develop more freely.
Key challenge: Slows down technical advancement and could favor big incumbents, making it harder for open-source developers to compete.
4. Standards Bodies
Why it matters: Voluntary guidelines often lack teeth or consistency. A unified standards body can harmonize norms globally.
What it is: Technical standards for things like documentation, transparency, energy use, and model disclosures—typically developed by groups like ISO, IEEE, and NIST. Some experts are calling for a global body—similar to the Codex Alimentarius in food safety—that could set shared rules for AI across countries. Codex helps countries coordinate on food standards so that people can trust what’s safe to eat. Something similar for AI could reduce confusion, close gaps, and build global trust.
Principle: Interoperability. When countries use the same standards, AI systems can plug and play across borders, regulators can coordinate instead of duplicating work, and companies can’t dodge tough rules by cherry-picking the weakest jurisdiction.
Key challenge: Standards-setting is often a slow, bureaucratic process—especially when dozens of countries and companies are involved. This opens the door for powerful actors to dominate the conversation and shape weak or self-serving rules that benefit them instead of the public.
5. Regulatory Sandboxes
Why it matters: Many AI systems are too new or fast-moving to regulate effectively through static rules. Sandboxes are meant to be a contained environment where experimentation doesn’t spill over and cause harm—similar to how kids can safely experiment inside a sandbox.
What it is: Time-bound, supervised environments where companies can trial AI systems under close oversight and temporary regulatory flexibility. Countries like the UK and Singapore have piloted sandboxes, but many still lack clear entry criteria, goals, and exit plans.
Principle: Precaution. Sandboxes allow innovation while identifying problems early—before AI is widely deployed.
Key challenge: Making sandboxes meaningful, with clear goals, evaluation metrics, deadlines, and next steps for broader accountability.
6. Incident Reporting and Traceability
Why it matters: If something goes wrong with an AI system—say it gives dangerous advice or misclassifies someone—there’s often no clear way to report it, trace what happened, or fix the underlying issue. Without a feedback loop, mistakes are repeated across labs and models and baked into new systems.
What it is: Mandatory reporting systems where developers disclose dangerous, misleading, or unexpected model behavior—paired with traceability mechanisms (borrowed from food safety) that document a model’s training data, design choices, and decision pathways.
Principle: Traceability. Good records let us understand what went wrong and who’s responsible.
Key challenge: Getting companies to report problems honestly and consistently, and building the infrastructure to track how AI systems were built. Without legislation, big tech firms will not voluntarily take this on.
7. Third-Party Oversight
Why it matters: We can't rely on companies to police themselves, especially when speed and profits come before safety. Independent oversight ensures someone is watching who doesn’t have a stake in the outcome.
What it is: Institutions or networks—like a Bureau of AI Safety or international watchdogs—tasked with monitoring, investigating, and, when necessary, enforcing rules around AI. These bodies could also support whistleblowers, review audits, and flag emerging risks.
Principles: Accountability & Resilience. External watchdogs help prevent abuse and make systems more robust when things break down.
Key challenge: Building institutions with enough authority, expertise, and independence to do this work well—and making sure governments and companies respect their findings.
Call to Action
We’re at an inflection point. The stakes are too high—and the potential impacts too profound—to remain passive. To solidify AI governance, we must tackle the challenges noted above. These aren’t easy tasks—they’re challenges we still need to solve. But if we want AI to serve the public, not just private power, these are the problems we must take on together and call on our lawmakers to take action.
Drawing inspiration from the EU’s Policy and Investment Recommendations for Trustworthy AI, it’s time for coordinated, cross-sector action. The decisions we make today—about audit standards, licensing gates, and democratic processes—will shape whether AI amplifies human potential or concentrates unchecked power.
Let’s act with urgency, transparency, and solidarity. At the very least, we should challenge lawmakers to convene working groups—across government, industry, and civil society—to begin tackling these challenges with urgency and transparency.
The future we build depends on it.
Anthralytic is a strategy and evaluation consultancy helping mission-driven organizations navigate complexity, measure what matters, and adapt in real time. We combine human-centered design with AI-enabled tools to support better decisions, stronger accountability, and more responsive systems. If you’re working at the intersection of social change and emerging tech, let’s connect.