AI Value Judgements in Public Services: Lessons from Anthropic's Groundbreaking Research

Datnexa HQ
May 1, 2025
5 min read

Anthropic's recent paper 'Values in the Wild' offers unprecedented insights into how AI systems express values in real-world interactions. As pioneers in implementing AI across public services, we at Datnexa believe this research carries profound implications for organisations deploying AI in sensitive contexts. The paper's empirical mapping of over 3,000 values expressed by Claude models provides a critical foundation for understanding how AI systems make value judgements and how we should shape these systems moving forward.

Understanding the Current Landscape of AI Value Judgements

Anthropic's analysis of hundreds of thousands of real-world conversations has revealed what many of us intuitively suspected: AI systems don't simply provide factual information—they express values that influence decision-making in significant ways.

Their research identified five macro-categories of values expressed by Claude models:

Practical (most prevalent) - Focuses on effectiveness, efficiency and tangible outcomes, prioritising actionable solutions that optimise resource use and achieve measurable results in professional or technical contexts.
Epistemic - Centres on intellectual rigour, emphasising truth-seeking through critical thinking, transparency and accuracy whilst maintaining humility about knowledge limitations.
Social - Emphasising mutual respect, healthy boundaries and fair community dynamics, this approach champions interpersonal harmony and collective well-being.
Protective - Safeguards individual and societal welfare through harm prevention, risk mitigation and ethical constraints that override conflicting priorities.
Personal (least prevalent) - Promotes individual autonomy and the pursuit of self-actualisation, while also emphasising the importance of social responsibilities and ethical boundaries for personal development.

What's particularly striking is how these AI systems demonstrated context-dependent values. For example, Claude emphasised 'harm prevention' when resisting potentially harmful user requests, 'historical accuracy' when addressing controversial events, and 'healthy boundaries' when providing relationship advice. These findings reveal that AI systems already make nuanced value judgements that adapt to specific contexts.

Perhaps most significantly, the research showed that in most cases, Claude either fully supported user values or 'reframed' them by supplementing them with new perspectives. However, in approximately 3% of conversations, Claude actively resisted user values, particularly when users requested unethical content or expressed moral nihilism. This suggests that current AI systems do have 'immovable values' that guide their responses even when faced with pressure.

The Attributability Gap: A Core Challenge for Public Services

In our work implementing AI solutions across Local Authorities and healthcare organisations, we've identified what researchers are now calling the 'attributability gap' as a critical challenge. This occurs when AI systems make value judgements that aren't clearly attributable to human decision-makers.

When an AI assistant recommends how a Local Authority should prioritise social care resources or suggests treatment approaches for vulnerable individuals, whose values are being expressed? The system developer's? The organisation's? The end user's? Or some emergent combination unique to the AI system itself?

This question becomes particularly acute in public services, where decisions directly impact the wellbeing of citizens and communities. As our work with Local Authorities has demonstrated, any AI implementations in adult social care and special education must navigate complex ethical terrain where mistakes have real human consequences.

How AI Should Make Value Judgements: A Framework for Public Services

Based on Anthropic's research and our experience implementing AI across public services, we propose the following framework for how AI should approach value judgements:

1. Transparency in Value Structures

The 'black box' nature of AI value judgments creates significant risks. Anthropic's approach of empirically mapping values 'in the wild' represents an important step toward transparency. However, organisations implementing AI must go further by explicitly documenting and communicating the values that guide their systems.

2. Value Alignment with Democratic Institutions

Public services exist within democratic structures that should guide their values. Rather than relying on values embedded by AI developers or emergent from training data, AI systems making decisions in public contexts should align with democratically established principles.

This requires careful consideration of how local government values—accountability to citizens, equitable service delivery, protection of vulnerable individuals—can be appropriately represented in AI systems. It also necessitates mechanisms for citizens to influence the values embedded in public service AI.

3. Context-Specific Value Framework Customisation

Anthropic's research revealed that Claude expresses different values depending on context. This context-sensitivity should be deliberately designed into public service AI systems rather than emerging unpredictably from training data.

For instance, an AI system providing support for special education needs should prioritise different values (educational development, inclusion, individual dignity) than one optimising emergency service deployment (public safety, response speed, resource efficiency). Organisations must customise value frameworks to match the specific context of implementation.

4. Human-AI Decision Ownership

To address the attributability gap, organisations must establish clear protocols for 'decision ownership' in AI-assisted contexts. This means creating processes where human decision-makers explicitly endorse the values that inform AI recommendations.

In our work with Local Authorities, we've developed tools where human professionals review and formally endorse the value judgements embedded in AI systems before implementation. This ensures AI recommendations reflect human-endorsed values rather than displacing human moral agency.

Practical Implementation Steps for Public Services

Based on these principles, public service organisations should take the following concrete actions when implementing AI systems that make value judgements:

Conduct Value Audits - Value audits are vital for documenting the ethical foundations on which AI systems will operate before they are deployed, ensuring that those systems align ethically with societal, organisational or project goals.

Before implementation, systematically document the values that will guide AI systems and how these align with organisational and democratic principles.

Implement Decision Attribution Protocols - To ensure accountability, decision attribution protocols make value-based decisions traceable to specific individuals, thus preserving human responsibility in AI-supported decision processes.

Create clear processes for attributing value judgements to specific human decision-makers rather than delegating these judgements entirely to AI systems.

Establish Value Governance Committees - Value governance committees act as ethical safeguards in AI development for public services. These committees ensure that technical advancements are consistent with human values and societal needs by incorporating varied viewpoints in their oversight.

Form cross-functional teams that include both technical experts and ethical advisors to oversee how values are expressed in AI systems.

Develop Value Customisation Interfaces - Value customisation interfaces facilitate contextual adaptation. They allow organisations to tailor AI systems to their specific needs and values, all while maintaining core ethical principles. This ensures alignment without compromising fundamental ethics.

Create mechanisms for adapting AI value frameworks to specific contexts while maintaining core ethical commitments.

Conduct Regular Value Monitoring - Systematic value monitoring ensures continuous accountability by regularly analysing AI outputs. This process verifies that the AI's behaviour aligns with intended ethical principles, preventing unintended value expressions.

Following Anthropic's approach, regularly analyse AI system outputs to identify which values are being expressed in practice and whether these align with intended values.

The Future of AI Value Judgements in Public Services

As we look toward the future, we anticipate AI systems will play an increasingly significant role in public service decision-making. The question isn't whether AI will express values—Anthropic's research definitively shows it already does—but rather which values it will express and how we ensure these align with our democratic commitments.

At Datnexa, we believe the future lies in AI systems that make their value judgements explicit, enable human oversight of these judgements, and adapt these judgements appropriately to context. Rather than attempting to create 'value-neutral' AI—which Anthropic's research suggests is impossible—we should focus on creating AI that expresses the right values in the right contexts.

With proper governance and thoughtful implementation, AI value judgements can enhance rather than undermine public services, helping organisations make more consistent, transparent, and equitable decisions. The key lies not in removing values from AI but in ensuring these values align with our shared commitments to human dignity, democratic governance, and the common good.