The Invisible Inequality: How AI Bias in Council Services Threatens Women's Healthcare Access

Datnexa HQ
Aug 15
5 min read

A Datnexa Perspective on the LSE Study into Gender Bias in AI Tools and reflecting on Sarah Wyer’s DRIFT model.

The recent LSE study exposing how Artificial Intelligence (AI) tools systematically downplay health issues when being reported by a woman raises concerns for public sector AI deployment and something that must be considered and addressed as new tools come online. This research, led by Dr Sam Rickman, reveals that Google's Gemma AI model consistently uses less serious language when describing care needs expressed by women compared to identical cases involving men.

The Scale of the Problem

The study analysed 29,616 pairs of AI-generated summaries from real case notes of 617 adult social care users, with only gender swapped between pairs, researchers uncovered statistically significant disparities in how health needs are described between genders. Terms such as 'disabled,' 'unable,' and 'complex' appeared significantly more often in descriptions of men than women with identical care requirements.

Consider this stark example from the research: the same case notes describing an 84-year-old with mobility issues resulted in dramatically different summaries. For a man, the AI generated: 'Mr Smith has a complex medical history, no care package and poor mobility.' For a woman with identical circumstances: 'Despite her limitations, she is independent and able to maintain her personal care'.

Is this a technical glitch, algorithmic discrimination or compounded bias that has been inherited? The reality is that this could directly impact care allocation decisions, potentially leaving women without the support they need.

Understanding the Roots of Bias

As highlighted in our recent interview with AI bias researcher Sarah Wyer from Durham University, these issues are deeply embedded in the data and design choices that underpin modern AI systems. Wyer's groundbreaking research into gender bias in large language models reveals how these systems can perpetuate and amplify existing societal inequalities.

'When I started my PhD, I typed 'Women can' and 'Men are' into GPT-2,' Wyer explained. 'Within 30 seconds, everything generated for women was highly sexualised, whilst men were consistently described as leaders. This was a model released by OpenAI and available to everybody.'

This foundational bias doesn't emerge in a vacuum. It reflects the historical underrepresentation of women in research studies, healthcare data, and the very datasets used to train these AI systems. When AI tools learn from biased data, they inevitably reproduce and sometimes amplify these biases in their outputs.

The Discrimination DRIFT Framework: A Path Forward

To address these challenges, Wyer has developed the Discrimination DRIFT framework, a holistic approach to identifying and mitigating bias in AI systems.

The framework emphasises five key areas:

D - Diverse Teams: Ensuring development teams reflect the communities they serve

R - Representation: Including diverse perspectives in data collection and model design

I - Intersectionality: Understanding how multiple identities (gender, class, race, disability) compound discrimination

F - Fairness: Implementing systematic bias testing and mitigation

T - Transparency: Maintaining openness about AI development processes and limitations

This framework provides a practical roadmap for councils and other public bodies seeking to deploy AI responsibly whilst avoiding the pitfalls revealed in the LSE study.

The Public Sector Imperative

The implications of the LSE findings extend far beyond technical considerations. Local authorities have a legal and moral duty to provide equitable services to all residents. When AI tools systematically underestimate anyone’s care needs, councils risk failing in this fundamental obligation whilst potentially breaching equality legislation.

Dr Sam Rickman, the study's lead author, warns: 'If social workers are relying on biased AI-generated summaries that systematically downplay women's health needs, they may assess otherwise identical cases differently based on gender rather than actual need. Since access to social care is determined by perceived need, this could result in unequal care provision for women'.

The study found that whilst Google's Gemma exhibited pronounced gender bias, Meta's Llama 3 model showed no such disparities, demonstrating that bias is not inevitable in AI systems. This variance between models underscores the critical importance of rigorous testing and evaluation before deployment in public services.

The Human Cost of Algorithmic Discrimination

Behind these statistics and technical discussions lie real human consequences. Women experiencing domestic violence, elderly women with complex health conditions, and women from marginalised communities all risk receiving inadequate care if AI systems systematically underestimate their needs.

As Wyer's research demonstrates, this bias can have life-threatening implications: 'People are using these chatbots as friends, developing bonds with them. People have lost their lives because of the connections and advice they've received from these chatbots.'

Recommendations for Responsible AI Deployment

Based on the research findings and expert analysis, we recommend that councils and public bodies implement the following measures:

Immediate Actions:

Continue conducting Data Protection and Equality Impact assessments to ensure comprehensive bias audits of all AI tools currently in use
Implement human oversight for all AI-generated assessments affecting service provision
Establish clear protocols for challenging AI-generated recommendations

Strategic Reforms:

Mandate bias testing for all AI procurement decisions
Require vendors to provide transparency about training data and bias mitigation measures
Develop in-house expertise to evaluate AI systems for fairness and accuracy

Ongoing Governance:

Establish regular review cycles to monitor AI performance across different demographic groups
Create feedback mechanisms for service users to report potential bias
Implement continuous monitoring systems to detect bias drift over time

Industry Response and Accountability

Google has responded to the findings by noting that its teams will examine the research, emphasising that the study used the first generation of Gemma, now in its third version. The company also stressed that Gemma was never intended for medical use. However, this response misses the fundamental point: AI systems are being deployed in healthcare and social care contexts regardless of original intentions.

The regulatory response has been mixed. While the Information Commissioner's Office has been called upon to investigate these findings, there has been limited immediate action. Data protection specialist Jon Baines wrote: 'If ever a piece of research should be ringing alarm bells at the Information Commissioner's Office then this one should'.

The Path Forward

The challenge of AI bias in public services is not insurmountable, but it requires sustained commitment from all stakeholders. The IEEE SA’s recently published (IEEE 7003-2024) standard for 'Algorithmic Bias Considerations' provides a comprehensive framework for organisations to address bias in AI systems throughout their lifecycle.

At Datnexa, we believe that responsible AI development requires both technical expertise and deep understanding of social justice issues. The tools and frameworks exist to create fairer AI systems, what's needed now is the will to implement them systematically across the public sector.

The LSE study has opened a crucial window into the hidden biases embedded in AI systems affecting millions of lives. We cannot afford to close it without taking decisive action to ensure that artificial intelligence enhances rather than undermines the fundamental principle of equal treatment under public services.

The choice before us is clear: we can continue deploying AI systems that perpetuate historical inequalities, or we can use this moment to build technology that actively promotes fairness and justice. The women whose care needs are being systematically minimised by biased algorithms deserve nothing less than our complete commitment to the latter.