SlideShare a Scribd company logo
Not Fair! Testing AI Bias and
Organizational Values
About me
• International speaker and writer
• Graduate degrees in Math, CS, Psychology
• Technology communicator
• AWS certified
• Former university professor, tech journalist
• Cat owner and distance runner
• peter@petervarhol.com
Gerie Owen
3
• Quality Engineering Architect
• Testing Strategist & Evangelist
• Test Manager
• Subject expert on testing for
TechTarget’s
SearchSoftwareQuality.com
• International and Domestic
Conference Presenter
Gerie.owen@gerieowen.com
What You Will Learn
• Why bias is often an outcome of machine learning results.
• How bias that reflects organizational values can be a desirable result.
• How to test bias against organizational values.
Agenda
• What is bias in AI?
• How does it happen?
• Is bias ever good?
• Building in bias intentionally
• Bias in data
• Summary
Bug vs. Bias
• A bug is an identifiable and measurable error in process or result
• Usually fixed with a code change
• A bias is a systematic inflection in decisions that produces results
inconsistent with reality
• Bias can’t be fixed with a code change
How Does This Happen?
• The problem domain is ambiguous
• There is no single “right” answer
• “Close enough” can usually work
• As long as we can quantify “close enough”
• We don’t know quite why the software
responds as it does
• We can’t easily trace code paths
• We choose the data
• The software “learns” from past actions
How Can We Tell If It’s Biased?
• We look very carefully at the training data
• We set strict success criteria based on the system requirements
• We run many tests
• Most change parameters only slightly
• Some use radical inputs
• Compare results to success criteria
Amazon Can’t Rid Its AI of Bias
• Amazon created an AI to crawl the web to find job candidates
• Training data was all resumes submitted for the last ten years
• In IT, the overwhelming majority were male
• The AI “learned” that males were superior for IT jobs
• Amazon couldn’t fix that training bias
Many Systems Use Objective Data
• Electric wind sensor
• Determines wind speed and direction
• Based on the cooling of filaments
• Designed a three-layer neural network
• Then used the known data to train it
• Cooling in degrees of all four filaments
• Wind speed, direction
Can This Possibly Be Biased?
• Well, yes
• The training data could have been recorded in single
temperature/sunlight/humidity conditions
• Which could affect results under those conditions
• It’s a possible bias that doesn’t hurt anyone
• Or does it?
• Does anyone remember a certain O-ring?
Where Do Biases Come From?
• Data selection
• We choose training data that represents only one segment of the domain
• We limit our training data to certain times or seasons
• We overrepresent one population
• Or
• The problem domain has subtly changed
Where Do Biases Come From?
• Latent bias
• Concepts become incorrectly correlated
• Correlation does not mean causation
• But it is high enough to believe
• We could be promoting stereotypes
• This describes Amazon’s problem
Where Do Biases Come From?
• Interaction bias
• We may focus on keywords that users apply incorrectly
• User incorporates slang or unusual words
• “That’s bad, man”
• The story of Microsoft Tay
• It wasn’t bad, it was trained that way
Why Does Bias Matter?
• Wrong answers
• Often with no recourse
• Subtle discrimination (legal or illegal)
• And no one knows it
• Suboptimal results
• We’re not getting it right often enough
It’s Not Just AI
• All software has biases
• It’s written by people
• People make decisions on how to design and implement
• Bias is inevitable
• But can we find it and correct it?
• Do we have to?
Like This One
• A London doctor can’t get into her fitness center locker room
• The fitness center uses a “smart card” to access and record services
• While acknowledging the problem
• The fitness center couldn’t fix it
• But the software development team could
• They had hard-coded “doctor” to be synonymous
with “male”
• It was meant as a convenient shortcut
About That Data
• We use data from the problem domain
• What’s that?
• In some cases, scientific measurements are accurate
• But we can choose the wrong measures
• Or not fully represent the problem domain
• But data can also be subjective
• We train with photos of one race over another
• We train with our own values of beauty
Is Bias Always Bad?
• Bias can result in suboptimal answers
• Answers that reflect the bias rather than rational thought
• But is that always a problem?
• It depends on how we measure our answers
• We may not want the most profitable answer
• Instead we want to reflect organizational values
• What are those values?
Examples of Organizational Values
• Committed with goals to equal hiring, pay, and promotion
• Will not exclude credit based on location, race, or other irrelevant
factor
• Will keep the environment cleaner than we left it
• Net carbon neutral
• No pollutants into atmosphere
• We will delight our customers
Examples of Organizational Values
• These values don’t maximize profit at the expense of everything
• They represent what we might stand for
• They are extremely difficult to train AI for
• Values tend to be nebulous
• Organizations don’t always practice them
• We don’t know how to measure them
• So we don’t know what data to use
• Are we achieving the desired results?
• How can we test this?
How Do We Design Systems With
These Goals in Mind?
• We need data
• But we don’t directly measure the goal
• Is there proxy data?
• Training the system
• Data must reflect goals
• That means we must know or suspect the data
is measuring the bias we want
Examples of Useful Data
• Customer satisfaction
• Survey data
• Complaints/resolution times
• Maintain a clean environment
• Emissions from operations/employee commute
• Recycling volume
• Equal opportunity
• Salary comparisons, hiring statistics
Sample Scenario
• “We delight our customers”
• AI apps make decisions on customer complaints
• Goal is to satisfy as many as possible
• Make it right if possible
• Train with
• Customer satisfaction survey results
• Objective assessment of customer interaction results
Testing the Bias
• Define hypotheses
• Map vague to operational definitions
• Establish test scenarios
• Specify the exact results expected
• With means and standard deviations
• Test using training data
• Measure the results in terms of definitions
Testing the Bias
• Compare test results to the data
• That data measures your organizational values
• Is there a consistent match?
• A consistent match means that the AI is accurately reflecting organizational
values
• Does it meet the goals set forth at the beginning of the project?
• Are ML recommendations reflecting values?
• If not, it’s time to go back to the drawing board
• Better operational definitions
• New data
Finally
• Test using real life data
• Put the application into production
• Confirm results in practice
• At first, side by side with human decision-makers
• Validate the recommendations with people
• Compare recommendations with results
• Yes/no – does the software reflect values
Back to Bias
• Bias isn’t necessarily bad in ML/AI
• But we need to understand it
• And make sure it reflects our goals
• Testers need to understand organizational values
• And how they represent bias
• And how to incorporate that bias into ML/AI apps
Summary
• Machine learning/AI apps can be designed to reflect organizational
values
• That may not result in the best decision from a strict business standpoint
• Know your organizational values
• And be committed to maintaining them
• Test to the data that represents the values
• As well as the written values themselves
• Draw conclusions about the decisions being made
Thank You
• Peter Varhol
peter@petervarhol.com
• Gerie Owen
gerie@gerieowen.com

More Related Content

PDF
Pdf analytics-and-witch-doctoring -why-executives-succumb-to-the-black-box-me...
PPTX
Pavel Kamyshov "Team Health Check" Kyiv PM Club
PPTX
Testing a movingtarget_quest_dynatrace
PPTX
Overcoming Top 5 Misconceptions Predictive Analytics
PPTX
Your Agile Leadership Journey: Leading People-Managing Paradoxes - Agile Char...
PPTX
Are you ready for Data science? A 12 point test
PDF
ERMS Success: Questions to Ask
PDF
GoDataDriven Giovanni Lanzani
Pdf analytics-and-witch-doctoring -why-executives-succumb-to-the-black-box-me...
Pavel Kamyshov "Team Health Check" Kyiv PM Club
Testing a movingtarget_quest_dynatrace
Overcoming Top 5 Misconceptions Predictive Analytics
Your Agile Leadership Journey: Leading People-Managing Paradoxes - Agile Char...
Are you ready for Data science? A 12 point test
ERMS Success: Questions to Ask
GoDataDriven Giovanni Lanzani

What's hot (7)

PPTX
ORA Workshop Presentation
PPTX
Your Agile Leadership Journey: Leading People, Managing Paradoxes
PDF
Getting better all the time – and Fast! How Agile drives marketing excellence
PPTX
How to Speak "Manager"
PDF
I have an app idea, now what (ascendle) (ProductCamp Boston 2016)
PPT
Counseling vsm presentation_7-20-2011
PDF
Big Data LDN 2017: Preserving The Key Principles Of Academic Research In A Bu...
ORA Workshop Presentation
Your Agile Leadership Journey: Leading People, Managing Paradoxes
Getting better all the time – and Fast! How Agile drives marketing excellence
How to Speak "Manager"
I have an app idea, now what (ascendle) (ProductCamp Boston 2016)
Counseling vsm presentation_7-20-2011
Big Data LDN 2017: Preserving The Key Principles Of Academic Research In A Bu...
Ad

Similar to Not fair! testing ai bias and organizational values (20)

PPTX
Not fair! testing AI bias and organizational values
PPTX
Testing for cognitive bias in ai systems
PPT
AI Ethics and Bias By Komninos Chatzipapas
PDF
Model bias in AI
PPTX
Correlation does not mean causation
PDF
Algorithmic Bias - What is it? Why should we care? What can we do about it?
PPTX
AI Fails: Avoiding bias in your systems
PDF
AI ETHICS AND BIASES (For the AI BIAS).pdf
PDF
Algorithmic Bias : What is it? Why should we care? What can we do about it?
PDF
AI day poster
PPTX
Machine Learning Pitfalls
PDF
"Practical Approaches to Training Data Strategy: Bias, Legal and Ethical Cons...
PPTX
Combatting Bias in Machine Learning
PPTX
Avoiding Machine Learning Pitfalls 2-10-18
PDF
Using AI to Build Fair and Equitable Workplaces
PDF
Bias in AI-systems: A multi-step approach
PDF
When the AIs failures send us back to our own societal biases
PPTX
Responsible AI in Industry: Practical Challenges and Lessons Learned
PDF
From Filter Bubbles to Fair Hiring: Harnessing AI for Diverse News and Mitiga...
PPTX
3 Steps To Tackle The Problem Of Bias In Artificial Intelligence
Not fair! testing AI bias and organizational values
Testing for cognitive bias in ai systems
AI Ethics and Bias By Komninos Chatzipapas
Model bias in AI
Correlation does not mean causation
Algorithmic Bias - What is it? Why should we care? What can we do about it?
AI Fails: Avoiding bias in your systems
AI ETHICS AND BIASES (For the AI BIAS).pdf
Algorithmic Bias : What is it? Why should we care? What can we do about it?
AI day poster
Machine Learning Pitfalls
"Practical Approaches to Training Data Strategy: Bias, Legal and Ethical Cons...
Combatting Bias in Machine Learning
Avoiding Machine Learning Pitfalls 2-10-18
Using AI to Build Fair and Equitable Workplaces
Bias in AI-systems: A multi-step approach
When the AIs failures send us back to our own societal biases
Responsible AI in Industry: Practical Challenges and Lessons Learned
From Filter Bubbles to Fair Hiring: Harnessing AI for Diverse News and Mitiga...
3 Steps To Tackle The Problem Of Bias In Artificial Intelligence
Ad

More from Peter Varhol (14)

PPTX
DevOps and the Impostor Syndrome
PPTX
162 the technologist of the future
PPTX
Digital transformation through devops dod indianapolis
PPTX
Making disaster routine
PPTX
What Aircrews Can Teach Testing Teams
PPTX
Identifying and measuring testing debt
PPTX
What aircrews can teach devops teams ignite
PPTX
Talking to people lightning
PPTX
Using Machine Learning to Optimize DevOps Practices
PPTX
Varhol oracle database_firewall_oct2011
PPTX
Qa test managed_code_varhol
PDF
Talking to people: the forgotten DevOps tool
PPTX
How do we fix testing
PPTX
Moneyball peter varhol_starwest2012
DevOps and the Impostor Syndrome
162 the technologist of the future
Digital transformation through devops dod indianapolis
Making disaster routine
What Aircrews Can Teach Testing Teams
Identifying and measuring testing debt
What aircrews can teach devops teams ignite
Talking to people lightning
Using Machine Learning to Optimize DevOps Practices
Varhol oracle database_firewall_oct2011
Qa test managed_code_varhol
Talking to people: the forgotten DevOps tool
How do we fix testing
Moneyball peter varhol_starwest2012

Recently uploaded (20)

PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
cloud_computing_Infrastucture_as_cloud_p
PPTX
Spectroscopy.pptx food analysis technology
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
A Presentation on Artificial Intelligence
PPTX
Tartificialntelligence_presentation.pptx
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PPTX
TLE Review Electricity (Electricity).pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Heart disease approach using modified random forest and particle swarm optimi...
Unlocking AI with Model Context Protocol (MCP)
cloud_computing_Infrastucture_as_cloud_p
Spectroscopy.pptx food analysis technology
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Assigned Numbers - 2025 - Bluetooth® Document
A Presentation on Artificial Intelligence
Tartificialntelligence_presentation.pptx
A comparative analysis of optical character recognition models for extracting...
Network Security Unit 5.pdf for BCA BBA.
Encapsulation_ Review paper, used for researhc scholars
Reach Out and Touch Someone: Haptics and Empathic Computing
Accuracy of neural networks in brain wave diagnosis of schizophrenia
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Digital-Transformation-Roadmap-for-Companies.pptx
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Spectral efficient network and resource selection model in 5G networks
Group 1 Presentation -Planning and Decision Making .pptx
TLE Review Electricity (Electricity).pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Heart disease approach using modified random forest and particle swarm optimi...

Not fair! testing ai bias and organizational values

  • 1. Not Fair! Testing AI Bias and Organizational Values
  • 2. About me • International speaker and writer • Graduate degrees in Math, CS, Psychology • Technology communicator • AWS certified • Former university professor, tech journalist • Cat owner and distance runner • [email protected]
  • 3. Gerie Owen 3 • Quality Engineering Architect • Testing Strategist & Evangelist • Test Manager • Subject expert on testing for TechTarget’s SearchSoftwareQuality.com • International and Domestic Conference Presenter [email protected]
  • 4. What You Will Learn • Why bias is often an outcome of machine learning results. • How bias that reflects organizational values can be a desirable result. • How to test bias against organizational values.
  • 5. Agenda • What is bias in AI? • How does it happen? • Is bias ever good? • Building in bias intentionally • Bias in data • Summary
  • 6. Bug vs. Bias • A bug is an identifiable and measurable error in process or result • Usually fixed with a code change • A bias is a systematic inflection in decisions that produces results inconsistent with reality • Bias can’t be fixed with a code change
  • 7. How Does This Happen? • The problem domain is ambiguous • There is no single “right” answer • “Close enough” can usually work • As long as we can quantify “close enough” • We don’t know quite why the software responds as it does • We can’t easily trace code paths • We choose the data • The software “learns” from past actions
  • 8. How Can We Tell If It’s Biased? • We look very carefully at the training data • We set strict success criteria based on the system requirements • We run many tests • Most change parameters only slightly • Some use radical inputs • Compare results to success criteria
  • 9. Amazon Can’t Rid Its AI of Bias • Amazon created an AI to crawl the web to find job candidates • Training data was all resumes submitted for the last ten years • In IT, the overwhelming majority were male • The AI “learned” that males were superior for IT jobs • Amazon couldn’t fix that training bias
  • 10. Many Systems Use Objective Data • Electric wind sensor • Determines wind speed and direction • Based on the cooling of filaments • Designed a three-layer neural network • Then used the known data to train it • Cooling in degrees of all four filaments • Wind speed, direction
  • 11. Can This Possibly Be Biased? • Well, yes • The training data could have been recorded in single temperature/sunlight/humidity conditions • Which could affect results under those conditions • It’s a possible bias that doesn’t hurt anyone • Or does it? • Does anyone remember a certain O-ring?
  • 12. Where Do Biases Come From? • Data selection • We choose training data that represents only one segment of the domain • We limit our training data to certain times or seasons • We overrepresent one population • Or • The problem domain has subtly changed
  • 13. Where Do Biases Come From? • Latent bias • Concepts become incorrectly correlated • Correlation does not mean causation • But it is high enough to believe • We could be promoting stereotypes • This describes Amazon’s problem
  • 14. Where Do Biases Come From? • Interaction bias • We may focus on keywords that users apply incorrectly • User incorporates slang or unusual words • “That’s bad, man” • The story of Microsoft Tay • It wasn’t bad, it was trained that way
  • 15. Why Does Bias Matter? • Wrong answers • Often with no recourse • Subtle discrimination (legal or illegal) • And no one knows it • Suboptimal results • We’re not getting it right often enough
  • 16. It’s Not Just AI • All software has biases • It’s written by people • People make decisions on how to design and implement • Bias is inevitable • But can we find it and correct it? • Do we have to?
  • 17. Like This One • A London doctor can’t get into her fitness center locker room • The fitness center uses a “smart card” to access and record services • While acknowledging the problem • The fitness center couldn’t fix it • But the software development team could • They had hard-coded “doctor” to be synonymous with “male” • It was meant as a convenient shortcut
  • 18. About That Data • We use data from the problem domain • What’s that? • In some cases, scientific measurements are accurate • But we can choose the wrong measures • Or not fully represent the problem domain • But data can also be subjective • We train with photos of one race over another • We train with our own values of beauty
  • 19. Is Bias Always Bad? • Bias can result in suboptimal answers • Answers that reflect the bias rather than rational thought • But is that always a problem? • It depends on how we measure our answers • We may not want the most profitable answer • Instead we want to reflect organizational values • What are those values?
  • 20. Examples of Organizational Values • Committed with goals to equal hiring, pay, and promotion • Will not exclude credit based on location, race, or other irrelevant factor • Will keep the environment cleaner than we left it • Net carbon neutral • No pollutants into atmosphere • We will delight our customers
  • 21. Examples of Organizational Values • These values don’t maximize profit at the expense of everything • They represent what we might stand for • They are extremely difficult to train AI for • Values tend to be nebulous • Organizations don’t always practice them • We don’t know how to measure them • So we don’t know what data to use • Are we achieving the desired results? • How can we test this?
  • 22. How Do We Design Systems With These Goals in Mind? • We need data • But we don’t directly measure the goal • Is there proxy data? • Training the system • Data must reflect goals • That means we must know or suspect the data is measuring the bias we want
  • 23. Examples of Useful Data • Customer satisfaction • Survey data • Complaints/resolution times • Maintain a clean environment • Emissions from operations/employee commute • Recycling volume • Equal opportunity • Salary comparisons, hiring statistics
  • 24. Sample Scenario • “We delight our customers” • AI apps make decisions on customer complaints • Goal is to satisfy as many as possible • Make it right if possible • Train with • Customer satisfaction survey results • Objective assessment of customer interaction results
  • 25. Testing the Bias • Define hypotheses • Map vague to operational definitions • Establish test scenarios • Specify the exact results expected • With means and standard deviations • Test using training data • Measure the results in terms of definitions
  • 26. Testing the Bias • Compare test results to the data • That data measures your organizational values • Is there a consistent match? • A consistent match means that the AI is accurately reflecting organizational values • Does it meet the goals set forth at the beginning of the project? • Are ML recommendations reflecting values? • If not, it’s time to go back to the drawing board • Better operational definitions • New data
  • 27. Finally • Test using real life data • Put the application into production • Confirm results in practice • At first, side by side with human decision-makers • Validate the recommendations with people • Compare recommendations with results • Yes/no – does the software reflect values
  • 28. Back to Bias • Bias isn’t necessarily bad in ML/AI • But we need to understand it • And make sure it reflects our goals • Testers need to understand organizational values • And how they represent bias • And how to incorporate that bias into ML/AI apps
  • 29. Summary • Machine learning/AI apps can be designed to reflect organizational values • That may not result in the best decision from a strict business standpoint • Know your organizational values • And be committed to maintaining them • Test to the data that represents the values • As well as the written values themselves • Draw conclusions about the decisions being made