2013-03-22
1
What We Actually Know
About Programming
And What We Ought To Do Next
Greg Wilson http://software-carpentry.org March 2013
Best Practices for Scientific Computing 2
You are free to:
Copy, share, adapt, or re-mix;
Photograph, film, or broadcast;
Blog, live-blog, or post video of;
This presentation. Provided that:
You attribute the work to its author and respect
the rights and licenses associated with its
components.
Best Practices for Scientific Computing 3
Arrr, Matey
Seven Years’ War (actually 1754-63)
Britain lost 1,512 sailors to enemy action...
...and almost 100,000 to scurvy
Best Practices for Scientific Computing 4
James Lind (1716-94)
1747: (possibly) the first-ever
controlled medical experiment
× cider
× sulfuric acid
× vinegar
× sea water
āˆšā€Æoranges
× barley water
Of course, no-one paid attention until a proper Englishman
repeated the experiment in 1794...
Oh, the Irony
2013-03-22
2
Best Practices for Scientific Computing 5
1950: Hill & Doll publish a
case-control study comparing
smokers with non-smokers
Now called the ā€œBritish Doctorsā€
study, it ran until 2001
It Took a While...
Best Practices for Scientific Computing 6
#1: Smoking causes
lung cancer
#2: Most people would
rather fail than change
ā€œWhat happens ā€˜on average’ is
of no help when one is faced
with a specific patientā€¦ā€
What They Found
The Cochrane Collaboration (http://www.cochrane.org/)
now archives results from hundreds of medical studies
Best Practices for Scientific Computing 7
ā€œ[Using domain-specific languages] leads to two primary
benefits. The first, and simplest, is improved programmer
productivity... The second...is...communication with
domain experts.ā€
– Martin Fowler,
IEEE Software,
July/August 2009
So Where Are We?
Best Practices for Scientific Computing 8
One of the smartest guys in the industry...
...made two substantive claims of fact…
Look Closer
...in a peer-reviewed journal...
...without a single citation…
…because nobody expected one
2013-03-22
3
Best Practices for Scientific Computing 9
Growing emphasis on empirical studies in
software engineering since the mid-1990s
Papers describing new tools or
practices routinely include results
from some kind of field study
Many are flawed or incomplete,
but standards are constantly improving
A New Hope
Best Practices for Scientific Computing 10
Rigorous inspections can remove 60-90% of errors before
the first test is run. (Fagan 1975)
A Classic Result
Best Practices for Scientific Computing 11
Rigorous inspections can remove 60-90% of errors before
the first test is run. (Fagan 1975)
The first review and hour matter most. (Cohen 2006)
A Classic Result Refined
Best Practices for Scientific Computing 12
Sackman, Erikson, and Grant (1968): ā€œExploratory
experimental studies comparing online and offline
programming performance.ā€
Or 10, or 40, or 100, or whatever other
large number pops into the head of
someone who can’t be bothered to
look up the reference...
The best programmers are
up to 28 times more productive than the worst.
Most Often Misquoted
2013-03-22
4
Best Practices for Scientific Computing 13
1.  Study was designed to compare batch vs. interactive,
not measure productivity
2.  How was productivity measured, anyway?
3.  Best vs. worst exaggerates any effect
4.  Twelve programmers for an afternoon
Pick That Apart
Next major study was 54 programmers...
...for up to an hour
Best Practices for Scientific Computing 14
Boehm et al (1975): ā€œSome Experience with
Automated Aids to the Design of Large-Scale
Reliable Software.ā€
...and many, many more since
1.  Most errors are introduced
during requirements analysis
and design
2.  The later they are removed,
the most expensive it is to
take them out
time
number/cost
Another Classic Result
Best Practices for Scientific Computing 15
Pessimists: ā€œIf we tackle the
hump in the error injection
curve, fewer bugs will get to the
expensive part of the fixing
curve.ā€
Optimists: ā€œIf we
do lots of short
iterations, the total
cost of fixing bugs
will go down.ā€
That Explains a Lot
Best Practices for Scientific Computing 16
Nagappan et al (2007) & Bird et al (2009):
Geography has little correlation with software quality
Isn’t That Interesting…
2013-03-22
5
Best Practices for Scientific Computing 17
Nagappan et al (2007) & Bird et al (2009):
Distance in the org chart is a much better predictor
Isn’t That Interesting…
Best Practices for Scientific Computing 18
Are any metrics better at predicting faults/effort than LOC?
No.
A Few More Results
Best Practices for Scientific Computing 19
Do more frequent releases improve software quality?
Yes, but it also changes the nature of the bugs.
A Few More Results
Best Practices for Scientific Computing 20
Are there better ways to teach programming?
Yes: media-based instruction and peer instruction.
A Few More Results
2013-03-22
6
Best Practices for Scientific Computing 21
Sampling Bias
I focus on quantitative
studies because they’re
what I know best
A lot of the best work in this
field is using qualitative
methods
Best Practices for Scientific Computing 22
All Together Now
Andy Oram & Greg Wilson (ed):
Making Software: What Really
Works, and Why We Believe It.
O'Reilly, 2010, 978-0596808327.
http://neverworkintheory.org
Best Practices for Scientific Computing 23
Where to Start?
Best Practices for Scientific Computing 24
Where to Start?
Many practices can be monitored
automatically
2013-03-22
7
Best Practices for Scientific Computing 25
Where to Start?
But top-down initiatives usually don’t work
Best Practices for Scientific Computing 26
1.  How do you identify people with good ideas?
Where to Start?
Best Practices for Scientific Computing 27
1.  How do you identify people with good ideas?
2.  How do you reward people for good ideas?
Where to Start?
Best Practices for Scientific Computing 28
1.  How do you identify people with good ideas?
2.  How do you reward people for good ideas?
3.  How do they share those ideas with peers?
Where to Start?
2013-03-22
8
Best Practices for Scientific Computing 29
1.  How do you identify people with good ideas?
2.  How do you reward people for good ideas?
3.  How do they share those ideas with peers?
4.  How do you tell if it actually worked?
Where to Start?
Best Practices for Scientific Computing 30
Where to Start?
Remember: some changes
will be qualitative
1.  How do you identify people with good ideas?
2.  How do you reward people for good ideas?
3.  How do they share those ideas with peers?
4.  How do you tell if it actually worked?
Best Practices for Scientific Computing 31
Words to Live By
If you build a man a fire,
you'll keep him warm for a night.
If you set a man on fire,
you'll keep him warm for the rest of his life.
— Terry Pratchett
Best Practices for Scientific Computing 32
http://software-carpentry.org
Thank You

More Related Content

PPTX
7th grade august 20th, 2010
PPT
Bits of Evidence
PDF
You wear it well - Wearable Technology Show 2015, London 11th March 2015
PPTX
Prevalence Of Spreadsheet Errors
PPTX
2012 Young Generation Network - Human performance problems
PDF
The Ultimate Metric
PPTX
2014 abic-talk
PDF
Agile bodensee - Agile Testing: Bug prevention vs. bug detection
7th grade august 20th, 2010
Bits of Evidence
You wear it well - Wearable Technology Show 2015, London 11th March 2015
Prevalence Of Spreadsheet Errors
2012 Young Generation Network - Human performance problems
The Ultimate Metric
2014 abic-talk
Agile bodensee - Agile Testing: Bug prevention vs. bug detection

What's hot (8)

PPT
2008 epsc - accident avoidance
PDF
GrrCON 2018: Stop boiling the ocean!
PPT
2015 Trinity Dublin - Task risk management - hf in process safety
PDF
Fix What Matters
PPT
2007 North Wales OHS - Human factors overview
ODP
Why Do Computational Scientists Trust Their So
PDF
Bias Driven Development - Mario Fusco - Codemotion Milan 2016
PPTX
Automated Software Enging, Fall 2015, NCSU
2008 epsc - accident avoidance
GrrCON 2018: Stop boiling the ocean!
2015 Trinity Dublin - Task risk management - hf in process safety
Fix What Matters
2007 North Wales OHS - Human factors overview
Why Do Computational Scientists Trust Their So
Bias Driven Development - Mario Fusco - Codemotion Milan 2016
Automated Software Enging, Fall 2015, NCSU
Ad

Similar to Greg Wilson - We Know (but ignore) More Than We Think (20)

PDF
Journal Club - Best Practices for Scientific Computing
PPTX
2013 ucar best practices
PDF
2014-10-10-SBC361-Reproducible research
PDF
2013 10-30-sbc361-reproducible designsandsustainablesoftware
ODP
Two Solitudes
PPTX
Software Sustainability: Better Software Better Science
PPTX
20171003 lancaster data conversations Chue-Hong
PDF
UMich CI Days: Scaling a code in the human dimension
PPTX
Better Software, Better Research
PDF
Mapping out a Research Agenda
PPT
SciForge Workshop@Potsdam Institute for Climate Impact Reserach; Nov 2014
PDF
ISEC'18 Tutorial: Research Methodology on Pursuing Impact-Driven Research
PDF
2014 11-13-sbsm032-reproducible research
PPTX
How to think smarter about software development
PPT
Why Good Software Sometimes Dies... and how to save it
PDF
You Got Your Engineering in my Data Science - Addressing the Reproducibility ...
ODP
Dark Matter, Public Health, and Scientific Computing
PPTX
Le Bauer: Data Driven Model Development
PDF
Software Entomology or Where Do Bugs Come From?
PDF
2014 10-15-Nextbug edinburgh
Journal Club - Best Practices for Scientific Computing
2013 ucar best practices
2014-10-10-SBC361-Reproducible research
2013 10-30-sbc361-reproducible designsandsustainablesoftware
Two Solitudes
Software Sustainability: Better Software Better Science
20171003 lancaster data conversations Chue-Hong
UMich CI Days: Scaling a code in the human dimension
Better Software, Better Research
Mapping out a Research Agenda
SciForge Workshop@Potsdam Institute for Climate Impact Reserach; Nov 2014
ISEC'18 Tutorial: Research Methodology on Pursuing Impact-Driven Research
2014 11-13-sbsm032-reproducible research
How to think smarter about software development
Why Good Software Sometimes Dies... and how to save it
You Got Your Engineering in my Data Science - Addressing the Reproducibility ...
Dark Matter, Public Health, and Scientific Computing
Le Bauer: Data Driven Model Development
Software Entomology or Where Do Bugs Come From?
2014 10-15-Nextbug edinburgh
Ad

More from #DevTO (14)

PPTX
Alan Hietala - A Brief Into to AngularJS
Ā 
PPTX
Eli Aleyner & Mani Fazeli - What does it take to find a co founder
Ā 
PDF
Joey Coleman - Building an Open Data Ecosystem for all to access
Ā 
PPTX
Peter Newhook - Make Data Dance
Ā 
PDF
Yuriy Blokhin - Building a Development Platform
Ā 
PDF
Katherine Hague - The Decentralize Future of eCommerce
Ā 
PDF
Farhan Thawar - Managing an Agile Team
Ā 
PPT
Tyson Kingsbury - Anatomy of a Logo
Ā 
PPTX
Kevin MacDonald - Anyone can make maps
Ā 
PDF
Brian Hogg - Web Apps using HTML5 and JS
Ā 
PDF
Shoukry Kattan - Titanium Mobile. Cross Platform Mobile Apps
Ā 
PPTX
Marc Roginsky - How to Deliver Effective Client-facing Presentations
Ā 
PDF
Shey Sewani - RabbitMQ At FreshBooks
Ā 
PDF
Alex Stobe - Dev Networking
Ā 
Alan Hietala - A Brief Into to AngularJS
Ā 
Eli Aleyner & Mani Fazeli - What does it take to find a co founder
Ā 
Joey Coleman - Building an Open Data Ecosystem for all to access
Ā 
Peter Newhook - Make Data Dance
Ā 
Yuriy Blokhin - Building a Development Platform
Ā 
Katherine Hague - The Decentralize Future of eCommerce
Ā 
Farhan Thawar - Managing an Agile Team
Ā 
Tyson Kingsbury - Anatomy of a Logo
Ā 
Kevin MacDonald - Anyone can make maps
Ā 
Brian Hogg - Web Apps using HTML5 and JS
Ā 
Shoukry Kattan - Titanium Mobile. Cross Platform Mobile Apps
Ā 
Marc Roginsky - How to Deliver Effective Client-facing Presentations
Ā 
Shey Sewani - RabbitMQ At FreshBooks
Ā 
Alex Stobe - Dev Networking
Ā 

Recently uploaded (20)

PDF
UiPath Agentic Automation session 1: RPA to Agents
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PDF
OpenACC and Open Hackathons Monthly Highlights July 2025
PPTX
Configure Apache Mutual Authentication
PDF
Credit Without Borders: AI and Financial Inclusion in Bangladesh
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PPTX
2018-HIPAA-Renewal-Training for executives
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
How IoT Sensor Integration in 2025 is Transforming Industries Worldwide
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PDF
Convolutional neural network based encoder-decoder for efficient real-time ob...
PDF
The influence of sentiment analysis in enhancing early warning system model f...
PDF
Enhancing plagiarism detection using data pre-processing and machine learning...
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PPT
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PPTX
Benefits of Physical activity for teenagers.pptx
PPT
What is a Computer? Input Devices /output devices
DOCX
search engine optimization ppt fir known well about this
PPTX
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
UiPath Agentic Automation session 1: RPA to Agents
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
OpenACC and Open Hackathons Monthly Highlights July 2025
Configure Apache Mutual Authentication
Credit Without Borders: AI and Financial Inclusion in Bangladesh
sustainability-14-14877-v2.pddhzftheheeeee
2018-HIPAA-Renewal-Training for executives
Zenith AI: Advanced Artificial Intelligence
How IoT Sensor Integration in 2025 is Transforming Industries Worldwide
NewMind AI Weekly Chronicles – August ’25 Week III
Convolutional neural network based encoder-decoder for efficient real-time ob...
The influence of sentiment analysis in enhancing early warning system model f...
Enhancing plagiarism detection using data pre-processing and machine learning...
Taming the Chaos: How to Turn Unstructured Data into Decisions
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
Benefits of Physical activity for teenagers.pptx
What is a Computer? Input Devices /output devices
search engine optimization ppt fir known well about this
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION

Greg Wilson - We Know (but ignore) More Than We Think

  • 1. 2013-03-22 1 What We Actually Know About Programming And What We Ought To Do Next Greg Wilson http://software-carpentry.org March 2013 Best Practices for Scientific Computing 2 You are free to: Copy, share, adapt, or re-mix; Photograph, film, or broadcast; Blog, live-blog, or post video of; This presentation. Provided that: You attribute the work to its author and respect the rights and licenses associated with its components. Best Practices for Scientific Computing 3 Arrr, Matey Seven Years’ War (actually 1754-63) Britain lost 1,512 sailors to enemy action... ...and almost 100,000 to scurvy Best Practices for Scientific Computing 4 James Lind (1716-94) 1747: (possibly) the first-ever controlled medical experiment × cider × sulfuric acid × vinegar × sea water āˆšā€Æoranges × barley water Of course, no-one paid attention until a proper Englishman repeated the experiment in 1794... Oh, the Irony
  • 2. 2013-03-22 2 Best Practices for Scientific Computing 5 1950: Hill & Doll publish a case-control study comparing smokers with non-smokers Now called the ā€œBritish Doctorsā€ study, it ran until 2001 It Took a While... Best Practices for Scientific Computing 6 #1: Smoking causes lung cancer #2: Most people would rather fail than change ā€œWhat happens ā€˜on average’ is of no help when one is faced with a specific patientā€¦ā€ What They Found The Cochrane Collaboration (http://www.cochrane.org/) now archives results from hundreds of medical studies Best Practices for Scientific Computing 7 ā€œ[Using domain-specific languages] leads to two primary benefits. The first, and simplest, is improved programmer productivity... The second...is...communication with domain experts.ā€ – Martin Fowler, IEEE Software, July/August 2009 So Where Are We? Best Practices for Scientific Computing 8 One of the smartest guys in the industry... ...made two substantive claims of fact… Look Closer ...in a peer-reviewed journal... ...without a single citation… …because nobody expected one
  • 3. 2013-03-22 3 Best Practices for Scientific Computing 9 Growing emphasis on empirical studies in software engineering since the mid-1990s Papers describing new tools or practices routinely include results from some kind of field study Many are flawed or incomplete, but standards are constantly improving A New Hope Best Practices for Scientific Computing 10 Rigorous inspections can remove 60-90% of errors before the first test is run. (Fagan 1975) A Classic Result Best Practices for Scientific Computing 11 Rigorous inspections can remove 60-90% of errors before the first test is run. (Fagan 1975) The first review and hour matter most. (Cohen 2006) A Classic Result Refined Best Practices for Scientific Computing 12 Sackman, Erikson, and Grant (1968): ā€œExploratory experimental studies comparing online and offline programming performance.ā€ Or 10, or 40, or 100, or whatever other large number pops into the head of someone who can’t be bothered to look up the reference... The best programmers are up to 28 times more productive than the worst. Most Often Misquoted
  • 4. 2013-03-22 4 Best Practices for Scientific Computing 13 1.  Study was designed to compare batch vs. interactive, not measure productivity 2.  How was productivity measured, anyway? 3.  Best vs. worst exaggerates any effect 4.  Twelve programmers for an afternoon Pick That Apart Next major study was 54 programmers... ...for up to an hour Best Practices for Scientific Computing 14 Boehm et al (1975): ā€œSome Experience with Automated Aids to the Design of Large-Scale Reliable Software.ā€ ...and many, many more since 1.  Most errors are introduced during requirements analysis and design 2.  The later they are removed, the most expensive it is to take them out time number/cost Another Classic Result Best Practices for Scientific Computing 15 Pessimists: ā€œIf we tackle the hump in the error injection curve, fewer bugs will get to the expensive part of the fixing curve.ā€ Optimists: ā€œIf we do lots of short iterations, the total cost of fixing bugs will go down.ā€ That Explains a Lot Best Practices for Scientific Computing 16 Nagappan et al (2007) & Bird et al (2009): Geography has little correlation with software quality Isn’t That Interesting…
  • 5. 2013-03-22 5 Best Practices for Scientific Computing 17 Nagappan et al (2007) & Bird et al (2009): Distance in the org chart is a much better predictor Isn’t That Interesting… Best Practices for Scientific Computing 18 Are any metrics better at predicting faults/effort than LOC? No. A Few More Results Best Practices for Scientific Computing 19 Do more frequent releases improve software quality? Yes, but it also changes the nature of the bugs. A Few More Results Best Practices for Scientific Computing 20 Are there better ways to teach programming? Yes: media-based instruction and peer instruction. A Few More Results
  • 6. 2013-03-22 6 Best Practices for Scientific Computing 21 Sampling Bias I focus on quantitative studies because they’re what I know best A lot of the best work in this field is using qualitative methods Best Practices for Scientific Computing 22 All Together Now Andy Oram & Greg Wilson (ed): Making Software: What Really Works, and Why We Believe It. O'Reilly, 2010, 978-0596808327. http://neverworkintheory.org Best Practices for Scientific Computing 23 Where to Start? Best Practices for Scientific Computing 24 Where to Start? Many practices can be monitored automatically
  • 7. 2013-03-22 7 Best Practices for Scientific Computing 25 Where to Start? But top-down initiatives usually don’t work Best Practices for Scientific Computing 26 1.  How do you identify people with good ideas? Where to Start? Best Practices for Scientific Computing 27 1.  How do you identify people with good ideas? 2.  How do you reward people for good ideas? Where to Start? Best Practices for Scientific Computing 28 1.  How do you identify people with good ideas? 2.  How do you reward people for good ideas? 3.  How do they share those ideas with peers? Where to Start?
  • 8. 2013-03-22 8 Best Practices for Scientific Computing 29 1.  How do you identify people with good ideas? 2.  How do you reward people for good ideas? 3.  How do they share those ideas with peers? 4.  How do you tell if it actually worked? Where to Start? Best Practices for Scientific Computing 30 Where to Start? Remember: some changes will be qualitative 1.  How do you identify people with good ideas? 2.  How do you reward people for good ideas? 3.  How do they share those ideas with peers? 4.  How do you tell if it actually worked? Best Practices for Scientific Computing 31 Words to Live By If you build a man a fire, you'll keep him warm for a night. If you set a man on fire, you'll keep him warm for the rest of his life. — Terry Pratchett Best Practices for Scientific Computing 32 http://software-carpentry.org Thank You