adding page while taking this mlt talk in:
Quantifying forgiveness: Watch the video from the recent #MLTalks with @ProPublica’s @JuliaAngwin and @Joi https://t.co/HwdHSuMBP3
Original Tweet: https://twitter.com/medialab/status/966422316510076928
7 min – the story we all know.. some people are forgiven for their crimes and some people are not.. and i feel bad about that (journalist props for putting people away).. and so i started investigating forgiveness..t
the weird thing about automation and tech is.. it is audit able.. so we can see systemic bias in a way we can’t really see in human minds
8 min – the algo i looked at was the one that predicts risk of recidivism.. used for pre trial.. parole.. sentencing.. looked at 18 000 people over 2 yrs.. got to a sample of 7 000 people.. to look at 2 yr stretch.. found: when control for all factors.. black people 45% more likely to be assigned a higher risk score… twice as likely to get a false positive
18 min – finding it’s more unjustified forgiveness than bias.. for whites..
19 min – also that predicts risks of car accidents – for ins co’s.. minority neighborhoods get higher raters.. ins co’s put on or take away a charge based on zipcodes
25 min – once again.. a big gap where we’ve chosen to give one group a pass
so i’m hoping we can look more at forgiveness than bias.. but .. i’m really thankful we are doing this automated.. so we can really look at it.. i’m hopeful that these kind of data can help change the debate (it has in some places)
27 min – real problem.. a lot of the problems were in rural.. so not a lot of data.. so strung together neighborhoods.. i think it was .. not enough data so made a guess.. ie: here’s a lot of nice white people
29 min – congress gave ins co’s an exemption.. so only regulated by states.. and states don’t usually regulate
31 min – i think one thing frustrating about my findings.. people say.. of course (obvious to them).. but we do need the data to show it.. until you lob data over the fence.. don’t get a real policy dialogue going
34 min – all these data convos come down to that: them saying ‘you’re looking at the wrong pool’.. (ie: that doesn’t fit w our data.. but our data is secret) .. that’s why i feel so strongly about journalists collecting their own data .. received data sets.. there’s a reason they don’t collect it..
37 min – joi: weird thing about the word fair.. fair for who..t
39 min – predicting violence – 20% (on virginia and child abuse/neglect)
42 min – joi: if you had accurate data and predicting 100%.. would it be fair..t
hard for me to answer.. personally i feel very uncomfortable of a future crime in the sentencing of a current crime.. i believe in human change and redemption..t
43 min – joi: on trying to focus less on prediction and more on causal…
47 min – the terrible ness of the criminal justice is so much deeper than algos.. ie: no more human contact.. no natural light.. outdoor space.. et al
49 min – joi: on letting people out of jail but with ridiculous rules they can never follow.. as you optimize for single score.. which seems good because jails are so bad.. but may just be smearing this around.. are you looking at the right problems.. or reducing problem.. to something else..
50 min – algos exist because people need to tell the public they are only letting low risk people out.. part of the goal of ending mass incarceration..telling people it’s ok.. the science is here.. it’s a political story more than a data story.. the data is there to solve that problem..
52 min – joi: don’t blame the algo.. it’s the political system that created the optimization that the algo was set for..t.. ie: optimize for money over convenience of the families.. i think that’s the real problem.. we don’t really seem to be good at figuring out how (who) to decide.. use data to make it so we can see what’s going on.. then decide..
maybe because these problems we keep trying to algo.. shouldn’t/needn’t be here in the first place.. ie: let’s work on a deeper problem (maté basic needs) that would make these problems irrelevant.. as it could be
i’m just really good at problems and i suck at solutions.. i do think correctly diagnosing a problem is.. valuable.. by bringing quantification to these problems makes it addressable..
53 min – an addressable problem is the thing i showed .. you can buy ads targeted to jew haters.. i’m in the world of addressable problems..
55 min – q&a
q: what if what is done at the end is different ie: giving someone money or bus pass or babysitter or healthcare.. something that doesn’t hurt anybody if false..
a: that’s how they’re used in canada.. risk and needs assessments.. when came to us.. had needs section.. but judges have said.. very limited resources.. and judges also scared of being the one that let the guy out
58 min – joi: if you could at least go one layer deeper on what: failure to appear means.. massive data base of needs.. could be used to help.. but also used to hurt them
1:01 – q: maybe fairness is less about analyzing more data and more about coming up with better alternatives.. ie: can we quantify the economic loss of society for all the biases that are being done
1:12 – all the due process req’s only happen when you go to trial.. and no one goes to trial.. no due process at pre trial.. defendant has very little rights to defend
1:13 – problem is the extreme latitude of judges.. can judge however they want/feel
1:16 – q: is risk a reductionist neoliberal concept.. and if it is .. is there a alt concept you’d like to see data scientists start to orient itself around for modeling purposes
a: i do think risk is often narrowly/politically defined and people are unwilling to acknowledge that.. i still think it’s useful.. ie: i’m not expanding beyond the scope of risk.. and i’m still showing they still don’t have it going on.. i’m going to their playing field.. and just using that to show this
1:25 – joi: at mit we can’t do a lot of studies we want to do.. and copyright ness that you can’t audit..all these stifling things.. a lot of people that talk about freedom of internet.. don’t talk about these laws that are impeding research
Julia Angwin is an award-winning investigative journalist at the independent news organization ProPublica.
From 2000 to 2013, she was a reporter at The Wall Street Journal, where she led a privacy investigative team that was a Finalist for a Pulitzer Prize in Explanatory Reporting in 2011 and won a Gerald Loeb Award in 2010. Her book, Dragnet Nation: A Quest for Privacy, Security and Freedom in a World of Relentless Surveillance, was published by Times Books in 2014.
In 2003, she was on a team of reporters at The Wall Street Journal that was awarded the Pulitzer Prize in Explanatory Reporting for coverage of corporate corruption. She is also the author of “Stealing MySpace: The Battle to Control the Most Popular Website in America” (Random House, March 2009).
She earned a B.A. in mathematics from the University of Chicago, and an MBA from the Graduate School of Business at Columbia University.