weapons of math
(2016) by Cathy O’Neil
many poisonous assumptions are camouflaged by math and go largely untested and unquestioned..t
the privileged, we’ll see time and again, are processed more by people, the masses by machines
let’s go deeper – why are we processing people..?
for years, washington teacher complained about the *arbitrary scores and clamored for details on what went into them.. it’s an algo .. they were told. it’s very complex.. this discouraged many from pressing further.. many people, unfortunately are **intimidated by math
dang.. so much in this .. *sounds like school.. grades/degrees et al **school again.. does that.. let’s question the deeper things first.. rather than trying to get the math/algo unbiased.. let’s unbias.. (aka: free) us..
sarah bax: ‘how do you justify evaluating people by a measure for which you are unable to provide explanation’..t
good rule for life.. why are we evaluating/measuring each other at all..? too much.. ie: inspectors of inspectors
‘she had been pleased to see her incoming 5th graders had scored surprisingly well.. 29% of students ranked at an ‘advanced reading level’.. 5x avg in school district..
dang.. how is that ok.. how are you not seeing the juxta/fractal here..
yet when classes started she saw many struggled to read.. investigations.. revealed a high level of erasures on the standardized tests.. high rate of corrected answers points to greater likelihood of cheating..
we devalue the human spirit when we think we have to .. or even can.. measure it.. if some adult/admin isn’t cheating on scores.. kids surely are.. not as bad people.. but as pretty good survival/coping skills
teacher eval algos are a powerful tool for behavioral modification.. that’s their purpose.. feature both stick and carrot..
supposed to‘s are killing us all.. deeper than wmd.. wmd are just one of the tools.. sure..one of the main tools.. but a symptom of people growing up in school’s supposed-to-grip.. to the point that (back to your earlier point about not questioning anything math related).. we don’t question anything.. we just follow orders.. like.. do your math hw – which cycles us back around to neocortex fearing failure rejection in math at grade 2.. shutting us down.. oh my math.. so goes much deeper than algo’s evaluating people.. we’re looking at whales in sea world.. and assuming they are acting naturally..
do you see the paradox?.. an algo says might be a bad ire.. can turn someone’s life upside down.. when person fights back.. case must be ironclad.. we’ll see time and again.. human victims of wmds are held to a far higher standard of evidence than the algos themselves..
here’s the paradox: this is exactly what adults (in your version the algos) have been doing do kids (in your version humans) for years..
and yet when i heard interviews w the occupiers, they often seemed ignorant of basic issues related to finance
not a failure of learning/ignorance.. a failure of supposed to ness.. of thinking that money is natural/basic.. ie: making up money.. we have to let go of that if we want to get to global equity: everyone getting a go everyday.. (7bn everyday.. there’s some ginorm/small math for the math geeks)
ill conceived mathematical models now micromanage the economy, from advertising to prisons..
worse.. we’re micromanaging humans..
what we need most.. the energy of 7bn alive people.. let’s focus on that
the trouble is that profits end up serving as a stand in , or proxy, for truth.. we’ll see this dangerous confusion crops up again and again..
this happens because data scientists all too often lose sight of the folks on the receiving end of the *transaction.. they certainly understand that a data crunching program is bound to misinterpret people a **certain % of time, putting the in wrong groups and denying them a job or a chance at their dream house.. but as a rule, the people running the wmds don’t dwell on those errors.. their feedback is money, which is also their incentive
*we’ve got to let go of measureing transactions.. ie: 10 day care centers ness
**like 100%.. because we’re algo\ing non legit data (ie: whales in sea world)
again.. let’s go ginorm/small on this.. ie: just focusing on labeling people w daily curiosities.. in order to match them locally.. sans money/measure .. if we think we need incentives.. pretty good sign we’re doing it wrong..
ie: i don’t think people would really crave a job and a dream home.. if we could get us back to an undisturbed ecosystem
we will explore harmful ie’s that affect people at *critical life moments: going to college, borrowing money, getting sentenced to prison, or finding and holding a job.. all of these **life domains are increasingly controlled by secret models wielding arbitrary punishments..
welcome to the dark side of big data
those aren’t our natural *critical life moments.. rather they are part of the broken feedback loop of supposed to‘s
those aren’t natural *life domains.. we need to get back to basic needs first.. that’s our deeper dark side.. we could get there by data.. but i’m thinking.. only if we focus on self-talk as data
1 – bomb parts – what is a model
on the stats in baseball and ted williams.. the answers to all fo these questions (how much more is a double worth tha a single.. when if ever its it worth it to bunt a runner from frist to second).. are blended and combined into mathematical models of their sport. these are parallel universes of the baseball world, each a complex tapestry of probabilities.. the purpose it to run diff scenarios at every juncture..looking for the optimal combos
does that make the fun..?
i’ve seen a log of un fun situations in sports.. because of all the angles/rules
imagining in life especially.. but also in sport.. we’d be better off w/o all the stats
starting in the 1980s serious statisticians started to investigate what these figures.. really meant: how they translated into wins, and how executives could max success w a min of dollars..
play is not about winning.. life is not about money/profit/measure
moneyball is not shorthand fo ran statical approach in domains long ruled by the gut
let’s get back to the gut..
but baseball reps a healthy case study – and it serves as a useful contrast to the toxic models..
baseball models are fair, in part, because they’re transparent.. everyone has access to the stats.. also has stat rigor.. ie: immense data set .. data highly relevant to outcomes they are trying to predict.. the folks building wmds routinely lack data for the behaviors they’re most interested in .. so they substitute stand in data or proxies.. these correlations are discriminatory..
models are opinions embedded in mathematics.
the question however is whether we’ve eliminated human bias or simply camouflaged it w technology..
again.. deeper.. it’s not the bias that’s killing us (though that’s a killer) .. it’s that we aren’t ourselves..
this is the basis of our legal system. we are judged by what we do, not by who we are..
maybe .. if all of us were actually ourselves.. a legal system .. people judging other people.. would become irrelevant/ridiculous.. ie: eudaimoniative surplus/undisturbed ecosystem
the model itself contributes to a toxic cycle and helps to sustain it.. that’s a signature quality of wmd
so.. hope you’d agree that any type of measuring of people is a toxic cycle.. or as you say.. wmd.. not new.. here from the very first disturbance to our ecosystem..
wmds are by design inscrutable black boxes..
2 – shell shocked – my journey of disillusionment
mathematical models, by their nature, are based on the past, and on the assumption that patterns will repeat
so.. not on alive (and so antifragile) people.. not on an undisturbed ecosystem..
to be clear, the subprime mortgages that piled up during the housing boom.. were not wmds. they were financial instruments, not models, and they had little to do w math.. (in fact, the brokers went to great length to ignore inconvenient numbers)
but when banks started loading mortgages .. into classes of securities and selling them, they were relying on flawed math models to do it.. the risk model attached to mortgage backed securities was a wmd..
very few people had the expertise and the info required to know what was actually going on statistically, and most of the people who did lacked the integrity to speak up..
i was esp disappointed in the part that math had played… people had deliberately wielded formulas to impress rather than clarify.. so i left the hedge fund in 2009 aw the conviction that i would work to fix he financial wmds..
i soon realized i was in the rubber stamp business.. in 2011 it was time to move again.. newly proclaimed data scientists.. ready to plunge into the internet econ.. i started out building models to anticipate the behavior or visitors to various travel websites… the key question was whether someone is showing up at the expidea site was just browsing or looking to spend money..
? still rubber stamping our broken feedback loop . .no?
i saw all kinds of parallels between finance and big data
money vindicates all doubts.. and the rest of his circle plays along, forming a mutual admiration society..
in inclination to replace people w data trails, turning them into more effective shoppers, voters, workers.. to optimize some objective.. this is easy to do and to justify, when success comes back as an anonymous score and when the people affected remain every bit as abstract as the numbers dancing across the screen..
i was getting more involved w the occupy movement.. more and more i worried about the separation between tech models and real people..
in spite of blogging almost daily.. i could barely keep up w all the ways i was hearing of people being manipulated, controlled, and intimidated by algos.. truly alarmed, i quit my job to investigated the issue in earnest
total resonance.. only.. from the perspective of supposed to’s (get ed, job, money, et al) not simply algo’s.. algo’s just a facilitator of hastening..
please go deeper w me
3 – arms race – going to college
what does a single national diet have to do w wmds? scale a formula, whether it’s a diet or a tax code.. if it grows.. creates distorted and dystopian econ.. this is what has happened in higher ed
thinking one size fits all anything (econ or ed or food or whatever you call it) is a disturbance to our ecosystem.. we need all the parts of our one body.. or it won’t work.. we won’t dance..
today.. we have the means to facilitate that diversity (discrimination to infinity as equity).. so let’s focus our energy/math/whatever on that
(in he ranking et al) it was just people wondering what matters most in ed, then figuring out which of those variables they could count, and finally deciding how much weight to give each of them in the formula
they had no direct way to quantify how a 4 yr process affected on single student, much less tens of millions of them.they coulnd’t measure learning, happiness confidence friendships, or other aspects of a student’s 4 yr experience..
instead they picked proxies that seemed to correlate w success. they looked at sat scores, student teacher ratios.. acceptance/graduation rates…
sad.. college presidents.. they were almost like students again, angling for good grades form a taskmaster, in fact, they were trapped by a rigid model, a wmd..
the problem isn’t the us news model but its scale. it forces everyone to shoot for exactly the same goals, which creates a rat race – and lots of harmful unintended consequences..
yeah.. like k-12 et al.. and work.. we need to reset.. we need a global do over.. we need a rat park for people..
we cannot place the blame for this trend entirely on the us news rankings. our entire society has embraced ow only the idea that a college ed is essential but the idea that a degree from a highly ranked school can catapult a student into a life of power and privilege.. the us news wmd fed on these beliefs, fears, and neuroses. it created powerful incentives that have encouraged spending while turning a blind eye to skyrocketing tuitions and fees
as the ranking grow so do effort s to game them..
some 2000 stone throwing protesters gathered outside the schoool..in china: ‘we want fairness. there is no fairness if you don’t ‘let us cheat’.. they knew others were cheating .. so preventing them from cheating was unfair.. in a system in which cheating is the norm, following the rules amounts as handicap..
each college’s admissions model is derived, at least in part, from the us news model, and each one is a mini wmd
the result is an ed system that favors the privileged..
result of school/work et al is a life system that favors people who conform.. blindly/passively follow rules
but even those who claw their way in to a top college lose out.. college admission game.. has virtually no ed value..
so is there a fix.. obama suggesting coming up w a new college rankings model.. tie rankings to a diff set of metrics, including affordability..job placement.. worthy goals to be sure but every ranking system can ge gamed..
obama’s didn’t happen.. so govt capitulated.. and result might be better.. instead of ranking.. ed dept released loads of data on a website.. students can ask own questions
4 – propaganda machine – online advertising
as he saw it, most people object to advertisements because they were irrelevant to them.. in the future, they wouldn’t be.. people would welcome pitches tailored to them…
anywhere you find the combo of great needs and ignorance, you’ll likely see predatory ads.. the boom in for profit colleges is fueled by predatory ads..
they zero in on the most desperate among us at enormous scale.. in ed, they promise what’s usually a false road to prosperity..
it gets complicated..various messaging campaigns all interact w each other and much of their impact can’t be measured.. it’s easier to rack online messaging.. that’s why much of he ad money at for profit unis goes to google and fb.. each of these platforms allows advertisers to segment their target populations in meticulous detail..
recruiting in all of its forms is the heart of the for profit business and it accounts for far more of their spending, in most cases, than education.. a senate report on 30 for profit system found that they employed one recruiter for every 48 students.. apollo group, the parent co for the uni of phoenix spent more than a bn dollars on marketing in 2010, almost all of it focused on recruiting.. that came out to $2225 per student on marketing and only $892 per student on instruction…
5 – civilian casualties – justice in the age of big data
thanks largely to the industry’s wealth and powerful lobbies, finance is underpoliced.. just imagine if police enforced their zero tolerance strategy in finance.. not likely of course.. bankers are virtually invulnerable.. they spend heavily on our politicians, .. and are also viewed as crucial to our econ.. that protects them.. if their banks go south, our econ could go w them.. (the poor have no such argument)
as a group, they made it thru the 2008 market crash practically unscathed. what could ever burn them now
my point is that police make choices about where they direct their attention.. today they focus almost exclusively on the poor.. and now data scientists are stitching this status quo of the social order into models.. that hold ever greater sway over our lives..
the result is that we criminalize poverty, believing all the while that our tools are not only scientific but fair..t
(on stop and frisk) .. about 85% young african american or latino men.. in certain neighborhoods, many were stopped repeatedly. only .1%, or 1 of 1000 was linked in any way to a violent crime. yet this filter capture many others for lesser crimes, .. that might have otherwise gone undiscovered.. some of the targets,..got angry, and a good number of those found themselves charged w resisting arrest..\93
nyclu sued bloomberg admin.. charging that the stop and frisk policy was racists… back men they argued, were 6x more likely to be incarcerated that white men and 21x more likely to be killed by police.. .. at least according to the available data (which is famously underreported)
stop and frisk isn’t exactly a wmd, because it relies on human judgment and is not formalized into an algo.. but it is built upon a simple destructive calculation.. not so diff from long shot calculation s used by predatory advertisers or spammers.. if you give yourself enough changes you’ll reach your target..
aspects of stop and frisk were similar to wmds though.. ie: it had a nasty feedback loop.. it ensnared 1000s of black and latino men.. .. many for committing the petty crimes and misdemeanors that go on in college frats, unpunished, every saturday nights.. but while the great majority fo uni students were free to sleep off their excesses, the victims of stop and frisk were booked, and some dispatched to the hell that is rikers island.. each arrest created new data, further justifying the policy..
the constitution’s implicit judgment is that freeing someone who may well have committed a crime, for lack of evidence, poses less of a danger to our society than jailing or executing an innocent person.
wmds by contrast, tend to favor efficiency.. they feed on data that can be measured and counted.. but fairness si squishy and hard to quantify.. it is a concept. and computers, for all their advances in language and logic, still struggle mightily w concepts. so.. fairness isn’t calculated into wmds and the result is massive industrial production of unfairness..
people who favor policies like stop and frisk should experience it themselves. justice cannot just be something that one part of society inflicts upon the other..
and why are nonwhite prisoner from poor neighborhoods more likely to commit crimes? according to the data inputs for the recidivism models, it’s because they’re more likely to be jobless, lack a hs diploma, and have had previous run ins w the law.. and their friends have too.. another way of looking at the same data, though, is that these prisoners live in poor neighbohoods w terrible schools and scant opps.. and they’re highly policed.. the poor and non white are punished more for being who they are and living where thy live..
(just after bit on amazon’s endless questions.. because future of co hinges upon a system that learns continually.. figuring out what makes customers tick): if i had a chance to be a data scientist for the justice system, i would do my best to dig deeply to learn what goes on inside those prisons and what impact those experience might have on prisoners’ behavior.. i’d first look into solitary confinement..
we need a reset.. not a closer look (we already have a closer look.. ie: solitary, kalief, shaka, et al)
researchers have found that time in solitary produces deep feeling of hopelessness and despair.. could that have an impact on recidivism..? that’s a test i’d love to run, but i’m not sure the data is even collected.. how about rape
? what..? on recidivism? how about suicide..?
a serious scientist would also search for positive signals form the prison experience. what ‘s the impact of more sunlight, more sports, better food, literacy training? maybe these factors will improve convicts’ behavior after they go free..
wow.. how about a serious scientist working on no more incarceration..
the goal, if data were used constructively, would be to optimize prisons – much the way co’s like amazon optimize websites or supply chains.. for the befit of both the prisoners and society at large
dang.. how about we use diff data (ie:self-talk as data).. and have the goal be .. no more judging other people.. just assume good .. crazy..? what we’re doing now is crazy.. we could so get there.. via ie: gershenfeld sel
instead of just trying to eradicate crimes, police should be attempting to build relationships in the neighborhood..
in camden nj.. i 2012, placed under stat control w dual mandate: lowering crime and engendering community trust.. if building trust is the objective, an arrest may well become a last resort, not the first..
trust begs no judgment
from a mathematical pov however trust is hard to quantify
yeah.. because it’s either 100% or it’s just judge\ment (killer of relationships/community)
6 – ineligible to serve – getting a job
1971 – court ruled intelligence tests for hiring were discriminatory and therefore illegal.. instead the industry opted for personality tests.. like kronos (mit project gone 500 mn annual business).. purpose not to find best employee.. but to exclude as many people as possible as cheaply as possible..
much like college admin et al..
many of the tests used today force applicants to make difficult choices.. ie: do you get mad easily.. many leave prospect pleading either high strung or lazy..
some 72% of resumes are never seen by human eyes.. t..
so job applicants must craft their resumes w that automatic reader in mind.. ie: words the specific job is looking for, languages, hones.. .. images are useless.. fancy fonts just confuse machines.. ariel an courier.. are safest.. no symbols such as arrows.. ..so.. those w money and resource to prep resumes come out on top
7 – sweating bullets – on the job
all the more reason to spread the word about these and other wmds. once people recognize them and understand their statistical flaws, they’ll demand evaluations that are fairer for both students and teachers..t
or perhaps we’ll all wake up and realize that evaluations/judgements.. are a disturbance to humanity/ecosystem
8 – collateral damage – landing credit
(on local bankers standing tall in a town.. controlled money.. put on sunday best to visit them.. he’d know all about you.. would judge you) and there’s a good chance he was more likely to trust people form his own circles. this was only human. but it meant that for millions of americans the predigital status quo was just as awful as some of the wmds i’ve been describing..
so too school.. where judge\ment of people if prime
it just wasn’t fair. and then along came an algo, and things improved.. the (fico) score was color blind
improved.. or perpetuated a broken feedback loop.. ie: that money (or ed or whatever) is what we needed.. when what we’ve needed most all along .. is the energy of 7bn (aka: all human) alive people
these scores (fico) have lots of commendable and non wmd attributes.. first, they have a clear feedback loop. credit co’s can see which borrowers default on loans and match those numbers against their scores.. *this is a sound use of stats
? maybe *no use of stats is sound .. for humanity
of math and men.. oh my math.. measuring things
fico’s website , for ie, offers simple instructions on how to improve your *(credit) score (reduce debt, pay bills on time, and stop ordering new credit cards)
dang.. debt et al.. how about we focus on how to improve *eudaimonia .. a sound use of energy
creditworthiness has become an all too easy stand in for other virtues.. conversely, bad credit has grown to signal a host of sins and shortcomings that have nothing to do w paying bills..
the practice of using credit scores in hirings and promotions creates a dangerous poverty cycle..
framing debt as a moral issue is a mistake..t
not letting go of the concept of debt.. is a cancer
9 – no safe zone – getting insurance
you can imagine how machine learning systems fed by diff streams of behavioral data will be soon placing us not just into one tribe but into hundreds of them.. even thousands.. through the process, we will rarely learn about the tribes we ‘belong’ to or why we belong .. many of those tribes will mutate hour by hour, even minute by minute.. as the systems shuttle people from one group to another.. after all, the same person acts very differently at 8am and 8pm..t
great description.. now imagine that w a positive twist.. ie: mech listening to daily curiosities and helping us find our tribe.. (day by day.. hour by hour.. whatever) .. according to our whimsy.. let’s spend our energy on that.. no?
ie: 2 convers.. as infra..augmenting interconnectedness
let’s ‘wrest back a measure of control’ with that
10 – the targeted citizen – civic life
voting and propaganda in elections
we have to explicitly embed better values into our algos creating big data models that follow our ethical lead.. t
ie: self-talk as data
to eliminate wmds we must advance beyond establishing best practices in our data guild. our laws need to change, too. and to make that happen we must reevaluate our metric of success..
redefining success.. to be flavors of success.. leading to eudaimoniative surplus.. aka: undisturbed ecosystem
the’s no fixing a backward model like the value added model. the only solution in such a case is to ditch the unfair system. forge, at least for the next decade or two, about building tools to measure the effectiveness of a teacher..
how about we forget school.. ie: it’s measuring the effectiveness of a kid.. every day.. wmd
only when we have an ecosystem w positive feedback loops can we expect to improve teaching using data. until then it’s just punitive..
teaching something that wasn’t requested.. by the learner.. is punitive.. coercive.. destructive.. kids tests are totally fractal to teacher evals.. process is totally similar ie: ridiculous hoops..
ok .. dang.. mostly frustrating.. ie: spot on many of the math myths/fobias.. et al.. using it as a control mech because most think they don’t/can’t understand and are too ashamed (or whatever) to admit so they just go along.. but not going deep enough on how tech could address our dilemma today.. not improve tests/evals/credit-scores.. rather.. help us to disengage from them all together.. judge\ing people.. is not ok.. it’s the poison we need to let go of
ie: hlb via 2 convos that io dance.. as the day..[aka: not part\ial.. for (blank)’s sake…].. a nother way
the energy of 7bn alive people