Biblia Abiblia

Tuesday, August 22, 2017

Eight Ways of Looking at at a Drone Strike

With apologies to Wallace Stevens.

I.
Among twenty snowy mountains
The only moving thing
Was the camera of the drone.

II.
I was of three minds,
Like a voting booth
In which there are two ballots

III.
The drone cruised on the autumn winds.
It was a small part of the pantomime.

IV.
A Republican and a Democrat.
Are one.
A Republican and a Democrat and a Predator Drone
Are one.

V.
I do not know which to fear
The terrors of President Trump
Or the militarism of neoliberals
The drone killing
Or just after.

VI.
A bird's-eye view on the monitor
A bomb "crashed" the wedding
The shadow of the drone
Crossed it, to and fro
The route
Traced on the radar
The word Realpolitik

VII.
Oh thin men of Bayda
Why do you imagine golden birds
Do you not see how the drone
Glides above the heads
Of the people about you

VIII.
I know noble declarations
"The world safe for democracy"
But I know too
That the drone strike is involved
In what I know.

Sunday, August 13, 2017

Internment by Big Data

Big Data, IBM tells us, is "emerging as the world's newest resource for competitive advantage." Algorithms and neural networks fed with big data are driving cars, translating between languages, creating forged video clips of ex-presidents, classifying images, and -- and here is where our Skynet fears kick in -- beating the best human go players, e-sport "athletes" and fighter pilots.

Software is already reading X-rays and, apparently, doing a better job than many radiologists. Not a day goes by without a newspaper article or TV segment on how big data and AI are ganging up and coming for our jobs. Implicit in this reporting is the idea that humans are fallible, labile, and emotional; we tire easily and our perceptions are distorted by animosities and prejudice. Computers on the other hand...

We cannot chalk this idea of algorithmic impartiality off as journalistic sensationalism. The actors who develop the software are often as guilty of this, as are many academics. In a much-quoted paper form 2013 -- "The Future of Employment: How Susceptible are Jobs to Computerisation," Carl Benedikt Frey and Michael A. Osborne at Oxford University write:

"Computerisation of cognitive tasks is also aided by another core comparative advantage of algorithms: their absence of some human biases. An algorithm can be designed to ruthlessly satisfy the small range of tasks it is given. Humans, in contrast, must fulfill a range of tasks unrelated to their occupation, such as sleeping, necessitating occasional sacrifices in their occupational performance (Kahneman, et al., 1982). The additional constraints under which humans must operate manifest themselves as biases. Consider an example ofhuman bias: Danziger, et al. (2011) demonstrate that experienced Israeli judges are substantially more generous in their rulings following a lunch break. It can thus be argued that many roles involving decision-making will benefit from impartial algorithmic solutions."

It could indeed be argued that "impartial algorithmic solutions" should supplant human jobs, but it would not make for a very good argument for the simple reason that there is no such thing as an impartial algorithm. Attempts to outsource judicial decision-making to algorithms have proven catastrophic. Far from being impartial, the algorithms have turned out to be deeply racist. They not only replicate the prejudice from the humans who created the data in the first place, they also amplify it and removes any transparency (neural networks are "black boxes"; they will tell you what they believe the answer is, but they are incapable of telling you why that is) while at the same time giving it a veneer of objectivity.

This cluelessness is hardly new. What follows is a brief account of the first US Big Data revolution. Just like Messieurs Frey and Osborne, many of the actors believed that by dint of mechanizing they created objective truth, and just like most developers in Silicon Valley, the principal players were convinced that the problems they tackled were purely technical.

This is the story of how the US Census Bureau used big data to locate and round up 110,000 Japanese-Americans in 1942.

"The Electrical Enumerating Machine"

Computing Before Computers, Aspray, 1990, p. 131.

The Constitution of the United States mandates a nationwide census every ten years. What the Founding Fathers had in mind was a tally of all "free Persons" and three-fifths of the slaves so as to be able to apportion taxes and congressional seats. Born out of a desire to carry out a white supremacist agenda under the guise of scientific objectivity (as in the three-fifth ratio), it soon became clear that this agenda could be furthered by adding other, purportedly objective, demographic categories to the questionnaires, or in Paul Schor's words:

"The study of statistics leads almost naturally to the study of the processes by which elites objectify other classes of the population ... The US census fits well in this process of constituting groups of individuals as social problems--especially from 1840 on, when it aimed to find answers to the big political questions about the population, such as slavery and the harmfulness of freedom for blacks, the inassimilability of new immigrants and the "racial suicide" of Anglo-Saxons, racial mixing, and the degeneracy of blacks. This is revealed in the multiplication of racial categories to distinguish groups that sometimes were numerically insignificant; thus the 2,039 Japanese enumerated in 1890 contrasts with the treatment of the white race, which was never defined during the entirety of the period under study." -- Schor, Paul, "Counting Americans: How the US Census Classified the Nation", 2017, p. 3.

With the added groups and categories, and the population boom (from 2.5 million in 1777 to 63 million in 1890), tallying the data and crunching the numbers by hand became a problem. Not only was it time-consuming, it was almost impossible to do what we today would call a custom search. If we, for example, were interested in the number of illiterate men over the age of 50 nationwide, it might be possible to find the number through extensive cross-referencing, but if so, it would have been a very laborious task. The 1880 census had taken 8 years to finalize, and with the new data categories added for each census, there was a risk that the 1890 census would not be finished within the 10-year window.

Tabulating multiple categories by hand was a Sisyphean task.

The Development of Punch Card Tabulation in the Bureau of the Census: 1890-1940, Truesdell, 1965, p. 1.

75 years earlier a maverick British mathematician, Charles Babbage, had floated the idea of using steam to compute mathematical tables. His ideas were ahead of their time (his later design, the Analytical Engine, was the first proof of concept for an all-purpose computer, as we understand the term today, with a CPU, a memory bank, and a printer). Babbage's "engines" would receive their input from punch cards. Lack of funding (and the fact that brilliant conceptual ideas appealed to him much more than humdrum but commercially viable designs) meant that he had to abandon his projects.

In 1888, however, with the Census Bureau struggling to keep up, there was suddenly a market for tabulation machines. Realizing that part of the work could be done by mechanical means, they announced a competition. Whoever could tabulate sample data the fastest with the help of machinery would land a lucrative contract for the 1890 census. The winner, an ex-bureau employee named Herman Hollerith, bested his opponents by solving the task almost ten times faster. How? Just like Babbage before him, his design was based on punch cards.

Hollerith punch card used in the 1890 U.S. census.

Using a special hole punch, a clerk would translate all the demographic data on each census record into discrete data points on a card. The first four columns, for example, signify state, county, and census district. Another column signified race. In his patent application, Hollerith explains that it is not very difficult to make a tally of individual categories, such as the number of men or women in the U.S. This can be done manually. But things get more dicey if you want to run a tally with different variables. (Hollerith's examples are telling in light of the Schor quote above):

"it is required to know the number of native whites, or of native white males of given ages, or groups of ages, &c., as in what is technically known as the age and sex tally; ... The labor and expense of such tallies, especially when counting combinations of items made by the usual methods, are very great." U.S. Patent US 395782

So, he constructed a machine -- a tabulator -- that could read the punch cards, and could be set through electrical relays to count combinations of categories. The total tallies were then displayed on dials.

Hollerith tabulator dials. The 1890 machine featured 40 of these;

thanks to the double dials, each one could display a number from 0 to 10,000.

Machines based on Hollerith's original design were used by the agency until 1950 when they were partly replaced by computers. Reflecting on the Hollerith revolution in 1965, bureau director Ross Eckler notes how:

"The Superintendent of the Census of 1890 could rightly take pride in the gains that were accomplished through the use of the new equipment which initiated the use of punch cards in extensive statistical tabulations, though perhaps he did not realize the outstanding importance of the innovation which first reduced the data on the census schedule to a form which could be classified and counted by purely mechanical devices." -- Truesdell, p. 1.

"Purely mechanical" does more, linguistically, than explain that the human had been removed from the equation; the phrase also suggests that the classifications -- and the tallies computed by the machine -- had objective validity.

The quote is from the preface of a book by Leon E. Truesdell, who served as the bureau's chief demographer until 1955. Truesdell will soon make a more full-fledged appearance, but what is striking is that his account -- preoccupied with the problem of developing machinery to tally an ever-expanding array of demographic categories -- never once reflects on the categories themselves or what they refer to. His notion of progress is technological, as he writes in the epigraph where he reflects on the rapid digitization of the agency between 1950-1965:

"[I]n a few years, the electronic computer, with its supporting devices for assembling census data, had made far more progress than the punch card had made in 60 years. For the contribution of [the] FOSDIC [computer] is in addition to the fantastic increase in operational speed of the computer and the tremendous increase in the possibilities for complex cross-classification, checking for consistency, inflation from sample, and even adjustment of variable data." -- Truesdell, p. 208.

Internment by Big Data

"Jp" in Column 14 denoted Japanese ancestry. The IBM machines were able to quickly identify all punch cards for Japanese-Americans on the West Coast; these were then de-anonymized and tied to the individual records with names and addresses.

After the bombing of Pearl Harbor, the census bureau -- with state-of-the-art punch card technology and census data on everyone in the US -- was inundated with requests from the military, and they were more than happy to "help with the war effort." The following quote is from Margo Anderson's social history of The American Census:

"In January 1942 [the Census Bureau] acknowledged they were getting many requests from the military and surveillance agencies for information on Germans, Italians, and especially Japanese. Leon Truesdell, chief population statistician, said, 'We got a request yesterday, for example, from one of the Navy offices in Los Angeles, wanting figures in more or less geographic detail for the the Japanese residents in Los Angeles, and we are getting that out.' Assistant Director Virgil Reed followed up, noting that, for requests for data on the Japanese, Germans, and Italians, "some of them wanted them by much finer divisions than States and cities; some of them wanted, I believe several of them, them by census tract even, Truesdell agreed: 'That Los Angeles request I just referred to asked for census tracts." [Director James Clyde] Capt was pleased with these new efforts, bragging, 'We think it is pretty valuable. Those who got it thought they were pretty valuable. That is, if they knew there were 801 Japs in a community and only found 800 of them, then they have something to check up on ... We're by law required to keep confidential information by individuals ... But in the end, if the defense authorities found 200 Japs missing and they wanted them names of the Japs in that area, I would give them further means of checking individuals."
-- The American Census: A Social History, Second Edition, Anderson, 1990, pp. 194-195.

For our friend Truesdell, the request for the whereabouts of Japanese-Americans in Los Angeles was a technical challenge, and one he relished. Unfortunately, as director Capt noted (while at the same time expressing his willingness to flout the law), the agency was not allowed to provide information on individuals. Together with the commerce secretary, Capt lobbied Congress. He wrote an amendment to an omnibus war powers bill that was adopted and passed into law in March 1942.

The New York Times reported how the amendment would remove the confidentiality protection of census records and allow for data sharing between agencies: "[Such] data, now a secret under law, government officials believe, would be of material aid in mopping up those who had eluded the general evacuation orders." (Quoted from Anderson, p. 1196)

Capt's aggressive lobbying, Anderson notes, led to "the provision of technical expertise and small-area tabulations to the army for the roundup, evacuation, and incarceration of the Japanese-ancestry population -- more than 110,000 men, women, and children -- from the West Coast of the United States. (Anderson, p. 196)

Computing Before Computers, Aspray, 1990, p. 143.

PS. I started looking into the story of how big data was used to round up Japanese-Americans after watching this 1981 ABC Nightline interview with Steve Jobs and David Burnham. It is well worth watching.

Saturday, October 22, 2016

Remembering Charles Martell: The Case of the Incredible Shrinking Captain

"it seems hardly credible that the loss of bodies so tiny as the parathyroids should be followed by a result so disastrous." -- William Stewart Halsted (1852-1922)

If you were an M.D. practicing in Boston or New York in the '20s and '30s -- especially if you had an interest in metabolic disease and kept up to date with your field -- chances are that you would have been familiar with (astounded, perplexed, what have you) the case of Charles Martell: The Incredible Shrinking Captain. Today, his story is little more than a footnote in the annals of endocrine surgery -- someone to name-check if you, say, publish a paper on your experience in removing out-of-place parathyroid glands.

This is a pity. The story of the Merchant Navy Captain, who shrank seven inches as his bones literally turned to dust, deserves far wider recognition. First, it has all the trappings of a Shakespearean tragedy. As the medical establishment failed to make the final breakthrough, it was the captain himself who figured it out. But by the time he cracked his own case (by telling his surgeons to crack his breastbone open) it was too late. Because of this, Martell's story touches on something that is hotly debated today: patient engagement. Should we allow the patient to become a co-investigator, and grant her full and direct access to her electronic health record with test results, radiology reports, and doctors' notes?

With the passing of the Health Insurance Portability and Accountability Act (HIPAA for short) in 1996, patients were given the right to access (though, crucially, not directly) their health records. The rationale, as the Department of US Health & Human Services explains, is simple: [p]roviding individuals with easy access to their health information empowers them to be more in control of decisions regarding their health and well-being."

HIPAA was written with old-fashioned paper charts in mind -- charts that needed to be retrieved from archives, photocopied, and mailed by clerical staff. Because of this, clinics and hospitals are allowed a processing period of 30 days. Today, this provision should be moot. Electronic health records (EHRs) are ubiquitous, and with the push toward interoperability, that is, EHRs from different organizations being able to communicate with each other, the patient's complete chart could be made available with the click of a mouse. There are no technical or legal challenges.

And yet, there is push-back from some health care professionals who worry about delays and disruptions in their workflows if they give patients direct access to their notes. Instead of cutting to the chase and document their findings and assessments point blank, they fear that they will have to spend time thinking about their tone of voice and not offending the patient.

There is also a media narrative that plays into this; I am thinking of the various permutations of The Dangers of Using Google to Self-Diagnose articles. But for everyone who's convinced that the MMR jab gave their kid autism, there might be a patient who does her research not on fringe websites and support groups but in peer-reviewed publications -- someone who pores over their medical chart and picks up on something the doctor might have missed. Even if the former cohort outnumbers the latter 10:1 we still need a counter narrative, which is why I am enlisting the help of my friend Captain Martell.

His story can be read today as an appeal to open access, something that only 16.7 million patients worldwide currently enjoy, according to data from the OpenNotes project.

Not only did Martell "go online" (avant la lettre so to speak; he spent his final months in the Harvard Medical Library) he was eventually proven right and, in the process, helped advance our knowledge of parathyroid glands. So, take it away, Captain!

From Hunk to Hunchback

Born in Somerville, Massachusetts, Charles Martell entered the Massachusetts Nautical School in 1914 and graduated two years later at the top of his class. He saw action in the First World War as navigating officer on the U. S. Army Transport Shoshone (originally a German vessel, she was apparently seized in 1917; c'est la guerre!) After the Armistice, Martell served in the Merchant Marine. He had suffered from back pain for a few years, but this was hardly unusual for a sailor at a time when occupational safety and ergonomics were not top concerns. But by 1919 other worrying symptoms kicked in. "[H]is fellow officers noticed that he was growing shorter and becoming pigeon-breasted," as Dr. Eugene DuBois put it in his 1929 case report.

As Martell lost his stature, his back pain soon spread to his legs. While climbing a ladder, his knee gave way and he fractured something. When the ship came to New York a month later, he was taken to the U. S. Marine Hospital on Staten Island. "X-rays were taken. Little was accomplished." Not many weeks went by before his other knee collapsed. "One certainly gets the impression that something was fundamentally wrong with the structure of the Captains'[sic] under-pinnings," Fuller Albright, who would go on to treat him at Massachusetts General Hospital, chuckled in 1949. But perhaps not a laughing matter as far as Captain Martell was concerned.

He was taken to the Methodist Episcopal Hospital in New York where they patched his knee up. In the next few years, as he was crisscrossing the world, Martell fractured several bones. He was diagnosed with "arthritis of the mixed perisynovial and hypertrophic type," which, at the time, was physician longhand for "we have no f*cking clue." But what the the doctors did realize was that Martell's disease, whatever it was, did not spare a single bone in his body. "Roentgenograms ... showed osteomalacia [softening of the bones] involving the whole skeletal system."

In 1923, "he stumbled, fell against a chair, and broke both bones of the left forearm." Back to the Marine Hospital on Staten Island he went. While in the hospital, he managed to break two more bones: his left radius and his right humerus. He was at first treated with "diets high in calcium." This seemed to make sense. Martell's urine contained much more calcium than he was ingesting. He was, in other words, losing calcium from his bones. Putting him on a diary diet rich in calcium might, so the doctors reasoned, correct this metabolic imbalance so that the food rather than the bones would supply the excess calcium that his body, for some reason, needed to metabolize.

When this did not improve matters, the doctors took a shotgun approach and subjected him to the full force of 1920s cutting edge medicine: "cod liver oil, calcium and phosphorus medications, thyroid extracts, epinephrin, heliotherapy, quartz lamp treatment, and irradiated milk, without noticeable improvement." This reads like Allen Ginsberg's riff on 1950s psychiatry ("insulin Metrazol electricity hydrotherapy psychotherapy occupational therapy pingpong & amnesia") and the effect was the same. Nil.

In January 1926, Martell was admitted to Bellevue Hospital in New York, under the care of Dr. Eugene DuBois. Martell's bones were crumbling and turning to cysts. Something was horribly wrong with the way his body processed calcium, but what? Debois subjected his patient to a meticulous metabolic study. Martell was given food with varying amounts of calcium while the good doctor (or the good but, sadly, unsung nurse) measured the amount of calcium in his urine and feces. The fact remained; Martell lost more calcium than he ingested, but increasing his dietary intake did not improve matters.

Para...what?

First depiction of human parathyroid glands. Sandström, Ivar, "Om en ny körtel hos menniskan och åtskilliga däggdjur." Uppsala Läkareförenings Förhandlingar, Band XV:7-8 (1879-1880)

Deciding to up the ante, DuBois prescribed his patient parathyroid extract. The role of these diminutive glands, which had only been discovered some 40 years earlier, was still not entirely established, though they seemed to have something to do with calcium and bone metabolism. DuBois had come across case reports from Germany that described a disease with severe bone deformities and cysts -- osteitis fibrosa cystica -- that seemed to fit the bill. Patients dying from this excruciating ailment sometimes had en enlarged parathyroid gland. Conventional wisdom at the time had it that this was some kind of compensatory effect. The parathyroid gland protected against these bone changes by hypertrophy -- that is, by becoming larger and (it was assumed) secreting something into the bloodstream.

DuBois tested this hypothesis. "Three hundred and eighty units of parathormone were given between February 16 and March 2 with a maximum daily dosage of 40 units," but unfortunately the patient "complained of definite increase of pain all over his body, especially on motion. There was in addition, a slight elevation in the blood calcium. The medication was then stopped."

In 1915, a German physician had pointed out that it was odd that most patients with osteitis fibrosa cystica only had one enlarged gland (out of four). If the effect was compensatory, how come all four glands were not equally enlarged? Perhaps -- banish the thought -- this tiny tumor actually caused the bone disease by interfering with the way the body regulated calcium? In the mid '20 this was still a very radical idea, but in the end DuBois got it:

"Our patient then presents a picture which agrees in its essentials with that produced by the excessive administration of parathyroid extract and opposite to that found in hypoparathyroidism. These considerations and the finding of parathyroid tumors in patients with osteomalacia and similar bone disturbances (18) led us to the condusion that the underlying basis for the osteitis fibrosa cystica in our subject was a hyperactivity of the parathyroid bodies."

Martell was again referred to Massachusetts General -- this time for the surgeons to "consider the advisability of removing one or more of his parathyroid glands."

Bring out the Scalpel
In Boston, Martell went under the knife, courtesy of Dr E. P. Richardson, who made a cervical incision and went looking for a rogue gland. He found a "small vascular area 6 mm in diameter, deeper red than the remainder of the thyroid lobe" and removed it. Histological examination revealed that it was indeed a parathyroid gland. Sadly, this made little difference in Martell's clinical course. Not to be deterred, Richardson went at it again. He found a nodule on the other side of the neck. The pathologist once again confirmed that it was a parathyroid gland, but once again, it made little difference.

Chart showing little difference in blood calcium after removal of two parathyroid glands.

"A Case of Osteitis Fibrosa Cystica (Osteomalacia?) with Evidence of Hyperactivity of the Parathyroid Bodies. Metabolic Study II," Bauer, Albright and Aub, 1929.

Although stumped, the doctors noticed a slight improvement -- both clinically and physiologically -- and Martell was promptly discharged. He was able to hold down a job as a maritime insurance clerk for a few years, but in 1932 things took a turn for the worse. His kidneys were failing and he was once again admitted to Massachusetts General Hospital. Dr. Pattersson and and a colleague, Dr. Oliver Cope, operated four more times, but did not find any enlarged glands. This is where things get interesting. Cope would later recall how Martell:

"took a scientific interest in his own case and became an investigator as well as an investigatee; he was often found in his room poring over an anatomy text, he demanded that the surgical search should be continued until it succeeded, even when the next step was a sternotomy"

At Martell's insistance, the surgeons cracked open his breastbone and found the culprit: a 30 mm encapsulated parathyroid tumor (adenoma) in his mediastinum. But the disease had taken its toll on the former sea captain, and Charles Martell died six weeks after the operation -- after an emergency attempt to remove a kidney stone (the result of his long-standing parathyroid disease that caused calcium to be excreted through his kidneys) that was stuck in his ureter.

What Martell realized -- albeit too late -- by poring over anatomy texts was that parathyroid glands are not always found adjacent to the thyroid. As a result of how the gland develops and migrates (or fails to migrate) in the fetus, it can end up high in the neck or in the area in the chest known as the mediastinum. How did Martell come to suspect this? Did he find an obscure reference to a mediastinal gland in the medical literature, or did he form the hypothesis himself by synthesizing what he had discovered by studying embryology and anatomy? Cope's description is elusive. On the one hand, it suggests someone who is barely literate (Martell was of course highly intelligent) and is clutching his book, like the pious peasant his bible. But it also conjures up the image of the lone genius: an Einstein waiting for a Millikan to confirm empirically what he had intuited.

If there is a moral to this Most Lamentable Tragedie, it would be: so much for the dangers of self-diagnosis.

Sources:

Albright, "A Page out of the History of Hyperparathyroidism," J Clin Endocrinol Metab, 1948;8(8):637-657.

Bauer, Albright, Aub, "A Case of Osteitis Fibrosa Cystica (Osteomalacia?) with Evidence of Hyperactivity of the Parathyroid Bodies. Metabolic Study II," J Clin. Invest. 1930;8(2):229-248.

Cope, "The Story of Hyperparathyroidism at the Massachusetts General Hospital," N Engl J Med. 1966;274:1174-1182.

Hannon, Shorr, McClellan, DuBois, "A Case of Osteitis Fibrosa Cystica (Osteomalacia?) with Evidence of Hyperactivity of the Parathyroid Bodies. Metabolic Study I," J Clin. Invest. 1930;8(2):215-227.

McLellan, Hannon, "A Case of Osteitis Fibrosa Cystica (Osteomalacia?) with Evidence of Hyperactivity of the Parathyroid Bodies. Metabolic Study III," J Clin. Invest. 1930;8(2):249-258.

Sandström, "Om en ny körtel hos menniskan och åtskilliga däggdjur" Uppsala Läkareförenings Förhandlingar, Band XV:7-8 (1879-1880).

Sunday, July 6, 2014

Welcome to the World of Technochicracy! Revisiting the FB Debacle

"I have a joly wo, a lusty sorwe" - Chaucer, Troilus and Criseyde

A recent research paper published in PNAS and written by Adam D. I. Kramer - a member of Facebook's Core Data Team - and two information scientists from Cornell, made the news and caused some outcry last week. The authors admitted to having manipulated the news feed of some 700,000 thousand Facebook users back in January 2012; two experiments were conducted in parallel: in one, posts with "positive emotional content" had a lower chance of cropping up in a user's feed, and in the other experiment, many posts with "negative emotional content" were omitted from the feed.

The theory goes something like this: it is well attested that emotional contagion occurs in face-to-face situations. Without knowing it, we scan other people's faces for clues on their emotional status. Once we've gauged that, mirror neurons in our brain fire away, and we adapt our own facial expressions accordingly. A sort of unconscious mimicry. Negativity breeds negativity, and happiness makes the world go round. Previous studies have focused on nonverbal cues, but what about verbal ones? What about situations where the people are miles apart? If a friend posts a negative status update on Facebook, will I catch the negativity bug and post a negative one myself? Or rather, given a large enough sample ("N=689,003" in this case) is there a statistically significant correlation between the emotions of group members exposed to positive and negative posts?

And the answer, sez Kramer et al., is yes! The connection is tenuous, like "gold to airy thinness beat" (incidentally not a phrase used in the paper), but it is there; about one in a thousand posts exhibited "emotional contagion." Given the scale of Facebook, however, "this would have corresponded to hundreds of thousands of emotion expression in status updates per day."

Relying on a vague "research" clause in the Facebook User Policy, the authors conducted an exercise in manipulation with hundreds of thousands of users. This makes a mockery of the idea of informed consent - a mockery more egregious than the false pretenses used by Stanley Milgram in his harrowing 1961 experiment on authority and obedience. While Milgram's subjects were in the dark about the real purpose of the experiments, at least they knew that they took part in one. We are veering dangerously close to mind control land here. In fact, in one of the first books written on the psychology of brain washing, William Sargent explains how:

"Various beliefs can be implanted in many people after brain function has been sufficiently disturbed by accidentally or deliberately induced fear, anger, or excitement. Of the results caused by such disturbances, the most common one is temporarily impaired judgement and heightened suggestibility. Its various group manifestations are sometimes classed under the heading of 'herd instinct'" (William Sargent, Battle for the Mind: A Physiology of Conversion and Brain-washing, 1957)

Now, Dr. Sargent was a behaviorist (white coat, stopwatch in hand), and the point is not that his analysis is especially lucid (it isn't). No, what is eerie here is the similarity between his Pavlovian notions of human behavior and the underpinnings of the Facebook study: deliberately induced feelings? Heightened sensibility? Herd instinct? Do we hear a bell ringing in the distance?

Here is what the study authors have to say about their methodology:

"Posts were determined to be positive or negative if they contained at least one positive or negative word, as defined by Linguistic Inquiry and Word Count Software (LIWC2007) (9) word counting system, which correlates with self-report and physiological measures of well-being, and has been used in prior research on emotional expression. LIWC (7, 8, 10). LIWC was adapted to run on [...] the News Feed Filtering system, such that no text was seen by the researchers. As such, it was consistent with Facebook's Data Use Policy, to which all users agree prior to creating an account on Facebook, constituting informed consent for this research."

In the wake of this debacle, most people have focused on the second part - the dilution and slippery slopification of "informed consent"; there are excellent pieces on the responsibility of the social scientist (and less-than-stellar apologies), but what really rubs me the wrong way has more to do with the first part, and what I have previously called technochicracy. It is the lingering suspicion that the stunning "big data" vista (complete with the cloud services floating overhead) is a set-piece propped up - like a Potemkin village in the midst of Silicon Valley - in front of a 1950s or 1960s landscape of dumb terminals and behaviorist labs.

This experiment was framed as a groundbreaking study in emotional contagion; thanks to big data crunching and state-of-the-art software, an effect only previously observed in face-to-face interaction in an artificial milieu could now be studied on a massive scale with humans, so to speak, in their natural habitat. And yet, the underlying methodology is so hopelessly crude as to bring to mind Pavlov's experiments on conditioned reflexes in dogs. In the words of his American disciple B. F. Skinner, "Pavlov's attention was directed mostly to the glandular part of this total response, because it could be measured by measuring the flow of saliva." A stimulus (a bell ringing before food is served, an upbeat status message) triggers a physiological response that can be adequately measured in a beaker or by said Linguistic Inquiry and Word Count Software. Conceptually speaking, there is little difference between measuring the flow of saliva and counting positive/negative words in the flow of big data.

In fact, the idea of trigger words in status messages correlating with emotional well-being, at least on the aggregate level, owes more to 19th century positivism than any "cutting edge" science. It makes a mockery of the human condition. Who better to knock down 19th century underpinnings than a 19th century poet? When accused of mixing gravity and levity in Don Juan, Lord Byron sent the following letter to his publisher to answer the critic:

"His metaphor is, that 'we are never scorched and drenched at the same time'. Blessings on his experience! Ask him these questions about 'scorching and drenching'. Did he never play at Cricket, or walk a mile in hot weather? Did he never spill a dish of tea over his testicles in handing the cup to his charmer, to the great shame of his nankeen breeches? [...] Did he never tumble into a river or lake, fishing, and sit in his wet cloathes in the boat, or on the bank, afterwards 'scorched and drenched', like a true sportsman? 'Oh for breath to utter!' - but make him my compliments; he is a clever fellow for all that - a very clever fellow." (Byron’s Letters and Journals VI:207)

If we think of Don Juan as a status update, the reason it is so difficult to parse is (and Byron makes this abundantly clear) that human emotion cannot be reduced to a discrete number on a happiness scale. Even in the extreme, seemingly most clear-cut cases, our sadness or happiness is seldom unalloyed:

"Every cloud has a silver lining..."

"Ay, in the very temple of Delight / Veil'd Melancholy has her sovran shrine."

Etc.

But what about the statistically significant (albeit minuscule) effect Kramer et al. observed? There are, in fact, many ways of accounting for it that have nothing to do with emotion whatsoever. I might, for example, tag a friend and quote what s/he says in my status update (the system is far too crude to take quotation into account.) There is also the possibility that a few stray words I have recently read linger in my mind, winding their way into my next status message.

On a more general level, Chomsky's critique of Skinner's language behaviorism bears repeating. One of his points is that the causality between stimulus and response can only be "studied" by multiplying categories or properties of the stimulus-object until it loses all and any pretense of objectivity:

"Consider first Skinner's use of the notions stimulus and response. In Behavior of Organisms he commits himself to the narrow definitions for these terms. [...] Evidently, stimuli and responses, so defined, have not been shown to figure very widely in ordinary human behavior. We can, in the face of presently available evidence, continue to maintain the lawfulness of the relation between stimulus and response only by depriving them of their objective character. A typical example of stimulus control for Skinner would be the response to a piece of music with the utterance Mozart or to a painting with the response Dutch. These responses are asserted to be "under the control of extremely subtle properties" of the physical object or event. Suppose instead of saying Dutch we had said Clashes with the wallpaper, I thought you liked abstract work, Never saw it before, Tilted, Hanging too low, Beautiful, Hideous, Remember our camping trip last summer? or whatever else might come into our minds when looking at a picture (in Skinnerian translation, whatever other responses exist in sufficient strength). Skinner could only say that each of these responses is under the control of some other stimulus property of the physical object. If we look at a red chair and say red, the response is under the control of the stimulus redness; if we say chair, it is under the control of the collection of properties (for Skinner, the object) chairness, and similarly for any other response. This device is as simple as it is empty." (Noam Chomsky, "A Review of B. F. Skinner's Verbal Behavior", 1967)

An even better example (pace Chomsky) of the "suppose instead" refutation would be Proust's musings on the "petite phrase" he once heard in a sonata:

"When, after that first evening at the Verdurins’, he had had the little phrase played over to him again, and had had sought to disentangle from his confused impressions how it was that, like a perfume or a caress, it swept over and enveloped him, he had observed that it was to the closeness of the intervals between the five notes which composed it and to the constant repetition of two of them that was due that impression of a frigid and withdrawn sweetness; but in reality he know that he was basing this conclusion not upon the phrase itself, but merely upon certain equivalents, substituted (for his mind’s convenience) for the mysterious entity of which he had become aware"

In much the same way as the sensuous effects and enveloping power of the musical phrase are not part and parcel of its "stimulus properties," every status update containing the word "happy" (or a synonym) is not a property intrinsic to the word "happy." Proust's tentative explanation involving "certain equivalences" is a marvelous debunking of the causality assumption. For all its duh!-simplicity, the "lingering" hypothesis (mine and Proust's) is far too complex to be accounted for by Skinnerian behaviorism (whether of the 1950s stamp or dressed-up in technobabble.) Emotional contagion does exist, but it does so in a complex interplay of facial expression, mimicry and thought - both conscious and unconscious. It cannot be reduced to a stimulus-response scheme.

In fact, when we peak through the cool Matrix-curtain of big data, we find a faded Polaroid from the past - Brylcreemed men in white coats studying lab rats or dogs cowering in cages - men whose theories and methodologies were already debunked by the time horn-rimmed glasses went out of fashion... Welcome to the world of Technochicracy!

Tuesday, May 6, 2014

Multiple Choice and the Critic

Recently I wrote a rather heated polemic against standardized testing, but as Mssrs. Emerson and Wilde so aptly put it: consistency is "the hobgoblin of little minds" and "the last refuge of the unimaginative." So today I will do a graceful volte-face and acclaim its virtues (well, in one specific context) while sporting a sheepish little smile.

Some weeks ago I took the GRE Literature Test, required by many graduate programs in English. I am already a PhD student, so there was no pressing need for me to do so. But what with English not being my first language, I wanted to see how I would stack up against the motley pool of test takers (college seniors majoring in English, students fresh out of MA programs in English and "mature" students who graduated years or decades ago and want to attend grad school.) Plus, a good result might be a feather in my beret should I ever feel like applying to adjunct positions in English.

The test consists of around 230 multiple choice questions on the analysis and identification of texts ranging from Beowulf to Elizabeth Bishop, Gower to Ginsberg. "Multiple choice? Literature?" I hear you say. Wouldn't that be a throwback to a prelapsarian time before meanings had started to multiply by mitosis? A time when Oxford was still a city of aquatint and dons in caps and gowns were the guardians of truth? A time, in short, when an exam on Edmund Spenser could look like this:

"In whose reign did he flourish? Repeat Thomson's lines. What is said of his parentage? What does Gibbon say? How did he enter Cambridge? What is a 'sizer,' and why so called? What work did he first publish? What does Campbell say of Raleigh's visit to Spenser?" (A Compendium of English Literature, Charles D. Cleveland, 1852)

Or perhaps (given that you need to be able to recognize a iambic tetrameter or ottava rima) we think of the diluted versions of New Criticism that seeped down to high school students in the 50s, brilliantly captured by the fictitious Understanding Poetry by the equally fictitious J. Evans Pritchard, PhD:

"To fully understand poetry, we must first be fluent with its meter, rhyme, and figures of speech. Then ask two questions: One, how artfully has the objective of the poem been rendered, and two, how important is that objective. Question one rates the poem's perfection, question two rates its importance. And once these questions have been answered, determining a poem's greatness becomes a relatively simple matter."

Ah, 'twas the best of times... A time when the canonical texts were treated with silk gloves and the pages of the textbooks were still firmly glued together. But today, when every single literary edifice has been subjected to the wrecking ball of Derrida and Sons, Deconstruction Company? Sure, you could test the trivial (as in the Victorian exam) but you're only a click away from those facts on your cell phone, so why bother with them in the first place? And while you might be able to scratch the surface with multiple choice, surely you'll never reach the murky, rhizomatic depths of literature?

Before I answer my own rhetorical questions, I need to throw in a caveat. The GRE Literature test is rather silly (but surprisingly fun to take); its "predictive value" is questionable. You can score very high and still be a mediocre critic. But then again, no English Department in the US will ever judge an application solely on the GRE score. In fact, GPAs, writing samples and published articles are infinitely more important. And this is how it should be. I do, however, think that the test says something. It's an indisputable fact that, say, Althusser's notion of interpellation is a form of "coercive address," that Thomas of Hales' "Hwer is Paris and Heleyne" exemplifies the "ubi sunt motif" and that the choice between my and mine in Shakespeare's Sonnet 23 rests on "the same rationale as the Modern English choice between a and an."

Factual recall, rudimentary close reading skills and linguistic inference, all pretty elementary skills, right? What about those pesky rhizomes? Well, consider this item (taken from the Practice Test Booklet):

So what with blod and what with teres
Out of hire yhe and of hir mouth.
He made hire fairce face uncouth;
Sche lay swounende unto the deth,
Ther was unethes eny nreth:
Bot yit when he hire tunge refte,
A litel part therof belefte,
Bot sche with al no word mai soune,
But chitre and as a brid jargoune.

Which of the following lines make use of the same story?

(a)
Twit twit twit
Jug jug jug jug jug jug
So rudely forc'd
Tereu

(b)
Tu – whit! – Tu – Whoo!
And hark, again! The crowing cock,
How drowsily it crew.

(c)
I do not know which to prefer,
The beauty of inflections
Or the beauty of innuendoes

(d)
A sudden blow: the great wings beating still
Above the staggering girl...

(e)
This I sat engaged in guessing, but no
Syllable expressing
To the fowl whose fiery eyes now burned
Into my bosom's core

In order to make the connection between John Gower's and T. S. Eliot's uses of the Philomela myth, you need to be conversant with English literature from vastly different periods. Factual recall might help you identify the different snippets, but it will only take you so far. In fact, only 19% of all students tested got this one right (i.e. fewer than you would expect if they just threw a random guess.) This is hardly surprising; students take classes in Modernist Poetry and (though not very likely) in Middle English Poetry, but they are not taught to make thematic connections. And who can blame them when even tenured professors are bewitched by the siren-song of Foucault with its call for "absolute discontinuity" between the modern and the pre-modern (someone who did have a good supply of organic beeswax was J. G. Merquior whose takedown ought to be required reading for professors and undergraduates alike.)

In fact, this item calls for a modicum of that quaint and curiously old-fashioned thing called erudition. And this is part of what I find appealing about the test. Everyone in grad school is supposed to have the tools necessary to delve deep into their chosen area, but in order to retrace the winding paths that lead from text to text, you also need a broad survey map. When novelists and poets prove to be more well read than the academics dealing with them, something is clearly awry. Case in point: postmodern scholars (of some renown, I might add), writing on Chinua Achebe or Yvonne Vera, who are blind to their grapples with and responses to T. S. Eliot, simply because they have never read him. As that much-maligned poet himself put it: "the most individual parts of [a poet's] work may be those in which the dead poets, his ancestors, assert their immortality most vigorously." (from "Tradition and the Individual Talent")

To recap, I would say that the wide-ranging reading required to answer questions spanning over a period of 1000 years is a good foundation for more focused work in grad school. Even when dealing with something highly specific you still need to make connections and discern influences straddling the epochal divides.

You are also tested on the King James Bible and some Greco-Roman literature and mythology, and let's face it, every Western author writing before, say, 1950, writes for a reader well versed in the Bible. How can we feel for Leopold Bloom, and understand what an underdog and outsider he is, if we are as ignorant of the meaning of "I.N.R.I" as he is? Joyce takes for granted that we can appreciate both the ignorance and the sheer beauty of his folk etymological strike of genius: "Iron Nails Rushed In."

And to take another example: for all his expertise on military and fortification history, Sterne's Uncle Toby has no idea who Cicero was. If we share his ignorance, he is no longer the exceptional oddball the contemporary audience loved. Such knowledge breeds familiarity; it might help bridge the divide between writings from the past and the contemporary scholar. He or she might actually get it – not on an academic "[T]he 18th century readers, most of whom had studied Cicero in the original Latin..." level, but on a more visceral "Whoa, this guy is amazing!" one. Or in the words of Cleanth Brooks:

"We tend to say that every poem is an expression of its age; that we must be careful to ask of it only what its own age asked; that we must judge it only by the canons of its age. Any attempt to view it sub specie aeternitatis, we feel, must result in illusion.

Perhaps it must. Yet, if poetry exists as poetry in any meaningful sense, the attempt must be made. Otherwise the poetry of the past becomes significant merely as cultural anthropology, and the poetry of the present, merely as a political, or religious, or moral instrument […] We live in an age in which miracles of all kinds are suspect, including the kind of miracle of which the poet speaks. The positivists have tended to explain the miracle away in a general process of reduction which hardly stops short of reducing the "poem" to the ink itself. But the "miracle of communication," as a student of language terms it in a recent book, remains. We had better not ignore it, or try to "reduce" it to a level that distorts it. We had better begin with it, by making the closest possible examination of what the poem says as a poem." (The Well Wrought Urn)

While a knowledge of history, myth and the Bible, of metres and tropes, and of allusions, echoes and thematic connections (things that can be tested on a multiple choice exam) will not get you there, it might take you some way towards experiencing the miracle of language and literature.

Tuesday, April 15, 2014

Amateur vs. Pro: the Bout of the 19th Century

"Many words that are now unused will be rekindled,
Many fail now well-regarded (If usage wills it so,
To whom the laws, rules, and control of language belongs.)" – Horace, Ars Poetica

Tracing how the "amateur" of the late 18th century – whether armchair artist or gentleman scholar – turned into a laughing-stock some hundred years later is to sketch the fall of the moneyed and leisured class; it is also to see the rise of the "middling" classes, whose members reconstituted themselves as professionals. The prerogatives of birth meant far less as more people could, at least in theory, gain some upward mobility. But this development came at a price; the widening specialization and division of labor, and the subdivision of life into a public and private sphere, were hotbeds for alienation and anomie. With professionalism came educational pragmatism. The seven Latin roads of a liberal education – the trivium and quadrivium – were repaved by professional workmen into a Second Empire boulevard. You trained for your chosen career and did not stray from its path. In today's job market, the metaphor of the one road is ubiquitous. If you aspire to a life in the fast lane, simply follow the road to success. 7 steps is all it takes (job hoppers and amateurs amblers need not apply)!

Throughout its 230 year history, the word "amateur" has both been inflected by, and vocally opposed to, these changes in society, and throughout much of this time it has held diverging meanings and connotations. It is as if the Zeitgeist itself dabbled as a lexicographer for competing dictionaries. This tension, as I will go on to show, is brilliantly captured in several Victorian works of art, from canonical literature to potboilers and pulp fiction, but let us start with its humble beginnings. Let us turn the clock back to a time when it was newfangled enough to call out for a definition. In his 1803 Cyclopedia, the Rev. Abraham Reese (himself an amateur encyclopedist) provides the following gloss:

"In the arts [...] a foreign term introduced and now passing current amongst us, to denote a person understanding, and loving or practising the polite arts of painting, sculpture, or architecture, without any regard to pecuniary advantage. [...] Amateurs who practise were never perhaps in greater number or of superior excellence, and those who delight in and encourage the arts have been the means of raising them in this country to that eminence to which they are arrived. It is to be regretted, however, that the great works of former ages, collected by amateurs in this kingdom, are not as accessible to our professors as they are in foreign countries"

Derived from the Latin verb for 'to love' ('amare') it is against the backdrop of burgeoning professionalism that we must view the coinage. The disregard of "pecuniary advantage," pits the dabbler against the professional draftsman or artist, but, in Reese's view, the relationship can be a mutually beneficial one rather than the cause of animosity and class resentment. As a member of the gentry or aristocracy, the amateur collector can afford to do all the legwork while traveling the length and breadth of Europe and racking up artworks, which can then be exhibited and used for instruction by the professors. Today we would perhaps talk about the synergistic effects of non-profit crowdsourcing.

Challenging the Amateur
Unfortunately, Reese's call for amateur-professional collaboration was unheeded. The first decades of the century did see an explosion of books addressed to amateurs in the arts, but these were mostly written by professors or professionals who wanted to impart some (limited) knowledge to the armchair art-lover. Sometimes lip service was paid to his or her judgment; in his 1814 pamphlet Short Address to the Amateurs of Ancient Painting, for example, professional artist H. C. Andrews challenged "the world to produce a painting of equal merit" to da Vinci's St. John the Baptist. But while Reese might have believed this to be possible, maybe even likely (NEWFLASH: Amateur Art Sleuth Discovers Lost Renaissance Masterpiece Gathering Dust in Venetian Palazzo) Andrews' challenge comes with the supercilious smile of someone who does not expect to be challenged.

Someone who did feel that a gauntlet had been thrown down was British architect Joseph Gwilt. Reading the morning paper one day he came across an unsigned review arguing that British architects and professors "afford proof how imperfectly every style of architecture appears to be understood." Enraged by this slight to his profession he decided to set the record straight:

"may it not be asked, whether this sentence passed upon a whole profession by an amateur, who from his writing is but slenderly versed in the art, is not written with an acerbity which shows some latent feeling arising from the want of homage to amateurs on the part of the professors. It would be refreshing to see one of the designs of any of the amateurs and critics, who, like the reviewer, pronounce judgment on a body of men whose lives are passed in the study of the art." (Elements of Architectural Criticism for the use of Students, Amateurs and Reviewers 24)

Unless they produce a blueprint or design worthy of the pros, the amateurs should help themselves to some humble pie (preferably by reading his book). With supreme irony, Gwilt's setting his hope up for a "refreshing" amateur design casts the non-professionals as musty and moldy – a class well past its sell-by date.

Heraldry and Whores
Judging from other how-to manuals from the time, amateurs were becoming (or perceived themselves as being) more and more marginalized. In 1828 Harriet Dallaway published A Manual of Heraldry for Amateurs – a "small essay intended chiefly for the use of my sex, or amateurs of heraldry, who may have a taste for such pursuits as connected with history and genealogy." The parallelism is striking. In the eyes of the world, the woman shouldn't abandon house and hearth for bookish learning, and, in much the same way, the amateur should throw his avocations to the winds and embark on a professional career. It wouldn't be reading too much into her caveat to say that the amateur was now, if not as frowned upon, at least somehow comparable to the woman with aspirations beyond her immediate "business."

Some decades after Reese's collaborative ideas, things had certainly changed. The ongoing professionalization spared no sector of society, least of all the underworld. In Henry Mayhem's newspaper reports on the seedy sides of Victorian London, he finds himself at a loss to account for the "amateur" prostitute:

"Those women who, for the sake of distinguishing them from the professionals [elsewhere termed "operatives"], I must call amateurs, are generally spoken of as 'Dollymops,' Now many servant-maids, nurse-maids who go with children into the Parks, shop girls and milliners who may be met with at the various 'Dancing Academies,' so called, are 'Dollymops.' We must separate these latter again from the 'Demoiselle de Comptoir,' who is just as much in point of fact a 'Dollymop,' because she prostitutes herself for her own pleasure, a few trifling presents or a little money now and then, and not altogether to maintain herself. But she will not go to the Casinos, or any similar places, to pick up men" (The London Underworld in the Victorian Period 43)

The incredulous "not altogether" (knitted brows, chin in hand) registers the confusion. Now, the point is not that the "Dollymops" were somehow pro bono ambassadors who enjoyed their business. In all likelihood they did not. But by this time it was getting increasingly difficult to wrap your head around the fact that some people had other, perhaps more complicated, motives than those dictated by their profession.

The March of Progress
This process of marginalization that relegated the amateur to Cabinet of Curiosity fodder (an armchair architect, a female heraldist, a whore who is not quite whore) did not arise out of a vacuum. Writing when the industrial revolution was still in its infancy, Adam Smith extolled the wealth-creating virtues of labour specialization:

"To take an example, therefore, from a very trifling manufacture, but one in which the division of labour has been very often taken notice of, the trade of a pin-maker: a workman not educated to this business (which the division of labour has rendered a distinct trade), nor acquainted with the use of the machinery employed in it [...] could scarce, perhaps, with his utmost industry, make one pin in a day, and certainly could not make twenty. But in the way in which this business is now carried on [...] it is divided into a number of branches, of which the greater part are likewise peculiar trades. One man draws out the wire; another straights it; a third cuts it; a fourth points it; a fifth grinds it at the top for receiving the head; to make the head requires two or three distinct operations; to put it on is a peculiar business; to whiten the pins is another; it is even a trade by itself to put them into the paper; and the important business of making a pin is, in this manner, divided into about eighteen distinct operations [...] ten persons, therefore, could make among them upwards of forty-eight thousand pins in a day." (An Inquiry into the Nature and Causes of the Wealth of Nations)

For him, the amateur pin-maker was an anachronism, a throwback to a bygone era when a blacksmith or farrier furnished the product all by himself (and, heaven forbid, didn't even stick to pins!) By mid-century the branches of economic theory and social science had converged, and through these "scientific" bifocals the amateur became even more primitive. Herbert Spencer considered the jack-of-all-trades an atavism, an evolutionary cul de sac. Differentiation and professionalism was no longer "just" the order of the day (an ideological choice that made sense in terms of production) but the supreme law of civilization:

"The change from the homogeneous to the heterogeneous is displayed in the progress of civilization as a whole, as well as in the progress of every nation; and is still going on with increasing rapidity. As we see in existing barbarous tribes, society in its first and lowest form is a homogeneous aggregation of individuals having like powers and like functions: the only marked difference of function being that which accompanies difference of sex. Every man is warrior, hunter, fisherman, tool-maker, builder; every woman performs the same drudgeries. Very early, however, in the course of social evolution, there arises an incipient differentiation" ("Progress: Its Law and Cause")

Major-Generals and Detectives

It is true that the amateur still had some lease of life. With the premiere of Gilbert & Sullivan's The Pirates of Penzance in 1879, he entered the stage as antihero. With his classical erudition and breadth of knowledge, the Major-General makes clear that he is the very model of the liberal scholar. Armed with an unquenchable thirst for knowledge (not to mention an impeccably twisted moustache) he had embarked on an intrepid journey through the trivium and quadrivium, quite oblivious to the fact that the only thing that really mattered as the 19th century drew to a close, was marching the one road: from the military academy to decorations and promotions via the battlefield. We root for him and feel for him – much like we do for Sir John Falstaff (in terms of "pluck" and military "experience" surely his great forebear) – precisely because we sense the tragedy looming over the comedy. We know that Hal's drinking buddy will one day prove a liability, and we fear that when the curtain falls, the Major-General will be trampled underfoot by the "march" of progress, turning amateurs, jacks-of-all-trades, dabblers, polymaths and generalists into roadkill.

It is telling that the most enduring character of 19th century fiction is not the moribund Major-General, but his antimatter avatar – someone who, to paraphrase an early review of the opera, "is uninformed on all subjects, except those connected with his profession." Take it away, Dr. Watson!

"Upon my quoting Thomas Carlyle, he inquired in the naivest way who he might be and what he had done. My surprise reached a climax, however, when I found incidentally that he was ignorant of the Copernican Theory and of the composition of the Solar System. That any civilized human being in this nineteenth century should not be aware that the earth travelled round the sun appeared to be to me such an extraordinary fact that I could hardly realize it.
"You appear to be astonished," he said, smiling at my expression of surprise. "Now that I do know it I shall do my best to forget it."
"To forget it!"
"You see," he explained, "I consider that a man's brain originally is like a little empty attic, and you have to stock it with such furniture as you choose. A fool takes in all the lumber of every sort that he comes across, so that the knowledge which might be useful to him gets crowded out, or at best is jumbled up with a lot of other things so that he has a difficulty in laying his hands upon it. Now the skilful workman is very careful indeed as to what he takes into his brain-attic. He will have nothing but the tools which may help him in doing his work, but of these he has a large assortment, and all in the most perfect order. It is a mistake to think that that little room has elastic walls and can distend to any extent. Depend upon it there comes a time when for every addition of knowledge you forget something that you knew before. It is of the highest importance, therefore, not to have useless facts elbowing out the useful ones."
"But the Solar System!" I protested.
"What the deuce is it to me?" he interrupted impatiently; "you say that we go round the sun. If we went round the moon it would not make a pennyworth of difference to me or to my work." (A Study in Scarlet)

I have always wondered whether the culture shock Dr. Watson experienced here was not as much a cause for his future PTSD as that fateful Jezail bullet in Maiwand. Fred Flintstone could not have been more confused and agitated had he crashed into George Jetson's atomic aerocar. While Holmes' rationale is anchored in Victorian science – the phrenologist idea of a discretely ordered and finite brain – his desire to achieve a one-to-one correspondence between knowledge and the demands of his profession is a strikingly modern one. Even the metaphor he uses to describe this would later find its way into the 21st century: if all the Business Self-Help books are anything to go by, it is imperative that you develop the specific mental tools and tool kits required by your profession.

In fact, Holmes' methods of observation and deduction are so cutting-edge that it is the professionals who come across as bumbling amateurs:

"Gregson and Lestrade had watched the manoeuvres of their amateur companion with considerable curiosity and some contempt. They evidently failed to appreciate the fact, which I had begun to realize, that Sherlock Holmes' smallest actions were all directed towards some definite and practical end."

It is probably not a coincidence that when he makes his grand reappearance in "The Return of Sherlock Holmes" (both he and Professor Moriarty had previously fallen to their death grappling on top of the Reichenbach Falls, but the reading public would have none of that) he does so by pretending to be a Victorian eccentric before unmasking himself. So it turns out that Holmes was unscathed after all; it was the 19th century amateur who had gone to meet his maker. It's a ham-fisted allegory, but it certainly gets the point across:

"I struck against an elderly, deformed man, who had been behind me, and I knocked down several books which he was carrying. I remember that as I picked them up, I observed the title of one of them, THE ORIGIN OF TREE WORSHIP, and it struck me that the fellow must be some poor bibliophile, who, either as a trade or as a hobby, was a collector of obscure volumes. I endeavoured to apologize for the accident, but it was evident that these books which I had so unfortunately maltreated were very precious objects in the eyes of their owner. With a snarl of contempt he turned upon his heel, and I saw his curved back and white side-whiskers disappear among the throng. [...] I had not been in my study five minutes when the maid entered to say that a person desired to see me. To my astonishment it was none other than my strange old book collector, his sharp, wizened face peering out from a frame of white hair, and his precious volumes, a dozen of them at least, wedged under his right arm. [...]
"Well, sir, if it isn't too great a liberty, I am a neighbour of yours, for you'll find my little bookshop at the corner of Church Street, and very happy to see you, I am sure. Maybe you collect yourself, sir. Here's BRITISH BIRDS, and CATULLUS, and THE HOLY WAR—a bargain, every one of them. With five volumes you could just fill that gap on that second shelf. It looks untidy, does it not, sir?"
I moved my head to look at the cabinet behind me. When I turned again, Sherlock Holmes was standing smiling at me across my study table. I rose to my feet, stared at him for some seconds in utter amazement, and then it appears that I must have fainted for the first and the last time in my life. Certainly a gray mist swirled before my eyes, and when it cleared I found my collar-ends undone and the tingling after-taste of brandy upon my lips. Holmes was bending over my chair, his flask in his hand" ("The Return of Sherlock Holmes")

TBC...

Monday, April 7, 2014

The Computer Illiterati Conspiracy (or "Why the Average Teaching Assistant Makes Six Times as Much as College Presidents")

With a growing college population, and the implementation of the Common Core Standards for K-12 students, Automated Essay Scoring (AES for short) is slated to become one of the most lucrative fields in the education market within a few years. Teachers might be good enough when it comes to assessing their students' writing, but they are painstakingly slow (a computer algorithm can churn out grades for tens of thousands of essays in a matter of seconds); they are also inconsistent and biased, and – banish the thought! – they want to get paid for their services.

These are the arguments put forward by ed policy makers and supported by one-dimensional (not to say shoddy) research, such as a much-quoted 2012 study from the University of Akron in which the authors compared human readers scoring student essays "drawn from six states that annually administer high-stakes writing assessments" with the performance of nine essay algorithms grading the same essays. They concluded that:

"automated essay scoring was capable of producing scores similar to human scores for extended-response writing items with equal performance for both source-based and traditional writing genre [sic!] Because this study incorporated already existing data (and the limitations associated with them), it is highly likely that the estimate provided represent a floor for what automated essay scoring can do under operational conditions." (2–3)

Between the lines of academic jargon in the last sentence we find a startling claim: if the high correlation between human readers and their silicon counterparts only represents a "floor" of what the programs are capable of, then then implication must surely be that they are, for all intents and purposes, better graders than the teachers. And true enough, the authors go on to deplore their inconsistency and inability to follow simple instructions:

"The limitation of human scoring as a yardstick for automatic scoring is underscored by the human ratings used for some of the tasks in this study, which displayed strange statistical properties and in some cases were in conflict with documented adjudication procedure." (27)

This is nonsense; nonsense wrapped in academic abstraction, but nonsense nonetheless. When teachers stray from "documented adjudication procedure," this is precisely because they are experienced and creative readers who know full well that an essay might be great even though it does not conform to – and sometimes consciously flouts – rigid evaluation criteria. And as for their grading exhibiting (gah!) "strange statistical properties" it is important to realize that this is not a sign of human fallibility. Quite the contrary. If there is a huge discrepancy between two readers evaluating the same essay, this indicates that at least one of them (possibly both although the one recommending the conservative grade might be wary of repercussions if he or she does not follow the criteria to the letter) has discovered that it is an outstanding essay.

Computer algorithms will always penalize innovation, but surely the students are not supposed to pen Pulitzer-winning essays? Isn't the point of the essays rather to gauge whether they can craft coherent texts according to the K-12 Common Core Standards (the ones listed below are for informative/explanatory essays)?

"Introduce a topic clearly, provide a general observation and focus, and group related information logically; include formatting (e.g., headings), illustrations, and multimedia when useful to aiding comprehension.

Develop the topic with facts, definitions, concrete details, quotations, or other information and examples related to the topic.

Link ideas within and across categories of information using words, phrases, and clauses (e.g., in contrast, especially).

Use precise language and domain-specific vocabulary to inform about or explain the topic.

Provide a concluding statement or section related to the information or explanation presented."

Yes, but even though these criteria are highly mechanic and wouldn't necessarily (if you excuse my anthropomorphizing) recognize a good essay if it bit them in the face, the AES systems still fall woefully short. They can do a word count and a spell check; they can look for run-on sentences and sentence fragments, and discover the ratio of linking words and academic adverbs. The fourth bullet point shouldn't pose much of a problem either. Since they have been fed hundreds of texts graded by humans, and extrapolated the "domain specific" words which correlate with high grades. And what about factual accuracy and logical progression, surely a piece of cake for the silicon cookie monster? Not quite.

One of the most vocal critics of automated essay assessment, Les Peralman who is director of writing at M.I.T., has taken one of the most commonly used automatic scoring systems for a spin. The e-Rater is used not by K-12 schools but by the ETS to grade graduate-level GRE essays (i.e. one of the most high-stakes tests on the market.) So how does it measure up? No, let us not even consider creativity, subtlety, style and beauty (all important traits in grad school work), but look at the rudimentary skills outlined in the Common Core Standards. Is the e-Rater able to discriminate factual accuracy from outlandish claims, logical progression from a narrative mess, sense from nonsense? The following essay, written by Perelman, received the highest grade possible – 6/6 (an essay with this score "sustains insightful in-depth analysis of complex ideas"):

Question: "The rising cost of a college education is the fault of students who demand that colleges offer students luxuries unheard of by earlier generations of college students -- single dorm rooms, private bathrooms, gourmet meals, etc." Discuss the extent to which you agree or disagree with this opinion. Support your views with specific reasons and examples from your own experience, observations, or reading.

In today's society, college is ambiguous. We need it to live, but we also need it to love. Moreover, without college most of the world's learning would be egregious. College, however, has myriad costs. One of the most important issues facing the world is how to reduce college costs. Some have argued that college costs are due to the luxuries students now expect. Others have argued that the costs are a result of athletics. In reality, high college costs are the result of excessive pay for teaching assistants.

I live in a luxury dorm. In reality, it costs no more than rat infested rooms at a Motel Six. The best minds of my generation were destroyed by madness, starving hysterical naked, and publishing obscene odes on the windows of the skull. Luxury dorms pay for themselves because they generate thousand and thousands of dollars of revenue. In the Middle Ages, the University of Paris grew because it provided comfortable accommodations for each of its students, large rooms with servants and legs of mutton. Although they are expensive, these rooms are necessary to learning. The second reason for the five-paragraph theme is that it makes you focus on a single topic. Some people start writing on the usual topic, like TV commercials, and they wind up all over the place, talking about where TV came from or capitalism or health foods or whatever. But with only five paragraphs and one topic you're not tempted to get beyond your original idea, like commercials are a good source of information about products. You give your three examples, and zap! you're done. This is another way the five-paragraph theme keeps you from thinking too much.

Teaching assistants are paid an excessive amount of money. The average teaching assistant makes six times as much money as college presidents. In addition, they often receive a plethora of extra benefits such as private jets, vacations in the south seas, a staring roles in motion pictures. Moreover, in the Dickens novel Great Expectation, Pip makes his fortune by being a teaching assistant. It doesn't matter what the subject is, since there are three parts to everything you can think of. If you can't think of more than two, you just have to think harder or come up with something that might fit. An example will often work, like the three causes of the Civil War or abortion or reasons why the ridiculous twenty-one-year-old limit for drinking alcohol should be abolished. A worse problem is when you wind up with more than three subtopics, since sometimes you want to talk about all of them.

Factual accuracy aside, where is the "in-depth analysis" and the logical progression? This hilarious rant has the trappings of an excellent essay – an advanced vocabulary, plenty of academic linking words as well as a good portion of "domain words" used in student essays on the same topic that scored highly ("teaching assistants", "accommodations", "capitalism") – and the machine cannot tell the difference. The algorithm can be easily fooled, something ETS made no secret of in a 2001 paper. But while admitting that utter nonsense can score highly, they also claim that this is of little relevance since students do not set out to trick an algorithm; they write with human beings in mind (there is still a human reader involved in the GRE scoring process), and the overlap between essays deemed good by humans and the algorithms is almost complete. We can illustrate this with a Venn diagram of essays receiving high scores:

It won't be long, however, before the human readers are given the boot. If you plug the high predictive validity, specious though it might be, into a cost-benefit analysis you would fool many a school board. And here's the rub, with no human reader involved, the green circle is a much more comfortable target to aim for than the blue bull's eye. Chances are that K-12 teachers, pressured to teach the Common Core tests rather than the skills these tests are supposed to measure, will be forced to coach their students how to produce impressive sounding gibberish, perhaps along the lines of:

"You see, start out with a phrase such as 'In today's society', 'During the Middle Ages', or, why not, 'In stark contrast to'. Then you rephrase the essay prompt and begin the second paragraph. Start with a linking word; "thus" or "firstly" are always a safe bet. And whatever you do, don't forget the advanced content words; if you're supposed to write about whether technology is good for mankind, how about a liberal sprinkling of "interaction", "alienation", "reliance" and "Luddite"... Oh yes, the last word will almost guarantee that you'll get an A! In the thirds paragraph..."

As loath as I am to beat the dystopian drum here, there is a real risk that the focus on discrete metrics (and consequently on uniformity and rote-learning) in the Common Core Standards, rather than promoting transparency and equity, might make us blind to the intrinsic worth and unique skills of each student. No longer human beings, they are now points in a big data matrix, in which their performance is mapped with mathematical precision to the performance of their peers. This breaking down of students (pun very much intended) into metrics will most likely lead to a kind of "lessergy" where total ability bears no relation to the sum of their artificially measured skills. A car made out of papier-mâché parts, which might have the same dimensions and at first glance pass for the real thing, will not perform very well on the road. And in much the same way, a student taught to fool the AES algorithms will hardly have gained any real-life skills in writing or critical thinking.

AES is of course only one facet of the big data-fication of education, but it is one of the most egregious ones. Until the two cultures divide has been bridged, policy makers will be as dumbfounded and seduced when told about "chi-square" correlations of automated essay scoring algorithms, and the "strange statistical properties" of human raters, as Diderot was when (if we are to believe the anecdote) Euler explained that given the equation:

$\frac{a+b^n}{n}=x$

...there is a God.

When I first read Hard Times 12 years ago, I thought it was a clunky, over-the-top satire. Now it seems eerily prophetic (yes, when he wasn't busy earning millions as a high-flying TA, Dickens actually found time to whip up a couple of novels):

"Utilitarian economists, skeletons of schoolmasters, Commissioners of Fact, genteel and used-up infidels, gabblers of many little dog’s-eared creeds [...] Cultivate in them, while there is yet time, the utmost graces of the fancies and affections, to adorn their lives so much in need of ornament"

Perhaps this is precisely what is needed – a grassroots movement of teachers and educators, writers and poets, students and parents, who can do just that: cultivate some fancies and affections into the Commissioners of Fact, and tell the technocrats and Taylorists that there is more to life than what is dreamt of in their philosophies. Until then, a good way to start would be to sign this petition against Machine Scoring in High-Stakes Essays (with Noam Chomsky as one of the signatories).