Tuesday, April 15, 2014

Amateur vs. Pro: the Bout of the 19th Century



"Many words that are now unused will be rekindled,
Many fail now well-regarded (If usage wills it so,
To whom the laws, rules, and control of language belongs.)" – Horace, Ars Poetica

Tracing how the "amateur" of the late 18th century – whether armchair artist or gentleman scholar – turned into a laughing-stock some hundred years later is to sketch the fall of the moneyed and leisured class; it is also to see the rise of the "middling" classes, whose members reconstituted themselves as professionals. The prerogatives of birth meant far less as more people could, at least in theory, gain some upward mobility. But this development came at a price; the widening specialization and division of labor, and the subdivision of life into a public and private sphere, were hotbeds for alienation and anomie. With professionalism came educational pragmatism. The seven Latin roads of a liberal education – the trivium and quadrivium – were repaved by professional workmen into a Second Empire boulevard. You trained for your chosen career and did not stray from its path. In today's job market, the metaphor of the one road is ubiquitous. If you aspire to a life in the fast lane, simply follow the road to success. 7 steps is all it takes (job hoppers and amateurs amblers need not apply)!

Throughout its 230 year history, the word "amateur" has both been inflected by, and vocally opposed to, these changes in society, and throughout much of this time it has held diverging meanings and connotations. It is as if the Zeitgeist itself dabbled as a lexicographer for competing dictionaries. This tension, as I will go on to show, is brilliantly captured in several Victorian works of art, from canonical literature to potboilers and pulp fiction, but let us start with its humble beginnings. Let us turn the clock back to a time when it was newfangled enough to call out for a definition. In his 1803 Cyclopedia, the Rev. Abraham Reese (himself an amateur encyclopedist) provides the following gloss:
"In the arts [...] a foreign term introduced and now passing current amongst us, to denote a person understanding, and loving or practising the polite arts of painting, sculpture, or architecture, without any regard to pecuniary advantage. [...] Amateurs who practise were never perhaps in greater number or of superior excellence, and those who delight in and encourage the arts have been the means of raising them in this country to that eminence to which they are arrived. It is to be regretted, however, that the great works of former ages, collected by amateurs in this kingdom, are not as accessible to our professors as they are in foreign countries"
Derived from the Latin verb for 'to love' ('amare') it is against the backdrop of burgeoning professionalism that we must view the coinage. The disregard of "pecuniary advantage," pits the dabbler against the professional draftsman or artist, but, in Reese's view, the relationship can be a mutually beneficial one rather than the cause of animosity and class resentment. As a member of the gentry or aristocracy, the amateur collector can afford to do all the legwork while traveling the length and breadth of Europe and racking up artworks, which can then be exhibited and used for instruction by the professors. Today we would perhaps talk about the synergistic effects of non-profit crowdsourcing.

Challenging the Amateur
Unfortunately, Reese's call for amateur-professional collaboration was unheeded. The first decades of the century did see an explosion of books addressed to amateurs in the arts, but these were mostly written by professors or professionals who wanted to impart some (limited) knowledge to the armchair art-lover. Sometimes lip service was paid to his or her judgment; in his 1814 pamphlet Short Address to the Amateurs of Ancient Painting, for example, professional artist H. C. Andrews challenged "the world to produce a painting of equal merit" to da Vinci's St. John the Baptist. But while Reese might have believed this to be possible, maybe even likely (NEWFLASH: Amateur Art Sleuth Discovers Lost Renaissance Masterpiece Gathering Dust in Venetian Palazzo) Andrews' challenge comes with the supercilious smile of someone who does not expect to be challenged.

Someone who did feel that a gauntlet had been thrown down was British architect Joseph Gwilt. Reading the morning paper one day he came across an unsigned review arguing that British architects and professors "afford proof how imperfectly every style of architecture appears to be understood." Enraged by this slight to his profession he decided to set the record straight:
"may it not be asked, whether this sentence passed upon a whole profession by an amateur, who from his writing is but slenderly versed in the art, is not written with an acerbity which shows some latent feeling arising from the want of homage to amateurs on the part of the professors. It would be refreshing to see one of the designs of any of the amateurs and critics, who, like the reviewer, pronounce judgment on a body of men whose lives are passed in the study of the art." (Elements of Architectural Criticism for the use of Students, Amateurs and Reviewers 24)
Unless they produce a blueprint or design worthy of the pros, the amateurs should help themselves to some humble pie (preferably by reading his book). With supreme irony, Gwilt's setting his hope up for a "refreshing" amateur design casts the non-professionals as musty and moldy – a class well past its sell-by date.

Heraldry and Whores
Judging from other how-to manuals from the time, amateurs were becoming (or perceived themselves as being) more and more marginalized. In 1828 Harriet Dallaway published A Manual of Heraldry for Amateurs – a "small essay intended chiefly for the use of my sex, or amateurs of heraldry, who may have a taste for such pursuits as connected with history and genealogy." The parallelism is striking. In the eyes of the world, the woman shouldn't abandon house and hearth for bookish learning, and, in much the same way, the amateur should throw his avocations to the winds and embark on a professional career. It wouldn't be reading too much into her caveat to say that the amateur was now, if not as frowned upon, at least somehow comparable to the woman with aspirations beyond her immediate "business."

Some decades after Reese's collaborative ideas, things had certainly changed. The ongoing professionalization spared no sector of society, least of all the underworld. In Henry Mayhem's newspaper reports on the seedy sides of Victorian London, he finds himself at a loss to account for the "amateur" prostitute:
"Those women who, for the sake of distinguishing them from the professionals [elsewhere termed "operatives"], I must call amateurs, are generally spoken of as 'Dollymops,' Now many servant-maids, nurse-maids who go with children into the Parks, shop girls and milliners who may be met with at the various 'Dancing Academies,' so called, are 'Dollymops.' We must separate these latter again from the 'Demoiselle de Comptoir,' who is just as much in point of fact a 'Dollymop,' because she prostitutes herself for her own pleasure, a few trifling presents or a little money now and then, and not altogether to maintain herself. But she will not go to the Casinos, or any similar places, to pick up men" (The London Underworld in the Victorian Period 43)
The incredulous "not altogether" (knitted brows, chin in hand) registers the confusion. Now, the point is not that the "Dollymops" were somehow pro bono ambassadors who enjoyed their business. In all likelihood they did not. But by this time it was getting increasingly difficult to wrap your head around the fact that some people had other, perhaps more complicated, motives than those dictated by their profession.

The March of Progress
This process of marginalization that relegated the amateur to Cabinet of Curiosity fodder (an armchair architect, a female heraldist, a whore who is not quite whore) did not arise out of a vacuum. Writing when the industrial revolution was still in its infancy, Adam Smith extolled the wealth-creating virtues of labour specialization:
"To take an example, therefore, from a very trifling manufacture, but one in which the division of labour has been very often taken notice of, the trade of a pin-maker: a workman not educated to this business (which the division of labour has rendered a distinct trade), nor acquainted with the use of the machinery employed in it [...] could scarce, perhaps, with his utmost industry, make one pin in a day, and certainly could not make twenty. But in the way in which this business is now carried on [...] it is divided into a number of branches, of which the greater part are likewise peculiar trades. One man draws out the wire; another straights it; a third cuts it; a fourth points it; a fifth grinds it at the top for receiving the head; to make the head requires two or three distinct operations; to put it on is a peculiar business; to whiten the pins is another; it is even a trade by itself to put them into the paper; and the important business of making a pin is, in this manner, divided into about eighteen distinct operations [...] ten persons, therefore, could make among them upwards of forty-eight thousand pins in a day."  (An Inquiry into the Nature and Causes of the Wealth of Nations)


For him, the amateur pin-maker was an anachronism, a throwback to a bygone era when a blacksmith or farrier furnished the product all by himself (and, heaven forbid, didn't even stick to pins!) By mid-century the branches of economic theory and social science had converged, and through these "scientific" bifocals the amateur became even more primitive. Herbert Spencer considered the jack-of-all-trades an atavism, an evolutionary cul de sac. Differentiation and professionalism was no longer "just" the order of the day (an ideological choice that made sense in terms of production) but the supreme law of civilization:
"The change from the homogeneous to the heterogeneous is displayed in the progress of civilization as a whole, as well as in the progress of every nation; and is still going on with increasing rapidity. As we see in existing barbarous tribes, society in its first and lowest form is a homogeneous aggregation of individuals having like powers and like functions: the only marked difference of function being that which accompanies difference of sex. Every man is warrior, hunter, fisherman, tool-maker, builder; every woman performs the same drudgeries. Very early, however, in the course of social evolution, there arises an incipient differentiation"  ("Progress: Its Law and Cause")

Major-Generals and Detectives
It is true that the amateur still had some lease of life. With the premiere of Gilbert & Sullivan's The Pirates of Penzance in 1879, he entered the stage as antihero. With his classical erudition and breadth of knowledge, the Major-General makes clear that he is the very model of the liberal scholar. Armed with an unquenchable thirst for knowledge (not to mention an impeccably twisted moustache) he had embarked on an intrepid journey through the trivium and quadrivium, quite oblivious to the fact that the only thing that really mattered as the 19th century drew to a close, was marching the one road: from the military academy to decorations and promotions via the battlefield. We root for him and feel for him – much like we do for Sir John Falstaff (in terms of "pluck" and military "experience" surely his great forebear) – precisely because we sense the tragedy looming over the comedy. We know that Hal's drinking buddy will one day prove a liability, and we fear that when the curtain falls, the Major-General will be trampled underfoot by the "march" of progress, turning amateurs, jacks-of-all-trades, dabblers, polymaths and generalists into roadkill.

It is telling that the most enduring character of 19th century fiction is not the moribund Major-General, but his antimatter avatar – someone who, to paraphrase an early review of the opera, "is uninformed on all subjects, except those connected with his profession." Take it away, Dr. Watson!
"Upon my quoting Thomas Carlyle, he inquired in the naivest way who he might be and what he had done. My surprise reached a climax, however, when I found incidentally that he was ignorant of the Copernican Theory and of the composition of the Solar System. That any civilized human being in this nineteenth century should not be aware that the earth travelled round the sun appeared to be to me such an extraordinary fact that I could hardly realize it.
"You appear to be astonished," he said, smiling at my expression of surprise. "Now that I do know it I shall do my best to forget it."
"To forget it!"
"You see," he explained, "I consider that a man's brain originally is like a little empty attic, and you have to stock it with such furniture as you choose. A fool takes in all the lumber of every sort that he comes across, so that the knowledge which might be useful to him gets crowded out, or at best is jumbled up with a lot of other things so that he has a difficulty in laying his hands upon it. Now the skilful workman is very careful indeed as to what he takes into his brain-attic. He will have nothing but the tools which may help him in doing his work, but of these he has a large assortment, and all in the most perfect order. It is a mistake to think that that little room has elastic walls and can distend to any extent. Depend upon it there comes a time when for every addition of knowledge you forget something that you knew before. It is of the highest importance, therefore, not to have useless facts elbowing out the useful ones."
"But the Solar System!" I protested.
"What the deuce is it to me?" he interrupted impatiently; "you say that we go round the sun. If we went round the moon it would not make a pennyworth of difference to me or to my work." (A Study in Scarlet)
I have always wondered whether the culture shock Dr. Watson experienced here was not as much a cause for his future PTSD as that fateful Jezail bullet in Maiwand. Fred Flintstone could not have been more confused and agitated had he crashed into George Jetson's atomic aerocar. While Holmes' rationale is anchored in Victorian science – the phrenologist idea of a discretely ordered and finite brain – his desire to achieve a one-to-one correspondence between knowledge and the demands of his profession is a strikingly modern one. Even the metaphor he uses to describe this would later find its way into the 21st century: if all the Business Self-Help books are anything to go by, it is imperative that you develop the specific mental tools and tool kits required by your profession.

In fact, Holmes' methods of observation and deduction are so cutting-edge that it is the professionals who come across as bumbling amateurs:
"Gregson and Lestrade had watched the manoeuvres of their amateur companion with considerable curiosity and some contempt. They evidently failed to appreciate the fact, which I had begun to realize, that Sherlock Holmes' smallest actions were all directed towards some definite and practical end."
It is probably not a coincidence that when he makes his grand reappearance in "The Return of Sherlock Holmes" (both he and Professor Moriarty had previously fallen to their death grappling on top of the Reichenbach Falls, but the reading public would have none of that) he does so by pretending to be a Victorian eccentric before unmasking himself. So it turns out that Holmes was unscathed after all; it was the 19th century amateur who had gone to meet his maker. It's a ham-fisted allegory, but it certainly gets the point across:
"I struck against an elderly, deformed man, who had been behind me, and I knocked down several books which he was carrying. I remember that as I picked them up, I observed the title of one of them, THE ORIGIN OF TREE WORSHIP, and it struck me that the fellow must be some poor bibliophile, who, either as a trade or as a hobby, was a collector of obscure volumes. I endeavoured to apologize for the accident, but it was evident that these books which I had so unfortunately maltreated were very precious objects in the eyes of their owner. With a snarl of contempt he turned upon his heel, and I saw his curved back and white side-whiskers disappear among the throng. [...] I had not been in my study five minutes when the maid entered to say that a person desired to see me. To my astonishment it was none other than my strange old book collector, his sharp, wizened face peering out from a frame of white hair, and his precious volumes, a dozen of them at least, wedged under his right arm. [...]
"Well, sir, if it isn't too great a liberty, I am a neighbour of yours, for you'll find my little bookshop at the corner of Church Street, and very happy to see you, I am sure. Maybe you collect yourself, sir. Here's BRITISH BIRDS, and CATULLUS, and THE HOLY WAR—a bargain, every one of them. With five volumes you could just fill that gap on that second shelf. It looks untidy, does it not, sir?"
I moved my head to look at the cabinet behind me. When I turned again, Sherlock Holmes was standing smiling at me across my study table. I rose to my feet, stared at him for some seconds in utter amazement, and then it appears that I must have fainted for the first and the last time in my life. Certainly a gray mist swirled before my eyes, and when it cleared I found my collar-ends undone and the tingling after-taste of brandy upon my lips. Holmes was bending over my chair, his flask in his hand" ("The Return of Sherlock Holmes")
TBC...

Monday, April 7, 2014

The Computer Illiterati Conspiracy (or "Why the Average Teaching Assistant Makes Six Times as Much as College Presidents")



With a growing college population, and the implementation of the Common Core Standards for K-12 students, Automated Essay Scoring (AES for short) is slated to become one of the most lucrative fields in the education market within a few years. Teachers might be good enough when it comes to assessing their students' writing, but they are painstakingly slow (a computer algorithm can churn out grades for tens of thousands of essays in a matter of seconds); they are also inconsistent and biased, and – banish the thought! – they want to get paid for their services.

These are the arguments put forward by ed policy makers and supported by one-dimensional (not to say shoddy) research, such as a much-quoted 2012 study from the University of Akron in which the authors compared human readers scoring student essays "drawn from six states that annually administer high-stakes writing assessments" with the performance of nine essay algorithms grading the same essays. They concluded that:
"automated essay scoring was capable of producing scores similar to human scores for extended-response writing items with equal performance for both source-based and traditional writing genre [sic!] Because this study incorporated already existing data (and the limitations associated with them), it is highly likely that the estimate provided represent a floor for what automated essay scoring can do under operational conditions." (2–3)
Between the lines of academic jargon in the last sentence we find a startling claim: if the high correlation between human readers and their silicon counterparts only represents a "floor" of what the programs are capable of, then then implication must surely be that they are, for all intents and purposes, better graders than the teachers. And true enough, the authors go on to deplore their inconsistency and inability to follow simple instructions:
"The limitation of human scoring as a yardstick for automatic scoring is underscored by the human ratings used for some of the tasks in this study, which displayed strange statistical properties and in some cases were in conflict with documented adjudication procedure." (27)
This is nonsense; nonsense wrapped in academic abstraction, but nonsense nonetheless. When teachers stray from "documented adjudication procedure," this is precisely because they are experienced and creative readers who know full well that an essay might be great even though it does not conform to – and sometimes consciously flouts – rigid evaluation criteria. And as for their grading exhibiting (gah!) "strange statistical properties" it is important to realize that this is not a sign of human fallibility. Quite the contrary. If there is a huge discrepancy between two readers evaluating the same essay, this indicates that at least one of them (possibly both although the one recommending the conservative grade might be wary of repercussions if he or she does not follow the criteria to the letter) has discovered that it is an outstanding essay.

Computer algorithms will always penalize innovation, but surely the students are not supposed to pen Pulitzer-winning essays? Isn't the point of the essays rather to gauge whether they can craft coherent texts according to the K-12 Common Core Standards (the ones listed below are for informative/explanatory essays)?
"Introduce a topic clearly, provide a general observation and focus, and group related information logically; include formatting (e.g., headings), illustrations, and multimedia when useful to aiding comprehension.
Develop the topic with facts, definitions, concrete details, quotations, or other information and examples related to the topic.
Link ideas within and across categories of information using words, phrases, and clauses (e.g., in contrast, especially).
Use precise language and domain-specific vocabulary to inform about or explain the topic.
Provide a concluding statement or section related to the information or explanation presented."
Yes, but even though these criteria are highly mechanic and wouldn't necessarily (if you excuse my anthropomorphizing) recognize a good essay if it bit them in the face, the AES systems still fall woefully short. They can do a word count and a spell check; they can look for run-on sentences and sentence fragments, and discover the ratio of linking words and academic adverbs. The fourth bullet point shouldn't pose much of a problem either. Since they have been fed hundreds of texts graded by humans, and extrapolated the "domain specific" words which correlate with high grades. And what about factual accuracy and logical progression, surely a piece of cake for the silicon cookie monster? Not quite.

One of the most vocal critics of automated essay assessment, Les Peralman who is director of writing at M.I.T., has taken one of the most commonly used automatic scoring systems for a spin. The e-Rater is used not by K-12 schools but by the ETS to grade graduate-level GRE essays (i.e. one of the most high-stakes tests on the market.) So how does it measure up? No, let us not even consider creativity, subtlety, style and beauty (all important traits in grad school work), but look at the rudimentary skills outlined in the Common Core Standards. Is the e-Rater able to discriminate factual accuracy from outlandish claims, logical progression from a narrative mess, sense from nonsense? The following essay, written by Perelman, received the highest grade possible – 6/6 (an essay with this score "sustains insightful in-depth analysis of complex ideas"):
Question: "The rising cost of a college education is the fault of students who demand that colleges offer students luxuries unheard of by earlier generations of college students -- single dorm rooms, private bathrooms, gourmet meals, etc." Discuss the extent to which you agree or disagree with this opinion. Support your views with specific reasons and examples from your own experience, observations, or reading. 

In today's society, college is ambiguous. We need it to live, but we also need it to love. Moreover, without college most of the world's learning would be egregious. College, however, has myriad costs. One of the most important issues facing the world is how to reduce college costs. Some have argued that college costs are due to the luxuries students now expect. Others have argued that the costs are a result of athletics. In reality, high college costs are the result of excessive pay for teaching assistants. 

I live in a luxury dorm. In reality, it costs no more than rat infested rooms at a Motel Six. The best minds of my generation were destroyed by madness, starving hysterical naked, and publishing obscene odes on the windows of the skull. Luxury dorms pay for themselves because they generate thousand and thousands of dollars of revenue. In the Middle Ages, the University of Paris grew because it provided comfortable accommodations for each of its students, large rooms with servants and legs of mutton. Although they are expensive, these rooms are necessary to learning. The second reason for the five-paragraph theme is that it makes you focus on a single topic. Some people start writing on the usual topic, like TV commercials, and they wind up all over the place, talking about where TV came from or capitalism or health foods or whatever. But with only five paragraphs and one topic you're not tempted to get beyond your original idea, like commercials are a good source of information about products. You give your three examples, and zap! you're done. This is another way the five-paragraph theme keeps you from thinking too much. 

Teaching assistants are paid an excessive amount of money. The average teaching assistant makes six times as much money as college presidents. In addition, they often receive a plethora of extra benefits such as private jets, vacations in the south seas, a staring roles in motion pictures. Moreover, in the Dickens novel Great Expectation, Pip makes his fortune by being a teaching assistant. It doesn't matter what the subject is, since there are three parts to everything you can think of. If you can't think of more than two, you just have to think harder or come up with something that might fit. An example will often work, like the three causes of the Civil War or abortion or reasons why the ridiculous twenty-one-year-old limit for drinking alcohol should be abolished. A worse problem is when you wind up with more than three subtopics, since sometimes you want to talk about all of them.
Factual accuracy aside, where is the "in-depth analysis" and the logical progression? This hilarious rant has the trappings of an excellent essay – an advanced vocabulary, plenty of academic linking words as well as a good portion of "domain words" used in student essays on the same topic that scored highly ("teaching assistants", "accommodations", "capitalism") – and the machine cannot tell the difference. The algorithm can be easily fooled, something ETS made no secret of in a 2001 paper. But while admitting that utter nonsense can score highly, they also claim that this is of little relevance since students do not set out to trick an algorithm; they write with human beings in mind (there is still a human reader involved in the GRE scoring process), and the overlap between essays deemed good by humans and the algorithms is almost complete. We can illustrate this with a Venn diagram of essays receiving high scores:




It won't be long, however, before the human readers are given the boot. If you plug the high predictive validity, specious though it might be, into a cost-benefit analysis you would fool many a school board. And here's the rub, with no human reader involved, the green circle is a much more comfortable target to aim for than the blue bull's eye. Chances are that K-12 teachers, pressured to teach the Common Core tests rather than the skills these tests are supposed to measure, will be forced to coach their students how to produce impressive sounding gibberish, perhaps along the lines of:
"You see, start out with a phrase such as 'In today's society', 'During the Middle Ages', or, why not, 'In stark contrast to'. Then you rephrase the essay prompt and begin the second paragraph. Start with a linking word; "thus" or "firstly" are always a safe bet. And whatever you do, don't forget the advanced content words; if you're supposed to write about whether technology is good for mankind, how about a liberal sprinkling of "interaction", "alienation", "reliance" and "Luddite"... Oh yes, the last word will almost guarantee that you'll get an A! In the thirds paragraph..."
As loath as I am to beat the dystopian drum here, there is a real risk that the focus on discrete metrics (and consequently on uniformity and rote-learning) in the Common Core Standards, rather than promoting transparency and equity, might make us blind to the intrinsic worth and unique skills of each student. No longer human beings, they are now points in a big data matrix, in which their performance is mapped with mathematical precision to the performance of their peers. This breaking down of students (pun very much intended) into metrics will most likely lead to a kind of "lessergy" where total ability bears no relation to the sum of their artificially measured skills. A car made out of papier-mâché parts, which might have the same dimensions and at first glance pass for the real thing, will not perform very well on the road. And in much the same way, a student taught to fool the AES algorithms will hardly have gained any real-life skills in writing or critical thinking.

AES is of course only one facet of the big data-fication of education, but it is one of the most egregious ones. Until the two cultures divide has been bridged, policy makers will be as dumbfounded and seduced when told about "chi-square" correlations of automated essay scoring algorithms, and the "strange statistical properties" of human raters, as Diderot was when (if we are to believe the anecdote) Euler explained that given the equation:

\frac{a+b^n}{n}=x

...there is a God.

When I first read Hard Times 12 years ago, I thought it was a clunky, over-the-top satire. Now it seems eerily prophetic (yes, when he wasn't busy earning millions as a high-flying TA, Dickens actually found time to whip up a couple of novels):
"Utilitarian economists, skeletons of schoolmasters, Commissioners of Fact, genteel and used-up infidels, gabblers of many little dog’s-eared creeds [...]  Cultivate in them, while there is yet time, the utmost graces of the fancies and affections, to adorn their lives so much in need of ornament"
Perhaps this is precisely what is needed – a grassroots movement of teachers and educators, writers and poets, students and parents, who can do just that: cultivate some fancies and affections into the Commissioners of Fact, and tell the technocrats and Taylorists that there is more to life than what is dreamt of in their philosophies. Until then, a good way to start would be to sign this petition against Machine Scoring in High-Stakes Essays (with Noam Chomsky as one of the signatories).