Tell me where is fancy bred.
Or in the heart or in the head?
How begot? How nourished?– The Merchant of Venice, Act 3: Scene 2
In addition, there is literature arguing that sharing explicit criteria with students is inadequate and almost inevitably leads to instrumental learning and “criteria compliance” among students …
– A Critical Review of the Arguments Against the Use of Rubrics
Like a kid in a candy shop, too much choice can lead to decisions based on superficial or merely external and cosmetic factors. Bassanio chose the lead casket and he described gold as gaudy and as Hard food for Midas. Even so, he made the right choice not because he knew not all that glitters is gold.
Gold does glitter and this is the problem. How does one distinguish between options? In every conceivable domain where there is competition, it is not only possible but true in practice that market participants come to resemble each other at least on the surface.
This can be a very effective strategy if the substance is not as important as the appearance of substance. Since bandwidth is a scarce resource, something which looks good on the surface could be more valuable than something which is actually good. If good means eyeballs and the ability to inspire envy, glitter is gold.
Someone once said those who wear Brand X watches have no character because all they see and more importantly, want others to see, is X emblazoned prominently. Some like Bassanio have realised that they could be judged for being shallow. So, marketers have a new trick up their sleeves. They introduced a new word, understated. Understated means not garish. If something does not shine, how would it attract? Indeed, nature requires peacocking, in any domain. Though Bassanio chose the casket which did not shine, he did so only because there was a word in the song Portia had her servant sing which stood out to him.
Branding comprises educating about quality and communicating evidence of that quality. In every product and service and increasingly, human category, manufacturers, providers or contenders have constructed a list of criteria which the market should focus on. Having constructed the list, they then go to great lengths to demonstrate fidelity to that list. Apart from the obvious conflict of interest, this causes great unhappiness because of poor fit. I was told this is good. All of us know this is what good looks like. The rest look at me because I own good. Why am I still unhappy?
It is arguable that so long as selection or reward is based on some pre-determined set of criteria, one is merely moving from one form of glitter to another, whether we call it understated or something else.
Predetermined criteria reduce diversity and autonomy – I have to be what everyone says is good and choose what everyone says is good. Reliance on predetermined criteria could also leave us in Arragon’s plight when he heard the following words from Portia; Too long a pause for that which you find there. He too was wise enough to know, By the fool multitude that choose by show, not learning more than the fond eye doth teach – flashiness is deceptive, it is what is inside that matters. However, he depended on reason and not instinct.
Students may not be able to rely on their instinct because it is not sufficiently developed or because they do not trust it. Teachers too cannot merely rely on their instinct and nothing more when it comes to evaluations to avoid giving the impression of arbitrariness, favouritism, other types of bias and negligence.
For these reasons, students are introduced to rubrics which are used in assessments, especially those which are formative in nature. Formative assessments are premised on assessment for learning (AFL). In AFL, tests reveal gaps in the student’s current level of performance and highlight areas for growth. A “rubric is an assessment tool that lists the criteria for a piece of work or what counts (for example, purpose, organization, details, voice, and mechanics often are what count in a written essay) and articulates gradations of quality for each criterion, from excellent to poor” (Goodrich, 2005) – 10 and below: not meeting expectations, 11-14: barely meeting expectations, 15-17: meeting expectations, 18-20: exceeding expectations.
Since a rubric tells a student where he currently stands and what the next level requires, it lends itself to formative assessments. Theoretically, he would know the next level to work towards. Teachers want students to aim for the next level and they know that if an overall numerical score or grade was given, students would zoom in on only this and ignore all other information on the rubric. This is why marks are not given especially for writing assignments which require a process of drafting.
However, knowing what the next level requires is different from knowing how to get there. Let’s consider a rubric for compositions. Compositions are marked on two broad criteria; language and content. Under content, common keywords across mark bands are indicative of a criterion. Ideas, for example, is one criterion for compositions. For the mid-range mark band, the indicator could read, Ideas are mostly clear and relevant and for the top-range mark band, Ideas are very clear and fully relevant to the topic.
A student who has obtained a mid-range mark would know he needs his ideas to be clearer and more relevant to the topic but he would not know how. To assist his progress, he may be shown an exemplar of what fully relevant and very clear is. This might go some way in convincing him of the objectivity of marking. However, since composition topics vary, he may not know how to be relevant in a new topic. Also, even if he finds the ideas in the exemplar expressed clearly, he may not know how to express the ideas in his own composition clearly. Also, exemplars have been shown to restrict diversity of ideas, especially if a student is performance oriented.
Panadero and Jonsson (2020) enumerate the criticisms which have been levelled against the use of rubrics in assessment. One such criticism was classified as instrumentalism and criteria compliance – “In essence, this means that when being informed about the criteria, learners will focus on meeting these criteria with minimal effort and also limit their performance to what is explicated by the criteria, leaving other things aside”. This is known as studying to the test. Since the discipline is much more than the test, it is entirely possible to ace the test and give the appearance of mastery without mastery.
One example of instrumentalism and criteria compliance in the workplace could be a focus on relationships rather than job role if employees realise that visibility and good relations are more profitable than earnest labour. In group presentations, the most confident and vocal speaker could be mistaken for the one who contributed the most.
Another noteworthy criticism is that rubrics do not take into account “emergent criteria” or “criteria that surface while evaluating student performance”. Wilson (2006) (as cited in Panadero and Jonsson, 2020) has argued that evaluations should be solely on the basis of the “individual reactions evoked by student performance” and elaborated thus: “I suggest that we make ourselves transparent as we read−that we pay attention to what goes on in our minds and try to put our reactions and questions and wonderings and musings and connections and images into words”. This may result in a decrease of “interrater agreement”, when there are two markers but would keep faith with the beauty is in the eye of the beholder principle.
Emergent criteria and individual reactions are connected. There may well be situations when we experience a product, service or a piece of work where we feel good and we don’t know why. Rubrics and other kinds of checklists for selection are formed this way. Someone gathers all the items of some category which evoke positive reactions and studies the similarities. These similarities are then documented and measured on some scale; high X – low X. The various similarities have now become criteria. Criteria has the connotation of qualification – without these characteristics, the item does not pass muster or even considered seriously for selection. These criteria are then widely publicised in articles and so firms and contenders in other domains adopt them. Now these similarities are widely seen and accepted so much so that no other good is imagined.
This is how popular culture works. When immersed in drama of one culture, viewers form an impression of what it means to be acceptable and valued. This becomes an unchallenged truth. It is only when one accidentally or deliberately watches drama of a very different culture, that one realises that there are other equally valid conceptions of value.
If we take a step back to examine the now universally accepted criteria, we realise they were based on similarities. Items X, Y and Z have gained mass acceptance. They all have A, B and C. A, B and C are what makes something valuable. However, it could well also be the case, that X, Y and Z were each good in their own way and to different people. These other characteristics would never see the light of day because they would not have made it to the list of criteria. Also, the very idea of mass acceptance is oxymoronic. What appears to be universal acceptance could simply be universal recognition. So, characteristics A, B and C make an item noticeable but they may not satisfy. It is difficult to see what universal list of criteria could satisfy the wide-ranging diversity of the market.
If something is chosen on criteria, it has not been chosen at all.
The Brain Dojo