Assessment in language education
(MA in English language teaching)
According to Erwin (1991), as cited in Javaherbakh (2010) assessment is termed as a continuous process of learning and development. Assessment is an approach which makes it possible to gather information and make inferences about the learners’ potentials or the quality or success of teaching according to the different sources of the learners’ performance. Assessment can be carried out in various forms such as, test, interview, questionnaire, observation, etc. for example it seems important to assess the comprehension ability of a learner to comprehend if the learner can follow a course of study in a school, or extra instruction is needed.
Delivering assessment is one of the main activities that every teacher is commonly done in order to provide feedback on the students’ learning. According to Walvoord (2010), assessment is a systematic collection of information to give decisions on their achievement. Knowing their attainment usually motivates the students to make a better endeavor to achieve more. The students’ engagement in the learning process can make assessment more effective to affect the students’ motivation and encourage their learning efficiency (Khonbi & Sadeghi, 2012) Therefore, assessment is an important tool that must be conducted in the learning process to evaluate the targeted learning objectives in order to increase both the students’ achievement and motivation. Recently, educators in the world are putting more emphasis on the use of authentic assessment. It is regarded as a more valid tool in assessing students’ real competence. Authentic assessment is any type of assessment, which is implemented in order to indicate skills and competencies, representing problems and situations likely to be encountered in daily life (Collins, 2013). It is performance-based and requires students to exhibit the extent of their learning through a demonstration of mastery. The learning process is moved beyond memorization of fact or theory, but it encourages the students to respond by using the knowledge they have learned. The characteristics of authentic assessment are namely performing a task, real life, construction/application, students’ structure, and direct evidence (Mueller, 2006). Therefore, authentic assessment is a form of assessment which makes the students to perform the tasks they encounter in their daily life by using the knowledge and skills they obtain during the learning process (Marhaeni & Artini, 2015).
Taras (2001) alludes to increasing student consumerism manifesting itself in a greater demand for involvement and control of the assessment process. In the context of students being customers of an institution of higher learning, assessment grades “represent the final package that students want or expect to be delivered” (p. 612). This emphasis on satisfying the student customer in the form of assessment grades creates a dilemma for assessment practices. On the one hand, teachers or academics may desire to provide opportunities for assessment to meet a variety of educational objectives. Such objectives may include enhancing the students’ learning and developing the students’ assessment ability. However, if consumer satisfaction was the ultimate determiner of students’ grades, then their integrity and role in the assessment process may be questionable.
Another context for assessment practice is its potential for providing greater transparency in the assessment process for students. It allows students to understand how they are assessed by their teachers, by having their assessments compared with (and discussed in terms of) the teacher’s assessment (Tan, 2003).
In EFL contexts, teachers need to help students develop their assessment ability as a means to use peer/teacher assessment with proper evaluation criteria. Approaches to peer-and teacher assessment are necessity to develop students’ professional L2 development. Assessment is a huge topic encompassing everything from formal tests to everyday classroom tests. This challenging task helps language instructors to find effective ways to determine what and how much the students are learning. Instructors need to think carefully about their instructional goals and what kinds of assessments support these goals. Effective assessment tools can help teachers modify and focus instruction on what students need to know and be able to do in the target language in order to achieve communicative competence.
According to Bachman (2004) assessment is “a process of collecting information about something that we are interested in, according to procedures that are systematic and substantially grounded” (pp. 6-7). The result of an assessment procedure can be either a score or a verbal description. Spolsky and Ari Huhta (2008) referred to assessment as “all kinds of procedures used to assess individuals (e.g., informal observations, self-assessments, quizzes, interviews, tests)” (p. 469).
Assessment can be classified based on the decisions made according to the possible outcomes as formal vs. informal assessment, and summative vs. formative assessment. Formal assessments systematically planned and designed to get information are about students’ achievement in predetermined times. Brown (2004) described formal assessment as tennis tournaments, and made a distinction between formal assessment and testing. Although he considered all kinds of tests as formal assessment, this is not true vice versa; i.e. formal assessment is not necessary performed as tests. He associated tests with time constraints which is not always the case in formal assessment. For example, systematic observation of students’ oral performance is a kind of formal assessment, but it is hardly called a test which is limited to specific time and gathers limited pieces of information (Brown, 2004).
Based on Brown’s definition (2004), informal assessment includes occasional and unplanned comments and feedbacks. The informal assessment is not designed before the class. Results of this kind of assessment are not recorded and no judgment is made based on them. According to Brown, informal assessment is of various types of feedback; from simply saying “Nice job!” to giving some detailed comments about students’ performance (Brown, 2004). Informal assessment is interweaved with every second of teaching process as teachers always give feedback to students. So, informal assessment, according to Brown’s definition, is more about giving feedback rather than deciding upon students’ performance.
Summative assessment, as its name suggests, summarizes what the students learnt during a course and it is usually done at the end of the semester (Brown, 2004). This kind of assessment indicates what objectives have been accomplished, but it lacks feedback or any suggestion to improve performance. Final exams or proficiency tests are examples of summative assessment. Alderson (2005) associated summative assessment with long traditional tests which were so stressful to students. Any kinds of test which lacks further feedback and the only possible use of it is gathering scores in the eyes of students can be summative even if teachers have primarily designed the test to facilitate learning and teaching.
Taking place during learning, formative assessment is aimed to help learning and teaching by giving appropriate feedback (Lewy, 1990). Two purposes of formative assessment are provided by Nitko (1993): (a) selecting or modifying learning procedures, and (b) choosing the best remedies for improving weak points in learning and teaching. Gattullo (2000) characterized formative assessment as “(a) an ongoing multi-phase process that is carried out on a daily basis through teacher–pupil interaction, (b) a feedback for immediate action, and (c) modifying teaching activities in order to improve learning processes and results.” (p. 279). Students form their knowledge by analyzing and internalizing teachers’ comments via classroom assessments (Brown, 2004).
It seems that formative assessment has not always been the focus of attention in second language studies. Before 2000 a number of studies were done about classroom assessment in regular school programs (Rogers, 1991; Wilson, 2000), but very few studies were conducted about formative assessment in EFL learning contexts (Cheng, Rogers, & Hu; 2004). Rea-Dickins and Gardner (2000) pointed to this neglect too and said that in compare to other topics in language testing, formative assessment had received less attention. However, it should be said that Bachman (1990) was one of the first scholars who discussed about the complexities and difficulties of formative assessment. He stated that types of feedback received by the students could affect the results of future formal tests. Bachman put more emphasis on formal tests, and the construct of formative assessment were not discussed by him. In his later book with Palmer (1996), he focused on feedback and the relation between “formative evaluation” (p. 98) and formal tests. Shohamy (1995) was another scholar to take an initiatory step in discussion of formative assessment. He named some methods such as portfolios and projects which teachers used to put less reliance than on formal tests and to capture different aspects of language competence. However, the construct and practices of formative assessment in EFL contexts were roughly discussed before 2000 in spite the fact that its usefulness and help had been long been recognized by both teachers and researchers.
Students should use the language in order to learn it, and if they are graded all the time, they do not have the opportunity to do so. They should receive feedback, analyze it, and have the chance to test their hypotheses based on the feedback received. This is the very basic requirement to learn a language (Brown, 2004; Harris & McCann, 1994). Summative formal assessment makes use of traditional paper-and-pencil tests and is just followed by scores without any further feedback. They are usually done at the end of a course which is so stressful for students and teachers. Lack of feedback results in lack of diagnostic information, and students do not clearly know about their weak points. But, students’ performance highly depends on appropriate feedback from the teacher which is the defining feature of formative assessment. Teachers can make use of formative assessment to prevent negative washback effect of formal testing; which is the separation of teaching and learning in the eyes of students.
To assess their students every session, teachers have to consider an assessment task and have some questions in their mind such as when and how often should they assess the students? Or how should they conduct an assessment procedure? The question of “What” and “Why” rarely come to teachers’ mind (Bachman & Palmer, 2010). The reason of not asking what-question is quiet clear as teachers usually know what they want the learners to learn. However, it is somehow vague that why teachers do not ask why-questions; they either know the answer or seldom consider the reason of assessment. Why-question is important since it will define the decision to be made about the outcomes of an assessment. The very first use of language assessment is to make decisions for individuals (micro-evaluation), programs (macro-evaluation), and other stakeholders (Bachman & Palmer, 2010). It can be used to select individuals, place them into appropriate course of study, make changes in instruction, predict future performance of test-takers, make changes in educational programs (formative or summative decisions), to formulate new research questions, and modify the understanding of a specific language phenomenon (Bachman, 2004).
Cizek (2010) identified 10 important features of formative assessment as following:
- Students take responsibility for their own learning through formative assessment.
- It clarifies specific learning goals of any course.
- Students and teachers may focus on goals representing valuable educational outcomes with applicability beyond the learning context.
- Students may be able to identify their current knowledge/skills and the necessary steps for reaching the desired goals.
- Formative assessment provides the development of plans for attaining the desired goals.
- Students are encouraged to self-monitor their progress toward the learning goals.
- It provides examples of learning goals including, when relevant, the specific grading criteria or rubrics that will be used to evaluate the student’s work.
- Frequent assessment, including peer and student self-assessment and assessment embedded within learning activities.
- It includes non-evaluative, specific, timely, and related to the learning goals feedback that provides opportunities for the student to revise and improve work products and deepen understandings.
- Students’ metacognition and reflection on their work are promoted.
Heritage (2007) further categorizes formative assessments into three types contributing to the learning cycle:
- Those that happen during a lesson are called on-the fly
- Planned-for-interaction which is decided before instruction, and
- Curriculum-embedded, i.e., it is embedded in the curriculum and used to gather data at significant points during the learning process.
To be successful assessors, teachers must develop an “assessment literacy” (Gallagher & Turley, 2012): a deep understanding of why, when and how they assess the learners in ways that positively impact student learning. In addition, successful assessors view assessment through an inquiry lens, using varying assessments to learn from and with their students in order to adjust classroom practices accordingly. A deep knowledge of assessment and an inquiry approach to assessment may create a particular stance toward assessment. Teachers who hold this stance as knowledgeable inquirers are equipped with the autonomy to make decisions about the assessment practices which in turn provides them with meaningful information in their own classrooms; formative assessment can indeed be powerful and productive, especially those assessments that are planned, designed, implemented, and studied by the classroom teacher (Stephens & Story, 1999). Such assessments provide information to the teacher which enables them to understand the students better and support them in taking the next steps in their learning (Vygotsky, 1986).
Teachers as knowledgeable inquirers are able to choose among a variety of tools and strategies that best suit the context of their own classrooms. Analogous to the work of ethnographers or teacher researchers, meaningful formative assessment is used by teachers to help them study students in action and their artifacts of learning so that they better understand their performance. As teachers conduct their assessment work from this stance of knowledgeable inquirers, they have many strategies and tools from which to choose. Successful teacher assessors carefully select or create the right assessment at the right time in order to inform instruction and support the learner, thoughtfully administering the assessment with the least disruption to the ongoing learning in the classroom (Serafini, 2010). These assessments might be grouped into four types—Observations, Conversations, Student Self-Evaluations, and Artifacts of Learning—briefly described below:
The foundation of a teacher’s assessment work is observations. Teachers who observe students involved in language use and learning tend to know their students’ strengths and challenges better and are then able to plan supportive classroom learning experiences. The central development of a formative assessment stance is learning to observe closely to see beyond assumptions and predictions. Observations take following forms:
- Field Notes
The descriptions of classroom interactions are recorded (in journals, on computer, or on sticky notes), by being away from judgment and interpretation until later. Some teachers scribble down notes during class, some wait until the end of the day, and others videotape and then later take notes, based on viewing particular segments.
- Running Records
While listening to learners’ oral reading and to their retelling of what has been read, teachers take quick notes about student reading.
- Checklists and Observation Guides
Teachers gather information about pre-selected learning behaviors or interactions by marking tallies on a chart or keeping a record of examples of specific student actions (such as the types of questions being asked or the particular strategies being used).
Teachers may specifically ask students for further information by conducting surveys, interviews, or conferences based on questions they have about students’ learning. Among the conversational tools teachers use for assessment are these:
General information about reading or writing preferences or attitudes toward classroom literacy experiences can be gathered through written or oral surveys. General trends in a class or for a group of students across time can be indicated based on the data achieved via surveys. Ideally, teachers would use this information to plan more focused follow-up assessments or observations.
A more targeted look at assessment would be provided via interviews. Teachers may either work with open-ended questions, such as “When you are reading and you come to something you don’t know, what do you do?” (Burke as cited in Serafini, 2010) or “What would you like to do better as a writer?” or other questions based on specific questions they have about student learning.
Teachers invite students to share specific information about their intentions, processes, and/or products in order to help both teacher and student better understand the student’s learning and identify next steps in reading and writing conferences. Teachers often talk with students about the processes they use to select a topic for a writing piece, or the writing strategies they learned in a recent writing project. Through reading conferences, teachers learn why a student chose to abandon a particular book or what a student is working to understand in a current reading selection.
- Student Self-Evaluations
Student self-evaluations, an important component of formative assessment, are deliberate efforts to elicit student perspectives on their own learning. Students may reflect on progress toward a goal, on processes used for reading or writing, on new goals, or on lingering questions. Students are encouraged to monitor their own learning and learning needs through self-assessment which serves as an additional source of information on student learning. Student self-evaluations can take many forms:
- Exit Slips
To gather information about current understandings and/or current questions, students are invited to complete a quick “exit slip” as they leave the room or at the end of a lesson.
- Rubrics and Checklists
Students assess their own work and use the information to revise or to plan future learning experiences using pre-determined or student-generated lists of quality indicators.
- Process Reflections
Students write reflections highlighting the process they used to create particular artifacts or understandings and lessons they learned that will influence the way they approach similar work in the future.
- Student-Led Conferences
Conversations between student/parent, student/teacher, or among student/ parent/teacher are designed to help the student to highlight significant areas of growth and to set goals for future learning.
- Artifacts of Learning
Teachers review data about individual students or groups of students for planning future learning experiences. For example, teachers may:
- Collect a variety of sources of information on a single learner (case study) in order to identify patterns of understanding across the data set. Data may include samples of student work, notes based on classroom observations, input from other adults including parents, as well as standardized assessment data.
- Review a class set of work samples or observations in order to group students for further instruction or to plan learning experiences for the entire group.
- Look back at a variety of points along a student’s learning journey over the school year or over several years in order to see patterns of growth and to identify important next steps.
Teacher assessors also engage in ongoing analysis of the information available via the tools and strategies used to gather information about learning. Classroom teachers constantly make decisions based on their analysis of the information available at any given moment as learners engage in learning. Formative assessment allows teachers to immediately match instruction to students’ needs. As teachers refine their powers of observation and their skill in analyzing, they become better able to see what students are learning so that they plan for future learning experiences. In addition to this, teachers also protect time to engage in more thoughtful analysis by capturing information about learning that can be reviewed and studied over time. During this focused analysis, teachers review the information available and ask themselves and one another three key questions: “What do you see?”; “What do you make of it?”; “What will you do about it?” (Boudett, City, & Murnane, 2013).
Connecting Assessment and Instruction
The first way of conceptualizing a relationship between assessment and instruction that Rea-Dickins (2004) discussed, has to do with the impact of formal testing on teaching and learning. This phenomenon is generally referred to as the wash back effect (Cheng, 2005). Wash back manifests itself predominantly in situations of high-stakes testing, where obtaining high test scores comes to be the goal of education, with the result that the scores themselves are not representative of knowledge or ability in a given domain but rather indicate how well students have been trained for the test (Alderson & Wall, 1993; Bailey, 1996). Some authors, such as Fredricksen and Collins (1989), have suggested that test impact could be good or bad. Describing what they term a test’s systemic validity, they argue that a test has high systemic validity if it promotes favorable instructional practices and low systemic validity to the extent that it inhibits learning (p. 28). While one can appreciate this perspective, it is nevertheless the case that the social value placed on attaining high tests scores is sometimes so great that tests themselves actually stand in the way of instructional practice.
While wash back studies investigate the impact of assessment on instruction, other researchers reverse this relationship and assign the leading role to instruction (Poehner, 2008). In this approach to linking assessment and instruction, assessment procedures emerge from a grounded analysis of instructional interactions and pedagogical practices as observed in the classroom. This approach enables classroom teachers to assume a more agentive role in determining assessment practices. Rea-Dickins (2004) explains that an added advantage of curricular-driven assessment is that it lends itself well to evaluations of program effectiveness. In other words, because the assessments are derived from curricular objectives, students’ assessment performances can be taken as an indicator of how well those objectives are being met.
A third approach to bringing assessment and instruction together involves establishing pedagogical goals and then devising parallel instruction and assessment activities (Poehner, 2008). Rather than imposing an assessment on an extant educational context or using classroom practices to generate assessment procedures, instruction and assessment from this perspective should be developed alongside each other. The task-based framework is an excellent example of such an approach. In task-based pedagogies, both instruction and assessment are modeled after the kinds of communicative activities that characterize everyday life (Skehan, 2001; Wigglesworth, 2001). Learning tasks are intended to simulate real-life communicative interactions that promote students’ “individual expression” (Chalhoub-Deville, 2001, p. 214). These types of interactions are also used in assessment situations, where it is argued that their authenticity allows examiners to make generalizations about learners’ abilities that extend beyond the “learning/testing situation” and that predict how they will perform in other settings.
The final perspective on the relationship between assessment and instruction discussed by Rea-Dickins (2004), attempts to break through this wall by carrying out assessments during the course of instructional activities. This “instruction-embedded” assessment is usually carried out by classroom teachers in order to fine-tune instruction to learners’ needs, and as such represents a type of formative assessment. Formative assessment refers to assessment practices intended to feed back into teaching by providing important information regarding learners’ strengths and weaknesses that can be used for subsequent instructional decisions (Brown, 2004). As Bachman (1990) explains, formative assessment is usually contrasted with summative assessment, or assessments that occur at the end of an instructional period and are intended to report on learning outcomes. Both summative and formative assessments are concerned with learners’ futures in very different ways. Summative assessments report on individuals’ past achievements in order to make decisions about their future possibilities, including promotion to the next level of study and certification of competence required for graduation or employment. Formative assessments, on the other hand are more directly connected to teaching and learning.
The Impact of Assessment Techniques on Students
The impact of assessment on students will be greatly determined by the purpose of the assessment, type of assessment tool used and the way in which feedback is given to students (Lynch, 2001). The research described above shows the diversity in purposes of assessment, from measuring and quantifying knowledge to supporting and enhancing the learning process. The majority of research on assessment measures the impact on students in terms of achievement, aligning with the dominant assessment paradigm. In this subsection, research that focuses on the affective impact of assessment will be discussed.
Harlen and Deakin Crick (2003) conducted a systematic review of the literature based on the question “What is the evidence of the impact of summative assessment and testing on students’ motivation for learning?” (p. 178), covering research published up until 2000. They examined 19 studies deemed to be the most relevant to the topic through a process of inclusion criteria. They defined motivation as including self-esteem, self-concept, and self-efficacy, as well as effort, interest, and attitude. They found a positive relationship between achievement and self-esteem, which was reinforced by high-stakes testing. This suggests that recognition of success in schooling is related to high self-esteem, while perceived failure, represented by low grades, is linked to low self-esteem.
Brookhart, Walsh, and Zientarski (2006) also studied the relationships between motivation, effort, and assessment by examining middle school science and social science classes in Pennsylvania. They found that there was little relationship between effort and achievement, but that there was a positive relationship between motivation and achievement. Moreover, motivation was strongly linked to self-efficacy. Students who believed they were capable were more motivated, and performed better on assessment tasks, both tests and performance assessments.
Tal (2005) implemented a multi-faceted approach to assessment in an environmental education course for pre-service teachers. While students responded positively to these types of assessments, many acknowledged that they struggled with the new assessment framework, and did not understand the purpose of many of the tasks, such as self-assessments. Her work indicates that the intended impact of assessment practices may not be realized if students are not prepared for new types of assessments.
Hanrahan and Isaacs (2001) studied the responses to self- and peer-assessment of 233 university students. As in Tal’s study, there seemed to be a disconnection between the research methodology and the assessment philosophy posed by the researchers, since the assessments were summative and did not support the learning process. They also found, like Tal, that students required training with these assessment techniques in order to benefit from them. Their research indicated that students felt that self- and peer-assessment helped them develop critical thinking skills, increased motivation, and engendered empathy for their teachers. While the development of critical thinking skills was an obvious goal of the exercise, the development of empathy was an unexpected by-product.
Students’ appraisals of whether they have achieved or made sufficient progress toward their target goals have emotional implications (Anderman & Wolters, 2006). This recognition of the emotional impact of self-assessment is uncommon in the literature on the impact of assessment, and is especially relevant in the context of outdoor education. They also note that affective experiences in assessment have an effect on students’ goal selection and thus their future activities.
The research on the impact of assessment on students suggests a link between assessment and motivation, which has the potential to work for high-achieving students and against low achievers. Assessment practices also have unintended effects on students, and can be ineffective if students do not understand or relate to the purpose. However, Anderman and Wolters (2006) open up the possibility of other emotional impacts of assessment and how these are manifested in students’ behavior and self-concept.
Assessment in Language Education
Given the varied and often conflicting responsibilities teachers face in their daily activities, it is not surprising that assessment becomes a demanding task and poses the question that “Why do we assess anyway?” Students frequently become angry when they have to be assessed in order to demonstrate their mastery of a specific content or competency to pass to the next level of instruction (Poehner, 2008). Questioning the purpose of assessment may seem quite unusual since it has become a part of our everyday life (Brown, 2004). However, assessment specialists are increasingly reflecting on the reasons behind specific assessment practices as well as the role of assessment in society. Traditionally, assessment is described as an information-gathering activity (Lynch, 2001). For instance, McNamara (2004) explains that we assess in order to gain insights into learners’ level of knowledge or ability. One might imagine that the information gained through assessment procedures would be welcomed, and viewed as an integral component of good teaching. However, the use of terms such as “teaching to the test,” “narrowing of the curriculum” and “assessment-driven instruction” suggests that assessment is seen as an activity that is distinct from the goals of teaching (Lynch, 2001; McNamara, 2001). Rea-Dickins’ research (2004) into classroom-based assessment leads her to the conclusion that teachers often feel compelled to choose “between their role as facilitator and monitor of language development and that of assessor and judge of language performance as achievement”.
Approaches to Assessment
- Traditional Assessment
In theory, the primary purpose of assessment is to measure whether or not learning has taken place (Fiddler, Marienau, & Whitaker, 2006). Tyler (1949) mentioned the purpose of assessment as a “process for finding out how far the learning experiences as developed and organized are actually producing the desired results” (p. 105). Assessment will refer to the systematic gathering of information for the purposes of making decisions or judgments about individuals. Assessment is the super ordinate term for a range of procedures that includes measurement and testing but is not restricted to these forms (Lynch, 2001). Both of these definitions of the goals of assessment are somewhat idealized and do not suggest a mechanism for how assessment actually takes place.
An assessment paradigm is a framework that shows the underlying assumptions about the nature of knowledge and learning. Traditional assessment practices are rooted in the behaviorist perspective of learning. This perspective on learning is based on the idea that learning is a process of accumulating knowledge and information in discrete pieces, with limited transfer or synthesis (Shepard, 2000). Shepard (2000) argues that assessment under this paradigm tries to find if students have retained the information given to them by their teachers. The main consideration in designing assessment tools under this perspective is to ensure that the material covered is being assessed, as information retention is the primary goal.
The primary concerns in traditional assessment are ensuring validity and reliability in assessment tools (These concerns are in line with the goal of measuring and quantifying learning and gathering knowledge. Validity here refers to the extent to which a tool actually measures what it is intended to measure. Reliability refers to the consistency of a tool at producing the same results. The fact that these factors are the primary concerns in assessing students shows the product of traditional assessment which is the final grade (Brown, 2004). When learning is represented as a number, it is very important that the number be as valid and reliable as possible, or it lacks meaning (Alderson, 2000). Traditional assessment philosophy has produced assessment tools that allow teachers to rank students against each other which are called norm-referenced assessment techniques. Norm-referenced assessment can be inaccurate or unreliable in some populations, and is thus considered to be less valid than criterion-referenced assessment (Dunn et al., 2004). The shift towards criterion-referenced assessment, also known as standard-based assessment, is indicative of the need to develop more accurate representations of students’ learning than can be provided through norm-referenced assessment (Fulcher & Davidson, 2007).
The result of the traditional assessment framework is a heavy emphasis on norm-referenced assessments used for summative evaluation of student learning (Black & Wiliam, 1998). These grades are then used to make judgments about students’ abilities and potential. As Haertel and Herman (2005) observe, throughout history, assessment and evaluation have traditionally been used to sort and select students. Grades provide a basis for differentiation between students and a means of selecting the ‘best’ students for further learning opportunities. The use of assessment results is by external bodies, and students are not active participants in the process of assessment.
- Alternative Assessment
There are a number of alternatives to traditional assessment practices. The two most prevalent alternatives are authentic assessment and assessment for learning. The common underlying philosophy of alternatives to traditional assessment will be discussed here, and the assessment practices associated with both authentic assessment and assessment for learning will be described. Table 2.1 shows the differences between traditional and alternative assessment.
Traditional and alternative assessment (taken from Brown, 2004, p. 13) Table 1
One-shot standardized exams
Continuous long-term assessment
Timed, multiple-choice format
Untimed, free-response format
Decontextualized test items
Contextualized communicative tasks
Scores suffice for feedback
Individualized feedback and wash back
Focus on the “right” answer
Open-ended, creative answers
Oriented to product
Oriented to process
Many researchers have recognized this fact that alternative assessment techniques are efficient and dynamic means of assessing learners’ educational development. Alternative assessment includes procedures and techniques which facilitate the process of instruction and are easily incorporated into daily activities of the students (Hamayan, 1995) It is especially efficient in ESL/EFL contexts in which students can demonstrate what they can produce rather than what they can remember and recall (Huerta-Macias, 1995). Alternative assessment is intended to gather information about how students are able to process and complete real-life tasks (Huerta, 1995). Self-assessment is a process through which students learn about themselves (Dikel, 2005). Put it another way, a good language learner controls his own speech and that of others, too. That is, they are paying attention to how well his words are being perceived and if his performance meets the standards he has already learned (Rubin, 1975)
Since the emergence of alternative assessment methods, many researchers have attempted to probe into the efficiency of implementing new methods of assessing language learning of different learners. Ross (1998) has studied the effect of using formative assessment on foreign language proficiency development through involving eight cohorts of foreign language learners in an eight-year longitudinal study. He found that formative assessment procedures proved very positive effects on language proficiency development. Cheng and Warren (2005) have attempted to study the advantages of peer-assessment in English language programs. In their research, undergraduate engineering students attending a university in Hong Kong were asked to assess the English language proficiency of their peers. They also attempted to compare peer and teacher assessments. The results of their studies yielded that the students had a negative perception of assessing their peers’ language proficiency, but they could score their peers’ language proficiency in a similar fashion based on the same assessment criteria.
Zakian, Moradan, and Naghibi (2012) also examined the relationship between self-, peer-, and teacher-assessments of EFL learners’ speaking ability. A questionnaire was used to compare and contrast the learners’ attitudes towards their involvement in assessment. The learner-assessors had a training session before the assessment task. The findings of their study showed that a strong correlation between self, peer, and teacher-assessments can be estimated through Pearson Product Moment Correlation Coefficient formula. Also, they found that involving students in the assessment makes the testing environment safe and stress free.
As for general course achievement, Khonbi and Sadeghi (2012) investigated the effect of self-, peer-, and teacher-assessment on Iranian undergraduate EFL students’ course achievement. The students were pretested on their current Teaching Methods knowledge. After receiving relevant instruction and training, the first experimental group (N= 21) were involved in self-assessment activities, the second one (N= 23) were engaged with peer-assessment tasks, and the third one (N= 21) were subjected to teacher-assessment; however, the control group (N= 19) received no assessment-related treatment. The results of their study indicated differences in the performances of peer-, self-, teacher-assessment, and the control groups in favor of peer-assessment.
As for speaking ability, Ahangari, Rassekh-Alqol, and Ali Akbari (2013) examined the effect of peer assessment on oral presentation of Iranian EFL students. The peer assessment was incorporated into the experimental group’s course to explore whether and to what extent their oral presentation skills may enhance. They obtained data through a Likert scale questionnaire of peer assessment. The results of their study specified a statistically significant difference among the groups. The findings of their study also suggested that, when assessment criteria are definitely established, peer assessment empowers students to evaluate the performance of their peers in a manner comparable to those of the teachers.
Xiao and Lucking (2008) compared the effects of two peer assessment methods on university students’ academic writing performance and their satisfaction with peer assessment. They also examined the validity and reliability of student generated assessment scores. The results of their study indicated that students in the experimental group demonstrated greater improvement in their writing than those in the comparison group and the findings revealed that students in the experimental group exhibited higher levels of satisfaction with the peer assessment method both in peer assessment structure and peer feedback than those in the comparison group. Additionally, the findings of their analysis indicated that the validity and reliability of student generated rating scores were extremely high. Using Wiki interactive software and providing an online collaborative learning environment to facilitate peer assessment added value to peer assessment.
Peer-assessment is a type of alternative assessment techniques through which learners assess one another’s progress through using some checklists provided by their teachers. It appeals to principles the most of which is cooperative learning. It is simply “one arm of a plethora of tasks and procedures within the domain of learner-centered and collaborative education” (Brown, 2004, p. 270).
Peer assessment can be described generally as a process whereby students evaluate, or are evaluated by, their peers. In educational practice, this occurs in many different forms. Several types of peer assessment exist, such as grading a peer’s research report, providing qualitative feedback on a classmate’s presentation, or evaluating a fellow trainee’s professional task performance (Zundert, Sluijsmans, & Merrienboer 2010).
Peer assessment aims to describe the assessment processes that foster future learning and mitigate difficulties that are expected to occur. It also aims to transform students from mere receivers of knowledge from teachers to memorize and recall on tests to active learners and participants in learning and evaluation process, interact, search and explore, and reach to relationships between objects in order to generate new knowledge characterized by critical thinking and creativity. Peer assessment also helps to ensure a quality education for all students (Rogers & Threatt, 2000) and develop learner’s self-direction as one of the quality measures in education (Papinczak, Young, Groves & Haynes, 2007).
Peer assessment in the education field has been achieved at an increasing rate in recent decades, using it as an assessment tool (Gielen, Dochy, Onghena, Struyven, & Smeets, 2011). It represents a system for learning built on the basis of that learning directed around the learner with the other in depending on effective learning, which focuses on the full integration of the student in the process of collaborative learning with peers under the supervision of the teacher (Thomas, Martin & Pleasants 2011). Peer assessment is used to enhance learning as an effective way to increase motivation for students by engaging them in the evaluation process which has received attention in recent years from a number of international universities (Rimer, 2007), and to encourage peers to help each other to master the topic of learning (Alzaid, 2017).
Peer assessment aims to describe the assessment processes that foster future learning and mitigate difficulties that are expected to occur. It also aims to transform students from mere receivers of knowledge from teachers to memorize and recall on tests to active learners and participants in learning and evaluation process, interact, search and explore, and reach to relationships between objects in order to generate new knowledge characterized by critical thinking and creativity. Peer assessment also helps to ensure a quality education for all students (Rogers & Threatt, 2000) and develop learner’s self-direction as one of the quality measures in education (Papinczak et al., 2005).
Peer assessment in the education field has been achieved at an increasing rate in recent decades, using it as an assessment tool (Gielen et al., 2011). It represents a system for learning built on the basis of that learning directed around the learner with the other in depending on effective learning, which focuses on the full integration of the student in the process of collaborative learning with peers under the supervision of the teacher (Thomas et al., 2011). Peer assessment is used to enhance learning as an effective way to increase motivation for students by engaging them in the evaluation process which has received attention in recent years from a number of international universities (Rimer, 2007), and to encourage peers to help each other to master the topic of learning (Alzaid, 2017).
Benefits of Peer-assessment
The benefits of incorporating peer assessment into the regular assessment procedures have been discussed in a number of studies (Burnett & Cavaye, 1980; Earl, 1986; Goldfinch & Raeside, 1990; Kwan &Leung, 1996, Webb, 1995). Peer assessment is believed to enable learners to develop abilities and skills denied to them in a learning environment in which the teacher alone assesses their work. In other words, it provides learners with the opportunity to take responsibility for analyzing, monitoring and evaluating aspects of both the learning process and product of their peers. Research studies examining this mode of assessment have revealed that it can work towards developing students’ higher order reasoning and higher level cognitive thought (Birdsong & Sharplin, 1986), helping to nurture student-centered learning among undergraduate learners (Oldfield & MacAlpine, 1995), encouraging active and flexible learning (Entwhistle, 1993), and facilitating a deep approach to learning rather than a surface approach (Entwhistle, 1987; 1993; Gibbs, 1992). Peer assessment can act as a socializing force and enhances relevant skills and interpersonal relationships between learner groups (Cheng &Warren, 1999).
As a common classroom practice teachers evaluate their students ‘progress using assessment of learning and formative assessment (Afitska, 2014). Assessment of learning, known as summative assessment refers to as a term-test happening at the end of a term (DeLuca & Klinger, 2010). It has been more widely in use (Ataç, 2012). However, with respect to assessment and instruction, assessment for learning (henceforth, AFL) has been developed recently (Harlen, 2006). In accordance, assessment and learning are inextricably interwoven and assessment is viewed as a means of nurturing student learning (Davison & Leung, 2009).
In the language teaching context, the term assessment bridge is also a classroom-based area connecting language assessment and second language acquisition with the hoped-for learning outcomes (Colby-Kelly &Turner, 2007). Language teachers draw conclusions about learner performance through assessment (Brown, 2001). Those who learn, those who teach and those who are responsible for the development and accreditation of courses share this main concern. Language teaching researchers believe that teachers usually correct learners. However, rarely do these corrections lead to much improvement, especially in productive skills.
Practices of Language teachers’ assessment can be affected by contextual and experiential variables. According to Author (2018) contextual factors are defined as larger educational, social, political, historical, or other factors, whereas experiential factors includes assessment background, training, and practice. Recent research has shown that contextual factors can holistically form an assessment culture that affects teachers’ assessment practices in a local context (e.g., Rea-Dickins, 2001; Vogt & Tsagari, 2014), particularly, the development and application of assessments in school settings. Moreover, research on experiential factors suggests that (1) teachers use assessment practices which they are more familiar with (Reynolds-Keefers, 2010); and (2) the case of new assessment activities, methods, or tools, teaches them assessment on the job, through which they develop assessment intuitions (Scarino, 2013; Vogt & Tsagari, 2014).
Alderson, J. Charles. (2005). Diagnosing foreign language proficiency: the interface between learning and assessment. London: Continuum.
Anderman, E. M., & Wolters, C. (2006). Goals, Values, and Affect: Influences on Student Motivation. In P. Alexander & P. Winne (Eds.), Handbook of Educational Psychology (2nd ed., pp. 369-389). Mahwah: Lawrence Erlbaum Associates.
Ahangari, S., Rassekh-Alqol, B., & Akbari Hamed, L. (2013). The Effect of Peer Assessment on Oral Presentation in an EFL Context. International Journal of Applied Linguistics & English Literature, 2(3), 45-54.
Alzaid, J. M. (2017). The Effect of Peer Assessment on the Evaluation Process of Students. International Education Studies.
Afitska, O. (2014). Use of formative assessment, self-and peer-assessment in the classrooms: Some insights from recent language testing and assessment (LTA) research. Journal on English Language Teaching, 4(1), 29–39.
Ataç, B. A. (2012). Foreign language teachers’ attitude toward authentic assessment in language. Journal of Language and Linguistic Studies, 8(2), 7–19.
Alderson, J. C., & Wall, D. (1993). Does washback exist? Applied Linguistics, 14(2), 115-129.
Bailey, K. M. (1996). Working for wash back: A review of the washback concept in language testing. Language Testing, 13, 257-279.
Bachman, L. (2004). Statistical analysis for language assessment. Cambridge: Cambridge University Press
Brown, H. D. (2004). Language assessment: Principles and classroom practices. New York, NY: Pearson Education.
Bachman, L. and Palmer, A. (2010). Language Assessment in Practice. Oxford: Oxford University Press.
Bachman, L.F. (1990). Fundamental considerations in language testing. Oxford: Oxford University Press.
Bachman, L.F. & Palmer, A.S. (1996). Language testing in practice. Oxford: Oxford University Press.
Boudett, K., City, E., & Murnane, R. (Eds.). (2013). Data wise: A step-by-step guide to using assessment results to improve teaching and learning (2nd ed.). Cambridge, MA: Harvard Education Press
Brown, D. (2001). Teaching by principles: An interactive approach to language pedagogy. San Francisco State University: Longman.
Brookhart, S. M., Walsh, J. M., & Zientarski, W. A. (2006). The dynamics of motivation and effort for classroom assessments in middle school science and social studies. Applied Measurement in Education, 19(2), 151-184
Birdsong, T., & Sharplin, W. (1986). Peer evaluation enhances students’ critical judgment. Highway One, 9, 23-28.
Black, P., & Wiliam, D. (1998). Assessment and Classroom Learning. Assessment in Education. Assessment in Education, 5, 7-74.
Collins, R. (2013). Authentic Assessment: Assessment for learning, Curriculum and Leadership Journal, 11(7)
Cheng, L., Rogers, T., and Hu, H. (2004). ESL/EFL instructors’ classroom assessment practices: purposes, methods, and procedures. Language Testing, 21 (3), 360-389
Cizek, G. (2010). An introduction to formative assessment: History, characteristics, and challenges. In H. Andrade, & G. Cizek (Eds.), Handbook of formative assessment (pp. 3-17). New York, NY: Routledge.
Cheng, L. (2005). Changing language teaching through language testing: A washback study. Studies in language testing. Cambridge: Cambridge University Press.
Cheng, W. & Warren, M. (2005). Peer assessment of language proficiency. Edward Arnold publishers
Chalhoub‐Deville, M.: 2001, ‘Task based assessments: Characteristics and validity evidence’, in M. Bygate, P. Skehan, and M. Swain (eds.), Researching Pedagogic Tasks: Second Language Learning, Teaching and Testing, Longman, Harlow.
Colby-Kelly, C., & Turner, C. E. (2007). AFL research in the L2 classroom and evidence of usefulness: Taking formative assessment to the next level. Canadian Modern Language Review, 64(1), 9–37.
Dikel, M. R. (2005). A guide to going online for self-assessment tools: A critical view. An assessment in Language, 6(5), 64-74.
DeLuca, C., & Klinger, D. A. (2010). Assessment literacy development: Identifying gaps in teacher candidates’ learning. Assessment in Education: Principles, Policy & Practice, 17(4), 419–438.
Davison, C., & Leung, C. (2009). Current issues in English language teacher-based assessment. TESOL Quarterly, 43(3), 393–415.
Erwin, T. D. (1991). Assessing student learning and development: A practical guide for college faculty and administrators. San Francisco: Jossey-Bass Publishers.
Entwistle, A. C., & Entwistle, N. J. (1992). Experiences of understanding in revising for degree examinations. Learning & Instruction, 2, 1-22
Gibbs, G. (1992). Improving the quality of student learning. Bristol: Technical & Educational Services.
Fiddler, M., Marienau, C., & Whitaker, U. (2006). Assessing learning: Standards, principles, and procedures (2nd ed.). Dubuque, IA: Kendall/Hunt.
Fulcher, G. and Davidson, F. (2007). Language Testing and Assessment. Glenn Fulcher and Fred Davidson, London.
Gattullo, F. (2000). Formative assessment in ELT primary (elementary) classrooms: An Italian case study. Language Testing, 17 (2), 278–288
Gielen, S., Dochy, F., Onghena, P., Struyven, K., & Smeets, S. (2011). Goals of peer assessment and their associated quality concepts. Studies in Higher Education, 36(6), 719-735.
Goldfinch, J., & Raeside, R. (1990). Development of a peer assessment technique for obtaining individual marks on a group project. Assessment and Evaluation in Higher Education, 15, 210-231.
Gallagher, C. W., & Turley, E. D. (2012). Our better judgment: Teacher leadership for writing assessment. Urbana, IL: NCTE.
Hamayan, E. V. (1995). Approaches to Alternative Assessment. Annual Review of Applied Linguistics, 15, 212-226.
Huerta-Macias, A. (2002). Alternative Assessment: Response to commonly asked questions. In J.C. Richards & W. A. Renandya (Eds.), Methodology in language teaching: An anthology of current practice. Cambridge: Cambridge University Press
Huerta-Macias, A. (1995). Alternative assessment: responses to commonly asked questions. TESOL Journal, 5(1), 8-11
Hanrahan, S. J., & Isaacs, G. (2001) Assessing self and peer assessment: The students’ views. Higher Education Research and Development, 20, 53-70. Harlen, W. (2006). The role of teachers in the assessment of learning. In Pamphlet produced by the assessment systems for the future projects (ASF) Assessment Reform Group. London: UK.
Harlen, W., Deakin Crick, R. (2003). Testing and Motivation for Learning. Assessment in Education Principles Policy and Practice, (2):169-207
Harris, M. & McCann, P. (1994). Assessment (Handbook for the English classroom). Oxford: Heinemann Publishers.
Heritage, M. (2007). Formative assessment: What do teachers need to know and do? Phi Delta Kappan 89.
Khonbi, z., Sadeghi, K. (2012). The Effect of Assessment Type (self-vs. peer vs. teacher) on Iranian University EFL Students’ Course Achievement. Language Testing in Asia, 2(4), 47- 74.
Lewy, A. (1990). Formative and summative evaluation. In Walberg, H. & Haertel, G. (Eds.), The International Encyclopedia of Educational Evaluation, 26-28.
Lynch, B.K. (2001). The ethical potential of alternative language assessment. In C. Elder et al., eds. Experimenting with uncertainty: essays in honor of Alan Davies. (Studies in language testing 11). Cambridge: Cambridge University Press, pp 228-239.
McNamara, T. (2004). Language testing. In The Handbook of Applied Linguistics (pp. 763–783). A. Davies and C. Elder (Eds.). Malden, MA: Blackwell.
Mueller, J. (2006). The authentic assessment toolbox. Retrieved January 3rd, 2016 from http://jonathan.mueller.faculty.noctrl.edu/toolbox/standardtypes.htm
Marhaeni, A.A.I.N. & Artini, L.P. (2015). Asesmen autentik dan pendidikan bermakna: Implementasi Kurikulum 2013. Jurnal Pendidikan Indonesia, 4(1), 499-51
Oldfield, K.A., & MacAlpine, J.M.K. (1995). Peer and self-assessment at the tertiary level: An experiential report. Assessment and Evaluation in Higher Education, 20, 125-132.
Poehner, M. E. (2008). Dynamic assessment: A Vygotskian approach to understanding and promoting L2 development. Berlin: Springer Publishing.
Pedagogic Tasks: Second Language Learning, Teaching and Testing, Longman, Harlow, 186–209
Papinczak, T., Young, L., Groves, M., & Haynes, M. (2007). An analysis of peer, self, and tutor assessment in problem-based learning tutorials. Med Teach, 29(5), 122-132.
Reynolds-Keefer, L. (2010). Rubric-referenced assessment in teacher preparation: An opportunity to learn by using. Practical Assessment, Research & Evaluation,15(8), 1-9.
Rubin, J. (1975). What the “good language learner” can teach us. TESOL Quarterly, 9(1), 41-51.
Ross, S. (1998). Self-assessment in second language testing: A meta-analysis and analysis of experimental factors. Language Testing, 15(1), 1-19. 
Rimer, S. (2007). Harvard task force calls for new focus on teaching and not just research. The New York Times.
Rea-Dickins, P. & Gardner, Sh. (2000). Snares and silver bullets: disentangling the construct of formative assessment. Language Testing, 17(2), 215-243.
Rogers, T. (1991). Educational assessment in Canada: Evolution or extinction? The Alberta Journal of Educational Research, 37(2), 179.192
Rogers, R.K. & Threatt, D. (2000). The Coaching of Teaching. Vol. 29, 14-16.
Rea-Dickens, P. (2004). Understanding teachers as agents of assessment. Language Testing, 21 (3): 249–258.
Rea-Dickins, P. (2001). Mirror, mirror on the wall: Identifying processes of classroom assessment. Language Testing, 18(4), 429-462.
Shohamy, E. (1995). Performance assessment in language testing. Annual Review of Applied Linguistics, 15, 188-211.
Stephens, D., & Story, J. (1999). Assessment as inquiry: Learning the hypothesis-test process. Urbana, IL: NCTE
Scarino, A. (2013). Language assessment literacy as self-awareness: Understanding the role of interpretation in assessment and in teacher learning. Language Testing, 30(3), 309-327
Shepard, L. A. (2000). The Role of Assessment in a Learning Culture. Educational Researcher, 29(7):4-14
Skehan, P.: 2001, ‘Tasks and language performance’, in M. Bygate, P. Skehan, and M. Swain (eds.), Researching Pedagogic Tasks: Second Language Learning, Teaching and Testing. Longman, Harlow.
Serafini, F. (2010). Classroom reading assessments: More efficient ways to view and evaluate your learners. Portsmouth, NH: Heinemann.
Tal, T. Implementing multiple assessment modes in an interdisciplinary environmental education course. (2006). Environmental Education Research, 11(5):575-601
Thomas, G., Martin, D., & Pleasants, K. (2011). Using self-and peer-assessment to enhance students’ future-learning in higher education. Journal of University Teaching and Learning Practice, 8(1), 5.
Vygotsky, L. (1986/1934). Thought and language. Cambridge, MA: Massachusetts Institute of Technology
Vogt, K., & Tsagari, D. (2014). Assessment literacy of foreign language teachers: Findings of a European study. Language Assessment Quarterly, 11(4), 374-402.
Walvoord, Barbara E. (2010). Assessment Clear and Simple. San Francisco: John Wiley & Sons, Inc.
Wigglesworth, G.: 2001, ‘Influences on performance in task‐based oral assessments’, in M. Bygate, P. Skehan, and M. Swain (eds.), Researching
Xiao, Y., & Lucking, R. (2008). The impact of two types of peer assessment on students’ performance and satisfaction within a Wiki environment. The Internet and Higher Education, 11(3–4), 186–193.
Zakian, M., Moradan, M., & Naghibi, S.E. (2012). The Relationship between Self-, Peer-, and Teacher-Assessments of EFL Learners’ Speaking. World J Arts, Languages, and Social Sciences, 1(1)
Zundert, M., Sluijsmans, D., & Merrienboer, J. (2010). Effective peer assessment processes: Research findings and future directions. Learning and Instruction, 20, 270-279