Why Level 2 Assessments Given Immediately After Learning Are Generally Dangerous and Misleading
Note from Will Thalheimer: This is an updated version of a newsletter blurb written a couple of years ago, where I made too strong a point about the dangers of Level 2 Assessments. Specifically, I claimed that Level 2 Assessments should never be used immediately after learning, which may have been pushing the point too far. Upon rethinking the issue a bit, I've concluded that Level 2 Assessments immediately after learning are still dangerous, but there may be some benefits to using them if the overall assessment process is designed correctly. You can be the judge by reading below.
Introduction to Levels 1 and 2 Assessments
Before we get to Level 2 evaluations, let's talk about Level 1 of Kirkpatrick's 4-level model of assessment. Level 1 is represented by the "smile sheets" that we hand out after training or include at the end of an e-learning course. They typically ask learners to rate the course and to judge how likely they are to use the information they learned. These evaluations are valuable to get learner reactions and opinions, but they provide a very poor gauge of learning and performance. The fact that we rely on these almost exclusively to assess the value of our instruction is unconscionable.
Level 1 Assessments are not always good predictors of learning. Learners may give a course a high score but not remember what they learned. Learners are also famously optimistic about what they will remember. Just because they tell us they'll remember information and use it in their work doesn't mean they will. Learners also fill in smile sheets based on whether they like the course or the instructor. Courses that challenge learners may be rated poorly, even though a challenge might be exactly what is needed to push a significant behavior change.
Level 2 Assessments are intended to measure learning and retention. We want to know whether the information learned is retrievable from memory. Ideally, we want to know whether the information is retrieved and used on the job. If we measure actual on-the-job performance, we're really utilizing a Level 3 Assessment. In comparison, Level 2 Assessments measure the retrievability of information, not it's actual use. This is where the problems start.
What is Meant Here by the Word Assessment?
First let me clarify that I am using the word "assessment" to mean a test given for the purposes of evaluating a learning intervention. Assessments can also be used to bolster learning, as when they are used to promote retrieval practice or provide feedback to the learners. It is the first use of assessments that I am concerned with in this article. Specifically I will argue that Level 2 Assessments given for the purpose of evaluating the success or failure of a learning intervention are dangerous if given immediately after the learning. However, this does not mean that assessments used for the purpose of aiding retrieval or providing corrective feedback are not valuable at the end of learning. In fact, they are excellent for that purpose.
The analysis in this article also assumes that Level 2 Assessments are well designed. Specifically, it assumes that the assessments prompt learners to retrieve from memory the same information that they will have to retrieve in their on-the-job situations. It also assumes that the cues that trigger retrieval will be similar to those that will trigger their thinking and performance on the job. It is true that most current Level 2 Assessments don't meet these criteria, but they should.
The Problems With Immediate Assessments
When we learn a concept, we think about it. When we think about something, it becomes highly retrievable from memory, at least for a short time. Thus, during learning and immediately afterward, our newly learned information is highly retrievable. If we test learners then, they are likely to remember all kinds of stuff they'll forget in a day.
This problem is compounded because learning is contextualized. If we learn in Room A, we'll be better able to retrieve the information we learned if we have to retrieve it in Room A as opposed to Room B (by up to 55% or so). Thus, if we test learners in the training room or while they're still at their desks using an e-learning program, we're priming them for a level of success they won't attain when they're out of that learning situation and back on the job.
Giving someone a test immediately after they learn something is cheating. It provides an inflated measure of their learning. More importantly, it tells us very little---if anything---about how well learners will be able to retrieve information when they get back to their jobs.
On-the-job retrieval depends on both the amount of learning and the amount of forgetting.
Retrieval = Learning - Forgetting
Our instructional designs need to maximize learning and minimize forgetting. If we measure learners immediately after they learn, we've accounted for the learning part of the retrieval equation, but we've ignored forgetting all together. Not only are immediately-given Level 2 Assessments poor tools to use in measuring an individual's performance, but they also give us poor feedback about whether our instructional designs are any good. In short, they're double trouble. First they don't measure what we want them to measure, and then they don't hold us accountable for our work.
But What Happens On The Job?
All this is true in most cases, but there are complications when we consider what happens after the learning event. The analysis above is accurate in those situations when learners forget much of what they learn as they move from learning events back to their jobs. Look at the following graph and imagine doing a Level 2 Assessment at the end of the learning---before the forgetting begins. It would show strong results even though later performance would be poor.
But what happens in those all-too-rare situations when learners take the learning and begin to apply what they've learned as soon as they get back to the workplace? When they do this, they're much less likely to forget---and they may even take their competence and learning to a higher level than they achieved in the actual learning event. Check out the graph below as an example.
In this case, if we did a Level 2 Assessment at the end of the initial learning---before the workplace learning begins, again the assessment wouldn't be accurate. This time it might not adequately assess the ability of the learning intervention to facilitate the workplace learning.
Real learning interventions often generate both types of results. Learners utilize some of what they've learned back on the job---facilitating their memory; but the rest of what they learned is not used and so is forgotten. The following graph depicts this dichotomous effect.
So Why Use Level 2 Assessments At All?
It should be clear that Level 2 Assessments delivered immediately after the learning are virtually impossible to interpret. However, it may be useful to use them in conjunction with a later Level 2 Assessment to determine what is happening to the learning after the learners get back to the job.
If the learners' level of retrieval improves, then we can be fairly certain that the learners have made positive use of the learning event. Of course, a better way to draw this conclusion is to use a comparison-group design, but such an investment is normally not feasible.
If the learners' level of retrieval remains steady, then we can be fairly certain that the course did some good in preventing forgetting. Again, a comparison-group design will be more definitive.
If the learners' level of retrieval deteriorates, then we can be fairly certain that the learning event did not prevent forgetting and/or was probably not targeted at real work skills. Deteriorating retrieval is the one result we want to make sure we don't produce with our learning designs---and because forgetting is central to human learning---if we're preventing forgetting, we're doing something important. Finally, because forgetting is normal, a comparison-group design is not as critical in ruling out alternative explanations. In other words, if we find forgetting after the learner returns to the job, we can conclude that the learning event wasn't good enough.
To summarize this section, it appears that though Level 2 Assessments given immediately after learning have dubious merit because they're impossible to interpret, there may be value in using immediate Level 2 Assessments in combination with delayed Level 2 Assessments.
How to Do Level 2 Assessments
Level 2 Assessments should be utilized in the following manner:
1. Be administered at least a week or two after the original learning ends, or be administered twice---immediately after learning and a week or two later.
2. Be composed of authentic questions that require learners to retrieve information from memory in a way that is similar to how they'll have to retrieve it on the job. Simulation-like questions that provide realistic decisions set in real-world contexts are ideal.
3. Cover a significant portion of the most important performance objectives.
Hi I was really interested in your recent article.
Why not check out a UK based Further Education website, who are particularly focused on Work Based Learning: www.fenews.co.uk
Or if any of your friends or colleagues are interested in FE or WBL career in the UK check out www.fecareers.co.uk
Keep up the posts.
Gavin
Posted by: FE News | Monday, 05 June 2006 at 11:51 AM
Will,
Good meeting you in Boston recently.
I believe I mentioned at that time some results obtained by a client measuring post-test and retention-test gains - which for 70% of the trainees followed your second graph, or at least a somewhat flattened version of it. (See http://www.automatedlearning.com/customers/analysis.cfm)
Realistically, the logistics in most of our client sites would not facilitate a delayed Level 2.
But if results on a test taken immediately following instruction correlate highly with subsequent retention tests (including a level 2 taken two weeks after the training event) then it would seem the result is neither dangerous nor misleading. If the correlation exists, then we should have some confidence in the immediate test results. So I guess I question if the problem with an immediate assessment is as serious as you imply, provided that one has confidence that he is dealing appropriately with transferable skills.
If that correlation does not exist, then there seem to be at least three possible explanations.
(1) The apparent knowledge is deliberately not reinforced in the workplace. (We might not, for example, generate emergencies and accidents in order to reinforce emergency response training. Here we assume the forgetting will occur, and schedule refreshers or recertification on a regular basis.)
(2) The testing is not reflective of true workplace skills in the first place.
(3) The instruction was very "forgettable", focussing on short term memory and possibly lacking appropriate structure. Much of the industrial training I see still relies on a "just tell them what to do" approach rather than focussing on real depth of understanding.
By delaying the Level 2 test for a couple of weeks, one is able to identify the retention and possible transfer effect. This is good, because you can identify the ineffective and non-transfer training, and your results should be reflected in job performance.
But this seems to me to be compensation for less than ideal teaching or testing, and can significantly impact on training logistics.
Best regard,
Bob.
Posted by: Dr. Bob Abell | Tuesday, 06 June 2006 at 01:34 PM
Thanks Bob. Good comments.
I think we're basically in agreement. I'm interested in your comment about the correlation between an immediate test and a delayed test. I'll have to think more deeply about this, but it occurs to me that maybe such a correlation is not what we're aiming for, per se. A positive correlation would mean that everyone who did well on the immediate test would also do well on the delayed test, and everyone who did poorly on the immediate test would also do poorly on the delayed test. While we may want the good performers to remain good, we don't really want the bad performers to remain bad.
Although, the comparison might be apples to oranges, the correlations comparing Level 2 and Level 3 metrics typically show very low correlations. For example, a meta-analysis of a number of published studies found an r of only .12 between Level 2 and 3 (where .29 or below is considered a weak relationship). The study citation is: Alliger, Tannenbaum, Bennett, Traver, & Shotland (1997).
A meta-analysis of the relations among training criteria.
Personnel Psychology, 50, 341-357.
Bottom Line: It's complicated.
Posted by: Will Thalheimer | Tuesday, 06 June 2006 at 02:17 PM
I think the industrial training I see still relies on a "just tell them what to do" approach rather than focussing on real depth of understanding.
Posted by: Tampa Web Design | Thursday, 28 April 2011 at 07:57 AM
This is good, because you can identify the ineffective and non-transfer training, and your results should be reflected in job performance.
Posted by: credit business loan | Tuesday, 03 May 2011 at 04:24 AM
It is the first use of assessments that I am concerned with in this article.
Posted by: Web Design Tampa | Tuesday, 03 May 2011 at 08:10 AM
Well its provided that one has confidence that he is dealing appropriately with transferable skills.
Posted by: Gatehouse Academy Cult | Friday, 06 May 2011 at 05:19 AM
If the learners' level of retrieval deteriorates, then we can be fairly certain that the learning event did not prevent forgetting and/or was probably not targeted at real work skills.
Posted by: Fitness Exercise | Monday, 09 May 2011 at 07:36 AM
Really i think this does not mean that assessments used for the purpose of aiding retrieval..
Posted by: business appraisals | Tuesday, 24 May 2011 at 03:57 AM
I think if we test learners in the training room or while they're still at their desks using an e-learning program, we're priming them for a level of success they won't attain when they're out of that learning situation and back on the job.
Posted by: Brampton Real Estate | Tuesday, 24 May 2011 at 08:56 AM
I think if we test learners in the training room or while they're still at their desks using an e-learning program, we're priming them for a level of success they won't attain when they're out of that learning situation and back on the job.
Posted by: fashion accessories | Thursday, 26 May 2011 at 09:27 AM
It assumes that the cues that trigger retrieval will be similar to those that will trigger their thinking and performance on the job.
Posted by: Brampton Condos for Sale | Monday, 30 May 2011 at 09:40 AM
I will argue that Level 2 Assessments given for the purpose of evaluating the success or failure of a learning intervention are dangerous if given immediately after the learning.
Posted by: business listings | Tuesday, 31 May 2011 at 07:25 AM
We can measure learners immediately after they learn, we can accounted for the learning part of the retrieval equation, but we've ignored forgetting all together.
Posted by: recycle cell phones | Wednesday, 01 June 2011 at 07:08 AM
it may be useful to use them in conjunction with a later Level 2 Assessment to determine what is happening to the learning after the learners get back to the job.
Posted by: business grant funding | Thursday, 02 June 2011 at 02:14 AM
It also assumes that the cues that trigger retrieval will be similar to those that will trigger their thinking and performance on the job. It is true that most current Level 2 Assessments don't meet these criteria, but they should.
Posted by: online savings | Friday, 03 June 2011 at 07:28 AM
Level 2 Assessments may be useful to use them in conjunction with a later Level 2 Assessment to determine what is happening to the learning after the learners get back to the job.
Posted by: business networking | Monday, 06 June 2011 at 08:33 AM
Analysis is very accurate in those situations when learners forget much of what they learn as they move from learning events back to their jobs.
Posted by: cash for phones | Tuesday, 07 June 2011 at 02:51 AM
I agree with your 3rd one graph that if anyone starts work while studies then definitely he has forget old topics..
Posted by: Personal training course | Friday, 10 June 2011 at 09:27 AM
Its very true and I feel it also that your experience and concepts become strong with working,because what you learn that'll implement also at working place...
Posted by: Mississauga Condos for Sale | Tuesday, 14 June 2011 at 03:05 AM
I think its not the proper way to give immediately Level 2 Assessments,its a poor tools to use in measuring an individual's performance..
Posted by: online coupons | Wednesday, 15 June 2011 at 02:25 AM
Its a best way to give presentation for something use graphs and above mention all graphs are perfect with different-different situations.Level 2 assessment is good way to clear all things...
Posted by: Scott | Thursday, 16 June 2011 at 03:51 AM
I think its not possible to do learning with earning,do one thing either learning or earning,rest are just examples nothing more,level 2 experiment is not complete wrong...
Posted by: business networking | Friday, 17 June 2011 at 09:13 AM
I think we learn in proper way with honesty they how can we forget all these things and learning with working is the best option to learn something in well manner....
Posted by: Mississauga Luxury Homes | Saturday, 18 June 2011 at 07:36 AM
Its a best option when learning and earning are doing together,students can implement their studies in working which gives them good benefit..
Posted by: Fitness plan | Monday, 20 June 2011 at 07:22 AM
Stage of learning is very important in students life,don't focus on another activities because you can't make balance on two different vehicle,accident we'll be happen..
Posted by: discount coupon | Wednesday, 22 June 2011 at 02:27 AM
Its a human behavior that,if student only learn something and don't implement so they strat forgetting..
Posted by: sell old cell phones | Thursday, 23 June 2011 at 07:09 AM
Only learning is onlu useful in schools but when youths are in college then it useless and I think education shpuld be like this learning at working place is best option to understand learning...
Posted by: wedding dresses shops | Wednesday, 29 June 2011 at 07:49 AM
I believe in learning at workplace because it's a very best option to learn something and impact of implementation of same,so we get more appropriate results and understand the value of that concern topic..
Posted by: cash for phone | Thursday, 30 June 2011 at 02:26 AM
It is entirely understandable that Sage's competitors do not want the deadline extended, as they hope to win customers from Sage. But why should accountancy practices and businesses who use Sage software be penalised.
Posted by: Jewelry Display Stands | Thursday, 30 June 2011 at 03:51 AM
A positive correlation would mean that everyone who did well on the immediate test would also do well on the delayed test, and everyone who did poorly on the immediate test would also do poorly on the delayed test.
Posted by: Buy Jewelry Online | Friday, 01 July 2011 at 05:45 AM
Its a great idea to describe something through graphs and I like very much your concept,but my thought about your blog is that all situation are correct and its depend on person to person...
Posted by: GSm alarm systems | Friday, 01 July 2011 at 09:37 AM
Its a best tools to use in measuring an individual's performance best way to give immediately Level 2 Assessments.
Posted by: Fly fishing | Saturday, 02 July 2011 at 10:44 AM
We can accounted for the learning part of the retrieval equation, but we've ignored forgetting all together.
Posted by: WFG Canada | Saturday, 02 July 2011 at 11:05 AM
I think students have to do only one thing at one time other wise thry will confuse and making mistake which make laid down their confidence level..
Posted by: cell phone cash | Tuesday, 05 July 2011 at 09:33 AM
I think students have to do only one thing at one time other wise thry will confuse and making mistake which make laid down their confidence level..
Posted by: cell phone cash | Tuesday, 05 July 2011 at 09:35 AM
I guess I question if the problem with an immediate assessment is as serious as you imply, provided that one has confidence that he is dealing appropriately with transferable skills.
Posted by: business funding | Wednesday, 06 July 2011 at 03:14 AM
Well assessment itself has many meanings...Assessment is often equated and confused with evaluation or can be used to determine what a student knows or can do,while evaluation is used to determine the worth or value of a course or program.Thanks
Posted by: crowdSPRING | Saturday, 09 July 2011 at 05:19 AM
I like your stuff its well but I think there are so many reasons of failure and success,here you mentioned some them very well but some are you missed..
Posted by: unsecured business credit line | Wednesday, 13 July 2011 at 03:23 AM
I agree that Real learning interventions often generate both types of results. Learners utilize some of what they've learned back on the job---facilitating their memory; but the rest of what they learned is not used and so is forgotten.
Posted by: Christmas Island | Wednesday, 13 July 2011 at 07:44 AM
Retrieval = Learning - Forgetting. I really like the formula which you have posted. amazing post. thanks for sharing.
Posted by: special discount code | Thursday, 21 July 2011 at 10:01 AM
It is entirely understandable that Sage's competitors do not want the deadline extended, as they hope to win customers from Sage.
Posted by: Barramundi fishing | Thursday, 21 July 2011 at 10:41 AM
I will argue that Level 2 Assessments given for the purpose of evaluating the success or failure of a learning intervention are more dangerous.
Posted by: special offers | Wednesday, 27 July 2011 at 07:38 AM
Well Assessments are not always good predictors of learning.Learners may give a course a high score but not remember what they learned..I like your stuff its well but I think there are so many reasons of failure and success
Posted by: WFG | Saturday, 30 July 2011 at 12:06 PM
A delay could result in unforeseen outcomes for those firms that surged ahead with preparation.
Posted by: Business Directory Melbourne | Friday, 05 August 2011 at 08:04 AM
This article is very good .It is providing great knowledge to us,how to get further education online.
Posted by: bulk cards | Thursday, 18 August 2011 at 05:24 AM
It would be nice to think that, a possible delay would draw a line under the affair and they could really move on!
Posted by: shop online | Saturday, 20 August 2011 at 03:22 AM
Well I go through this blog and found it very informative and interesting..It should be clear that Level 2 Assessments delivered immediately after the learning are virtually impossible to interpret..thanks
Posted by: IncomeatHome | Monday, 22 August 2011 at 03:37 AM
I'll have to think more deeply about this..I'm interested in your comment about the correlation between an immediate test and a delayed test.
Posted by: Amy McCraken evergreen custom media | Wednesday, 07 September 2011 at 10:15 AM