« Learning Show -- Don't Forget Forgetting | Main | Work-Learning Research Morphs to Hobby »

Tuesday, 27 February 2007

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d8341cf01053ef00d834e9eefe53ef

Listed below are links to weblogs that reference Learning Research Quiz Results 2002-2007:

» Online pharmacy oxycodone. from Oxycodone sideeffects.
Oxycodone photo. Oxycodone and breastmilk. Oxycodone exraction. Sinemet oxycodone. Oxycodone. [Read More]

Comments

Ann Busby

Will, I am aghast at such responses! Do we all really know so little?

I do have a request-how about some feedback-like the opportunity to see the questions again, with the answers to try to learn something from it? Thanks, Ann

Will Thalheimer

Ann, I've added links to the quiz feedback in the body of the post. Thanks!

Jack Handley

I'm not aghast (surprised) but am appalled. I entered the field with Pipe and Mager (et.al) and have seen no progress in the practice or the competence of the practitioners.
Odiorne saw none, also. Has anyone?

In my experience, 70% in the trade are dullards, 20% are voluble trend fashionistas, and 10% are skilled, often self-taught.

Training-- performance improvement--task analysis--were "rationalized" in WWII and yet the trade still does not have a standards handbook on massed vs. distributed practice, for example.

Stephen Downes

well, I got 2 out of 15 correct. That is substantially worse than the average, which is, as you point out, barely above what they would get from pure guesswork.

(Actually, the 32 percent is about exactly what you would expect. It's an old adage among trivia game players: 'when in doubt pick 3' (ie., C, the middle response). And this quiz fits true to the pattern: A was the correct response 2 times, B 4 times, C 6 times, D 2 times, E none, and F once.)

All this is a round-about way of saying: have you considered the possibility that it's the quiz that's the problem, not the quiz-takers?

I mean - I went into the test with the expectation that I might not do well. I have a healthy doubt of my own abilities. But I am not a 2 out of 15 in my own field. That's an unreasonable result.

There is, in my view, a systematic flaw in this test. And it can be expressed generally as the following:

The test author believes (based on some research, which is never cited) that "Learning is better if F" where 'F' is some principle, such as "Performance objectives that most clearly specify the desired learner behavior will produce the best instructional design."

This principle is treated as linear. That is to say, the more the principle is exemplified in the answer (per the author's interpretation) the more learning will be better.

But these principles are not linear. There is a point of diminishing returns. There is a point at which slavish adherence to the principle produces more problems than good. Experienced designers understand this, and hence build some slack into the application of the principles.

Question 1 provides a good object lesson:

The feedback states: "Performance objectives that most clearly specify the desired learner behavior will produce the best instructional design."

Option B (which I selected) is: “As each web page is developed, and after the full website is developed, each web page should be tested in both Netscape Navigator and Internet Explorer.”

Option C (which is considered correct) is: Same as B, with the addition of the following: “One month after completing the training, learners should test each web page during its development at least twice 90 percent of the time, and test each web page once after the whole website is complete at least 98 percent of the time.”

Now the question is, is the performance objective "more clearly stated" in C than in B? According to the author (obviously) it is. But sometimes making things more precisely stated does not make them more clear. It does not even make them more precise.

Which is clearer:

a. Test the page after design

b. Test the page 98 percent of the time after design

In my view, (a) is clearer.

Moreover, (b) is no more precise than (a). Because what (a) *means* is "Test the page 100 percent of the time after design".

Therefore, it would be unreasonable to select (c) on the ground that it is clearer. The unthinking effort to make it more precise went over the top and resulted in a statement that is more an example of nonsense than clarity.

The entire test is constructed this way. I got a couple where it was pretty obvious what the examiner was looking for. But otherwise, I picked what I felt was the best answer, which in every case was the less extreme version of the over-the-top choice.

In question number 2, for example, the principle is: "When the learning and performance contexts are similar, more information will be retrieved from memory."

Well, this is generally true. But will somebody prepare better spending a week on the road, living in a hotel, unable to keep up with work at home in Boston or to be there to help the kids? Being on the road creates an impact. So even if the test is being conducted in San Francisco, the comes a point where the advantage of studying and testing in a similar environment is overwhelmed by the disadvantage of being on the road.

The test author created an extreme case - a test location in San Francisco instead of a test location in downtown Boston. Thus, complications that an experienced person would automatically take into account - the time lost in airports, the rigors of travel, etc. - are built into their thinking.

The only way to get through such questions is to be able to figure out what the author is looking for. In this case, I looked at the example and it was pretty clear that it would be based on 'similarity of environment' and not any real question about 'effective learning'. It was one of the two I got right.

But author's intention is very deliberately disguised throughout the test. Or more accurately, the test addresses such a specific context that only people who work in that specific context have any real chance of divining the author's intent (and as it turns out, the context was so narrow it didn't even show up statistically).

This, I think, is one of the problems of testing genrally, and not just this test in particular.

In a test like this, each question is designed to measure only one point of learning (more precisely: to measure responses only along one vector). Theoretically, you could have questions that measure more than one vector, but it results in confusing questions and too many possible responses.

If the test measures simple things, that's fine. The question of whether 2+2=4 is not going to be impacted by external considerations.

But if the test measures complex phenomena, then it is going to systematically misrepresent the student's understanding of the phenomena.

Specifically, a very simple one-dimensional understanding will fare as well (and in this case, better) than a complex, nuanced understanding. People who understand a discipline as a set of one-dimensional principles will do the best - understanding simply becomes a case of picking which principle applies, then selecting the example that fits the best.

This test fails because it is too narrowly defined to let the simple understanders spot the principle being defined, and too dependent on single principles to give people who genuinely understand the phenomena any advantage.

The test author is right: don’t trust gurus.

Unfortunately, the test author didn't consider the possibility of recursion.

Will Thalheimer

Stephen,

Thanks for taking the time to write such an exhaustive analysis of my test.

Your critique is intriguing, but off the mark.

The test clearly is not perfect. No test is perfect. Yes, it grades questions in a rather one-dimensional way, but that's okay for the purposes it was designed for---to get a general idea of how much people know about the most fundamental principles of learning.

It is not appropriate to think of the scoring system employed in comparison to the straw man of typical tests we are all used to. In typical tests, it is considered a failure to get an answer wrong. I evaluate my test results in a probabilistic manner, under the assumption that those who know more about learning will do better on the test. For example, an expert in learning may get a 75% not a 40%. This takes care of the dimensionality problem you mentioned.

One of the skills of being a professional is to be able to notice what aspects of the environment are most important to pay attention to. Your criticism of the multidemsionality of the test questions is misplaced. A truly knowledgeable person must know the most critical dimensions to pay attention to. That is one of the hallmarks of expertise---to know what is important.

Again, and I believe I have reiterated this many times, the test is admittedly not perfect. The test could, for example, be improved by validating it with real learning experts other than myself.

Here's a more ideal design, implementable with more time and resources than I had available:

1. Develop a set of questions based on fundamental research-based learning principles.

2. Select a group of learning researchers to offer suggestions for improvement of each test item. Then improve the questions.

2A. If we had unlimited time and resources, we could create experiments for each question, where we actually created different learning interventions for each answer choice and randomly assigned real learners to the various interventions to see which answer choice actually produced the best learning results. I didn't feel this was necessary since I based my questions on the collective wisdom of dozens of the world's most knowledgable learning researchers.

3. Have a second group of learning researchers take the test as test takers. Look at the result for each question. Keep questions that are reliably answered in the "correct" way---in other words keep the questions in which all, or almost all, of the experts agree. Discard questions where the experts disagree.

4. Provide the remaining questions as the assessment.

While this procedure would certainly result in a better test, I am confident that the test as currently designed still can separate the wheat from the chaff.

As you can imagine, using the exhaustive approach I described above would be rather onerous, which is why I created the test in a more straightforward manner.

By the way, since you commented on the lack of research citations in my feedback---an intentional design decision to keep the feedback reasonably short and accessible---let me say that each of these questions is based on dozens or hundreds of studies from the world's preeminent refereed journals on learning, instruction, and memory. To get access to my research, interested readers may visit my catalog of research reports at www.work-learning.com/catalog/

In your critique of my Question 1, you completely failed to include the actual scenario-based question. The context in which decisions are made---as can be included in the question scenarios---is critically important and goes back to the primacy of noticing. An expert in instructional design would have considered the importance of creating an objective that can be measured. Only the "correct" choice creates such an objective.

Critiques are most valuable if they include suggestions for improvement. Your critique complained about the limitations of testing, but did not suggest improvements. How would you create a test that assessed people's knowledge of fundamental learning principles?


CR Geissler

Hi Will,
From your conclusions, either everyone who has taken your test is a dullard and/or is uneducated (including member's of various professional organizations and, likely, some PhD's in education) or your test is flawed.

If Occam's Razor (the simplest explanation is the correct one) is applied here then the problem is with your test. You challenge Stephen Downes to be more constructive with his feedback, please accept mine as I have tried to do just that. I have found many problems, here are a sample:

- Confusing wording - Q1, Answer "C" I'm a big believer in the ABCD (audience, behaviour, condition, degree) approach to learning objectives, but when the written objective is so confusing the simpler objective is better. The problem is that in the stem you talk about websites, but in the outcome you talk about webpages. Question Stems and responses should be not just related, but the same, so when reading the stems, I couldn’t understand how someone would test a website 90% of the time – you either test it or you don’t. Of course, your objective was about testing webpages, even then the wording is odd and takes far too much cognitive processing.

- Content problem - Q3, Answer B - 10 min review, 20 min new material. Your feedback states “Choice A is best because it provides the most repetition.” so if Choice D would have been to provide 25 minutes of review and 5 minutes of new material then it would have been correct? Research suggests that an adult’s attention span is 15 to 20 minutes and that the first 10 minutes are most productive. If this is the case – then all of the productive time in your correct scenario has been spent on review with diminishing attention going to the new materials. Answer “A” balances 5 minutes of good review and then the remaining time for new material which seems more prudent and more grounded in theory.

- Too simplistic, not enough context - Q8 even in your feedback you acknowledge that you don’t know “It seems reasonable that, after a couple of hours of instruction, learners who receive interesting yet irrelevant information may be able to re-energize themselves to pay attention to further learning materials.” Yes, it does seem reasonable and the telling of on topic and off topic stories to vary pace, engage, refresh, motivate, entertain, etc. is the same in the classroom, boardroom, lunchroom and locker room. Done right, it can be a good thing and if this client is a professional comedian, then he should be able to do it right. You should remove this question from your statistics.

- Poorly written distracters – Q9 There are plenty of evaluation experts that will tell you that the six distracters that you provided are frought with problems that renders the question almost useless in its discriminating power. The key issue is not the use of statements like “Choices B and C are correct.” But their use mixed with distracters like “The prequestions will have little if any effect.” The fair way to write this question would be to enumerate all of the choices in the stem, then have all of the distracters match, so the distracters would be along the lines of “Statement 1 alone is correct” / “Statements 1 and 3 are correct”. BTW, your distracters in Q1 and Q5 (e.g ‘same as B but add’) suffer from the same problem but a different reason – I’m sure John Sweller and Ruth Colvin Clark talk about it in their books …

- ??? – Q11 your feedback says “They knew 50 of the answers and guessed correctly on 10 without really knowing the answer. How do we know this?” well, you don’t. I don’t know you or your work, but for someone who claims that the feedback they have written is based on research and who offers to sell that research – this is shocking. Any evaluation expert and/or statistician will say that the premise you base your feedback on is false. If what you wrote was true, then anyone who scores 100% on a test, really only knows 90% and guessed at the final 10%. You cannot twist probability in statistics like this. If probability in testing were so predictable then in any 4-distracter multiple choice test the minimum anyone can get is 25% because to do worse would be statistically impossible.

- Illogical question – Q14 asks “The instructional guru recommends that they develop two tests, one with questions and one without, and see which one works best.” This is obviously a zen question – what is a test with no questions? I guess it is a blank page. So which produces better learning outcomes – blank pages or pages with questions written on them?

- Missing Details – Q15 compares a classroom video seminar with an e-learning video-based course. Without further details about the elearning course, one could assume that a classroom seminar would include other participants and some kind of facilitator – which would mean there would also be opportunities for conversation, collaboration, questions and feedback. The inclusion of words like ‘classroom’ and ‘seminar’ connote all of the interaction that usually takes place in those environments. Thus without a better explanation of the context of instruction in both presentations the question leaves too much room for argument – so the answers cannot be argued definitively.

I enjoyed this little test and you did present some arguments in the feedback that I found reassuring to my own practice and some that challenged my understanding. However I must disagree with you that it has any kind of predictive or explanatory power for the state of Instructional Technology. Occam’s Razor …

Will Thalheimer

CR,

Thanks for your suggestions for improvements.

I think Stephen's point, though, was that the whole way of asking the questions was flawed, because they offered several interpretations, depending upon what the reader focused on.

As I wrote previously, this design actually enables expertise to be demonstrated because experts can tell what to focus on when given the real-world messiness of multidimensionality.

Mark Frank

Will

Just discovered your blog and the test. Great resources - thanks.

However, I think CR has it spot on. The test is an excellent tool for self-assessment and learning but I wouldn't take it seriously as a way of evaluating competence. I did quite well :-)

Jessa Marie Agnes

Hi..can i get the test results for free? I am Jessa Marie Agnes. I have no pay pal account and i also have no credit cards. I take the Personality test because it is our assignment but i cannot get the results for i have no money to pay. i am still a college student here in the Philippines and i have just an allowance of 500php in a week..pls help me.I need the result. If you will send it to me for free just send it to my email address gabjessa@hotmail.com. Thank You.

Will Thalheimer

The quiz feedback is free and is listed at the bottom of the original post. Here it is again:

http://www.work-learning.com/quiz_questions_feedback.htm

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Search

  • Google
    This Blog Web

Translate

Notable Books

Sponsoring Ads (vT1)

Sponsoring Ads (vG2)

Sponsoring Ads (vL3)

Tracker