Random Thoughts: Playing Fair in Education: Standardized Tests

Today is Tuesday, March 12, 2013.

Much of my frustration as an educator was due to standardized testing. For eighteen and a half years I taught English as a Second Language (now called ELL for English Language Learners or ELD for English Language Development). Our students came from over one hundred different language backgrounds, but the majority of the ELL kids were either Hmong, Somali, or Latino (mostly Mexican).

In the course of each school year, my ELL kids were tested even more than the "regular" kids who spoke only English. The testing was the district's response to requirements set by the No Child Left Behind act, or NCLB, which is the latest re-authorization of a bill originally called the Elementary and Secondary Education Act, or ESEA. This is a federal statute that originally mandated remedial reading and math instruction for qualified "general education" students and special education classes for those students who qualified and had a document called an Individualized Education Program (IEP) written for them. It also mandated English language instruction for students who are not native English speakers and in general equal access to education. The act also provides funding for school libraries and professional in-service training for teachers. NCLB also gives the U.S. military access to students in 11th and 12th grades on each school campus. The military can request students' names, addresses and phone numbers. Believe it or not, the act also explicitly forbids the establishment of a national curriculum.

But I digress...

Here's how the year went the last year that I taught. In late August and September, I began by testing the kids who qualified for ELL services by virtue of the fact that another language besides English was spoken in their homes. (This was verified by a Home Language Questionnaire that every parent in Minnesota is required to fill out.) The PreLAS test was mostly oral, but I did test to see if the kids recognized numbers and letters. (LAS stands for Language Assessment Scales and the Pre indicated that it was for preschool and kindergarten only.) Kids were asked to name objects, respond to common questions such as "What's your name?" and "How old are you?" They were asked to repeat a few sentences, a harder task than you think, especially in a language that is not your native tongue. Then I asked them to do a series of actions, such as point to the middle of a piece of paper, put one hand on top of the other, etc. It doesn't sound like much of a test, but it did help us to establish who needed ELL services and who didn't. The problem was that the test took at least 20 minutes, sometimes longer, and I had to test each child individually. Since my normal schedule only allowed me to devote about 30 minutes a day to the kindergarten kids, I just tested kids all day until I was done before I started classes for the older kids. But of course, I couldn't start right away at the beginning of the school year, because the kindergarten students always started the second week of school, and I couldn't test the kids right away on the first or second day because the teachers were trying to get a classroom routine established. (Just putting on coats to go home takes 30 minutes at the beginning of the year. By the end of the year, it takes only ten.) I understood that the last year I taught was the last year that this particular test would be given, and it has been replaced by something else. The PreLAS was given at the beginning of the year and again at the end to measure progress. Those who did not progress normally were referred to special classrooms in the district called Language Academy, where the focus was language development in addition to core curriculum.

Also at the beginning of the year the classroom teachers were required to give a series of tests to their classes as a whole as well as to individual students in order to rank them for reading instruction. This testing was mandated by the reading series that we adopted, which the teachers universally hated. It should be noted that most of the people responsible for this textbook adoption were either fired or rotated to other jobs a couple of years later. The textbook remains, though, because the district spent millions for it. The individual testing takes so long that teachers are required to get substitutes to teach their classes for at least two days each time the test is given during the year. That's a lot of subs!

In early to mid October, the first of the standardized tests for the older kids was given, the MAP test. I was interested to read that some teachers in California were refusing to participate in this test, and I hope their resistance will grow louder. The(Measure of Academic Progress (MAP) test is done on the computer, and as most schools only have one computer lab, only one classroom can take it at a time. Most school districts only buy the grade 3-12 package, but there are tests developed for kindergarten through second grade, as well.)   Note that I said the districts buy these tests, and they cost the taxpayers millions of dollars. According to one source, since NCLB was enacted (in 2002), the burden of mandated testing has been placed on the states. In Texas, for example, the cost to taxpayers for these tests was $88 million per year for the years 2009 through 2012. Nationally, the Pew Center for the States reported that annual state spending on standardized tests rose from $423 million in 2002 to $1.1 billion in 2008, a 60% increase, compared to a 19.22% increase in inflation during the same period.

Oh, and guess who trains teachers to give these tests. Right, a company sales rep. Tests have been written for Reading, Math and Science; the St. Paul Public Schools bought the reading and math package, thank God.

The test is not timed, so kids can take as long as they want. Teachers are asked not to stop kids unless they are making no more progress. The test uses a computer active interface (CAI), so they start out asking easy questions, then gradually move on to harder ones, depending on how many answers the student gets right. When the computer establishes the student's level, it asks a few more specific questions to establish which specific areas of content (such as multiplication, fractions, etc. for math) the student needs help with. Some kids have more questions to answer than others, but the ones who have more questions generally finish earlier, anyway because they are more able. The computer can tell if the student is guessing randomly, and will end the test early in that case, and the student's score will not be recorded, making it necessary for the student to be tested again. Since students could take up to two hours to take the test, we had to schedule each classroom for at least that long, which meant that only two classrooms could be scheduled in the morning and one in the afternoon. Each classroom had to come in twice – once for the Reading test and once for Math. With eleven classrooms, that meant that 22 test slots of two hours were needed. Some days we couldn't test because other activities were taking place. We also had to work around classroom field trips. The upshot was that it took three weeks to finish all the MAP testing, and that meant that for three weeks the computer lab was unavailable to classrooms for other use.

The kicker is that the MAP test was given not once but three times a year, in October, January and March. Keep in mind, also, that each computer in the lab had to be set up, in advance, for each individual student, so there had to be a seating chart. This meant that there had to be someone to set up the computers, monitor the tests, and sit with the stragglers who were still finishing after the teacher took the rest of the class back to their classroom.   So in addition to buying the test, each school building had to hire their own testing supervisors or pull existing staff off their normal job to do this testing. I was the one pulled off my job to do this, and the ELL department was not happy about it, but they couldn't complain. I didn't have to be in the room for the entire testing time, but I did have to monitor some groups, and I had to upload our results each day to the district server.

Right after the MAP testing, the first report cards were due, so you can imagine how much testing there was going on in the individual classrooms (chapter tests in math, social studies, etc.). And then there were days off from school for the kids while teachers conducted parent conferences.

November and December were blessedly test-free, but of course there were the usual interruptions to instruction for major holidays. In January, the MAP test loomed once again, and it was time for the classroom teachers to re-test the students in reading to determine their progress relative to the reading curriculum materials.

In late February and early March the ESL kids in grades 3 through 12 were required to take the TEAE test (Test of Emerging Academic English). I could combine grades 3-4 and grades 5-6, which helped, but I had to schedule two different testing sessions for each group, to accommodate a reading and vocabulary test in one session and a writing test in a separate session. These tests were not to be given on Mondays or Fridays, so I scheduled the reading test for one group in the morning and the other in the afternoon, so that reading took one whole day and writing took another whole day. Naturally, my regular ELL classes were cancelled for those two days. The more time-consuming part of this test was an individually-administered oral assessment that I did for each and every one of the 90+ students on my caseload. Naturally I had to cancel regular ELL classes to do this. After each assessment, I took that student's answer sheet for 3rd through 6th graders and bubbled in their oral score. For kindergarten through second graders, I bubbled in a special oral-only answer sheet for each student. There were special instructions on how to bundle the test papers and account for all testing materials to be sent back for scoring or destroyed. That alone took up two or three evenings after school. Preparation of the test materials before the test was administered also took up a couple of evenings. No extra compensation was given.

In March, we had the last three-week round of MAP testing. As if that wasn't enough, the last year I taught, our school was chosen to take the NAEP test, and guess who was asked to coordinate that. The National Assessment of Educational Progress (NAEP) is the only test that is administered nationally, using the same materials and procedures, so it is really the only tool that can measure progress in various schools across the nation. The test is more or less the same from year to year, which enables us to compare scores from year to year. The test is run by the Commissioner of Education Statistics, who works in the U.S. Department of Education. Results are reported in "The Nation's Report Card," which now has its own web site. The test is given to students in grades 4, 8 and 12. Schools are chosen each year to provide a representative sample of students in public schools, private schools, Bureau of Indian Education schools, and Department of Defense schools. Private schools include Catholic, Conservative Christian, Lutheran, and other private schools. The state results are based on public school students only. (I don't understand why that is so - private schools should be included in the results, in my opinion.) A team of women came to administer the test, so all I really had to do was fill out a mountain of paperwork, including data for each child on their ability to speak English, their general rank (high, medium, low) in reading and math, and whether they were a part of the Free and Reduced Lunch program. Believe me, anything to do with the government always involves more paperwork than you can shake a stick at! The classroom teachers were asked to cover certain information normally displayed on the walls of classrooms, such as multiplication tables, word walls, and spelling lists. Then they were told to leave the room and not come back in until the test was over. Some of our kids got the reading test and others got the math test. The testing team then took all the test materials and filled out a bunch more paperwork in our conference room for at least a couple of hours. When they left, they gave me a folder full of paperwork that I was to save until the last day of school, then shred. I was given no extra compensation for this, but awarded some "clock hours" that I could use toward teacher re-certification – which I never used because I retired the following year. We were also given an "award" that could be framed and hung in somebody's office, which we promptly put in the "round file."

Also in March we had our second reporting period, followed by spring parent conferences. Naturally they wanted all the walls covered with student art work, so other lessons had to be given short shrift so that kids could do art projects for the walls, to please the parents.

Right after the NAEP test, it was time to think of MCA testing. The Minnesota Comprehensive Examination was given once a year in April in grades 3-12. Scores for third, fifth and eighth-graders were reported to the state, used as the basis for ranking schools and teachers in Minnesota according to student performance. Testing was done in three areas: reading, math, and science. The reading and math tests were pencil-and-paper tests, but the science test (given only to the fifth-graders, thankfully) was done on computers. The problem with this test is that the special ed kids have to be given separate versions of the test, and have to be tested in small groups or even individually (according to each child's IEP) to reduce stress, so teacher aides were required to assist in monitoring the tests. ELL kids who had been in the country less than three years were exempt, which meant that they had to be babysat while the others were doing testing. There also had to be someone available to escort kids to the bathroom if necessary, usually an aide who was pulled off his or her regular job during testing. The tests were given on specific days for the entire district, only in the morning if possible, and had to be done in the middle of the week, so Tuesdays and Wednesdays were testing days. Thursday was a make-up test day. Each subject took two days to test, so the reading test was given one week, the math test the next week, and science the week after that. We were given a few days after the regular testing window to test absentees, a 100% test participation rate being the goal. I was given some extra compensation for administering the MCA test school-wide, including preparing the test booklets for each and every student, making sure all classroom teachers were "trained" in this year's rules (There's a new rule every year, it seems – no joke!), and getting all the teachers to sign a "confidentiality agreement" that say they will not divulge the test contents to anyone. One teacher from some school emailed the State Department of Education about a certain question, quoting the question exactly, and the school was dinged for this no-no by not having its scores counted in the state tally, and guaranteeing that someone from the State Department of Education would come nosing around that school the following year to see that they followed protocol. (I know this sounds extreme, and it is, but I am absolutely not joking, here. This really happened.) Naturally part of my job was also to send all the materials back to the company that created the test, so the tests could be scored.

By this time it was mid-May, and time for the ELL kindergartners to have their final PreLAS test, which once again took several days to complete, because I tried to use only "ESL time" for each classroom, to minimize disruption. After this, it was final report card time, and of course the individual classrooms had final tests or projects. in various subjects, plus final reading testing for the reading curriculum. This reading curriculum testing involved seven different tests, several of which had to be administered individually, as I mentioned before.

At the end of the year, classrooms had to be packed up, and we were notified that there would be a major renovation of the school building, making it necessary that books normally kept on shelves be completely packed away in boxes so they could be moved. Several teachers were also to be moved to different classrooms. (The school was being wired for air conditioning, which irked me no end, it being my last year of teaching. Why did they wait until I retired to get air conditioning in the building?!) Believe me, when this was all done, the teachers couldn't get out of the building fast enough. As usual, I was the last teacher to turn in my key. I had always dreamed of walking out of the building in full sunshine on my last day of teaching, but, of course, the weatherman didn't care about my dreams; nor did the clouds. It rained cats and dogs my last day, and it was dusk before I finally left. (The principal came in to get my key, and I just left the door open for the custodian to vacuum and lock up.)

Anyway, that was my experience with testing for one year, and I have to tell you that I have had it up to my eyeballs with standardized testing.   And this was only in elementary school. They also had grad standard tests in high school that are required for graduation. I shudder to think how little teaching went on in the upper grades.

Oh, and the results – the PreLAS and MAP test scores were given right away, but mostly important for the classroom teachers. We did not get results for the TEAE or MCA tests until the following fall. (I called in to ask whether my ELL students "made AYP" (adequate yearly progress) and was told that ELLs were the only group of kids who did.   So I guess I managed to do my job, even though I spend way, way too much time testing and working with kids who weren't even on my caseload.

If the above isn't enough to tell you why I hate standardized tests, I have more reasons.

First of all, and most importantly, student achievement has not improved. "After NCLB passed in 2002, the U.S. slipped from 18th in the world in math on the Programme for International Student Assessment (PISA) to 31st place in 2009, with a similar drop in science and no change in reading," according to the Committee on Test-Based Accountability in Public Education at the National Research Council. But of course teachers don't need a fancy report to tell them that.

Secondly, testing is unfair and discriminatory to ELL students, not only because they don't all speak English that well, but also because the tests are skewed toward people with experience in American cultures. Here's an example of what I mean by this. In one of the early grad standard tests for high school students (I was a student teacher, then), the ELL kids came out of the testing room and asked us what "Twinkies" were. Evidently, this product was mentioned in one of the reading passages, but our ELL kids didn't know what it was, as their main contact with "American culture" was limited to their school attendance. At home they reverted to their culture of origin. Complaints were made, and the Twinkies passage was deleted from the test. (By the way, it works both ways: very few Americans know what "bubble tea" is, but the Hmong kids have it regularly as a treat instead of things like Twinkies. The Mexican kids have their treats, and so do the Somalis. But once again, I digress.)

As I mentioned above, ELL kids have to take a lot of tests, and given that it takes an average of five to seven years after one enters the country (or after a student who comes from a home where English is not spoken starts school). The upshot is that ELL students are tested before they begin to approach their English-only peers in mastery of the language. (I've had kids start using the bubble answer sheets upside-down before I was able to catch the problem. Also, many of the kids don't understand all the directions, including the one that tells students not to make any stray marks on their answer sheets – so, of course, I was obligated to spend time erasing all the little doodles they drew in the margins, when they didn't know what else to do.)

The tests are not as "objective" as many people imagine, due to the fact that many of the questions are poorly designed, or contain culturally specific information. Some questions are "weighted" and count slightly more than others, and that alone can cause gaps to appear between white students and those of color.

Many students, especially at the high schohol level, do not take the tests seriously, because their scores do not count toward graduation or university admission. Every year, it seems there is some group of kids somewhere who make a pact to give random answers on the tests. If they don't like the math teacher, they may retaliate by intentionally doing poorly on the math test, knowing that their scores will count against the teacher and the school, rather than against the students.

In many states, test results are used to reward and punish teachers. This alone is unfair, but what about teachers whose subjects are not tested? The physical education teacher, the music teacher, the social studies teacher (No, social studies is not tested, at least not yet.), the art teacher, and the computer teacher are exempt from pressure to coax their students to perform well on standardized tests. Because scores can be punitive, teachers are now told explicitly to "teach to the test," meaning that more creative types of lessons are now held to be an unnecessary waste of time.

Although the NAEP test does allow comparisons nationally, it is not given in every school to every grade level every year. The state tests (such as Minnesota's MCA) vary from state to state, guaranteeing that no meaningful comparisons can be made.

These days, standardized tests always include open-ended questions, especially in math, where the student has to write the answer instead of choosing which of several answer choices is correct. This means these questions must be individually graded by hand. Standardized tests that include a writing section (such as the fifth-grade portion of the MCA, which I forgot to mention above) also have to be hand-graded on a holistic scale. Not only does this take time, but the people hired by the testing companies to grade these tests generally have no educational training. They are monitored to ensure a certain productivity quota, and underpaid to boot, earning only $11-13 per hour for work that is tedious and energy-draining. A former test scorer said it best. "All it takes to become a test scorer is a bachelor's degree, a lack of a steady job, and a willingness to throw independent thinking out the window..."

I should mention also, as one who has been hired over the summer to do this, that standardized test questions are also not written necessarily by teachers. I was hired by a testing company in the Twin Cities to write tests that would be given in another state. Those who were hired and "trained" along with me were not teachers, but out-of-work businessmen, housewives, and college students on summer break. I was the only teacher in the group. People are hired as "consultants," which means they are part-time and temporary workers, and income taxes are not taken out of their pay, even though the income is reported to the IRS. As I learned to my chagrin, the taxes levied on "consultants" are higher than those on other types of employees. That was a nasty little surprise that resulted in my having, for the first and only time, to pay extra income tax, rather than get a refund. The work for question-writers was not paid per hour, as for scorers, but instead paid on a commission basis per question or reading passage accepted for inclusion into a test. This eant that my work for that summer was not necessarily guaranteed payment, and that when I was paid, the check did not arrive until October of that year, with another surprise payment in December, when the company discovered that there were not enough acceptable reading passages and they relaxed their requirements for acceptance. (That, alone, should scare you.)

Finally, standardized testing only measures a few things, but not necesarily what is important in life. A test cannot pick out someone who will make a great mechanic, a famous hair-stylist (or even a regular one), a religious leader, a car salesman, or a police officer. Tests cannot even determine exactly who will do well in college, as many believe, because there are a lot of other factors involved, such as the student's level of independence, work ethic, and maturity. Besides which, students who normally do well in tests are the types of students who have learned to game the system, and who essentially wake up in the morning, blink, and get an A. Few of these students have any study skills when they finally get to a place where they are not necessarily the alpha dog, academically. They are the ones for whom Plan A always works, so they have no idea how to devise a Plan B.

To sum up, I give you a quote from Jonathan Pollard's online article, "Measuring What Matters Least."

"Our children are tested to an extent that is unmatched in the history of our society. There is no more discussion of learning or of new educational methods. ...the educational discourse in our nation has been limited to the following statement: 'Test scores are too low. Make them go up.'" :-/

***
Sources used for this blog:

Committee on Incentives and Test-Based Accountability in Public Education at the National Research Council. Incentives and Test-Based Accountability in Education. 2011. www.nap.edu

Contreras, Russel. "Some 11th-Graders Turned Test into a Game." August 2, 2004. www.abqjournal.com

DiMaggio, Dan. "The Loneliness of the Long-Distance Test Scorer." Monthly Review, December 2010.

Jacobs, Bruce. "No Child Left Behind's Emphasis on 'Teaching to the Test' Undermines Quality Teaching." Endeavors, December 2007.

Koretz, Daniel M. "Limitations in the Use of Achievement Tests as Measures of Educators' Productivity." Journal of Human Resources, Fall 2002.

Martinez, Marcy. "TAKS Test Taking a Bite Out of Budget?" April 28, 2011. www.valleycentral.com

McKnight, Katherine PhD. "Opting Out of NCLB Testing." March 25, 2011. www.katherinemcknight.com

National Council of Churches Committee on Public Education and Literacy. "Ten Moral Concerns in the Implementation of the No Child Left Behind Act." www.ncccusa.org (accessed June 21, 2011)

Pollard, Jonathan, "Measuring What Matters Least." StandardizedTesting.net, World Prosperity, Ltd.

Rigga, Kristina. "What Standardized Tests Miss," Mother Jones, May 19, 2011.

"Standardized Tests." ProCon.org (nonprofit public charity). Visited on March 11, 2013.

Strauss, Valerie. "Unanswered Questions About Standardized Tests." Washington Post, April 26, 2011.

Toppo, Greg. "When Test Scores Seem Too Good to Believe." USA TODAY, March 17, 2011.

Valli, Linda and Robert Croninger. "High Quality Teaching of Functional Skills in Mathematics and Reading." drdc.uchicago.edu (accessed June 20, 2011).

Zagursky, Erin. "Smart? Yes. Creative? Not So Much." February 3, 2011. www.wm-edu

Tuesday, March 12, 2013

Playing Fair in Education: Standardized Tests

No comments: