| Sign In to gain access to subscriptions and/or personal tools. |
Predictable Failure of Federal Sanctions-Driven Accountability for School Improvement—And Why We May Retain It Anyway
HEINRICH MINTROP is an associate professor in the Graduate School of Education at the University of California, Berkeley, 3647 Tolman Hall, Berkeley, CA 94720; mintrop{at}berkeley.edu. As a researcher, he explores issues of school improvement and accountability in both their academic and civic dimensions.
The federal accountability system, made universal through the No Child Left Behind Act of 2002, is a system driven by quotas and sanctions, stipulating the progression of underperforming schools through sanctions based on meeting performance quotas for specific demographic groups. The authors examine whether the current federal accountability system is likely to succeed or fail, by asking, Does the sanctions-driven accountability system work? Is it practical? And is it legitimate among those who must implement it? The authors argue that even though sanctions-driven accountability may fail on practical outcomes, it may be retained for its secondary benefits and because there is a sense that credible policy alternatives are lacking. They conclude by proposing alternative policies and approaches to the current system.
Key Words: accountability educational policy educational reform policy Sanctions are a fact of life. When children do not do their homework, they lose points. When employees do not come to work, their pay gets docked. When shopkeepers do not wash their display windows, customers shun their wares. Sanctions make intuitive sense: We want people to do the right thing, and we feel it is only fair when they bear the consequences for lack of performance. Time and again, industrial psychologists and organizational sociologists have shown how rewards and sanctions function at the core of work performance (Cooper & Robertson, 1986; Latham & Pinder, 2005; Lawler, 1973). But how rewards and sanctions play out has differed quite remarkably over time (Sennett, 2006). The "organization man," working within stable private or public bureaucracies, could look forward to climbing the career ladder, given adequate work effort, loyalty to the organization, and seniority. Being passed over for a promotion at the normative time may have been financially disadvantageous and socially shameful, but one did not lose ones place in the organization. Nowadays, the stakes are higher. Those who are nimble and flexible and with up-to-date skills can reap enormous rewards, whereas those who cannot keep up in the competition are apt to lose job, status, and livelihood. Todays high-performance work organizations are of this decidedly more high-stakes type. Although they dangle high rewards for some, they foster a climate of punitive uncertainty for others, particularly those who find themselves working in struggling or losing industries. Not unlike large corporations in the business world, schools in the past were organized as stable hierarchies, although with rather flat career ladders, that rested on faculty members solidarity and attachment to a given school or school district. Tenure and seniority increased longevity in the job and decreased uncertainty, envy, competition, and turnover. The "apprenticeship of observation" (Lortie, 1977) laid the groundwork for socializing new teachers into established performance standards. Group solidarity among teachers shored up collective expectations of average or middling effort; poor performance of a few "bad apples" was tolerated, and high performance was largely ignored. A decade or so after corporate restructuring, the new high-stakes work organization arrived in the public school system with the advent of accountability systems, first introduced in a few states in the early 1990s and then made universal through the federal No Child Left Behind (NCLB) legislation. Interestingly, after a few experiments with the rewards aspect of high-stakes accountability, most notably in Kentucky and California, school accountability systems gravitated to the sanctions side (Mintrop & Trujillo, 2005). With the additional prod of NCLB, state accountability systems for the most part narrowed their mission. Increasingly bypassing schools for the middle class, they instead targeted, with laser-sharp focus, the lowest performing schools, with the goal of closing the achievement gap. As they became more equity oriented, they also became more punitive. The prevalence of incentives and sanctions is tied to a new centralism in goal setting and goal monitoring made possible by new information and data warehousing technologies (Sennett, 2006). It has now become practical for a central planning agency, be it private or public, to set targets based on a small set of quantitative performance indicators, monitor whether large numbers of relatively small performance units reach these targets, and surgically order sanctions for underperforming units. This new performance management system increases central control by top management, freezing out the mediating functions of the middle layers of the organization. In the political realm, this new approach to public administration increases the potential for a small group of centrally positioned elites to steer a whole system, whether these be efficiency-oriented politicians or equity-oriented civil rights leaders. But in contrast to private corporations, in education central management of school performance is grafted onto, and runs up against, the traditional loose coupling of the educational system, political contestation at various layers of the system, and the enduring unionization of teachers. Whereas in private industry the new performance management system could be bolstered by a good dose of coercion, in the educational system the coercive power of the center is greatly reduced, making the success of the system much more incumbent on legitimacy. Current accountability systems in education consist of standards as broadly framed orientations for subject matter content and skills, standardized tests as the basis for performance indicators, and performance targets and quotas for measuring performance and underperformance. Sanctions are the means by which higher levels of the system put pressure on lower level performance units—schools, districts, and states—to take the central performance demands seriously. At its heart, the federal accountability system for schools is driven by quotas and sanctions. The substance of academic content, testing rigor, the regulation of inputs (with the exception of the requirement for highly qualified teachers), the specific method of school or program improvement, the specific methods of restructuring, all this is left for states to decide, as long as the elements exist formally. What is strictly stipulated, however, is the staged progression of underperforming units through a set of increasingly severe sanctions based on meeting performance quotas for specific demographic groups: from identification and publication of "school improvement" status (a kind of public shaming with potentially far-reaching market consequences), to mounting loss of organizational autonomy through required external intervention and service contracting, and finally to termination through reorganization or takeover of the organization. Even though the law formulates the sanction stages in the language of improvement, support, and radical renewal, the punitive core for districts and schools is apparent: When improvement efforts fail, loss of control and threat to organizational survival is at stake. Raising the overall achievement of a whole national educational system and closing the achievement gap is obviously an enormously complex problem. NCLB is the simple policy answer to that complex problem that currently holds sway. If indeed a combination of quotas and sanctions connected to new data-processing and warehousing technologies could do the job, we would be the first to support such an approach, given the dire situation of many poor students and students of color in the nations schools. But we argue that, at this juncture, the contours of the failure of the federal sanctions-driven approach have come into appearance, as the simplicity of the remedy has become increasingly submerged by the complexity of school and district improvement. But because it is seen as an "easy" way to make the system work, it may be retained anyway.
"Accountability is here to stay," a slogan often heard in the early 1990s to persuade disbelieving teachers to make the necessary adjustments, was not so wrong after all. State accountability systems have shown staying power—certainly beyond what critics imagined. And by all indications, they have proved to be quite powerful instruments in reshaping how schools go about their business, especially schools for poor and minority children (Au, 2007; Herman, 2004). Accountability systems divide into two main components: guidance in the form of standards and test data that orient instruction and inform performance; and firm performance targets, pressures, and sanctions that make the system compelling. By their very nature, pressures and sanctions should be perceived as more negative than standards and tests, the former being more controlling, the latter being more informative (Frey, 1997). Sanctions are penalties for noncompliance with authoritative regulations or powerful demands. They may inflict loss of benefits, prestige, or status on individuals or collectives and trigger attendant feelings of displeasure, shame, or fear (Posner & Rasmusen, 1999). In the extreme, they threaten freedom or survival. Sanctions can be quite costly, as the outlays for the criminal justice system in the United States demonstrate. Costs are reduced when the actual imposition of sanctions is the exception because the threat of sanctions is sufficient to compel the desired behavior or the expected behavior occurs largely voluntarily. Sanctions are credible when they are properly targeted on those actors who are responsible for expected behaviors, when they cause clearly unwanted discomfort, and when they can be enforced. Voluntary compliance is more likely when the expected behaviors are valued because they "work," that is, lead to expected outcomes or correspond to personal values, dispositions, or ideologies, and when actors have the requisite capacities to fulfill expectations (Coleman, 1990; Lawler, 1973; Shamir, 1990, 1991). Conversely, sanctions are likely to fail when they produce ambiguous or uncertain outcomes so that their effect is in doubt. They are likely to fail when they are designed with less-than-credible threats, aimed at diffuse target actors, and beset with visible enforcement challenges, and when they require behaviors for which there is not sufficient capacity. Under these conditions they become impractical. Legitimacy will be compromised when sanctions only tenuously correspond to target actors values and norms, befall actors who do not feel at fault, or instill doom as opposed to hope or expectation of success. Sanctions lacking practicality and legitimacy may induce defensiveness (Argyris, 1990; Staw, Lance, & Dutton, 1981) that prevents organizational actors from learning and trying out new solutions. Nevertheless, the approach may still be maintained (Meyer & Zucker, 1989) as long as secondary benefits can be derived from it, such as symbolic or ideological satisfaction or economic gain or simply because of the lack of realistic alternatives that address the problem as forcefully as the sanctions approach, on the face of it, promises to do. We argue in this article that we have reached a fork in the road: The federal sanctions-driven approach to school performance has developed the adverse conditions enumerated above, in which case its failure becomes predictable. It may be fixed or made to work without real success, in which case it survives but its survival becomes educationally undesirable; or it may hang on despite its failure, propped up by a coalition of secondary beneficiaries with political power. In the following sections we elaborate our arguments by asking three main questions: (a) Does the sanctions-driven approach work; that is, does it produce the intended results? (b) Is it practical; that is, can it be implemented? and (c) Is it valued and legitimate?
Does the Sanctions-Driven Accountability System Work?
Does the system produce the expected outcomes? Whether high-stakes accountability under NCLB improves student achievement also depends in part on the metric used. Most state accountability systems appear to be a success, given that, in most systems, test scores continue to rise. In fact, analyses that use state assessment results show increases in overall average scores (Center for Education Policy, 2008b; The Education Trust, 2004, 2005; Neal & Schanzenbach, 2007). But the picture appears far less positive when one looks at the National Assessment of Educational Progress (NAEP), the only cross-state metric we have. The idea behind comparing state test results to NAEP is that if gains on high-stakes state tests represent real gains in achievement, they should generalize to low-stakes tests, such as NAEP (Koretz, 2008; Lee, 2007). When NAEP scores are used, gains appear to be much lower (Fuller, Gesicki, Kang, & Wright, 2006; Fuller, Wright, Gesicki, & Kang, 2007; Lee, 2006, 2007). The picture looks better for elementary grades than secondary ones, and better for math than for reading—reading scores tend to remain flat, whereas Grade 4 math scores showed some improvement. A meta-analysis of large-scale studies using national data showed modestly positive policy effects on average but no significant effect on narrowing the achievement gap (Lee, 2008). Nonetheless, there are substantial variations among states, and few states have narrowed the achievement gap among racial and socioeconomic subgroups and improved overall performance at the same time (Lee, 2007). Although NAEP scores have risen since NCLB, it is difficult to attribute gains to NCLB simply because the scores represent trends that began prior to NCLB and do not reflect any significant acceleration in the pace of improvement after NCLB passage. Lee (2007) examined long-term trends in national average math and reading scores and in the achievement gap between 1971 and 2004. He found small or moderate improvements in both reading and math, but no indication that the improvements in achievement were related to any educational reform policies (Nation at Risk in 1983, Goals 2000, Improving America Schools Act in 1994, and NCLB in 2001); the trend lines were linear, and there were no significant changes in the performance trajectories over the entire 33-year history of NAEP. Long-term trends in reducing the racial and socioeconomic achievement gap showed a curvilinear pattern, with reductions in achievement gaps in the 1970s and 1980s and an increase in the 1980s and 1990s. Since NCLB passage, the gap has not narrowed significantly. Given the large discrepancies between the NAEP and state assessment results, it is not quite clear what the state tests measure. By all indications, state accountability systems with their own pressures and sanctions are successful at focusing schools and districts attention on state assessments. The literature is replete with accounts of schools attempts to streamline or "align" teaching with state demands. But whether students actually learn more and are given a better education in higher performing schools is doubtful (Mintrop & Trujillo, 2007). It has become increasingly apparent that teachers in low-achieving schools, who must generate larger gains than those in high-achieving schools, have strong incentives to adopt practices that inflate test scores (Koretz, 2008).
Do the sanctions work? Corrective action and restructuring options under NCLB, such as reconstitution, charter school conversion, or takeover by education management organizations (EMOs), may work in some situations but do not appear to work across the board and are often accompanied by negative side effects (see Mintrop & Trujillo, 2005, for more detail; Mathis, 2009). For example, in Maryland, some local reconstitutions actually exacerbated schools capacity problems, reduced schools social stability, and did not lead to the hoped-for improvements, although a few schools benefited from a fresh start (Malen, Croninger, Muncey, & Redmond-Jones, 2002). Results from reconstitutions in Chicago (Hess, 2003) and in New Yorks Schools Under Registration Review program were inconclusive as well (Brady, 2003; New York State Education Department, 2003). School takeover by EMOs has worked in some cases but not in others, as research from Baltimore and Philadelphia suggests (Blanc, 2003; Bracey, 2002; Saltman, 2005; Travers, 2003a; Useem, 2005). State takeovers of entire districts have also produced uneven outcomes. Financial management is often cited as the most promising area for potential success of district takeover by states (Garland, 2003). However, equally dramatic academic success has been much harder to achieve (Education Commission of the States, 2004; Ziebarth, 2002). Although the research base on charter schools is expanding, little is known about charter school conversion as a means of corrective action and school redesign (Bulkley & Wohlstetter, 2003). Available data seem to suggest that converting district-administered schools into charter schools has had uneven results (The Brown Center, 2003; Gill, Zimmer, Christman, & Blanc, 2007). Charter schools also tend to show up on states lists of failing schools in larger proportions than regular public schools (The Brown Center, 2003). Thus among the variety of corrective action and restructuring strategies that have been tried, none stand out as universally effective or robust enough to overcome the power of local context. Competence of provider personnel, intervention designs, political power of actors in the system, and district and site organizational capacity to absorb the strategies all strongly influence how a particular strategy will turn out. A recent review of the available research evidence on the NCLB restructuring options study corroborates this conclusion (Mathis, 2009). The reluctance of states and districts to embrace the strong tools provided to them by NCLB (Center for Education Policy, 2007, 2008c; U.S. Government Accountability Office, 2007) may be telling in this context. The school restructuring options have not been widely adopted, with schools and districts preferring more traditional school reform strategies, such as attending to how instructional time is used, hiring coaches to improve instruction, increasing staff collaboration, or providing school-based tutoring. In a survey of 340 districts conducted by the Center for Education Policy (2007), district officials cited their own strategies as more effective in improving student performance than the more radical corrective measures stipulated by the federal law. Finally, if the NCLB sanctions "worked," we might expect schools to move out of improvement status in large numbers, but in many states this is not happening (Center for Education Policy, 2008a; Owens & Sunderman, 2006). In sum, if there were clear-cut evidence of convincing student-learning gains, the debate about the current sanctions-driven approach would end even if a causal connection between specific sanctions and test scores could not be found. But there is not clear-cut evidence. The evidence of improved student learning is ambiguous, and the effectiveness of the prescribed sanctions for school improvement is mixed. It is safe to say that, as of now, a universally effective treatment for low-performing schools in the corrective action and restructuring stages has not materialized. Thus we are left with great uncertainties as to the effect of NCLB on student achievement—uncertainties that undermine a justification for the costs associated with the law. For those committed to the idea of sanctions-based accountability, there may be grounds for a more positive answer: State test scores keep rising, and NAEP at least moves in the right direction. This kind of evidence may quell their doubts about the wisdom of sanctions-based systems, but it does not silence those with second thoughts.
Is the Sanctions-Driven System Practical?
Simplistic goal setting and misidentification of schools AYP is the measure used to hold schools and districts accountable. Schools that make AYP are assumed to be functioning well. As it turns out, AYP is not very good at differentiating schools that are making progress from those that are not. There are a number of technical reasons for this, most notably the fact that AYP compares the current proficiency status of a school or district to a fixed annual target. According to this metric, schools report the percentage of students who are performing at or above the proficiency target for a given year. Thus AYP—as currently defined and used in most states—is not a measure that captures improvement, or gains in student achievement, from one year to the next (Linn, 2008). Because students in schools identified for improvement for the most part begin with lower average test scores, they can continue to make substantial improvements while failing to reach the fixed AYP performance targets. As a result, overall student achievement gains are often similar in schools that are identified for improvement and schools that met the federal AYP goals. For example, an analysis of data from Virginia found similar levels of improvement in proficiency in both types of schools (Kim & Sunderman, 2005). (Whether or how the growth model pilot program now allowed under NCLB may attenuate this problem is discussed below.) If AYP, as currently used in most states, were one among many school quality indicators used to gauge school and district performance, these inaccuracies would be tolerable, perhaps, but it is not. As the sole authoritative indicator that triggers sanctions, it creates powerful realities for schools and districts on the ground. Irrationally, many schools may be forced to own the low-performance label when they are in fact fairly healthy and still making progress.
Insensitivity to exclusion
Insensitivity to special needs Meeting subgroup targets for students with limited English proficiency (LEP) or learning disabilities has turned out to be a particularly vexing problem for schools and districts. Researchers have identified a number of challenges to implementing the NCLB requirements for LEP students: the instability of the LEP subgroup, the failure of standardized test scores to accurately reflect what LEP students understand, and the lack of proven accommodations that would make these scores more reliable, among others (Abedi, 2004; Batt, Kim, & Sunderman, 2005; Coltrane, 2002; Kieffer, Lesaux, & Snow, 2008). State and local education officials have questioned the fairness of the provisions because students who achieve English proficiency are generally moved out of the subgroup, while new students with very low levels of English proficiency are continually added to the subgroup, greatly diminishing the chances that schools serving large numbers of LEP students will be able to improve the performance of this subgroup and make AYP. In addition, states have found that schools reporting an LEP subgroup are more likely to be identified as needing improvement than those without this subgroup—a pattern that applies to the subgroup of students with disabilities as well (Batt et al., 2005; Sunderman et al., 2005). It is not surprising that a system inclined to capture performance with highly standardized and simple measurement tools and averse to exceptions from uniform proficiency goals would come under enormous strains in dealing with special needs students. But the remedy is not straightforward. In the logic of the systems incentives, any group excluded from or relieved of accountability provisions runs the risk of being short-changed, as it is in schools and districts self-interest to ration their services according to the most pressing accountability demands.
Discouraging rigor State accountability systems that operate within Horizon 1 seem to be more practical in the present architecture of NCLB. Operating with fairly low test rigor pegged to presently available state and teacher capacity, these systems produce low numbers of failing schools and have capacity-building needs that are fairly light, affordable, and manageable (Peterson & Hess, 2006). Pressure may be sufficient to prod schools toward reaching the systems modest proficiency goals. Thus in low-demand or low-rigor systems, a mere sanctions approach of the NCLB type would work. In contrast, we have systems that are more ambitious in their performance demands while at the same time producing an intervention burden that seems to make the system unworkable. To illustrate, data from the 2003–2004 school year (Mintrop, 2008; Mintrop & Trujillo, 2005) show that states such as California, with high testing rigor in which the definition of proficiency is relatively close to that of NAEP, had up to a quarter of their schools in federal school improvement. States with large gaps between state-defined and NAEP-defined proficiency, such as Texas (where the gap is between 50 and 60 points), had a much smaller intervention burden—about 5% of the schools in Texas were identified as needing improvement. Kentucky is a state with medium testing rigor and a correspondingly medium intervention burden (about 10% of total number of schools in 2003–2004). States that adopted accountability systems prior to NCLB (first-generation accountability states) tended to deal with these high-intervention burdens by scaling back their programs. After an initial period when some states classified large numbers of schools as needing improvement, most settled on an intervention burden of no more than 2% to 4% of total number of schools in the state. But this winnowing down took place prior to NCLB, when states had flexibility in how many schools they wanted to sanction as low performing. It is no longer possible under NCLB, with its firm staging of sanctions and increasing proficiency targets. Thus if states were to adopt definitions of proficiency close to NAEP, as California did and as some advocates have demanded of all states, the result would be a staggering number of schools in need of improvement, for which enormous intervention capacity would have to be provided. Indeed, by 2008, 48% of schools and 61% of districts in California were in improvement status (Asimov, 2008). Short of that, states have the option to keep their school accountability systems well within Horizon 1 or else risk impracticality.
Turbulence In all likelihood, many low-performing schools, unable to meet federal AYP, will have previously been subjected to substantial local reform measures. Districts that anticipate state action and carry out local school restructuring often move principals and staff, conduct inspections, and mandate programs before a school appears on the state or federal radar screen. When that happens, schools may have to repeat improvement stages or cycles once they enter federal or state corrective action, and they may have to adopt, yet again, corrective action and restructuring strategies with uncertain and contingent prospects for improvement. Something of this nature is bound to happen in places like Philadelphia, where a fairly large number of the lowest performing schools will make their journey through the NCLB stages as already redesigned schools (Travers, 2003b), and has already happened in districts that have a long history of reconstitution, such as San Francisco. As was pointed out above, charter schools tend to show up on states failing schools lists in larger proportions than do regular public schools. For these schools as well, fundamental redesign happened prior to the introduction of federal sanctions. In other words, rather than being distinct stages of well-articulated intervention intensity, NCLB interventions will increasingly look like déjà vu to affected schools, with more hoops to jump through, unless states design intervention approaches that are truly different from "all the other things" a school has already tried. Such approaches need to decrease turbulence rather than add to it. But the rigid staging of federal sanctions makes designing measures appropriate to the developmental needs of a given school or district much more difficult.
Strained capacities High-stakes accountability systems seem to intensify a two-tier structure of high-and low-capacity schools and districts. Research has found that high-capacity schools often already possess the capacity and resources needed to perform at high levels and are thus able to use the additional impetus and guidance from the accountability system to respond as expected—that is, to improve instruction and curriculum (Diamond & Spillane, 2004; Elmore, 2004; Sunderman, 2001). They are therefore more likely than low-capacity schools to avoid the negative repercussions of the sanctions. In contrast, many poorly performing schools lack the resources and capacity to respond, on their own, to sanctions in ways that will improve curriculum and instruction (Elmore, 2004). Intensifying pressure through sanctions will not result in improvements but in further rigidity, fragmentation, and deterioration (Mintrop, 2004). Low-capacity schools are predestined to bank on short-term strategies that require little added capacity (Sunderman, Tracey, Kim, & Orfield, 2004). Common strategies are test preparation activities, content alignment, and concentration on tested subjects, benchmark grades, and students near proficiency. In some low-performing schools, this can amount to a parallel test-remediation curriculum that is different from the regular curriculum taught in less-pressured schools, with the result that students are excluded from intellectually challenging content and learning (Diamond & Spillane, 2004; Sunderman, 2001; Valenzuela, 2005). In low-rigor, low-demand accountability systems, these strategies might actually work to keep a school from facing corrective action, but it is unlikely that they suffice in more rigorous accountability systems. When the mere threat of sanctions is not strong enough to lead schools and districts to meet AYP goals or when it has become a detriment to schools chances for improvement, a support infrastructure is needed. NCLB relies on state education agencies to play a crucial role in implementing the federal mandates but provides relatively modest resources to help them do so. Under NCLB, states are required to develop testing and accountability systems that in many instances go beyond what they had in place previously. They must collect and publish data on student achievement that include disaggregation by subgroup categories and teacher quality, which is more extensive than previous data requirements. Even more important, states have a role in helping schools and districts improve under NCLB, a requirement that traditionally has not been a state function. State education agencies are relatively small and generally devote modest efforts to distribute resources and assure compliance with federal and state laws. The traditional focus of state agencies—to enforce federal requirements, enact state policies, and act as conduits for the flow of federal money to school districts—means they lack both the staff and the expertise to reform schools (McDermott, 2004; Sunderman & Orfield, 2006). In such a system, responsibility for school improvement gets passed down to the next level of the educational system, often leaving low-capacity schools to improve on their own.
Coping with impracticality States have also put pressure on the federal government to make implementation more flexible. And indeed, the federal government has moved to attenuate some of the shortcomings of NCLB but has not removed the sense of pervasive impracticality surrounding the sanctions approach. This is particularly evident in two pilot programs—the growth model pilot program and the differentiated accountability pilot program. As of January 2009, the U.S. Department of Education had approved the growth model pilot program in 15 states. This pilot program was intended to allow states to take into account student progress when determining AYP, but early, although not definite, indications are that its use has made little difference in the number of schools identified for improvement (Klein, 2007; Weiss, 2008a, 2008b). Under current guidelines the pilot maintains many of the requirements of the current status measure—the 100% proficiency requirement remains in effect, states are required to incorporate grade-level proficiency targets into the model, and only assessments in reading and math are included—and will do little to mitigate misidentification and the enormous intervention burden that accumulates in states with medium to high standards. Moreover, growth models are challenging to design, requiring very sophisticated state databases and a high level of technical expertise (Haertel, 2005; McCaffrey, Lockwood, Koretz, & Hamilton, 2003), both in short supply in the majority of states. The differentiated accountability pilot program would allow some states to determine how to intervene in schools and districts. The rationale was straightforward: Schools need "differentiated" interventions that are linked to the reasons for which they were identified for improvement in the first place. However, the pilot program continued to require states to increase the number of students participating in the supplemental services and choice options, to maintain the NCLB sanctions stages with stipulated time lines, and to ensure that the restructuring and corrective action sanctions were retained (U.S. Department of Education, 2008). In short, states have shown a tendency to lessen compliance pressures, whereas the federal government has made allowances for minor design changes that leave the overall approach with its attendant impracticalities firmly in place. In sum, the federal sanctions-driven approach to school performance as currently designed with its simplistic method of determining performance indicators and setting goals and its rigid staging of sanctions has proved to be quite impractical. In state systems with at least moderately high performance demands, NCLB has led to high numbers of failing schools that by far outstrip district and state intervention capacities. But it is not even clear if the bulk of these schools are in fact correctly classified. Most notably, the sanctions system has no practical answers in stock for the full spectrum of student performance and learning needs, especially for students far below proficient, special needs students, and excluded marginally performing students; moreover, it does not speak to the predicament of low-capacity schools and districts. Although it may appear that the pressures of looming sanctions have succeeded in fermenting a climate of reform, such ferment, in many instances, is more likely to result in unproductive turbulence than in sustained school improvement.
Is the Sanctions-Driven System Valued and Legitimate? Accountability systems fashioned after NCLB principles violate core professional norms of educators and produce widespread frustration and demoralization among those charged with carrying out needed school improvement efforts. Although teaching to the test is acceptable to a certain degree, high pressure to do so to the exclusion of other more complex and far-reaching goals is not. As a result, teachers widely report that they need to compromise standards of good teaching when striving to meet accountability goals (Abrams et al., 2003; McNeil, 2000; Valenzuela, 2005). Indeed, schools performance or accountability status may be a poor indicator of their overall educational quality (Mintrop & Trujillo, 2007). The moral discourse of accountability assigns failure to schools lack of high expectations and standards for all students and places the burden of responsibility on educators. Educators themselves are torn. They assume guilt and at the same time discount it (Booher-Jennings, 2005; Finnigan & Gross, 2007; Hargreaves, 2004; Mintrop, 2004). The belief is widespread that sanctions penalize teachers and administrators who have to work under the most difficult conditions in schools that serve children in poverty from many different demographic subgroups, a belief that resonates with evidence documented by research (Sunderman et al., 2004). As a result, low-performance labels attached to the organization are rejected as valid judgments of individual work quality (Mintrop, 2004). Studies have documented that accountability goals are not seen as realistic, and sanctions are seen as ill guided and of little personal consequence, unfairly placing blame on teachers. Yet despite misgivings, the humiliation or discomfort of working in a publicly labeled low-performing school seems to trigger an initial surge of energy and determination, if not frenzy, among educators to meet the goals (Finnigan & Gross, 2007; Fullan, 2003; Malen et al., 2002; Mintrop, 2004). Often the most activist teachers and administrators who have the least reason to own the low-performance label are the ones who assume most of the responsibility. In other cases, district and school administrators use the label to demand compliance with centrally adopted prescriptions and delegitimize teachers traditional defenses against administrative intrusions. When hoped-for improvements either are not forthcoming or cannot be sustained after the short-term fixes have been exploited, as is often the case in struggling low-capacity schools, resentment and demoralization set in that trigger exit (Finnigan & Gross, 2007; Mintrop, 2004). The reality of high rates of school failure in the more demanding states, in the face of threatened or imposed sanctions and despite educators efforts or willingness to comply and suspend judgment, reinforces the overall negativity of the sanctions-driven approach. Educators who are guided by the idea of public service in a difficult environment (Sennett, 2006; Shamir, 1990) and who do not embrace the systems self-interested performance calculus feel especially devalued. But values are not fixed, and teachers as an occupational group are highly susceptible to external normative influences. Accountability systems, with their measured outcomes, performance targets, sanctions, and attendant programmatic prescriptions rooted in powerful ideologies of effectiveness and science, may reshape values. We have examples of schools that have imbued accountability goals with moral purpose and that function with a sense of goal integrity, or good balance between external demands and internal values (Elmore, 2004; Mintrop & Trujillo, 2008; Reeves, 2000); have taken determined steps to tightly align their teaching to state assessments; and have been extraordinarily successful. Furthermore, the loosening connection between teachers and schools of education that traditionally inculcated novice teachers into a discourse of professionalism and progressivism, combined with the tremendous teacher turnover in urban schools, may result in dwindling teacher agency and make accountability systems and sanctions-driven accountability an overwhelming force that creates not only new work routines but also new values. At the present time, however, it is probably safe to say that negativism is the prevalent mood. And despite an almost 20-year run in some states, accountability systems, and particularly NCLB, still encounter serious legitimacy and acceptability problems among the very groups that they are designed to primarily target. It is indicative in this context that, in their majority, state accountability systems established prior to NCLB have either rarely used, or turned away from, high pressure and sanctions as a main lever to motivate teachers. Instead they came to emphasize mild pressure. By contrast, under NCLB, pressure as an improvement strategy is a central feature, and schools may face severe sanctions in a rather short time, with all the concomitant problems of legitimacy.
The federal sanctions-driven approach to school performance is not a powerless system but is nevertheless likely to fail. The system would not fail if it were simply a matter of skepticism on the part of those who need to carry out school reform or because it was not valued by teachers or administrators in low-performing schools and districts. It would probably not fail based on the inconclusive data on the effectiveness of accountability systems, because interpretations can be slanted and one can cling to the encouraging upswing of test scores in most state systems. But the combination of uncertain effects, loose connection to the broader educational values and norms of educators, and the difficulties or impossibilities of carrying out the laws regulations day-to-day makes it a prime candidate for being declared a failing system. But there is a way out. As long as states maintain a low goal horizon and the more lenient options for school improvement or restructuring continue to be chosen, the architecture of NCLB will hold. But it will be a solidly Horizon 1 undertaking. The problem with such accountability systems is not that they concentrate schools on Horizon 1 challenges but that they tend to squelch teacher activities in Horizon 2, particularly when these systems work well, that is, when they push educators to run a tight ship around test-driven basic skills remediation. And once a system has operated within the confines of Horizon 1 for a while, and educators have internalized the intellectual habits rewarded in such a system, school improvement dynamics cannot simply be switched over into Horizon 2. Thus learning gets stuck while the system succeeds. This is particularly destructive for poor students and students of color who, more so than White students, are concentrated in the schools that NCLB identifies as failing. There are two more reasons why the system may persist even when it fails. One is that, as the literature on failing organizations has shown (Meyer & Zucker, 1989), failing structures are kept in place when groups derive secondary benefits from the maintenance of those structures. Secondary beneficiaries of NCLB are those ideologically or politically committed to it, those deriving economic benefit from it (e.g., testing agencies, educational management organizations, segments of the school improvement industry), and those deriving political benefit from the dysfunction of the sanctions-driven approach (e.g., politicians campaigning on a platform of educational reform). Perhaps more important among those who are committed to educational equity is the sense that we lack credible policy alternatives. A focus on sanctions-driven accountability is the dominant paradigm that is currently driving federal and state educational policy, and consequently many policy proposals focus on ways to improve on NCLB. Some of these proposals could improve NCLB. For example, accountability systems that set targets pegged to real growth achieved by a sizable number of demographically similar high-performing Title I schools are preferable to the current status measures. Also preferable are systems that incorporate multiple indicators of performance. Although state standards are good orientations and state tests are good devices for system monitoring and self-monitoring, multiple indicators of school quality are better able to cover a wide spectrum of educational goals and valued outcomes. Multiple-indicator systems have a better chance to connect to concerns such as student engagement in learning and instructional quality. There are a number of good suggestions that if adopted could improve on the meaningfulness of accountability systems for educational practice but fail to address the underlying flaws of NCLB. We argue that what is needed is an alternative to the current sanctions approach and a broadening of the social welfare agenda. Recognizing the complexity of schooling, the need for capacity building, and the importance of factors external to the school that affect student performance is one alternative to the current sanctions-based accountability system. Policy makers would be required to think about school reform differently and acknowledge that schools alone cannot overcome the social and economic inequities in our society that contribute to unequal educational outcomes. More comprehensive investments in student welfare that link education with health, job development, and community building, as well as redistributive investments to attract and keep top-flight professionals in schools for the poor, would be paramount. But even more comprehensive social welfare measures do not by themselves solve the problems of the adversarial sanctions system currently in place. An alternative approach to sanctions-driven accountability would be a partnership between government, the teaching profession, and empowered low-income parents. Schools cannot be improved against the better judgment, and without the enthusiastic participation, of those charged with making the improvements. Their commitment cannot be coerced through sanctions, but it can be motivated through guidance and mild and positive pressures that mobilize internal ideals and standards of competence and care (Darling-Hammond, 2004). Such standards, pertaining to teachers inherent desire to reach students personally and effect learning, are exemplified by the actions of the stronger teachers and need to be made compelling for all teachers through professional socialization in teacher preparation programs and sustained by way of good instructional supervision, learning communities at school sites, professional networks—and the soft power of accountability systems that are redesigned to inspire educators. Accountability systems inspire educators when they connect to broader educational values and give the stronger teachers enough flexibility to model best practices. In such systems, state tests become a feedback device rather than an automatic sanctions trigger. Soft accountability is powerfully augmented when parents are mobilized to support their childrens achievement and press for high-quality schools. We submit that after about 15 years of state and federal sanctions-driven accountability that has yielded relatively little, it is time to try a new approach, one that centers on the idea of sharing responsibility among government, the teaching profession, and low-income parents. The hard cultural work of broader-based movements, nourished by government and civic action, will have to replace legal-administrative enforcement and mandates as the centerpiece of such an equity agenda.
This research was supported in part by a grant from the Charles Stewart Mott Foundation to the Civil Rights Project/Proyecto Derechos Civiles, University of California, Los Angeles. Received for publication January 25, 2009. Revision received May 7, 2009. Accepted for publication May 8, 2009.
Educational Researcher, Vol. 38, No. 5,
353-364 (2009)
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||





