Annual Professional Performance Review (APPR)
Section 100.2 of the Commissioner’s Regulations requires each district and BOCES to conduct required annual teacher evaluations. An APPR plan must be updated annually. Beginning July 1, 2011, the following nine criteria are the performance criteria to be used to evaluate teachers of instructional services. This criteria applies to classroom teachers who are not included in the 2011-12 phase-in of the new teacher evaluation requirements:
- Content Knowledge — Knowledge of the subject area and curriculum.
- Pedagogical Preparation — Employ the necessary pedagogical practices to support
- Instructional Delivery — Demonstrate delivery of instruction that results in active student involvement, appropriate teacher/student interaction, and meaningful lesson plans resulting in student learning.
- Classroom Management — Demonstrate classroom management skills, supportive of diverse student learning needs, which create an environment conducive to student learning.
- Student Development — Demonstrate knowledge of student development, an understanding and appreciation of diversity, and regular application of developmentally appropriate instructional strategies for the benefit of all students.
- Student Assessment — Implement assessment techniques based on appropriate learning standards designed to measure student progress in learning and successfully use analysis of available student performance data and other relevant information.
- Collaboration — Demonstrate effective collaborative relationships with students, parents, or caregivers and appropriate support personnel to meet the learning needs of students.
- Reflective and Responsive Practice — Demonstrate that practice is reviewed and effectively assessed, and appropriate adjustments are made on a continuing basis.
- Student Growth — A positive change in student achievement between at least two points in time as determined by the school district or BOCES, taking into consideration the unique abilities and/or disabilities of each student, including English language learners.
According to section 3012-c of Education Law, as added by Chapter 103 of the Laws of 2010, each school district and BOCES is required to establish an appeals procedure through collective bargaining under which the evaluated teacher can challenge the substance of the APPR, the district’s or BOCES’ adherence to the standards and methodologies for such reviews, adherence to the Commissioner’s regulations and locally negotiated procedures, and the issuance or implementation of a Teacher Improvement Plan.
Approved Student Assessment
Approved student assessment means a standardized student assessment on the list approved by the Commissioner for the locally selected measures subcomponent and/or the measures of student growth in non-tested subjects.
Approved Teacher Practice Rubric
An approved teacher practice rubric must broadly cover the New York State Teaching Standards and their related elements. The rubric must be grounded in research about teaching practice that supports positive student learning outcomes. Four performance rating categories — “Highly Effective,” “Effective,” “Developing,” and “Ineffective” — must be identified, or the rubric’s summary ratings must be easily convertible to the four rating categories that New York State has adopted. The rubric must clearly define the expectations for each rating category. The “Highly Effective” and “Effective” rating categories must encourage excellence beyond a minimally acceptable level of effort or compliance.
The rubric shall be applicable to all grades and subjects; or if designed explicitly for specific grades and/or subjects, they will be approved only for use in the grades or subjects for which they are designed. It must use clear and precise language that facilitates common understanding among teachers and administrators; it must be specifically designed to assess the classroom effectiveness of teachers. To the extent possible, the rubric should rely on specific, discrete, observable, and/or measurable behaviors by students and teachers in the classroom with direct evidence of student engagement and learning. The rubric must include descriptions of any specific training and implementation details that are required for the rubric to be effective.
Artifacts are samples of student or teacher work that demonstrate knowledge, skills, and/ or dispositions related to a standard or goal. A student artifact could be an essay that shows progression from draft to final copy. A teacher artifact could be a lesson plan with annotation as to successes and areas to reexamine.
Assessment refers to the process of gathering, describing, or quantifying information about an individual’s performance. Different types of assessment instruments include (but are not limited to) achievement tests, minimum competency tests, developmental screening tests, aptitude tests, observation instruments, performance tasks, and authentic assessments.
For the purpose of teacher evaluations, assessment approaches are the methods that school districts or BOCES employ to assess student or teacher performance. The methods may include, but are not limited to, the following: classroom observation, videotape assessment, self-reflection, surveys, and portfolio review.
The effectiveness of a particular approach to assessment depends on its suitability for the intended purpose. For instance, multiple-choice, true-or-false, and fill-in-the-blank tests can be used to assess basic skills or to find out what students remember. To assess other abilities, performance tasks may be more appropriate.
For purposes of measurement of student growth, baseline data is basic information gathered to provide a comparison for assessing individual student achievement at the beginning of instruction.
A principal is defined as an administrator in charge of an instructional program of a school district or BOCES.
Classroom Teacher or Teacher
A classroom teacher is defined as a teacher in the classroom teaching service as defined in Section 80-1.1, as the teacher of record and exempts evening school teachers of adults enrolled in nonacademic, vocational subjects, and supplemental school personnel. (Part 80-1.1 excludes pupil personnel services from the definition.)
Observation of classroom teaching practice by a trained evaluator, administrator, or peer is one measure of teacher evaluation. To be a fair and valid assessment element, the observation requires a common standard and rubric of expectations for performance.
Common Branch Subjects
Means common branch subjects as defined in 80-1.1 (any or all subjects usually included in the daily program of an elementary classroom).
Comparable Across Classrooms
Means that the same locally selected measures of student achievement or growth are used across a subject and/or grade level within the school district or BOCES.
Chapter 103 of the Laws of 2010 specifies student achievement will comprise 40 percent of teacher evaluations. Initially, 20 percent will be based on student growth on State Assessments or “comparable measures.” In subsequent years following Regents’ approval of a Value-Added Model, 25 percent will be based on student growth on State Assessments or “comparable measures.”
Guidance on the definition of comparable measures may be obtained by examining the State Education Department’s criteria for alternative assessments. New York State Education Commissioner’s Regulations Part 100.2 (f) (l)-(6), states: “With the approval of the commissioner, assessments which measure an equivalent level of knowledge and skill may be substituted for Regents examinations.” Based on these criteria, examples of comparable measures are suggested below.
- Measure the state learning standards in the content area;
- Are as rigorous as state assessments;
- Are consistent with technical criteria for validity, reliability, and freedom from bias; and
- Administered and the results are interpreted by appropriately qualified school staff in accordance with described standards.
Composite Score of Teacher Effectiveness
According to Part 30 of the Rules of the Board of Regents, a composite score of teacher effectiveness means a score based on a 100-point scale that includes three subcomponents:
(1) Student growth — As measured on State assessments or other comparable measures, 0-20 points for the 2011-12 school year and 0-25 points in subsequent years for those grades/subjects where a Value-Added Growth Model is approved by the Board of Regents.
(2) Student achievement — Based on locally selected measures, 0-20 points for the 2011-12 school year and 0-15 points in subsequent years for those grades/subjects where a Value-Added Growth Model is approved by the Board of Regents.
(3) Teacher effectiveness — For the 2011-12 school year and all subsequent years, 0-60 points.
Comprehensive Teacher Evaluation System (CTES)
A continuous improvement cycle of teacher evaluation that links teaching standards, performance expectations defined in a rubric, individual goal-setting for improvement of practice and differentiated professional development to meet the needs of the individual teacher throughout the span of a teaching career. The five key components include:
- Professional teaching standards;
- Multiple measures used to assess teaching performance;
- Details for effective teacher evaluation;
- The teaching and learning conditions affecting good teaching and positive student learning; and
- Teacher support and assistance.
A certified administrator under Part 80 who has authority, management, and instructional leadership responsibility for all or a portion of a school or BOCES instructional program in which there is more than one designated administrator.
Section 100.2 (dd) of the Commissioner’s Regulations requires that every school district and BOCES provide mentored experience for holders of initial teaching certificates. The goal of mentoring is to provide support for new teachers in the classroom teaching service in order to ease the transition from teacher preparation to practice, thereby increasing retention of teachers in the public schools, and to increase the skills of new teachers in order to improve student achievement in accordance with state learning standards. Mentoring programs should be developed and implemented consistent with any collective bargaining obligation negotiated under Article 14 of the Civil Service Law. The mentoring program must also be described in the district’s Professional Development Plan (PDP). Participation in mentoring is a requirement for an individual to receive a professional certificate.
The measurement, comparison, and judgment of the value, quality, or worth of student’s work and/or of their schools, teachers, or a specific educational program based upon valid evidence gathered through assessment.
An evaluator is an appropriately trained individual who conducts an evaluation of a classroom teacher or building principal. Evaluators may include school administrators, principals, outside evaluators, and teacher peer reviewers.
Evidence includes concrete proof or examples that document student learning or teacher effectiveness and/or improvement. Evidence may be included as part of a portfolio or summarized in a report.
Assessment questions, tools, and processes that are embedded in instruction and are used by teachers and students to provide timely feedback for purposes of adjusting instruction to improve learning are considered formative assessments. Formative assessment is used primarily to determine what students have learned in order to plan further instruction. By contrast, an examination used primarily to document students’ achievement at the end of a unit or course is considered a summative test.
A formative evaluation provides a teacher with feedback on how to improve their teaching practice to advance student learning. It is a critical component of career professional growth. Data from formative evaluation also can identify specific professional development opportunities for teachers that will facilitate student learning (e.g., instructional techniques that meet the needs of diverse learners, effective classroom management strategies, and use of student assessments).
Means the Board of Education of each school district or the Chancellor of the City School District of New York City, BOCES, or to the extent provided by the law, the Board of Education of the City of New York.
Means to measure the change in the performance of students on specified assessments over time.
A key question in the design of a growth system is to determine how “academic progress” over time is to be measured and how much growth is “enough.” New York will adopt the use of the Common Core State Standards and the resulting assessments as they become available, and the growth system will be aligned concurrently.
High Stakes Tests
One-shot exams administered to students with results used for determining consequences to students, teachers, and schools. Such tests include Regents Examinations, Teacher Certification Examinations and the grades 3-8 English language arts and math state assessments.
The extent to which two or more individuals (coders or raters) agree. Inter-rater reliability addresses the consistency of the implementation of a rating system. Ongoing training for all evaluators on the use of a teacher evaluation tool or protocol is one way to ensure continuous inter-rater reliability.
The primary individual responsible for conducting and completing an evaluation of a classroom teacher or building principal is the lead evaluator. To the extent practicable, the building principal, or his or her designee, will be the lead evaluator of a classroom teacher.
An experienced, skilled teacher who helps or coaches primarily beginning teachers to strengthen their instructional and pedagogical skills. In New York State, the mentor’s role is confidential and non-evaluative, unless the negotiated collective bargaining agreement states otherwise. Ideally, a mentor will have certification and expertise in the same content area as the person being mentored. Generally, mentors and mentees may be located in the same building.
The array of different assessments and evaluation tools used to obtain evidence of a teacher’s knowledge, skills, and dispositions. The purpose of a measure or set of measures is to provide “strong and convincing” evidence of an individual’s performance in a way that results in professional growth and improved student learning. Multiple measures allow teachers to provide evidence of their wide-ranging skills and activities, and provide evaluators with useful and meaningful information and evidence of an individual teacher’s effectiveness (Little, Goe & Bell, 2009).
- Multiple Measures of Student Growth
Two or more measures of assessments to obtain evidence of student learning. Some examples include observation, tests (state, district, grade level, classroom, standardized, criterion reference, norm referenced), essays, tasks, projects, laboratory work, presentations, and portfolios.
- Multiple Measures of Teacher Effectiveness
Two or more measures of teaching effectiveness based on prescribed standards, including observation, creation of a professional evidence binder (portfolio), student achievement scores, parent and student surveys, self-reflection, and others.
Peer Assistance and Review (PAR)
The goal of a PAR system is to help teachers to improve their teaching effectiveness. PAR includes two separate and distinct components — assistance and review. The assistance program ensures that teachers receive the support and guidance to improve their teaching performance. Peer review involves teachers in the assessment of a colleague’s performance. It is a negotiated process in which teachers assess the performance of teachers. Peer reviewers may also be referred to as Consulting Teachers. Peer assistance can exist without peer review but peer review should not exist without an assistance program such as mentoring and professional development. All PAR programs in New York State are bargained collectively.
A professional development strategy for educators to consult with one another, to discuss and share teaching practices, to observe one another’s classrooms, to promote collegiality and support, and to help ensure quality teaching for all students. Relationships between and among PAR participants and coaches are built on confidentiality and trust in a non-threatening, secure environment in which they learn and grow together; therefore, peer coaching is usually not part of an evaluative system. (ASCD, formerly the Association for Supervision and Curriculum Development.)
A collection of work, which, when subjected to objective analysis, becomes an assessment tool. This occurs when (1) the assessment purpose is defined; (2) criteria or methods are made clear for determining what is put into the portfolio, by whom, and when; and (3) criteria for assessing either the collection or individual pieces of work are identified and used to make judgments about student learning (CCSSO).
Portfolio of Teacher Work /Evidence Binder
A collection of items, exhibits, and artifacts intended to show a teacher’s or student’s accomplishments and abilities, including an increase in knowledge and skill. Teacher portfolios when used as a method of evaluation, involve goal-setting, collection of artifacts, self-reflection, and self-reporting.
A comprehensive, sustained, and intensive approach to improving teachers’ and principals’ effectiveness in raising student achievement. Professional development promotes collective responsibility for improved student performance and comprises professional learning that:
- Is aligned with rigorous state student learning standards;
- Is conducted among educators at the school and facilitated by well-prepared professional development coaches, mentors, master teachers, or other teacher leaders;
- Is ongoing and engages educators in a continuous cycle of improvement.
Professional development may be provided through courses, workshops, seminars, technology, networks of content-area specialists and other education organizations and associations.
Quality Rating Categories/Criteria
The performance of teachers evaluated on or after July 1, 2011, will be rated as one of the following categories based on a single composite effectiveness score:
- Highly Effective means a teacher is performing at a higher level than typically expected based on the evaluation criteria prescribed in regulations, including, but not limited to acceptable rates of student growth.
- Effective means a teacher is performing at the level typically based on the evaluation criteria prescribed in the regulations, including but not limited to acceptable rates of student growth.
- Developing means a teacher is not performing at the level typically expected and the reviewer determines that the teacher needs to make improvements based on the evaluation criteria prescribed in the regulations, including but not limited to less than acceptable rates of student growth.
- Ineffective refers to a teacher whose performance is unacceptable based on the evaluation criteria prescribed in the regulations, including but not limited to unacceptable or minimal rates of student growth.
An estimate of how closely the results of a test would match if the tests were given repeatedly to the same student under the same conditions (and there was no practice effect). Reliability is a measure of consistency.
Means that locally selected measures are aligned to the New York State Learning Standards and to the extent practicable, are valid and reliable as defined by the Testing Standards.
Describes a set of rules, guidelines, or benchmarks at different levels of performance, or prescribed descriptors for use in quantifying measures of program attributes and performance (adapted from Western Michigan University Evaluation Center). Rubrics:
- Promote learning by giving clear performance targets based on agreed-upon learning goals.
- Are used to make subjective judgments about work or status more objective through clearly articulated criteria for performance.
- Can be used to understand next steps in learning or how to improve programs (adapted from CCSSO).
Rubric to Evaluate Teacher Effectiveness
Describes performance for each criteria at the level of effectiveness: “Highly Effective,” “Effective,” “Developing,” and “Ineffective.”
Tests that are administered and scored under uniform (standardized) conditions. Because most machine-scored, multiple-choice tests are standardized, the term is sometimes used to refer to such tests, but other tests may also be standardized.
As defined by federal policy, student growth is the change in student achievement for an individual student between two or more points in time. Student achievement in the tested grades and subjects means: (1) a student’s score on the state’s assessments required under the federal Elementary and Secondary Education Act (ESEA); and, as appropriate, (2) other measures of student learning, such as those described for the non-tested grades and subjects, provided they are rigorous and comparable across classrooms.
For non-tested grades and subjects: alternative measures of student learning and performance such as student scores on pre-tests and end-of-course tests; student performance on English language proficiency assessments; and other measures of student achievement that are rigorous and comparable across classrooms.
Student growth is the change in student achievement for an individual student between two or more points in time. A state may also include other measures that are rigorous and comparable across classrooms.
Student Growth Percentile Score
A statistical calculation that compares student achievement on state assessments or comparable measures to similar students.
A test given to evaluate and document what students have learned at the end of a period of instruction. The term is used to distinguish such tests from formative tests, which are used primarily to diagnose what students have learned in order to plan further instruction.
Summative Evaluation for Teachers
Assessment of whether a standard has been met. It can be used for tenure decisions, intensive assistance decisions, dismissal decisions, career path decisions and compensation decisions.
Establish a framework and definition of specific expectations for what teachers should know and be able to do.
- Provide a clear definition of effective instructional practice;
- Define teacher competencies and describe what teachers should know and be able to do;
- Promote student learning;
- Serve as the base for teacher evaluation; and
- Inform professional learning and development.
Teacher (Principal) Improvement Plan (TIP)
On or after July 1, 2011, Chapter 103 of the Laws of 2010 requires a teacher receiving a rating of “developing” or “ineffective” to receive a Teacher Improvement Plan. The TIP must be developed and implemented no later than 10 days after the date on which teachers are required to report prior to the opening of classes for the school year. The TIP is required to include, but is not limited to, identification of the needed area of improvement, a timeline for achieving improvement and the manner in which improvement will be assessed. Where appropriate, the TIP should also differentiate activities to support a teacher’s or principal’s improvement in those areas. The TIP is to be developed locally through negotiations and consistent with the regulations of the commissioner.
Teacher or Principal Growth Percentile Score
The student growth percentile score with student characteristics of poverty, students with disabilities and English language learners are taken into consideration.
Teacher of Record
For 2011-12, this includes the teachers who are primarily and directly responsible for student learning activity aligned to the performance measures of a course consistent with guidelines prescribed by the Commissioner. For 2012-13 this term will be defined by the Commissioner.
Means that scores obtained from an instrument (test) represent what they are intended to represent. Validity refers to the appropriateness, meaningfulness, and usefulness of the specific inferences made from test scores. For example, if a test is designed to measure achievement, then scores from the test really do represent various levels of achievement.
Aims to estimate fairly a teacher’s contribution to achievement growth of his/her students.
The model compares class-wide achievement growth to expected growth.
Statistical adjustments account for what each student brings to the classroom:
- Student’s previous achievement.
- Other student factors such as poverty, attendance, special education status, etc. In principle, it is the fairest way to use student achievement in teacher evaluation (Gill).
Value-Added Growth Score
The result of a statistical model that incorporates a student’s academic history and other demographics and characteristics, school characteristics and/or teacher characteristics to isolate statistically the effect on student growth from those characteristics not in the teacher’s or principal’s control.
Determining teacher effectiveness requires that the evidence of multiple measures — classroom observations, parent surveys, student test scores, and other evidence of student learning — be incorporated into a single composite score. In calculating the composite score, all evidence may not have equal value or significance to the specific purpose(s) of the evaluation. Weighting refers to assigning different levels of value to the evidence obtained by classroom observations, parent and student surveys, and to student work.