Making Pay-For-Performance Pay Off (Part 3)

As the Bush administration’s initiatives to establish “pay-for-performance” move toward implementation, questions and doubts prevail. (Part 3 of 3 in a series of articles on pay for performance).

As the Bush administration’s initiatives to establish “pay-for-performance” move toward implementation, questions and doubts prevail. In previous articles, suggestions were made concerning the changes in selection, training, and priorities needed to maximize the possibility of successful implementation. What remains to be examined is a new appraisal system designed to support merit pay. That is the subject of this article, the last in a series of three.

Who understands this stuff?

There are few people in the Federal Government who have much expertise in the area of performance evaluations. The government acknowledges specialization in many human resources disciplines. There are Staffing, Classification, Employee Relations, Employee Development, Labor Relations, and EEO Specialists. There is no such thing as a “Performance Appraisal Specialist”… and no other specialty really wants to claim it.

Appraisal has been the “third rail” of HR offices for decades. Commonly, those with the most savvy and talent have avoided program responsibilities in the area of performance evaluation. If salary increases are going to be tied to appraisals, this will have to change. Federal agencies (especially DHS and DoD) are in need of real expertise in this area.

Unfortunately, if personnel specialists become familiar with the literature of evaluations they will soon recognize sad and glaring deficiencies. Few, if any, texts speak realistically to the subject of “performance standards” — or individual measurement. Yet these standards is at the heart of every evaluation system.

Report cards and widgets

Back in school most of us had a well-defined set of standards. Grades of 90-100 earned A’s, 80-89 lead to B’s, and so forth. If a student thought she deserved an “A,” rather than the “B” she received on her report card, she would simply ask the teacher to recalculate her average.

In the adult world of Federal employment, things are not so simple. Exams and term papers were contrived for measurement purposes. Real work is assigned to accomplish a mission. It is often irregular in character, inconsistent in importance, and unpredictable in outcome. It will be difficult to quantify.

Those who write about performance appraisals often speak in terms of “widgets.” In point of fact, our government is not a widget factory. Most jobs required judgment, communication, customer service, creativity, and other intangibles.

Weasel words and rodent rhetoric

In many areas of government, the more honest question is “How can we measure success, and/or failure without a bean count?” The answer has proven a disappointment — namely the use of “weasel worded” standards.

For several years now many federal agencies have diluted standards of achievement to generic descriptions — full of impressive adjectives yet void of real meaning. Words such as “rarely,” “frequently,” “occasionally,” and “normally” help neither employees nor supervisors. Most employees take one look at them and conclude that evaluations will be subjective. In such an environment, it’s more important to look good than to be good.

The rater’s dilemma

Whether the system has two levels, five, or a number in-between, what’s written in the standards needs to be credible. Management’s dilemma looks like this: either standards are written in terms of metrics (“hard numbers”) or in terms of weasel words (“soft prose”).

With a hard number approach, agencies are hoping that first level supervisors will actually produce and maintain performance appraisal data. There is a tide of historical experience to the contrary.

With soft prose, supervisors are left to make evaluation determinations that may soon affect salaries (and eventually retirements) based on vague perceptions. Thus, the first level supervisor is caught in a devilish trap. Either he is to gather data (week in and week out) to justify fair appraisals or stand up to inevitable perceptions of favoritism and bias.

As if this dilemma over the form standards will take weren’t enough, critics of appraisal (the late W. Edwards Deming, godfather of TQM, being the most notable) point out that investing time and effort to ensure evaluations are objective and fair will prove unwise. Deming and others far from the TQM community, point out that where the bean count rules, employees compete rather than cooperate, what’s not counted carries no priority, there’s a tendency to meet a standard rather than do one’s best, etc., etc. Quite simply, the assumption that counting behind employees will improve their performance has been challenged for decades.

What’s the payoff?

So why would any public or private employer introduce to pay-for-performance system? The additional time, paperwork, and consternation can in no way justify implementing such a system, absent a belief that pay incentives will result in improvements and efficiencies.

At present, it seems that both DoD and DHS are anchoring their expectations of payback in a belief that employees who compete against each other for money, will work both harder and smarter. This assumption cannot be over-emphasized. Evaluation systems (whether tied to pay or not) only succeed when they result in improved performance.

If the objective is improving productivity at the individual level (and comments related to previous articles in this series indicate other motives may be at work), an alternative evaluation concept is available. Standards That Enhance Performance (STEPs) may be what’s needed. This concept is anchored in the belief that how performance standards are designed is a primary factor in determining whether administrative costs associated with an appraisal system are exceeded by performance improvements.

STEP basics

The STEP method assumes that line managers will develop standards from scratch rather than deploying generics from some book or an HR office. It also presumes that employees will be evaluated in “critical elements” (or categories) that reflect the work they’re paid to perform.

By this method of determining what’s to be evaluated, a Management Assistant would have critical elements such as “Creating Reports,” “Maintaining Files,” “Arranging Meetings,” “Designing Presentations”, etc. Generic elements like “Teamwork,” “Adaptability,” or “Initiative” would become obsolete. In other words, begin by evaluating folks on what they’re paid to do. Critical elements would come from the essence of the job description.

Once the categories (or “critical elements”) are established, supervisors begin to develop STEPs with a brainstorming exercise. Consider what can and/or does go wrong in the performance of each critical element. The more real and potential problems the better. Thus, a list of headaches, fears, and other negatives is developed for each element.

Accentuate the negative

These problem areas are the germs from which real improvements grow. Examples might be “files aren’t updated,” “doesn’t read directives before beginning work,” and “fails to meet deadlines.” If appraisals are to result in pay determinations and overall improvements, then supervisors should identify where improvements are needed.

After editing the list of headaches related to each critical element, STEPs are then designed. These standards inform the employee of the work habits needed to insure the identified problems are eliminated (or less frequently experienced) in the future. For example, how can an employee ensure fewer deadlines are missed? Traditional standards might read, “Meets 98% of all deadlines” (hard numbers) or “Deadlines are seldom missed” (soft prose). Neither has made much of an impression on employees to date.

Moreover, if either of the standards shown above were credible, they assume the employee would know how to meet more deadlines and would now do so for fear he might fail his standard. A STEP in this case might read, “Employee will review pending deadlines each morning upon arriving at work. It’s a deadline is in jeopardy, will notify supervisor in time to adjust schedules or priorities.”

STEPs are designed to improve performance by telling the employee how she is to work — not how fast nor how accurately. In essence, STEPs are observable work habits. They don’t ask a supervisor to monitor completed work for met/missed deadlines throughout the year. Rather this technique presumes that better individual work habits are the foundation upon which improvements are built.

If everyone around you is catching fish and you are not, will it really help to have someone tell you to catch more fish or you’ll be thrown out of the boat?” The tougher assignment is to figure out what’s wrong and how you can fish more effectively. In the STEPs framework, this becomes the primary role of front line management.

Putting it into practice

This different technique for evaluating performance is by no means perfect. It simply accomplishes more with less pretense then “hard numbers” or “soft prose”. Employees are evaluated more on how they do their jobs than on how many, how fast, or how accurately. The intent is to improve work habits rather than adjudicate outcomes. What follows is an example of traditional standards in a five level system followed by STEPs.

In addition to bogus numbers and endless weasel words found in traditional standards, note the absence of criteria relating to “Outstanding” and “Unacceptable” performance. This is a common problem in government. Also note that in the STEPs approach, 3 of the 5 standards are preformatted. They simply relate back to “Fully Successful” and/or “Outstanding”. One need only fill in the blanks (or numbers) to complete them.

Traditional (Results-Oriented) Standards

(Five Level System)

Critical Job Element: Report Writing

Exceeds Fully Successful

Reports are extremely thorough yet concise and readable. Research is performed independently and demonstrates a thoughtful and highly professional approach. Only minor stylistic revisions are required. Fewer than 4% of reports exceed established deadlines.

Fully Successful

Reports are clear, sensible. Research is complete and accurate, requiring only occasional supervisory assistance. No major revisions that involve significant additional time and effort are required. Fewer than 8% of reports exceed established deadlines.


Reports are disorganized and/or unclear. Major revisions may be required. Close supervisory involvement is necessary to ensure the adequacy of research. More than 12% of reports exceed established deadlines.


(Five Level System)

Critical Job Element Report Writing


Performs as described in Fully Successful and, in addition, demonstrates all of the following work habits:

  • Discusses the nature of each report with all team members to understand how findings may affect other program areas.
  • Coordinates editing with the assigned Technical Writer at least a week before draft is complete.
  • Solicits input from other agency departments both at the beginning of the process, and after development of the initial outline.
  • Explores the possible use of graphics to make each report clearer and easier to understand.
  • Uses a tracking system to update the status of all pending reports on a daily basis.
Exceeds Fully Successful

Meets all condition of the “Fully Successful” standard, and, in addition, at least 3 of the items shown in the “Outstanding” standard.

Fully Successful

Performs as described in all of the following:

  • Outlines report before attempting first draft. Outline will be e-mailed to a team member and supervisor for review and comment.
  • Employee will identify any expenses (printing, outside consulting, etc.) associated with developing the report and include cost estimates with initial outline.
  • Develops a time line for each part of the report and submits to supervisor. Adjustments will be made on a weekly basis. Delays will be explained in writing.
  • Reviews accuracy of factual information and includes footnotes and bibliography with first draft.
  • Includes a table of contents in final reports.

After formal counseling, fails to follow any one work habit shown in the “Fully Successful” standard on two occasions.

This approach is offered to those who accept the reality of impending changes and want to make the best of them. These standards require little if any counting and are designed with an eye toward improvement rather than rating. They are welcomed by many employees who want to know how to achieve higher ratings and by supervisors who want to get something in return for those same ratings.

“STEPs” are no more a panacea than are “pay-for-performance” and “pay banding”. As with all forms of performance standards, they presume an engaged, informed, and empowered supervisor. They take more time to develop and require critical thought if they are to be at all meaningful. They are a step in the right direction.

About the Author

Robbie Kunreuther is the Director of Government Personnel Services (GPS). GPS provides 1 to 3-day seminars to Federal agencies in four subject areas: Dealing with performance and conduct issues; Developing sensible performance appraisal criteria; Fostering cooperative labor-management relations; and Applying mediation skills in the workplace. Over the years, Robbie has trained thousands of Federal supervisors, managers, HR specialists, and union officials. For more information about him and GPS, go to