Evaluating candidates

Topics/tags: Miscellaneous, academia

A few weeks ago, I attended Grinnell’s summer workshop on equitable evaluation. Since we’ve been spending so much time dealing with the complexities of faculty evaluation, particularly with regards to the role of end-of-course student evaluations [1], I had assumed that it would be mostly about equitable evaluation of current faculty. However, the workshop ended up being more about evaluating candidates for faculty and staff positions [2]. That’s okay; CS is hiring this year, so it’s good to think about such things [3].

On the second day of the workshop, we ended up discussing the criteria we use to evaluate candidates. Starting in the past few years, Grinnell has asked us to determine those criteria in advance of reading the applications. Our discussions of criteria have been quite useful; it helps us think more closely about what we care about, and it helps us make sure that we’re on the same page, as it were.

While CS was happy to identify criteria, we ended up arguing regularly with the administration about their request that we rank the criteria. In part, it was not clear that we would agree on the ranking [4]. More importantly, it’s not clear how rankings help. Suppose, for example, that we have three criteria, which we’ll designate as 1, 2, and 3, which are ranked in that order. Suppose also that we have two candidates, P and Q, P appears preferable under criterion 1, and Q appears to be preferably under criteria 2 and 3. Which do we choose? The ranking has not helped. It gets even worse when we have sixteen primary criteria.

We were also worried that there was a push for us to use numbers in assessing candidates. Our department generally eschews numbers, since, by themselves, they convey very little information. It doesn’t help that the administration kept talking about matrices, which makes it sound like a problem in linear algebra [5]. But it seems that matrix is just another word for rubric. That is, for each primary criterion (e.g., support for diversity), they have what seem to be a set of sub-criteria, and a description of what makes a candidate unsatisfactory, effective, or exceptional [6] for each sub-criterion. For example, the last row of the Plans and Organizes matrix has

Unsatisfactory: Doesn’t follow an orderly method of setting goals and laying out work.
Effective: Measures performance against goals; evaluates results.
Exceptional: Consistently exceeds performance goals; most often using SMART [7] goals.

I liked the idea of these rubrics; they can help make sure that we agree not only on criteria, but also what makes people weak or strong on those criteria. But the rubrics I saw focused on past experience. We do not generally evaluate faculty candidates on what they’ve done; we evaluate them on their potential. That’s particularly true for teaching; many of our candidates have not taught before, and same have only been at places that don’t employ student-centered teaching methodologies. So I asked about evaluating potential. After some false starts [8], the group helped me think about how we might do that.

And so my project for the workshop became drafting some rubrics for our department to use. I was joined in that process by a colleague from the Center for Teaching, Learning, and Assessment and by the Peer Education and Outreach Coordinator for Computer Science and Statistics, that latter of whom plays an active and important role in CS searches.

In the end, it will be a department decision whether or not to use rubrics and, if so, what the rubrics will look like. But I feel that it was a useful exercise and that having something will support department discussions of both issues.

While the rubrics were my project for the workshop, I also enjoyed the many discussions and readings and learned much from them. Here are some takeaways that I should remember:

It helps candidates if you let them know the primary phone questions in advance [9].
It would be useful to have our PEOC/CSS do a post-class interview with candidates [10,11].
It was suggested that we assess understanding of SLAC [12] education relatively late in the process. I’m not sure how I feel about that. It strikes me that it’s particularly important in a technical discipline like CS.
It’s useful to think about why we have each criterion; I appreciate that my CTLA [14] colleague regularly reminded me of that issue [15].
I need to be careful about anchoring bias [16].

All in all, I’m glad I went. I look forward to the subsequent departmental discussions.

[1] Our current method appears to be biased; the faculty voted to continue their use while we evaluate them more closely and, hopefully, develop a new approach.

[2] If I’d read the description of the workshop more closely, I probably would have known that.

[3] It’s even good to think about such things even if we’re not hiring.

[4] We didn’t have a lot of difficulty selecting criteria; as I said, it was a useful process. However, knowing our department, we’d still be discussing the ranking of criteria this summer.

[5] E.g., multiply a vector of scores by some matrix associated with the ranked criteria to get some kind of rating of each candidate.

[6] Their terms, not mine.

[7] Specific, Measurable, Attainable, Relevant, and Timely.

[8] At least some folks seemed to have difficulty with the idea that you don’t have past experience to evaluate folks on, or that relying on past experience can itself be biasing.

[9] The department uses a script. However, we regularly go off script when someone says something interesting or has trouble answering a question.

[10] Okay, I knew that already. But we should consider making it consistent practice.

[11] During one colleague’s interview on campus, I used our interview time to talk about what they could have done better in their class. They tell me it was quite intimidating. At the same time, it suggested to them that they would get good mentoring. However, I should have realized that Would you like feedback on your class? is a question that only has one answer. The moral: Someone less threatening than I am should be discussing such issues with candidates.

[12] Small Liberal Arts College.

[14] Center for Teaching, Learning, and Assessment.

[15] Why do you care whether they understand the liberal arts? That one was easy. Our primary goal is to educate the whole student within the liberal arts tradition. One can’t do that without some understanding of SLACs.

[16] Relying too heavily on one piece of information or an initial impression (the anchor) and neglecting subsequent information [17].

[17] That description is taken from an NSF survey.

Version 1.0 released 2019-07-09.

Version 1.0.1 of 2019-07-10.

The opinions stated herein are those of Samuel A. Rebelsky and do not necessarily reflect those of Grinnell College, Grinnell's Computer Science Department, the Rebelsky family, CMD-IT, SIGCAS, SIGCSE, any other organizations I am or have been affiliated with, or even most other sentient beings.

Check accessibility with WAVE.

SamR's Assorted Musings and Rants: Evaluating candidates by Samuel A. Rebelsky is licensed under a Creative Commons Attribution 4.0 International License.

This Web site was built using Markdown, some custom scripts, Twitter Bootstrap, and the Bootswatch Readable Theme.