AI with Michal

Using personality tests for hiring

The practice of selecting a validated trait instrument, placing it at the right funnel stage, and routing scores through a documented human review step so personality data informs decisions without replacing structured evaluation or obscuring group pass rates.

Michal Juhas · Last reviewed May 15, 2026

What is using personality tests for hiring?

Using personality tests for hiring means selecting a validated instrument, placing it at the right stage in the funnel, and treating scores as one input among several rather than as a ranking mechanism. The most defensible path starts with criterion validity: evidence that the specific trait predicts performance for your role family, not just for a general population. A conscientiousness measure that works for sales roles may add no signal for technical or creative roles, and using it anyway creates adverse impact risk without any offsetting prediction benefit.

The practical steps are: choose a validated instrument, set a funnel position after at least one human screen, route scores through a documented review gate, log group pass rates from the first hire, and correlate scores to your own performance ratings after 20 or more closed roles. Each step is doable without an IO psychologist on staff, but each step also requires someone to own it.

Illustration: personality tests for hiring as a structured workflow showing candidate completing a trait assessment, scores flowing through a human review gate, and a compliance log tracking group pass rates before the advance or reject decision

In practice

  • A TA manager preparing to deploy a conscientiousness screen for a high-volume customer service role runs a pilot on 30 past hires first, correlates their scores to manager ratings at six months, and presents the correlation coefficient to legal before going live.
  • A recruiter on a debrief call receives the personality report after all panelists have shared structured observations, not before, so scores do not anchor the conversation before direct evidence is on the table.
  • An HR director reviewing a quarter-end hiring audit spots that the personality filter pass rate for one demographic group is 62 percent of the pass rate for the majority group, triggering a vendor conversation about the norming sample before the next intake cycle opens.

Quick read, then how hiring teams use it

This is for recruiters, sourcers, TA, and HR partners who need shared language in vendor reviews, debrief rooms, and quarterly audits. Skim the first section for a shared picture. Use the second when you are deciding how a personality layer connects to live reqs, ATS steps, and compliance reporting.

Plain-language summary

  • What it means for you: Using a personality test in hiring means choosing a tool that was built and tested for your type of role, placing it after an initial screen, and treating the score as one piece of evidence alongside structured interviews and work samples.
  • How you would use it: Pick one trait your scorecard already names as critical for the role, find a validated measure of that trait, and run it as an optional data point before the panel stage.
  • How to get started: Ask your current vendor or shortlist whether they have a technical validity report for your role family. If they do not, request one before signing. If they cannot produce one, look at vendors that publish peer-reviewed criterion validity studies.
  • When it is a good time: After you have a named trait that matters for the role, after legal or HR has reviewed the lawful basis (GDPR in the EU), and after you have a plan for logging group pass rates from day one.

When you are running live reqs and tools

  • What it means for you: In a live workflow, personality scores appear as ATS fields or vendor dashboards. Without explicit rules about when a score can flag or advance a candidate, it becomes a silent automated gate that nobody audits.
  • When it is a good time: After the pilot correlation is positive and after the human-in-the-loop gate is documented in writing: who sees the scores, in what order, and what a low score triggers (investigate, override, or reject with documented reason).
  • How to use it: Keep assessment version and model version in the candidate record. Run the four-fifths check every quarter. Separate the score field from the advance or reject field in your ATS so an audit can show the two decisions were made independently.
  • How to get started: Integrate the tool API into your ATS or export pipeline so scores land in the same record as interview notes. Assign a compliance owner who reviews group pass rates monthly for the first six months of any new assessment deployment.
  • What to watch for: Vendors who bundle personality scoring with AI inferred from video or text without a separate validity study for that inference layer. Bundled claims are harder to audit and harder to defend if challenged. See AI bias audit for the questions to ask.

Where we talk about this

On AI with Michal live sessions the legal and ethics modules of the AI in recruiting track cover personality tests as a concrete case study in responsible tooling: how to read a technical manual, how to run a pass-rate audit, and how to brief a sceptical hiring manager on why the score is one input, not a ranking. If you want the peer discussion with real vendor names and real data, join a session at Workshops.

Around the web (opinions and rabbit holes)

Third-party creators move fast. Treat these as starting points, not endorsements, and verify any vendor-specific claims before deploying a tool.

YouTube

These open search results pages; use Filters - Upload date to find recent content. Mix IO psychology research with employment law explainers.

Reddit

  • r/IOPsychology is the practitioner and researcher community for discussions on which instruments have defensible criterion validity for specific role types.
  • r/recruiting captures real recruiter experience with personality tools: vendor claims, hiring manager pushback, and practical audit stories.
  • r/humanresources surfaces HRBP perspectives on policy, lawful basis documentation, and candidate feedback on assessment experience.

Quora

Validated tool checklist

CriteriaWhat to ask the vendorWhy it matters
Criterion validityWhat does this trait predict, for which role family?A general validity claim does not apply to your role
Norming sampleHow many, what industry, what seniority?Norms built on one group do not transfer cleanly to another
Adverse impact dataPass rates by race, gender, and age from the norming studyRequired under EEOC Uniform Guidelines for any selection tool
Inference methodIs the score from self-report or AI inference from behaviour?Inferred scores have weaker validity and higher bias risk
Version trackingCan I see which version produced a given score?Needed for audit trails and complaint investigations

Related on this site

Frequently asked questions

How should a hiring team choose which personality test to use?
Start with criterion validity: ask the vendor for a technical manual that links the specific trait to job performance in a sample that matches your role family, seniority level, and industry. If the manual cites a general population study rather than one matched to your role, treat that as a gap. Second, ask for adverse impact data by race, gender, and age from the norming sample. Third, check whether the instrument maps to the Big Five or a peer-reviewed derivative, since frameworks like MBTI lack the criterion validity needed for selection. See personality test for employment for a validation checklist and framework comparison.
When in the hiring funnel should candidates complete a personality assessment?
After a structured screening step, not at the top of the funnel. Placing a long questionnaire at the apply stage filters by assessment fatigue and technical access rather than by the trait you care about. The cleaner sequence is: application, recruiter screen, then structured interview or work sample, with a personality tool placed after the first human touchpoint so you have initial role fit before adding psychometric data. Async screening steps can sit in the same window. Avoid placing the test after a live panel interview: candidates interpret a late-stage request as a signal of distrust and completion rates drop.
How do you stop personality test scores from overriding recruiter judgment?
Log the score and the hiring decision as two separate records in the ATS so you can audit whether score thresholds are acting as automated gates. Require the recruiting lead to write a brief note naming the evidence used for the advance or reject decision, and that note should reference at least two sources (interview, work sample, reference) alongside the personality data. If scores are fed directly to hiring managers without recruiter mediation, the manager often anchors on the number. Keep the human-in-the-loop gate explicit: define who reviews flagged scores and what action options exist, advance, investigate further, or override with documented reason.
What should a debrief look like when personality data is part of the evaluation?
Brief panelists on the trait labels before the debrief, not on the scores. Panelists who see a low conscientiousness score before they share observations will anchor on it even if their direct experience with the candidate contradicts it. Share scores after structured discussion and let panelists note whether the data matched or diverged from their in-room observations. A divergence is often diagnostic: either the test is not measuring what the interview is probing, or the candidate presented differently in a structured context. Pair the debrief format with your scorecard so each trait maps back to a named job competency, not a vague culture label.
How do you verify that a personality tool is actually predicting job success?
After closing 20 or more hires for the same role family, pull the personality scores and the manager performance ratings for those hires at the three-month and twelve-month mark. Calculate a simple correlation. If the trait scores do not correlate with your internal performance measure, the tool is not valid for your context regardless of the vendor's general sample. Also run the four-fifths calculation by group to detect any pass-rate drift since launch. Log model and assessment version for every run so future audits can trace score to instrument. Recruiting analytics tools and your own spreadsheet are enough for this check at small sample sizes.
Can AI tools surface personality insights from interviews without a questionnaire?
Some vendors now infer trait scores from video facial expressions, speech patterns, or interview transcript text without asking candidates to complete a validated questionnaire. The psychometric literature is sceptical: correlation between inferred and self-report Big Five scores is low in independent studies, and the inference method introduces bias against candidates with accents, neurodiverse communication styles, or slower speech rates. A tool claiming to measure personality from a video interview should provide an independent validity study from a peer-reviewed source, not internal vendor benchmarks. See AI bias audit for the questions to ask before deploying any AI scoring layer in your hiring funnel.
How do AI in recruiting workshops address personality test use?
Sessions treat personality data as a compliance topic as much as a sourcing or screening topic. Participants practice writing a vendor questionnaire covering what the instrument predicts, for which job families, and for which groups it was normed, then read sample technical manuals in pairs to distinguish criterion validity from face validity claims. The goal is to give recruiters and TA leads enough vocabulary to push back on a vendor, brief a sceptical legal team, or run a retrospective audit on a tool already in use. Join a workshop to work through live vendor evaluation, then continue the conversation in membership office hours.

← Back to AI glossary in practice