All videos
Product Feb 2026 3:12

We built a hiring assessment ChatGPT can't cheat

Most assessments have right answers, which means AI can solve them. We built one where the right answer changes with every role. Here's how it works.

Video thumbnail: We built a hiring assessment ChatGPT can't cheat

Key takeaways

  • The problem with most hiring assessments in 2026 is they have right answers. Anything with a right answer can be solved by ChatGPT. The fix is not better proctoring, it is removing the right answers.
  • Truffle's personality assessment is scored against the recruiter's own preferences for the role. Hire a customer success rep, you want high warmth and agreeableness. Hire a QA engineer, you want high conscientiousness and low risk tolerance.
  • Those preferences are set by you, hidden from the candidate, and unique to each role. There is nothing to reverse-engineer.
  • The assessment uses validated Big Five personality research. The scoring is configured per-role. Candidates answer honestly and score against what you defined as a fit for that specific position.
  • This sits alongside one-way video interview responses, so the personality score is one layer of evidence next to how a candidate communicated and what they said about their experience. Layers of signal get harder to fake the deeper you go.

Every recruiter has the same problem in 2026. Polished resumes. Polished cover letters. Polished assessments. You can’t tell what’s real, and you can’t tell what came from the candidate versus what came from a model.

We took a different approach. Instead of building better detection, we asked a different question: what if the assessment didn’t have right answers? What if there was nothing for AI to optimize for?

Here is what we built.

Before the candidate ever sees the test, you configure it

This is Truffle’s personality assessment. Before a candidate takes it, the recruiter or hiring manager sets their preferences for the role. You define what matters.

  • Hiring a customer success rep? You probably want high warmth and high agreeableness.
  • Hiring a QA engineer? You probably want high conscientiousness and low risk tolerance.
  • Hiring a salesperson? You probably want high extraversion and high ambition.

These preferences are yours. Not a universal model. Not what some industrial psychologist decided a “good employee” looks like. What you told us actually matters for this specific hire.

The candidate sees the same test, every time

The candidate has no idea what configuration you set. They cannot see your preferences. They cannot reverse-engineer the scoring. There is no right answer to search for.

The questions themselves are based on validated Big Five personality research. The scoring behind them is configured by you for this specific role.

A candidate can answer completely honestly and score very high if their natural personality aligns with what you defined as a fit. Or they can answer honestly and score low. That is not a bad outcome. It just means this particular role is not a natural fit for them, which is exactly the signal you wanted.

There is no way to hack this with ChatGPT, because there is nothing to hack. The right answer changes with every role, every team, every hiring manager’s preferences.

What you see on the recruiter side

Each candidate gets a directional alignment score based on how closely their personality profile maps to the preferences you set. You can see where they are closely aligned and where there is a clear gap.

Crucially, this sits right next to their one-way video interview responses, work samples, and other screening data inside Truffle. You are not looking at a personality score in isolation. You are looking at it alongside:

  • How they communicated on camera
  • What they said about their experience
  • How their responses map against the criteria you defined for the role

That is the difference between a personality test that generates a number and a personality test that helps generate a decision.

The principle: layers of signal that get harder to fake

The AI application problem is not going away. Candidates will keep using AI to polish what they submit, and detection arms races are not winnable.

The answer is not to fight that. The answer is to add evidence that can’t be polished:

  • Personality assessments scored against your hidden criteria
  • One-way video interviews where someone has to actually show up and answer a question
  • Work samples that require explaining tradeoffs in their own words

Each layer is harder to fake than the last. By the third one, the candidate you are looking at is the candidate you would hire.

Try Truffle free for 7 days. No credit card.

Watch on YouTube

More on the Truffle YouTube channel.

Transcript

Read the full transcript

Every recruiter I talk to has the same problem right now. They’re getting polished, perfect answers on resumes, in cover letters, even in assessments, and they can’t tell what’s real. So, we asked a different question. What if the assessment didn’t have right answers? What if there was nothing for AI to actually optimize for? Let me show you what we’ve built.

This is Truffle’s personality assessment. Before a single candidate takes it, the recruiter or the hiring manager sets their preferences. You’re defining what matters for this role. Maybe you’re hiring a customer success rep, and you want someone who’s high on warmth and agreeableness. Or maybe it’s a QA engineer, and you want someone high on conscientiousness and low on risk tolerance. This is the key part. These preferences are yours. They’re not a universal model. They’re not what some psychologist decided a good employee looks like. They’re what you told us actually matters for this specific hire.

And here’s the thing. The candidate has no idea what the configuration is. They can’t see your preferences. They can’t reverse engineer the scoring. There is no right answer to search for.

All right, so here’s what the candidate sees. It looks and feels like a pretty straightforward personality assessment. The questions are based on validated Big Five personality research, but the scoring behind it is totally configured by you. A candidate can answer completely honestly and score really high if their natural personality aligns with what you’re looking for. Or they can answer honestly and score low, which isn’t necessarily a bad thing. It just means this particular role isn’t a natural fit for them. There’s no way to hack this with ChatGPT because there’s nothing to hack. The right answer changes with every role, every team, every hiring manager’s preferences.

Now, here’s what you see on the recruiter side. Each candidate gets a directional alignment based on how closely their personality profile aligns with the preferences that you set. You can see the breakdown where they’re closely aligned and where there’s a clear gap. This sits right next to their video interview responses and any other screening data that’s in Truffle. So, you’re not just seeing a personality score in isolation. You’re seeing it alongside how they communicated, what they said about their experience, and how all of it maps against what you told us actually matters. That’s the difference between a personality test that generates a number and one that helps to generate a decision.

The AI application problem isn’t going away. Candidates are going to keep using AI to polish everything they submit. The answer isn’t to fight that, it’s to add evidence that can’t be polished. Personality assessments scored against your hidden criteria, video interviews where someone has to actually show up and answer a question. This is layers of signal that get harder to fake the deeper you go. That’s what Truffle is building. The link is in the description if you want to give it a try.

Frequently asked questions

How can a hiring assessment be uncheatable by ChatGPT?
Remove the universal right answer. Most assessments score every candidate against the same model, so once ChatGPT learns the model, it solves every test. Truffle's personality assessment is scored against preferences the recruiter sets per role, and those preferences are hidden from the candidate. There is no single right answer to search for.
What makes Truffle's personality assessment different from a standard personality test?
A standard personality test scores candidates against a universal model (often someone else's idea of a "good employee"). Truffle's assessment is built on validated Big Five personality research, but the scoring is configured by the recruiter for each role. The same candidate might score high for one position and low for another, because the question is not "are they good," it is "do they fit this specific role."
Can candidates reverse-engineer the scoring?
No. The configuration is set by the recruiter and the candidate never sees it. They cannot tell whether you weighted warmth high or low for the role. There is nothing to reverse-engineer because the right answer for this position is not the right answer for the next one.
What if a candidate answers dishonestly?
They can. Personality assessments are a directional signal, not a polygraph. The point is they sit alongside other signals: one-way video interview responses, work samples, references. Truffle's recruiter view shows personality alignment next to how the candidate communicated and what they said about their experience. Each layer is harder to fake than the last.
Why not just block AI use during assessments?
Browser locks and proctoring make the test slower and more expensive without making it more accurate. The deeper fix is to design assessments where AI use does not give the candidate an advantage. If there is no right answer to look up, there is nothing for AI to optimize.

See it in Truffle

Replace 25 hours of phone screens with 25 minutes of Candidate Shorts. 7-day free trial, no credit card.

Start typing to search 300+ pages on hiretruffle.com.