Written by Ted Kinney, Vice President, Research and Development
As technology continues to advance, new tools and resources become available to us. One tool that has recently gained a lot of attention is ChatGPT. The AI-driven tool has the ability to search the internet and come up with the most relevant content related to the question asked. We’re always excited about any new technology. With new technology comes a wealth of opportunities to the assessment industry. At the same time, we see concerns about the potential negative impacts AI-driven chatbots can have on assessments, such as cheating. While it has been used successfully to cheat on some talent assessments, it is important to understand its limitations and potential impact on the assessment industry. Let’s look at its impact on different assessment types.
Firstly, although it might look like ChatGPT is “figuring out” the answers to questions, it is not. It simply searches the internet at a rapid pace and provides relevant content related to the question asked. As a result, it is more effective in giving guidance for assessments consisting of one item type with conventional, text-based, multiple-choice questions that have right and wrong answers. Furthermore, as of now, ChatGPT can only process text-based content. Even as it evolves to consume more complex item types, it will still have a tough time with items that are collecting trace data or interactive measures of information processing (e.g., multitasking, working memory).
When it comes to personality assessments, ChatGPT provides good advice when participants ask how to respond to items. While the exact wording varies each time, the engine typically responds with something like “As an AI engine, I am not designed to help cheat on assessments.” The responses typically include language like “You should read the item and answer honestly about what best characterises who you are.” This is good news because ChatGPT is actually providing the type of response that assessment vendors and clients would want it to provide.
When it comes to situational judgment tests (SJT), ChatGPT is surprisingly good at creating content, but not as good at responding to it. Over time, it may get better at simple formats that simply ask respondents to pick the best response option. But in more complex formats where participants are asked to rate every response individually, as is the case with most of Talogy’s SJT content, ChatGPT will be less likely to “figure out” the best response option. This is particularly true given with SJT content uses an ideal point-scoring approach, where the best response might be anywhere on the rating scale continuum.
It is clear that a lot of the interventions used for general test security, such as random item presentation, alternate forms, ideal point scoring, mixed methods of measurement, and simulation are the same types of interventions that safeguard against ChatGPT. Our test security interventions are designed so that even a “smart roommate” like ChatGPT isn’t able to help candidates cheat. We have perfected these interventions over the past decade, and they will prove over time to work just as well against ChatGPT. Therefore, while participants may occasionally attempt to use ChatGPT when responding to assessment content in the future, it is important to understand that these test-taking strategies are unlikely to provide the participant with much advantage.
Test security is something that has always been important to us and something we’ve had a strong focus on. We will continue to do so and are confident that our focus on continuous innovation allows us to stay ahead of a smart AI machine like ChatGPT.