Assessment design in an AI world
AI does not make assessment impossible. It makes some older assumptions less secure. This page gives you practical strategies for designing assessments that remain valid, meaningful, and defensible — across disciplines.
The question worth asking first
Before redesigning any assessment, ask: what is this task actually measuring? If a student used AI on this task, what would be lost — and does that matter?
For some tasks, the answer is: very little. If the task is producing a generic literature summary, AI can produce a reasonable version and the student producing it from scratch was not demonstrating much beyond the ability to summarise. That task may have needed redesigning before AI existed.
For other tasks, the answer is: everything. An oral presentation of original research, a clinical skills assessment, a studio critique, a live problem-solving session — these cannot be outsourced because they require the person to be present, thinking, and accountable.
So what does that mean? The starting point is not "how do I stop students using AI?" It is "what evidence of learning am I actually trying to capture, and how do I capture it in a way that AI cannot simply produce for them?"
Less secure and stronger evidence
Not all assessment types are equally vulnerable. The following gives you a practical map.
Four practical redesign strategies
These approaches work across disciplines — from humanities essays to lab reports to professional programme assessments. None requires abandoning existing assessment formats entirely.
- Add a process component. Ask students to submit not just the final product but evidence of the process — a research journal, annotated bibliography, draft with reflective commentary, or planning document. The process is much harder to outsource than the product. A student who cannot describe how they arrived at their argument has not done the intellectual work, regardless of whether AI was involved.
- Make it specific to your module. Tasks that require engagement with your actual teaching — specific lectures, seminars, your module's primary sources, discussions that happened in class — are much harder to shortcut with a general-purpose AI tool. The AI does not know what you taught. Your students do.
- Add a brief oral element. A five-minute follow-up conversation with the student about their submitted work — not a formal viva, just a short academic conversation — tells you more about their understanding than the written submission alone. Most students who have engaged with the work can explain it. Most who have not, cannot. This scales better than it sounds: ten minutes per student across a cohort of thirty is doable.
- Use staged submission. Break larger tasks into a sequence of checkpoints — a topic proposal, an outline, a partial draft, the final submission. Each stage is evidence. A student who produces a sophisticated final submission with no plausible development trail from their outline has a gap that is worth discussing.
Assessment types by discipline — what works
Different disciplines have different conventions and different starting points. These are not prescriptions — they are starting points for thinking.
Humanities and social sciences
The conventional essay is the most vulnerable format. Moves that help: require engagement with primary sources taught in your module rather than general research; require a critical commentary on the student's own argument; add an in-class component of 20–30 minutes that links to the essay topic; or shift toward oral presentation with Q&A. Seminar participation assessed via reflective log — written by the student about what they said and heard in seminars — is difficult to fabricate and high value.
Science and engineering
Lab reports are partially vulnerable — AI can produce the discussion and conclusion sections from given data. The lab work itself, the data collection, and the lab notebook remain strong evidence. Require students to annotate their own data with decisions they made during collection. Problem sets completed in supervised conditions remain robust. Open-ended design problems where students must justify their choices orally are strong.
Professional programmes — nursing, education, law, engineering
Clinical placement evidence, professional practice portfolios, and supervised performance assessments are inherently robust — the student must demonstrate capability in person. Written reflective components are more vulnerable. The most effective approach is to anchor reflective writing explicitly to specific, documented events in the student's own practice — an AI tool cannot replicate what happened during a specific placement shift with a specific patient or client.
Arts, design, and creative programmes
The studio critique, portfolio review, and process portfolio are already strong formats — the work and its development are visible, the student must discuss and defend it in real time. Written components that explain creative decisions made in the student's own practice are more robust than general critical essays. AI use in creative work itself is a substantive disciplinary conversation worth having explicitly with your programme.
Using AI to improve your assessment design
This is one of the highest-value uses of AI for HE staff. You can use it to stress-test your existing assessments, generate rubric language, produce sample student responses at different grade levels, or draft marking criteria.