Industry2026-06-26·8 min read

UPSC Chairman Says Future Exams Must Test What AI Cannot: The Case for Rigorous Subjective Evaluation

Dr. Ajay Kumar, UPSC Chairman, says future examinations must emphasise skills that AI cannot replicate. This shifts the burden squarely onto robust subjective evaluation infrastructure — exactly what India's universities are still struggling to build.

A Watershed Statement from India's Top Examiner

In an interview published in The Week magazine on May 30, 2026, Dr. Ajay Kumar — Chairman of the Union Public Service Commission, India's most consequential examination body — made a statement that every university examination controller, registrar, and vice chancellor in the country should read carefully.

"The future of exams will emphasise skills that AI cannot easily replicate," he said. "Critical thinking, creativity, interpretation and problem-solving."

The implications of this statement extend far beyond civil services examinations. They reframe the entire debate about what India's higher education assessment infrastructure needs to do — and why investment in rigorous subjective evaluation systems is not optional.

The AI Paradox in Examination Design

Generative AI has created a paradox for examination designers. The same tools that can instantly produce structured, grammatically correct responses to factual questions make those question types increasingly unreliable as tests of student knowledge. Multiple-choice questions, short-answer recall tests, and fill-in-the-blank formats — the staple of India's high-volume university examination system — can be answered with reasonable accuracy by AI tools accessible to any student with a smartphone.

The logical response, as Dr. Kumar outlines, is to shift assessment toward higher-order cognitive skills: analysis, synthesis, evaluation, and creative application. These are precisely the skills tested through well-designed essay questions, case study analysis, research project evaluation, and problem-solving tasks that require demonstrating reasoning, not just recalling information.

The challenge is that these AI-proof assessment formats are subjective. They cannot be evaluated by OMR scanners or automated scoring systems. They require trained human evaluators to read, interpret, and award marks based on the quality of reasoning displayed in a handwritten or typed response.

And that is where India's university examination system faces its most significant infrastructure gap.

The Subjective Evaluation Bottleneck

India's affiliated university system evaluates hundreds of millions of subjective answer scripts every year. A single affiliating university may oversee examinations for 200,000–400,000 students across hundreds of affiliated colleges, with each student writing 6-8 papers per academic year containing multiple long-form questions.

This scale, combined with tight result declaration timelines driven by admission deadlines and academic calendar pressures, creates conditions that systematically undermine evaluation quality:

Single evaluator per script in most institutions, eliminating the consistency check that double valuation provides

No mechanism for identifying outlier marking patterns among evaluators in real time

Physical answer scripts vulnerable to loss, damage, and misrouting during transport

No granular digital record of individual question marks until the final tabulation

Evaluator fatigue effects across high-volume marking sessions that are invisible in final results data

The result is an evaluation infrastructure that is structurally ill-equipped to reliably assess the very skills that Dr. Kumar says the future of examinations must emphasize.

What Digital Evaluation Makes Possible

The architecture of digital on-screen marking directly addresses each of the bottlenecks in subjective assessment at scale:

Assessment Challenge	Digital Evaluation Solution
Single-point evaluation failure	Mandatory double valuation on the same script image
Evaluator bias (marking own students)	Evaluator anonymity by design — evaluator never knows which student or institution
Outlier detection	Real-time marking analytics flagging statistical anomalies across questions
Physical script integrity	Scanned images stored with tamper-evident checksums
Audit trails	Complete question-by-question marking history accessible for re-evaluation
Speed vs. quality tradeoff	Parallel evaluation by multiple evaluators compresses timelines without compromising rigor

The NBA's outcome-based accreditation framework for engineering and technical education already requires institutions to demonstrate attainment of Course Outcomes and Program Outcomes — a fundamentally analytical, subjective assessment process. Institutions pursuing NBA accreditation must produce evidence that each student's analytical and design capabilities have been rigorously and consistently assessed. That evidence cannot be generated by manual single-valuation systems that leave no granular per-question record.

NAAC's Criterion 2 similarly requires evidence of the quality of examination and evaluation processes, including mechanisms to address student grievances and the fairness of assessment. A digital evaluation system with complete audit trails — showing which evaluator marked which question, when, and for how much — is a significantly stronger evidence base than folders of manually processed marks and handwritten award lists.

The Three Types of Assessment India Must Move Beyond

Dr. Kumar's statement implies a clear taxonomy of what examination systems should evolve away from, and toward:

AI-Replaceable Assessment Formats

Multiple-choice questions testing factual recall

Short-answer definitions that follow predictable templates

Fill-in-the-blank exercises on memorized content

Numerical problems with algorithmic solutions

AI-Proof Assessment Formats (Subjective, Analytical)

Essay questions requiring original argument construction

Case study analysis requiring situation-specific reasoning

Design or problem-solving tasks with open-ended constraints

Research project evaluation involving literature synthesis

The irony is that India's examination system has historically used AI-proof formats — long-form subjective answers — as its primary mode of assessment. The infrastructure to reliably evaluate those answers at scale, however, has not kept pace with the scale of higher education enrollment.

The 2026 Examination Season's Most Important Lesson

The CBSE OSM controversy of May 2026 and the NEET paper leak are, at their core, failures of examination infrastructure. But they contain a less-discussed lesson: both failures occurred in assessments that were predominantly objective or semi-objective in nature.

NEET tests recall and pattern recognition across biology, physics, and chemistry — domains where AI tutoring tools can simulate thousands of question patterns. CBSE's Class 12 examinations, while they include long-form answers, are evaluated under conditions that have historically prioritized volume throughput over analytical rigor.

If India's universities respond to the 2026 crisis by doubling down on objective testing as a perceived "safe" alternative to subjective evaluation, they will move in precisely the wrong direction. The right response is to invest in the infrastructure that makes rigorous subjective evaluation reliable, transparent, and scalable — the model that produces the kind of evidence NAAC, NBA, and increasingly NIRF ranking frameworks are beginning to require.

What University Leaders Must Prioritize

University administrators reviewing examination system investments should consider a framework that aligns with Dr. Kumar's direction of travel:

Redesign assessment blueprints to increase the proportion of analytical, interpretive, and creative questions — not as a response to AI, but because these formats test graduate capabilities that employers and regulators value.

Invest in the evaluation infrastructure that can handle subjective marking with the integrity those question types require — double valuation, evaluator anonymity, per-question audit trails, and real-time outlier detection.

Produce machine-readable evidence of learning outcomes that demonstrates to NAAC and NBA peer teams that each student's analytical growth has been measured rigorously, not just recorded numerically.

Dr. Kumar's statement is a signal, not just an observation. India's most authoritative examination body is telling the country's universities that the assessment paradigm is shifting. The institutions that build their evaluation infrastructure now — before the shift becomes a mandate — will be best positioned to demonstrate it.