The IMF Rates Its Technical Assistance. Why not its loans?

Thought Piece · MDB Reform Monitor · Institutional Accountability Series

The IMF Rates Its Technical Assistance.
Why Not Its Loans?

Every major multilateral development bank rates its lending operations independently and publishes the results. The IMF evaluates its technical assistance by programme and publishes ratings. Its loans — disbursing tens of billions annually to countries in fiscal and balance-of-payments crisis — receive no equivalent independent outcome rating at all.

Parminder Brar | mdbreform.com | Institutional Accountability Series

The Core Argument

The IMF has already built a rating infrastructure for its technical assistance operations. It has not extended it to the lending programmes that define its institutional purpose. This is not a technical accident. It is a governance choice — and it is time to reverse it.

I. The Accountability Architecture the IMF Has Not Built

When the World Bank closes a project, its Independent Evaluation Group assigns an outcome rating on a six-point scale: Highly Satisfactory down to Highly Unsatisfactory. That rating is published, attached to the operation in a public database, and disaggregated by sector, region, country, instrument, and year. The Asian Development Bank validates every project completion report — not a sample — and produces an independent rating. The IDB Office of Evaluation and Oversight validates every project and publishes a management-evaluator comparison annually. The framework — independent assessment, discrete rating, publication — is a standard of the multilateral lending system.

The IMF stands apart. When a Stand-By Arrangement ends, when a Rapid Financing Instrument disbursement is made, when an Extended Fund Facility closes, the IMF produces no independent outcome rating. There is no six-point scale, no five-point scale, no binary satisfactory/unsatisfactory determination by an evaluator independent of the operational department that managed the programme. The loan closes. Staff write a concluding Article IV. The programme is absorbed into institutional memory.

The IMF rates its technical assistance by programme. It does not rate the loans that define its institutional purpose.

This is not a peripheral inconsistency. Stand-By Arrangements, Extended Fund Facility programmes, Rapid Financing Instrument disbursements, and Poverty Reduction and Growth Trust facilities determine fiscal trajectories, restructuring timelines, social spending floors, and exchange rate regimes in economies under stress. The stakes of programme quality are at least as high as in the MDB system — arguably higher, because the interventions are directly macroeconomic and immediately sovereign.

II. What the IMF Provides Instead — and Why It Falls Short

The IEO: Lessons Without Verdicts

The Independent Evaluation Office was established in 2001 to provide the evaluation function the IMF lacked. It is structurally independent of management, reports directly to the Board, and has produced a body of serious analytical work. Its evaluations of the Euro Area crisis, the 2010 SBA with Greece, the structural conditionality doctrine, and Fund surveillance have shaped institutional debates. But the IEO operates at the thematic and systemic level, not the operation level. Individual country programmes appear as case study evidence within broader thematic arguments. They are not individually rated, individually accountable, or individually retrievable in a public database.

The consequence is structural. A programme that disburses in full against conditionality inappropriate to the country context, produces no structural reform, and leaves the borrower worse positioned than at approval will appear in an IEO report as a data point illustrating a systemic pattern. The failure is acknowledged in aggregate. It is not attributed to the specific programme, the specific mission chief, or the specific Board decision to approve.

The Nigeria RFI: How Thematic Evaluation Conceals Individual Failure

The IMF April 2020 Rapid Financing Instrument disbursement to Nigeria is instructive. At the time of Board approval, Nigeria had recorded approximately 288 confirmed COVID-19 deaths — placing it among the least affected large economies globally at that point. The public health justification was extraordinarily thin at the moment of approval.

The PFM risk was documented and known. Nigeria Open Treasury Portal data — available to civil society in real time — recorded transfers to ghost worker accounts, irregular payroll entries, and procurement transactions inconsistent with emergency health expenditure during and after the disbursement period. The subsequent prosecution and conviction of the Accountant General of the Federation for PFM fraud confirmed that the control environment against which the Fund approved disbursement was materially deficient.

The RFI disbursed approximately US$3.4 billion under compressed Board review timelines, with conditionality lighter than a full SBA, relying on member representations rather than monitored structural benchmarks. None of this has resulted in an IEO operation-level review. The programme outcomes appear as lesson-generating evidence in thematic discussions. They do not appear as a rated, attributed, published assessment of whether the April 2020 Nigeria disbursement was satisfactory or unsatisfactory.

An accountability system that generalises every failure into institutional lessons and attributes no failure to any specific programme is not an accountability system. It is a lesson-laundering mechanism.

The Article IV: Vague by Design

Article IV consultations are the IMF annual surveillance reports for each member. They are published, substantial, and read by markets and policymakers. But they do not constitute programme outcome evaluation. An Article IV produced twelve months after an RFI disbursement assesses the current macroeconomic outlook — not whether the RFI achieved its stated objectives, whether conditionality was appropriate, whether PFM risks materialised, or whether the disbursement improved the member fiscal trajectory relative to any counterfactual.

Moreover, Article IV language is institutionally calibrated toward diplomatic understatement. Assessments are written by the same teams that managed the engagement, negotiated the conditionality, and recommended Board approval. The language that emerges — fiscal consolidation proceeded broadly in line with programme targets; structural reform implementation faced some delays — will rarely communicate the scale of a failure with the directness that an independent rating of Unsatisfactory would convey.

III. The IMF Already Has What It Needs: The TA Rating System

The most revealing fact in this accountability gap is that the IMF has already built a rating infrastructure for one category of its operations — and has not extended it to the other. The Technical Assistance Information Management System (TAIMS) assigns outcome ratings to IMF TA engagements across fiscal affairs, monetary and capital markets, statistics, and legal. TA projects receive ratings on relevance, effectiveness, and sustainability. The system is imperfect — independence from delivering departments is partial — but the conceptual infrastructure exists. The IMF knows how to build a rating scale, apply it to programme-level operations, and use it for institutional portfolio management.

The question of why TAIMS rating infrastructure stops at technical assistance and does not extend to lending operations is not answered by technical incapacity. It is answered by institutional incentives. A finding that a tax administration TA in a small economy was partly effective carries modest reputational consequence. A finding that an EFF programme in a systemically important emerging market was Unsatisfactory carries immediate consequences for the country relationship, for the Board members who approved it, and for the mission chief who designed it. This is not an argument for why lending ratings should not exist. It is an explanation of why institutional incentives have prevented them from being built.

IV. How the IMF Compares to Its Peer Institutions

Evaluation Architecture: Major Multilateral Finance Institutions. Sources: IEG Annual Report on Evaluation 2023; OVE Annual Validation Cycle 2022; ADB AER 2024; IMF IEO mandate documentation.

Accountability Dimension	World Bank (IEG)	ADB (IED)	IDB (OVE)	IMF (IEO)
Coverage	All lending ops	All operations	All operations	Selected ex-post only
Outcome rating produced	Yes — 6-point scale	Yes — 4-point scale	Yes — 4-point scale	No — narrative only
Published per operation	Yes	Yes	Yes	No
Evaluator independent of management	Yes	Yes	Yes	Partial (Board reporting)
Rating gap tracked publicly	Yes	Yes	Yes	Not applicable
TA rating system exists	Yes	Yes	Yes	Yes (TAIMS)
Lending rating system exists	Yes	Yes	Yes	No
Individual failure identifiable	Yes	Yes	Yes	No — absorbed in themes

The IEO is genuinely independent in its reporting line — unlike AfDB IDEV, which reports to management, the IEO reports to the Board. But independence of reporting line without operation-level coverage produces an accountability structure that is formally correct and substantively insufficient. You cannot have a management-evaluator rating gap if the evaluator does not produce ratings.

V. What a Transparent IMF Rating System Would Look Like

The design challenge is real but surmountable. IMF lending programmes are policy-based instruments, not infrastructure deliverables. But MDB policy-based lending faces exactly the same challenge — and is rated. World Bank Development Policy Operations and IDB policy-based loans are assessed on outcome scales identical to investment projects. The IMF structural benchmarks and performance criteria — which already define programme success for waiver and review purposes — provide the natural basis for a rating framework.

Proposed Five-Point Scale for IMF Lending Programme Outcomes

Highly Satisfactory

Programme objectives substantially achieved; conditionality appropriate to context; macroeconomic stability sustained beyond programme period

All core criteria met

Satisfactory

Objectives largely met; minor slippages in conditionality or timeline; stability maintained with caveats

Most core criteria met

Partly Satisfactory

Mixed results; structural reform objectives partially achieved; conditionality partially appropriate; sustainability uncertain

Some core criteria met

Unsatisfactory

Objectives not achieved; conditionality poorly calibrated; disbursement without reform delivery; member not better positioned at closure

Few or none met

Programme Failed

No objectives met; conditionality counterproductive; member materially worse positioned than at approval

None met

Three further dimensions should be rated independently: (1) Design Quality — whether conditionality was appropriate to the country institutional capacity at approval; (2) PFM Risk Management — whether the Fund adequately assessed fiduciary risk in the disbursement environment; (3) Sustainability — whether the member fiscal and external position was more or less stable two years post-closure. These ratings should be produced by IEO, published per operation, and disaggregatable by instrument type, region, and income category.

VI. The Objections — and Why They Fail

Objection 1

“IMF programmes are too politically sensitive to rate”

This proves too much. World Bank DPOs involve civil service restructuring, subsidy removal, and banking sector consolidation in economies under stress — they are rated. IDB policy-based loans cover the most politically volatile reform contexts in Latin America — they are rated. The IMF claim to exceptionalism on sensitivity grounds is a preference for non-accountability dressed as methodological humility.

Objection 2

“IEO thematic evaluations are sufficient”

Thematic evaluations provide institutional learning at the systemic level. They do not provide operational accountability at the programme level. The distinction matters because operational accountability calibrates incentives at the decision-making level — the mission chief, the area department, the Board member who approved. When accountability is exclusively systemic, responsible agents are diffused into institutional patterns. No one approves a bad programme; there are only recurrent patterns of suboptimal design. This diffusion is the accountability gap. Thematic evaluation cannot close it by definition.

Objection 3

“The IMF membership would not accept it”

This is the most honest of the objections. Borrowing members have an obvious interest in programme failures not being rated and published. But the appropriate constituency for accountability architecture design is not the borrowing membership alone. The United States, United Kingdom, Germany, Japan, and other major quota contributors have provided the capital that enables IMF lending. They are entitled to know whether that capital is deployed in programmes that achieve stated objectives. A system that protects programme confidentiality at the cost of shareholder accountability has resolved the governance tension in the wrong direction.

VII. Three Reforms, No New Architecture Required

Reform 1

Mandatory IEO programme-level review above a de minimis threshold

Every SBA, EFF, ECF, and RFI disbursement above SDR 100 million should receive an IEO outcome review within 24 months of programme closure, producing a rating on the five-point scale above.

Reform 2

Extension of the TAIMS rating methodology to lending operations

The IMF has already built the methodological infrastructure for outcome rating in its TA system. Relevance, effectiveness, and sustainability translate directly to lending programme evaluation. The extension requires institutional will, not technical development.

Reform 3

A public programme outcome database structured to match the World Bank Project Portal

Each closed IMF lending arrangement should carry a public record: programme identifier, approval and closure dates, total disbursement, key structural benchmarks, IEO outcome rating, and a link to the full IEO review. Disaggregatable by instrument type, region, income group, and rating. Updated within 24 months of programme closure.

None of these reforms requires a quota review, a governance restructuring, or a change to the Articles of Agreement. They require a Board decision to extend the IEO’s existing mandate from selective thematic evaluation to systematic operation-level review. That decision has not been taken. It should be.

Conclusion

The Nigeria RFI is a specific case — not an isolated one. Emergency lending under compressed timelines, to countries with documented PFM weaknesses, against light conditionality frameworks, in crisis environments where political pressure to disburse overwhelms the analytical case for withholding — this is a structural vulnerability in the IMF operating model, not an occasional aberration. The accountability system should identify, rate, and publish these outcomes at the operation level.

Instead, the system generalises them. The Nigeria RFI becomes a data point on emergency lending governance. The Greece SBA becomes a case study in programme design in currency unions. The Argentina EFF illustrates balance-of-payments conditionality challenges. The lessons are absorbed. The verdicts are never delivered.

The World Bank tells you whether its projects worked. The IMF does not. This is not a technical accident. It is a governance choice — and it is time to reverse it.

Member governments that contribute capital to the IMF, civil society organisations that monitor conditionality in borrowing countries, and the academic community that evaluates stabilisation programme effectiveness have operated for decades without access to a simple, authoritative answer to the most basic accountability question: did this programme work? The IMF has the institutional infrastructure to provide that answer. It has chosen not to. That choice should no longer be accepted as a permanent feature of the international monetary system.

The IMF Rates Its Technical Assistance. Why not its loans?