Opening Lecture for
“The Impact on Academic Institutions of Research Evaluation Systems”,
Confederation of European Union Rectors Conference,
LNEC, Lisbon – 23-24 June 2000.


EVALUATION CULTURES AND THEIR POLITICAL IMPACT

Luis T. Magalhães
Presidente da Fundação para a Ciência e a Tecnologia


EVALUATION CULTURES ARE INSTRUMENTAL FOR ENHANCING COMPETITIVE ADVANTAGE AND FOR ASSURING POLITICAL SUPPORT

Some of the major social and economic trends of actuality are:

These four major trends are strongly interrelated and reinforce each other in multiple ways. All of them are connected to the building up of evaluation cultures and, at the same time, it is due to them that a deeply rooted evaluation culture becomes a major competitive advantage in the present world.

A more participated democracy, wider access to information, and more open comparisons at the global scale have brought the political need for improved accountability, effectiveness and efficiency of government expenditures.

After all, what is at stake is how the tax payers money is used and what are the outcome benefits of its use. In fact, in the present circumstances the growing requirement for accountability can only be expected to increase.

Governments need evaluations for several different purposes: optimizing budget allocations, re-orienting policies, rationalizing organizations, redefining institutional missions, augmenting productivity, and so on.

Better communication and absolute transparency of evaluation results, not only for specialized audiences but also for the general public, are now mandatory requirements. Policies have to be rooted on a well informed and supportive public base in order to be sustainable under increasing public scrutiny.

These are, in my view, the main reasons for the importance of an evaluation culture as an instrument for enhancing competitive advantage and for assuring political support for policies and for institutions.



EVALUATION IS CRITICAL FOR THE DEVELOPMENT OF A SOUND SCIENCE AND TECHNOLOGY SYSTEM AND OF THE RESEARCH FUNCTION OF UNIVERSITIES

The capacity to create, disseminate and use knowledge and information is more and more the main factor for economic growth and improvement of the quality of life. For this reason, the Science and Technology System assumes a structural role of fundamental importance for the economic and social progress. In each country it appears has a basic infrastructure for the knowledge-based economy and society.

The quality of human resources is the main factor underlying the invention and diffusion of knowledge and technology. The training of researchers and other skilled personnel has to be necessarily supported on the scientific system, even in what concerns technical training as that of, say, engineers.

The size and quality of the science and technology system and its capacity to foster the emergence of new up to date leaderships is the essential element for a permanent renewal of teaching and training at the forefront of knowledge and technical skills. This is particularly true for the science and technology system directly related in to higher education institutions and especially to universities.

In fact, the science and technology system performs a central role in stimulating creativity, the use of knowledge, innovation, modernization, permanent actualization, and entrepreneurship which are increasingly critical for the knowledge-based economy and society.

This line of reasoning can of course be traced back to the concept of research-based training and comprehensive humanistic education developed by Wilhelm von Humboldt in 1810 for the University of Berlin. The American graduate school was built on this concept focusing on research and higher learning, a development that took place mostly from the second half of the 19th century to the 1930s.

But this orientation assumes a more clear visibility in the modern university which came into being mostly throughout the second half of the 20th century.

To the classical model of von Humboldt, a market-oriented model was added. Students became to be seen as consumers or customers wanting competencies or skills certified through diplomas while universities compete with each other in order to satisfy this demand and attract the best students.

Government and industries increasingly appeared as customers of research — in particular research projects within a specified time frame and at competitive quality and prices. Contract-based research funding is adopted while universities compete in order to satisfy this research contracts demand.

This trend was made clear in the USA shortly after the World War II, with the creation of the National Science Foundation (NSF) in 1950. The NSF was based on the concepts advanced by Vannevar Bush in the 1945 report “Science: The Endless Frontier” which called for a government agency that supported only the best research in colleges, universities and research institutes. The early NSF procedures were built on the research projects evaluation methods of contract research which had been pioneered by the Office of Naval Research under the leadership of Alan Waterman who was nominated to be the first NSF Director in 1951.

To teaching and research a third mission of the university was added: service to the community in the form of lifelong education, technology transfer to the business sector, advice on policy decisions, and contribution to regional development strategies.

Institutional competition is a clear feature of this market-oriented approach: to function effectively as a market, university education needs to have a wide range of providers, and consumers must be able to make well-informed choices and, in fact, to switch “suppliers”.

In this context, regular independent evaluations with results widely disseminated become absolutely essential to the science and technology system and, in particular, to the research function of universities.



THE SCIENCE AND TECHNOLOGY SYSTEM EVALUATIONS HAVE A SPECIAL ROLE IN FOSTERING WIDE EVALUATION CULTURES

No system can be efficiently run without permanent observation and checking of its results and outcomes. We have seen that evaluation cultures are instrumental for enhancing competitive advantages and for assuring political support for public policies. And this happens in all public policy areas.

However, we have also seen that, due to the very special characteristics of science and technology for the advancement of the frontiers of knowledge, evaluation methodologies where pioneered by the science and technology system and have been operational in advanced countries for several decades.

This means that the examples of evaluation systems of research, in particular at universities, have the political responsibility of contributing to foster a wide evaluation culture encompassing all the fields of public policy.

This is, by itself, a very important reason to run clear and widely visible evaluations whose results and procedures are transparently disclosed and subjected to public scrutiny and wide debate.



THE PRINCIPLES OF RESEARCH EVALUATION PRACTICE


Research evaluation is better done by independent international peer review

A first thought on evaluation practices is clearly conveyed in a Blaise Pascal 1670 statement: “It is not permitted to the most equitable of men to be a judge in his own cause.”

Thus evaluations have to be run by independent observers who, of course, need to be experts on the fields being evaluated. Independent peer review is therefore an inescapable ingredient of research evaluations.

In small countries this leads directly to the need of involving foreign experts in evaluations, but there are other advantages of international evaluations which are universal:

First, assessments have to be run against international best-practices.

Second, in a globalized setting internationalization of research is critical; there is no better way to emphasize an internationalization policy than simply doing international evaluations — the message is very clear.

Third, international networking is critical for scientific and technological development; exposing ones research to international evaluation brings natural visibility to competencies around which further collaboration links can be established — then the word about them passes around very easily in particular through the evaluators themselves.

Fourth, evaluation results obtain more easily international recognition and credibility if the evaluators are foreign international experts, as comparison references are than easier to establish.

Fifth, as an evaluation is a check on the system that is later used through a complex mediated process, there are more advantages than disadvantages in having it done by detached outside experts who are more prone to detect established undesirable practices or the rising of new opportunities and leaderships and to voice them out irrespective of institutional conservatism.


The reasons for conducting evaluations are varied and involve improvement of activities, accountability and “market” information

There are several reasons for conducting an evaluation, among them:

Most good evaluations nowadays involve some sort of all these aspects. Therefore, they take into consideration not only the proposed activities but also the way they are organized and the results and outcomes track record of the performers. It is also increasingly asked directly to the evaluators their advice for precise funding decisions and for policy change. Transparent disclosure of the evaluation procedures and results is also seen now as mandatory for good evaluation practice.

In my opinion, useful evaluations — those that make a difference for the evaluated and for the system performance — always require consequences in funding decisions.

When an evaluation involves a whole system, like a comprehensive evaluation of the research in universities, it also requires a simple ranking system, in order to provide clear messages for institutional change and to set a clearly identifiable relief on the institutional landscape that can be easily understood by prospective “customers” — students, their families, businesses, public agencies, governments.

The sole publication of textual evaluation reports in a large setting will provide very little practical guidance for the outsiders and will convey a sense that everything is essentially the same and requires major improvements, as a good textual evaluation must always concentrate on criticisms that may contribute to improvements.

In other words, I believe that the caution that some evaluation systems exhibit in avoiding ranking with the fear that it will be misunderstood, overshadowing more important observations, drives against the whole system by rendering difficult to outsiders the identification of clear signs of differences.


Quality and originality should have precedence over quantity in research evaluation

There is no single best way of performing research evaluations. In particular, the alternative of doing qualitative or quantitative evaluations has been a subject of long controversy, but the conventional wisdom nowadays is that both have different strengths and weaknesses, and that a mixed approach is advisable.

Nothing is better than allowing the evaluators to obtain a diversely informed understanding of the situation complemented with direct observation and interpersonal interaction, and to rely on their expertise to arrive at a balanced assessment and recommendations. In any case, originality and quality should always be given precedence over quantity in criteria for performance evaluation and the allocation of resources.

Over simplifications on this matter can lead to quite unacceptable results. For instance, relying on a long set of criteria to be marked and on weighted averages of these marks to mechanically achieve a final grade was found to lead to bad results. Another example is that bibliometric data can be informative in research evaluations, in particular at the institutional or wider level, but relying on them automatically can lead to serious mistakes and even have undesirable consequences regarding the quality of publications and even scientific ethics (as, for instance reported in 1998 by the German funding agency DFG in the publication entitled Safeguarding Good Scientific Practice).

As a side remark, let me recall that this matter of grading competencies which we now take for granted in grading students, is a rather recent development in history. In fact, the first known instance of grading papers occurred only in 1792 at Cambridge University under the initiative of a tutor named William Farish who believed that a quantitative value should be assigned to human thoughts. This idea was to be later on adopted after the French Revolution at the end of the 18th century by the École Polythecnique de Paris as a device for assuring equal treatment to all students irrespective of their social origin. The first use of grading student papers in the USA seems to have occurred only in 1817 at the US Military Academy where managerial ideas are reported to have appeared for the first time (cf. The Genesis of Accountability: The West Point Connection, by Keith Hoskin and Richard Macve in 1988).



THE POLITICAL IMPACT OF EVALUATION CULTURES

To conclude, let me emphasize my view that: