TRYST WITH A LEGEND

  Calyampudi Radhakrishna Rao was born on September 10th, 1920 in Huvvina Hadagalli, 
Karnataka of C.D.Naidu and Lakshmikanthamma. He received an M.A. in mathematics from 
Andhra University in 1940, an M.A. in statistics from Calcutta University in 1943, a 
Ph.D. from Cambridge in 1948 for his thesis "Statistical Problems In Biological 
Classification" and an Sc.D from the same university in 1965. Thanks to the Cramer-Rao 
inequality and the Rao-Blackwellisation of unbiased estimates, he is now a household name 
in the family of statisticians and engineers too. He is the author of more than 280 
papers and 11 books in diverse fields like Estimation Theory, Multivariate Analysis, 
Characterization Problems, Combinatorics and Design of Experiments, Matrix Algebra, 
Generalised Inverse of a Matrix, Differential Geometric Methods in Statistics, 
Mathematical Genetics among others.

                      His contributions to the development of the Indian Statistical 
Institute and the foundation of statistics in India go down to history. Among the 
innumerable distinguished positions that C.R.Rao has occupied are the Directorship and 
the Jawaharlal Nehru chair at the I.S.I., Distinguished Professorship at Pittsburgh and 
the Eberly Professorship at Pennsylvania State University in which position he continues 
to stay till the present day.

 Some excerpts from the interview ---
1> Sir, how has statistics as a subject evolved from it's past and how will it be in 
the future? 

A>Well, historically statistics was used mainly by the government to take policy 
decisions. Actually the statistics as used by the rulers started maybe 2 thousand years 
ago. If you look at the Bible, they have described how to take census. In the Indian 
context, we have Kautilya's Arthashastra which is also full of statistics. During the 
Mughal period, you have a publication Ain-i-Akbari, which has information on everything 
in minute detail.

By the way, do you know whether the word statistics is an English or a French or a German 
word ?  It sounds English, right? But, it is actually a German word and it was coined by 
a German administrator by the name of Achenwal. By statistics he meant all the 
information needed by the government about the people. Now, the English people did not 
like the use of the German word and they coined the word 'publistics', which  was used in 
England for several years. Later on, since it is a more complicated word, they opted for 
the word statistics. Actually somebody in Scotland collected information on all people in 
Scotland and published it in the form of 'Statistics of Scotland' which attracted 
criticisms from the English.

 But the actual statistics as a method of inference started only in 1900. The idea is 
whenever you have some data, there is some amount of uncertainty in the inference you 
draw from that, because of errors in measurements and because we can't take a complete 
set of measurements on any phenomenon; you can observe only a part of the phenomenon. So, 
when we want to infer from statistics to the theory or mechanism which is producing the 
observation there is some amount of uncertainty. Now this was realized in 1900 that if 
you can measure the amount of uncertainty in the data, then you can use it for decision 
making purposes; and if you know the amount of uncertainty you know the loss you will 
incur by taking a certain decision. 

In 1900, Karl Pearson started the use of chi-square. That is the first example of using 
some criterion for deciding whether a certain hypothesis is true or not. He was comparing 
(observed - expected) (this chi-square formula, you know) and that was invented by Karl 
Pearson. He had developed a system of frequency curves or frequency distributions and he 
wanted to know which one would fit a given data. His hypothesis was that a certain family 
offrequency distributions are applicable to the given data.  So, he needed chi-squares to 
measure the uncertainty in the hypothesis and when the probability is high, he accepted 
the hypothesis. 

 Once it was shown that it is possible to make decisions on the basis of data by 
computing the amount of uncertainty in the data, people started using statistics in 
several areas. Probably the first area of science where statistics was used was 
demography where they computed life tables, that is for insurance purposes and then for 
applications in agriculture that led to design of experiments. Very early the 
psychologists used statistics to measure what are the components of human intelligence 
that can be measured, inferred through the performance in various tests like intelligence 
tests and so on. Then came applications in industry, in QC (did you study industrial 
statistics?). Then design of experiments in QC to find optimum mix of factors, how to 
combine various components in order to produce the certain given quantity.

Then came applications in medicine for medical diagnosis. If you go to a doctor he takes 
a large number of tests, in order that proper treatment  can be administered. Now how 
does he use the information from all these tests to arrive at a proper diagnosis? How 
will he synthesize all this information given the various tests, which he can look only 
one at a time. He says, 'test for cholesterol, good', 'blood pressure, not so good' and 
so on. So how will he put all the information together? That's why multivariate analysis 
comes in, which tells us how to combine the information from the various tests and make a 
proper diagnosis. So, this is the most useful application of statistics.



2> We have heard of the phrase 'Lies, Damn Lies and Statistics'. Has the idea changed 
now? (CRR interjects with a 'Yes, of course'). Do the people have faith in statistical 
results now?

A>Well, in sciences like physics, chemistry, etc. statistics is not used so much. 
Actually, physicists say that they do not have any use for statistics because they have a 
very precise way of generating measurements and so the errors in physical measurements 
are much      smaller than usual. Sow a variety of wheat  to measure the production. 
Well, it all depends on the fertility of the soil. But in the case of measurements taken 
by physicists, they are very, very accurate. Secondly, they can generate a very large 
number of observations repeatedly, they can take measurements which you cannot do in 
other sciences. So, when you have a large number of measurements you don't need accurate 
statistical methods, only simple argument is sufficient and accurate. Because of this, 
physicists do not use statistics very much. But in all sciences where measurements cannot 
be made without error, there is limitation on the number of observations and there 
statistics is essential.