IQ scale - the importance of norming
This is part 5 of a series of posts in which I explain the coming of the modern-day IQ scale. The introduction of standard scores by Weschler in 1939 would change the art of IQ testing for the better in that IQ score volatility was greatly reduced, and IQ scores would therefore become more meaningful and reliable, even as the test taker aged. Despite what seemed like a no-brainer upgrade to IQ testing, it would then take a further 25 years before the publishers of the different versions of the Binet test got rid of the antiquated (MA/CA) x 100 IQ scoring method, and propel the IQ scale into with 20th century with the introduction of standard scores.

Most contemporary IQ scales are based on standard scores, which confers a neat advantage to the test assessor who is able to assess the intelligence of each test taker and to evaluate how an IQ test score compares to the general population.

Most known IQ tests are set to have a mean score of 100, and a standard deviation of either 15, 16 or 24 points. A Z-table can then be used by the assessor to establish the percentile score of the test taker in question. The percentile score gives the test taker a relative positioning of his or her own intelligence in relation to the general population.

However, as I discussed in my first posting on the topic of IQ scales, these need to be carefully crafted by way of a robust norming exercise to ensure that results and interpretations associated with the test statistics can truly be extrapolated to the general population as a whole.

Thinking back to Binet, who in the early 1900s developed an IQ test to help the Paris ministry of education to weed out intellectually weaker children from the classroom, Binet would need to recruit a large number of children to hone his Mental Age (MA) scale. Once his MA scale had been created, it was unclear as to whether this scale would have been applicable and representative of children living outside Paris, or how balanced the sample of children was across economic or social classes. The most likely scenario was that Binet had developed a Parisian IQ test for children, rather than a French IQ test.

For this reason, Weschler would decades later spend a very large amount of time and money with his psychologist friends to recruit as representative a sample as possible of US test takers. Weschler had set his sights on understanding the intellectually ability of the entire United States of America.

The picture that emerges here is that IQ scales must be carefully normed relative to the country where the testing is taking place. Now, the Flynn effect tells us that the populations of several industrial nations have experienced an average increase in IQ by between 2-7 points per decade. The United States, for instance, has experienced a Flynn effect of 2-3 points for decade since the turn of the 20th Century, while the Netherlands have experienced a 6-7 point Flynn effect, largely over the same period.

This has several implications for IQ testing and I will discuss these in future postings.

