THE UTILITY OF CANONICAL CORRELATION ANALYSIS , COUPLED WITH TARGET ROTATION , IN COPING WITH THE EFFECTS OF DIFFERENTIAL SKEWNESS OF VARIABLES 19

The principal objective of the study was to determine the utility of canonical correlation analysis, coupled with target rotation, in coping with the effects of differential skewness of variables representing two batteries of tests. Generally speaking joint factor analyses of two or more batteries of tests result in factors of skewness rather than factors of content. To examine the problem, the General Scholastic Aptitude Test (GSAT) and Senior Ability Tests (SAT) were jointly applied to a sample of 1598 first-year university students, and subjected to both a principal factor analysis (PFA) and a canonical correlation analysis (CCA), coupled with target rotation. Three factors were obtained in both instances. The PFA yielded factors of skewness and the CCA factors of content. The target rotation gave a good fit with the theoretically specified values. The implications of the findings are discussed.

An issue which often arises, for example in studies of the Big Five, is whether two or more test batteries, given to the same sample of participants, have a common factor structure.Traditionally, researchers have simply conducted a joint factor analysis of such batteries of tests.However, due to the effects of differential skewness of the variables involved, the resulting factor structures have often been distorted.Factors of skewness rather than content have usually been obtained.Finch and West (1997, p.470) pointed out in this regard that joint factor analyses confound two sources of covariation, namely covariation within batteries and covariation between batteries.
To overcome the confounding of the two sources of covariation mentioned, Tucker (1958) proposed his interbattery factor analysis.His model will now be briefly described.
Assume that two batteries of tests, with a postulated common factor structure, have been applied to a representative sample of participants.The variables were intercorrelated and yielded the super matrix depicted in Figure 1.According to the fundamental theorem of factor analysis the super matrix R can be resolved into its factors as follows: R = FF', where A 1 = Factors of battery 1 shared in common with battery 2; A 2 = Factors of battery 2 shared in common with battery 1; S 1 = Factors specific to battery 1; S 2 = Factors specific to battery 2. R can therefore be presented as follows: It is therefore clear that R 12 = A 1 A' 2 .A 1 A' 2 contains the factors common to batteries 1 and 2. Browne (1979) provided a maximum likelihood solution to Tucker's model of interbattery factor analysis.He obtained estimates of the interbattery factor loadings by scaling the correlations of the original variables with the canonical variates.He subsequently extended his technique to more than two batteries of tests (Browne, 1980).
More recently Schepers (2004b) showed that the Multiple Battery Factor Analysis (MBFA) technique of Browne (1980) can cope with the effects of differential skewness of variables from two different batteries of tests.
He applied the General Scholastic Aptitude Test (GSAT) and Senior Ability Tests (SAT) jointly to a sample of 1598 first-year university students, and subjected the intercorrelation matrix to a principal factor analysis.' factor matrix was rotated to simple structure by means of a Direct Oblimin rotation.
The principal factor analysis yielded three factors, viz. a nonverbal (spatial) factor, and two verbal factors.The verbal tests of the GSAT loaded on one factor and the verbal tests of the SAT on another.
Following this the intercorrelation matrix was subjected to a multiple battery factor analysis (MBFA) and rotated to simple structure by means of a Direct Quartimin rotation.Again a three-factor-structure was obtained.A Tucker-Lewis reliability coefficient of 0,967 was obtained, which is highly acceptable.The average absolute off-diagonal residual was 0,046 which indicates a very good fit.
Three clear-cut factors were obtained, which were identified as a non-verbal reasoning factor, a verbal factor, and a number factor.
The three factors were strongly positively correlated, suggesting an underlying factor of general intelligence.
From the coefficients of skewness of the various measures of the GSAT and SAT it would appear that the distributions of the GSAT are quite skew.The indices range from 1,818 to -2,111.By contrast the distributions of the SAT are moderately skew.The indices range from 0,450 to -1,248.
From the foregoing it should be clear that even moderate variations in degrees of skewness can distort the factor structure of two batteries of tests if a joint factor analysis is done.By contrast MBFA seems to cope quite well with moderate degrees of skewness.
According to Browne (1979, p.75) the interbattery factor analysis model is "a genuine factor analysis model in that a single set of unobservable factor variables accounts for all correlation coefficients between two batteries of tests".By contrast canonical correlation analysis "is strictly a method of component analysis since two sets of observable linear combinations of variables are employed to investigate relationships between the two batteries of tests" (p.75).
Despite the fact that the rationale of the two models are quite different, the numerical procedures of canonical correlation analysis are very similar to that involved in obtaining maximum-likelihood estimates of interbattery factor loadings (Browne, 1979, p.75).It would therefore be very interesting to examine the utility of canonical correlation analysis in coping with the effects of differential skewness of variables.
The objective of canonical correlation analysis is to form linear combinations of two sets of continuous variables so as to maximise the correlation between the two composites (Cliff, 1987, p.453).According to Cliff (1987, p.455) canonical correlation analysis can be used if "one set of variables is dependent and the other independent or when there is no distinction in the roles of the two sets".It can therefore also be applied to variables from two batteries of tests.
A statistical test is performed to determine how many significant components there are (Bartlett, 1950;1951).Each component (dimension) is represented by two vectors of weights -one in respect of the first battery of tests, and the other in respect of the second battery of tests.The two vectors of weights representing a component are normally referred to as a variate, and the correlation between the two composites of a variate yields the canonical correlation in respect of that component.Thus there are as many canonical correlations as there are statistically significant components.
From an interpretive point of view it is normally very difficult to identify the components underlying the canonical structure matrix as it resembles an unrotated factor matrix.Rotation to simple structure is therefore necessary.In this regard Cliff (1987, p.456) states that the "structure correlations" between the observed variables and the canonical variates "can be transformed by the rotational methods of factor analysis, although the same transformation must be applied to the structure correlations of both batteries".Target rotation would seem to be ideal for this purpose.
From a theory testing point of view target rotation is more appropriate than the usual rotations to simple structure such as Varimax, Promax, Direct Oblimin, Quartimax, Quartimin, and other procedures.With target rotation the common factor structure of two batteries of tests can be specified on theoretical grounds.This is particularly useful whenever theoretical models are being tested.
From the foregoing it should be clear that differential skewness of variables is very disruptive when doing joint factor analyses of two or more batteries of tests (Ferguson, 1941;Gorsuch, 1974;Schepers, 2004a and2004b;Finch & West, 1997).There is thus a real need for techniques that can cope with the effects of differential skewness of variables of a continuous nature.

Objectives of the study
The principal objective of the study was to determine the utility of canonical correlation analysis, coupled with target rotation, in coping with the effects of differential skewness of variables from two batteries of tests.

Research approach
The primary goal of the study was to evaluate a particular statistical technique.A cross-sectional field survey was used in the collection of the data.

Participants
As the sample has been fully described in a previous study (cf.Schepers, 2004b, pp.78-79)only the essential details are given here: A representative sample of first-year university students at the Rand Afrikaans University, during 1995, was used in the study.
Complete records in respect of 1598 participants were available in respect of the General Scholastic Aptitude Test (GSAT) and Senior Aptitude Tests (SAT), amongst others.

Measuring instruments
As a complete description of the measuring instruments have been given in a previous study (cf.Schepers, 2004b, p.79) only the essential details are given here: The General Scholastic Aptitude Test (GSAT) The GSAT yields a measure of academic intelligence or scholastic aptitude.It consists of six subtests -three verbal and three nonverbal, and measures both verbal and non-verbal intelligence (Claassen, De Beer, Hugo & Meyer, 1998).

The Senior Aptitude Tests (SAT)
The SAT was designed for the measurement of a number of aptitudes of pupils in Grades 10, 11 and 12, and of adults.It consists of verbal, numerical, non-verbal reasoning, spatial and memory tests.Coordination and Writing Speed were excluded for the purposes of the present study (Fouché & Verwey, 1991).

Procedure
For the purposes of the present study only the records of students who had completed both the GSAT and the SAT were used.A total of 1598 complete records were obtained.

RESULTS
Principal objective: To determine the utility of canonical correlation analysis, coupled with target rotation, in coping with the effects of differential skewness of variables from two batteries of tests As a first step in the analysis, the canonical correlations of the subtests of the GSAT with the various measures of the SAT were computed.Bartlett's (1950Bartlett's ( , 1951) ) test of significance was used to determine the number of significant canonical correlations, and is given in Table 1.
From Table 1 it is clear that there are at least three significant canonical correlations.Accordingly three canonical variates, together with their associated canonical correlations, were computed.The complete analysis is given in Table 2.
Table 2 shows that the first canonical variate yielded a canonical correlation of 0,741, the second a canonical correlation of 0,393 and the third a canonical correlation of 0,275.The first canonical variate suggests a general factor, with loadings ranging from 0,421 to 0,869.The second and third variates, however, are more difficult to interpret as no simple structure is visible.It was therefore decided to rotate the matrix of canonical variates to simple structure.For this purpose use was made of a target matrix in conjunction with a Tarrot rotation.The target matrix was specified on theoretical grounds after studying the subtests of the GSAT and SAT.The target matrix is given in Table 3.
The target matrix was specified with high loadings on Factor 1 in respect of the non-verbal reasoning tests.Factor 2 was specified with high loadings on all the verbal tests, together with the two memory tests, and Factor 3 was specified with high loadings on the numerical tests.
Accordingly an oblique Tarrot rotation was performed of the matrix of canonical variates.The rotated matrix is given in Table 4.
Table 4 shows that rotation of the canonical variates to simple structure resulted in a well defined structure, yielding a good fit with the theoretically specified target matrix.The square root of the average squared deviation was equal to 0,144480.

DISCUSSION
The principal objective of the study turned out positive: Rotation of the canonical variates by means of a target rotation yielded a structure that is very similar to that obtained with the MBFA.A CCA followed by a target rotation might even be preferable to a MBFA when doing confirmatory studies as the target matrix can be specified on theoretical grounds prior to initiating the study.Target rotation can of course also be used with MBFA, but then the current program would have to be adapted.

TABLE 1 STATISTICAL
SIGNIFICANCE OF CANONICAL CORRELATIONS: BARTLETT'S TEST IN RESPECT OF THE GSAT AND SAT