Multivariate Independence Tests

Thought I should mention the obvious multivariate independence tests.   These can be used in a 1st pass to trim out the obviously independent variables.   Assume a d-dimensional joint distributions p(x0, x1, x2, … xd) as a starting point with sample vector X = { x0, x1, x2, .. xd }.

Let us consider a subdividing of the above d dimensional distribution into n and m dimensional distributions respectively (where d = n+m).  We want to test to see whether the marginal distribution of n variables is independent of the distribution of m variables.   Our sample vector for the full joint distribution can be expressed as:

X = { {a1,a2, ..an}, {b1,b2,…,bm}}

and we have marginal distributions p(A) and p(B) each of which is a subset of the dimensions of p(X).   The covariance matrix for p(X) looks like:

Intuitively if A and B are independent then Σab should be 0 (well not significantly different from 0).   We set up the null hypothesis H0 to indicate independence.   The wilks’ test rejects independence if:

where N is the number of samples.   This test is valid even for non-elliptical distributions, so is a good 1st line test.   The intuition with the above measure is that the determinant ratio is measuring the volume of the covariance matrix in ratio to the volumes of the non-crossing components of covariance.   If the volume is mostly explained by the non-crossing components, then A and B are independent.

Advertisements

Leave a comment

Filed under strategies

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s