On Sun, 5 Mar 2017 13:01:17 -0800 (PST), Regina <
[email protected]>
wrote:
Rich: Okay, so now I have the within group correlations, how do I get the correlations overall, controlling for the within group correlations?
On the one hand -- "That's complicated."
On the other hand -- I'm not sure that you and I would
be talking about the same thing. So I'm going to ramble a
bit, and you can tell me if I don't cover what you have in mind.
When I started considering within-group correlations, I
started wondering about the parallel to ANOVA. That is,
for variances: Total = Within + Between . How could I
apply that to covariances?
Well, the ultimate conclusion was that I should try to stay
very aware that between-group associations are not always
the same as within-group associations. "Correlation does
not prove causation" comes out strongly, when the between
group correlation is contradicted by the within-group value.
And someone has drawn wrong conclusions from the Between.
What is the Between-Group r? For your data where Persons
are the groups, what you do is: aggregate data for each person,
and then look at the correlation across persons.
The Total r is what you have if you just pool all the data. [I am conceptualizing a model here, not working on the arithmetic. The
arithmetic surely will not work out readily for unequal Ns.]
Surveys have a "problem", that they start out with pooled data which
can give them "Total" correlations that reflect group means, but
might badly represent what happens at the level of influence and
prediction within Groups -- be those groups recognized or not.
In other words, surveys provide Total correlations by default, and
often without appreciating how many assumptions they are making.
Part of the complication of looking at r's for T=W+B arises because, especially when there are contradictions, the interesting Groups
are apt to differ in means and variances, and /not/ necessarily
the same way for both variables you are looking at.
Any r is a measure of an association /in a particular sample/ or
universe. Comparing any two r's is always vulnerable to differences
in range -- which is why, for comparing samples, it is far better to
look for "consistent regression coefficients" rather than comparing
two r's. That is, the hypothesis of interest is that the regression
lines are "consistent with" being parallel; differences in mean
should not effect a test, and near-absence of variation in one
variable should not affect a test.
- That implies that one should be careful about drawing inferences
from groups to persons, or vice-versa. I've been satisfied wtih
making cautious and limited statements rather than introducing
some test procedure to an audience.
Hope this helps.
--
Rich Ulrich
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)