Computing difference scores when the scores are factor scores?

Computing difference scores when the scores are factor scores?

Post by Fust » Wed, 11 Feb 2009 14:43:29


Hi,

I've got a two-wave (i.e., PRE/POST) study. 100 subjects filled out
the same measures before and after a course of treatment. The measure
I'm interested in here is a scale consisting of 20 items.

Wave 1 OF STUDY: I've extracted, rotated, and interpreted three
factors from the 20 items administered in Wave 1, using common factor
analysis (i.e., not principal components analysis), and I would like
to compute factor scores. This is easy enough to do in SPSS.

Wave 2 OF STUDY: I would now like to compute factor scores using the
Wave 2 administration of these same 20 items, but instead of running
another factor analysis, I need to "assume" the same factor structure
as was estimated for Wave 1, because my ultimate goal is to compute
DIFFERENCE SCORES for each of the 100 subjects. My reasoning is that
if two sets of factor scores from SEPARATE factor analyses are
subtracted from each other to form difference scores, it would be like
subtracting apples from oranges instead of apples from apples, to use
an extremely crude metaphor.

My question is: How do I calculate Wave 2 factor scores using the raw
scores at Wave 2 and the factor score coefficient matrix output from
the abovementioned Wave 1 factor analysis? I am not familiar with the
SPSS matrix language and am wondering if there's a way to do it
without resorting to it.

Thanks!
 
 
 

Computing difference scores when the scores are factor scores?

Post by Art Kendal » Thu, 12 Feb 2009 01:16:36

Of course a lot depends on the overall purposes of your project. What
happened to the respondents between the study waves?
3 scales from 20 items may be a little iffy. What did you use as a
stopping rule?
How did the eigenvalues obtained compare to the eigenvalues from a
parallel analysis?


I would suggest the conventional scale construction approach.
Common factor analysis is conventional. using varimax rotation and
using only items that load cleanly on the scales that represent the
factors. that way you get divergent validity.
With SPSS it helps to use the options to sort the items in the order
they load on the factors and to suppress loading under .3 or so.
get the scores by something like.

compute scalescore1 = mean(item11, item3, item17, item20).
compute scalescore2 = mean(item12, item4, item18, item1).
compute scalescore3 = mean(item13, item5, item19).

Using unit weights create clearer interpretation of the meaning of a
score, and provides a scoring key that is usable at other times with a
lessened capitalization on chance.


Art Kendall
Social Research Consultants

 
 
 

Computing difference scores when the scores are factor scores?

Post by martijnvan » Thu, 12 Feb 2009 06:09:23

n Feb 10, 5:16m, Art Kendall < XXXX@XXXXX.COM > wrote:
> compute scalescore3 mean(item13, item5, item19)>
> Using unit weights create clearer interpretation of the meaning of >
> score, and provides a scoring key that is usable at other times with >
> lessened capitalization on chance>
> Art Kendal>
> Social Research Consultant>
> Fusto wrote> >> > Hi>

I agree with Art: the conventional scale construction approach usually
leads to results that are far more likely to hold up under cross-
validation.

Fusto says: "My reasoning is that
if two sets of factor scores from SEPARATE factor analyses are
subtracted from each other to form difference scores, it would be like
subtracting apples from oranges instead of apples from apples, to use
an extremely crude metaphor. "

This is exactly the problem with factor scores derived from exloratory
factor analysis: they are way too dependent upon the unique
characteristics of your measurement sample. This is especially true
since your number of cases is quite small compared to the number of
items under analysis.

By far the best approach would be to construct your scales based on
theoretical considerations. In that case you can do a confirmatory
factor analysis, which is less prone to sampling error. A very simple
technique for confirmatory factor analysis is the MultiGroup centroid
method. (MGM) Basically it boils down to comparing the corrected item-
total correlation for an item to the correlations of that item with
the scales it does not belong to (in theory).

Another thing: since you are computing difference scores, and your
number of items per factor is quite small, and your sample size is
quite modest, you might be running into power problems here. Mind that
the standard error of difference scores are 1.4 (square root of two)
times as large as the standard error of the individual factor scores!
 
 
 

Computing difference scores when the scores are factor scores?

Post by RichUlric » Thu, 12 Feb 2009 08:34:07

On Mon, 9 Feb 2009 21:43:29 -0800 (PST), Fusto < XXXX@XXXXX.COM >



I agree with the posters who say that you will probably be happier
with your factors constructed from simple averages of a few items.
Those have intelligibility and can be separately described for item
reliability, etc.

However, if you are determined to use exact factors, you can see
some description of how to do it in my Replies in this group in
January (16 and 19), in the thread
Manual Factor Score vs. SPSS Factor Score

--
Rich Ulrich
 
 
 

Computing difference scores when the scores are factor scores?

Post by Ryan » Thu, 12 Feb 2009 09:04:44


> Manual Factor Score vs. SPSS Factor Score> >> > --> > Rich Ulrich- Hide quoted text -> >> > - Show quoted text -

Hey Rich,

Hope it's okay if I ask a question. Is there any preference to taking
the means of items over, say, summing the items?

Thanks,

Ryan
 
 
 

Computing difference scores when the scores are factor scores?

Post by Fust » Thu, 12 Feb 2009 10:01:38


> Manual Factor Score vs. SPSS Factor Score> >> > --> > Rich Ulrich- Hide quoted text -> >> > - Show quoted text -



Thanks to all of you for your very helpful responses!!!
 
 
 

Computing difference scores when the scores are factor scores?

Post by RichUlric » Fri, 13 Feb 2009 04:51:50

On Tue, 10 Feb 2009 16:04:44 -0800 (PST), Ryan



[snip, previous. Keeping what is relevant to new question.]
[...]

It won't make any difference to your ANOVA tests, if
that is a worry to anyone, but I have a definite preference
for using the means over the sums. I'm thinking especially
of ad-hoc scales that are used in clinical trials. (Given someone's
"standard scale" that has often been reported, we are pretty-
much stuck with it. Even though the BPRS or the Hamilton
might be easier to teach if they used averages.) Here are
the main reasons.

- The sums each have an arbitrary Maximum, different for
scales with different numbers of items, so the only way to
know the 'meaning' of a total is to know the scale intimately.
That is an unnecessary burden on the reader, who is not the
PI who loves his scale. Or it is a burden on the statistician
who is dealing with dozens of scales, and wants to deal
intelligently with this scale without spending his life on it.

- The items have verbal labels which can be used to
interpret the average. On the one hand, it gives labels like
"never". Also, it is the easiest way to show any reader,
that a difference of 0.1 points is trivial, while a difference
of 1.0 points is large. The scoring reflects the units of
the "effect size" in the terms of the measurement.

- When there are occasion items that were blank, the
question of "What did you do with the missing?" is readily
answered. The score is the "average of those answered,
requiring (say) at least 3/4ths of the items to be present"
or it will be scored missing.



The other alternative that I have used frequently for
composite scores - not often for a Likert scale, but usually for
factors that are formed across different domains - is the T-score.
Scales are standardized with a mean of 50 and a standard deviation
of 10 - usually using the mean and SD for the whole sample at Pre.
That makes it relatively easy to look at group differences and
changes across time. (The standard deviation of 10 means
that you can report interesting differences without the clutter
of decimal points. The mean of 50 means that you don't
have the clutter of negative values for group means.)


--
Rich Ulrich
 
 
 

Computing difference scores when the scores are factor scores?

Post by Ryan » Fri, 13 Feb 2009 23:03:45


> of ad-hoc scales that are used in clinical trials. Given someone's >> "standard scale" that has often been reported, we are pretty- >> much stuck with it. ven though the BPRS r the Hamilton gt; > might be easier to teach if they used averages.) ere are> > the main reasons.> >> > The sums each have an arbitrary Maximum, different fo>
> scales with different numbers of items, so the only way t>
> know the 'meaning' of a total is to know the scale intimately>
> That is an unnecessary burden on the reader, who is not th>
> PI ho loves his scale. r it is a burden on the statisticia>
> who is dealing with dozens of scales, and wants to dea>
> intelligently with this scale without spending his life on it>
> The items have verbal labels which can be used >o
> interpret the average. n the one hand, it gives labels li>e
> "never". Also, it is the easiest way to show any read>r,
> that a difference of 0.1 oints is trivial, while a differe>ce
> of 1.0 oints is large. he scoring reflects the units>of
> the "effect size" n the terms of the measurement.

I follow--makes sense. Thanks>
> >
> When there are occasion items that were blank,>the
> question of "What did you do with the missing?" s rea>ily
> answered. he score is the "average of those answe>ed,
> requiring (say) at least 3/4ths f the items to be pres>nt"
> or it will be scored missing.

Right.>. >
>
> The other alternative that I have used frequently>for
> composite scores - not often for a Likert scale, but usually>for
> factors that are formed across different domains - is the T-sco>e. > Scales are standardized with a mean of 50 and a standard devi>tion
> of 10 - usually using the mean and SD for the whole sample at>Pre.
> That makes it relatively easy to look at group difference> and
> changes across time. (The standard deviation of 10>means
> that you can report interesting differences without the c>utter
> of decimal points. he mean of 50 means that you>don't
> have the clutter of negative values for group means.)

This has been my common pract>ce>

>> > --
> Rich Ulrich

Thank you,

Ryan