Ah, ok. So is this a sort of Markov model, where you are predicting the probabil...

Ah, ok. So is this a sort of Markov model, where you are predicting the probability of getting an exercise right after observing (some subset of) the previous exercises? And E is not 1 or 0, but the expected probability of getting it right? I'm still confused where all the different E_i's fit in.

That would explain the magnitude, and I agree the negative weight on T would just be due to the direct correlation between E and T.

Edit: I just realized that an exercise consists of multiple problems, so you're predicting whether or not the student will get >= 85% of the problems right on an exercise.