Figure 8 in Source Paper

Hello,

If we were given the Partial Residual graph shown in the source paper, figure 8. And we were asked to select all that applies to make the partial residuals graph appropriate on the exam.

a. Add a polynomial second degree to AOI

b. Add a hinge function at age 15

c. Add a categorical variable at age 20, 30, 40, and 50

d. Add a natural cubic spline

I'm under the impression that all these answers would be correct, right? Does it matter how we bin the variable shown in c)?


Thanks!

Comments

  • Great question! Adding a second degree polynomial to AOI is definitely correct and even a third degree one would likely be acceptable.

    A hinge function at age 15 corresponds to a change in slope at 2.71 = ln (15) which is roughly where the apex of the residuals are - so this would be fine.

    Fitting a natural cubic spline is certainly a valid option too as it's a more advanced version of modeling with polynomials.

    Option C is dubious because of the way it is worded. A categorical variable for various points will leave a lot of other ages still to be modeled. It would be more reasonable if it said something like bucket 0 to 20, 21 to 30, ... 41 to 50. However, even then I still wouldn't like it because we know the residual behavior changes around log(AOI) = 2.75, i.e. age 15.6 years. So our first bucket 0 to 20 is too coarse and would contain changing behavior. In the absence of any exposure information for credibility, I would use something like 0 to 15 and 16 to 20 or 16 to 30 for bucketing purposes.

  • Is it possible that option A is also dubious with the way it's worded? Since the parabola is facing downwards, wouldn't we be adding a negative square of the logged building age? Though, the text on page 51 simply says "adding the square of the logged building age".

  • I see where you're coming from and commend you for being very precise with your language.

    In some ways, the text is being careless by saying "adding the square of the logged building age" because, as you point out, the coefficient associated with squared building age variable should be negative according to the shape of the residual plot. However, I would say it's common practice to say you're "adding a variable" to a model when you really mean you're "including a variable" in the model. In this context, "adding a variable" isn't referring to the sign of the coefficient associated with the variable.

    I'm not sure the graders would have the time to appreciate this level of nuance unless you spelled it out really clearly in your answer.

Sign In or Register to comment.