2017 Fall (1)c

edited May 2024 in GENERAL

Hi,

The sample solution states that "candidate could interpret Manufacturing to be part of All Other Industry type OR the base class"

Could you please explain the difference between the two?

In Sample 1 "Using Manufacturing as the base industry class", the solution is applying the coefficients for "All Other industry type" of -0.55. This is a manufacturing risk, so if we use manufacturing as base class, isn't it we should not apply any coefficient for industry type?

Also it's weird that sample solution 3&4 are also "Using Manufacturing as the base industry class" but they are not applying the -0.55.

Thanks in advance for clearing this up!

Comments

  • The CAS accepted a lot of different answers to this question and not all of them are laid out in a consistent manner. Let's look at the Industry Type variable and recall what we know about base classes.

    The output for a categorical variable in a GLM includes coefficients for any categories that are not the base class. The base class always implicitly has a coefficient of 0. We're told we have a manufacturing risk and from the text it's clear this should be a categorical variable. There are three ways we could look at this risk.

    1. Manufacturing isn't listed under Industry Type so it must be the base class. In which case our coefficient is 0.
    2. Manufacturing is a type of construction so it should have coefficient 0.35.
    3. Manufacturing isn't called out as the base class and isn't construction so it must be "All Other" and have a coefficient of -0.55.

    Sample answer 1 is "wrong" in the sense that it does none of these. They have decided the "All Other" category is the base class and for some reason, the GLM output for the Industry Type variable has been rebased to something entirely unknown - perhaps so the weighted average overall is 1.000.

    Sample 2 says it takes approach 3 above but then doesn't include the coefficient so is inconsistent. Samples 3 and 4 take approach 1 above.

    Ultimately, the only advice we can give is when faced with ambiguity be really clear about

    a) what you're assuming, and

    b) how you get from your assumptions to say the coefficients used in the GLM.

  • The solution seems to assume a log link, but I don't see anywhere in the question where that is explicitly stated. Should we always assume a log link then?

  • Hi,

    It is actually in the question this time but the volume of material given in this IQ makes it hard to keep track of things. It's even worse in the Excel environment when you have to scroll incessantly or zoom out and squint.

    As a rule of thumb it's safe to assume a log link function if nothing is said unless you're doing logistic regression in which case you should assume a logit link function. As always, make sure to state your assumptions on the exam.


Sign In or Register to comment.