Weights vs offset
In the example for a policy twice as long, I understand the offset term can be used scale the predicted claim counts accordingly all else equal. However, I'm still confused on the weights (w) adjustment on the variance function for a frequency GLM model. Using the same simple example of policy twice as long, the exposure would be two times, therefore the variance function will be reduced by half. Why would we need to reduce the variance for this policy twice as long, since it is a frequency model?
Comments
This is due to the law of large numbers. https://en.wikipedia.org/wiki/Law_of_large_numbers
A 6-month policy and a 12-month policy may have the same claim frequency but, as you point out, the 12-month policy will have twice the number of claims because it has twice the exposure. By using the exposures as weights in a claim frequency model we're reflecting we have more information contained in a 12-month policy than we do in a 6-month policy. The more information we have, the lower the variance should be due to the law of large numbers. In other words, we have greater confidence that we "know" the true claim frequency for a 12-month policy than we do for a 6-month policy where both have the same claim frequency and identical other risk characteristics.
Taken to an extreme, suppose we have a one-day policy that has the same claim frequency as a 12-month policy. If you're using data from these policies to price a standard 6-month policy, which do you have more confidence in? The one which reflects a single day of experience, or the one that's got 365 days of experience embedded? The more confidence we have in an estimate, the lower the variance should be around its expected value.