GLM.Comments
Reading: Goldburd, M.; Khare, A.; Tevet, D.; and Guller, D., "Generalized Linear Models for Insurance Rating,", CAS Monograph #5, 2nd ed., Chapters 8 & 9.
Synopsis: We'll learn about the importance of good model documentation (including what's involved in that) and discuss some scenarios where a GLM may fall short.
Study Tips
Alice: "Chapter 8 is a brief read on why it's good to document your work well. Although the key points are outlined in this wiki article, you should read the source a couple of times before the exam to be highly familiar with the ideas. Chapter 9 covers some of the situations which can limit the usefulness of a GLM. This material is possibly more testable than Chapter 8."
Estimated study time: 3 Hours (not including subsequent review time)
BattleTable
Based on past exams, the main things you need to know (in rough order of importance) are:
- How to handle deductibles and territories in a GLM.
- The importance of good model documentation.
reference part (a) part (b) part (c) part (d) Currently no exam questions for this reading
Full BattleQuiz | Excel Files | Forum |
You must be logged in or this will not work.
In Plain English!
Model Documentation
The authors of the GLM text describe model documentation as fulfilling at least three important purposes:
- To check your own work and improve your communication skills.
- To transfer knowledge to the next owner of the model.
- To comply with the requirements of internal and external stakeholders.
Alice: "All of the documentation you produce should comply with ASOP 41 on Actuarial Communications (even if you're not yet credentialed it's just good practice). Don't forget you can read ASOP 41 for some continuing ed credit if you need to brush up."
Documenting your work should ideally be done in parallel with the work itself because this allows you to correct mistakes rather than just identify them too late. When you document your work it's generally at a higher level than the work itself so you're forced to reflect on how to communicate it effectively. As you describe your processes and assumptions to someone else it becomes easier to find mistakes or conceptual misunderstandings. It's a lot easier to do this in a high-level narrative than reading line by line in your favourite programming language. Two additional benefits of the documentation process are:
- You'll gain a deeper understanding of the topic, so your work will have a higher quality than that of someone who is just following the steps in a text.
- You'll improve your communication skills. The first draft of your documentation may not be pretty but as you refine it you'll get better at concisely passing along the key details.
Good documentation will allow you to rest easily when you've transferred the model into production and/or moved on to the next project. If someone has a question, you can flip back to your documentation and not have to rely on your memory. Auditors, risk managers and regulators often have questions on various aspects of a model and these may arise several months after the modeling work was completed. With good documentation you can easily address the questions and update assumptions etc. without the time and frustration of getting back up to speed on prior work.
The authors of the GLM text say your documentation should meet the following requirements:
- Include everything needed to reproduce the model starting from the source data to the getting the model output.
- State all assumptions and justify why they were made and are reasonable.
- Disclose any data issues/limitations and how they were addressed.
- Discuss any reliance on other models as input, such as catastrophe model output being used in a rate-level indication.
- Discuss any reliance on external stakeholders, such as constraints imposed by regulators or statutes.
- Discuss the model structure, performance and limitations
- Comply with ASOP 41 (or your local actuarial communication standard)
Computer code is a form of model documentation. It needs to be clearly written and commented to complement the higher level technical narrative you're also creating. Examples of good code documentation include:
- Making sure the version is known — is it the first draft, pre-production, in production?
- Having a brief description of the purpose any routines or modules.
- Noting any limitations — is something counted using an "Int" so limited to a certain maximum number of entries?
- Saying why any non-intuitive code is needed — what exactly was the purpose of that large temporary array?
Why you probably shouldn't model coverage options with GLMs
Rating variables can be separately broadly into two categories: those which the insured has limited control over such as the characteristics of their home or car, where they live and work, etc., and those which they have a choice such as their deductible, policy limits, and coverage options (rental coverage or not?).
A GLM can produce odd results for variables where the insured has choice. For instance, it may indicate that a higher deductible should be charged more than a lower deductible. Or higher limits should be charged less than lower limits.
Possible reasons for this model behaviour include:
- The GLM tries to account for correlations regardless of whether causation is present. Choosing a higher deductible could indicate a higher risk appetite, which is present in their loss history (adverse selection). Or, choosing a higher deductible may reflect greater financial stability of the insured and their willingness to protect it despite low losses (favourable selection).
- An underwriter may have recognized the policy as being a higher risk and required it to be written at a higher deductible.
Ultimately, using a GLM to model deductibles etc. may result in a good model for existing policyholders but a poor one for future business. In particular, moving away from pricing by pure premiums for variables such as deductibles, will likely shift the insured's behaviour, causing historical experience to not reflect future expectations.
It is better to estimate coverage options such as deductibles, ILFs, peril group factors etc. using traditional actuarial loss elimination methods rather than a GLM. Include the results of these analyses in the GLM model as an offset.
Territory Modelling
There may be too many territories for a GLM (Alice: "Remember a GLM assumes its data is fully credible..."). However, aggregating up to a useful number of levels may cause a loss of signal. Instead of using a GLM it is better to use other methods such as spatial smoothing to model territories.
However, you should still keep the territory variable in the GLM as an offset variable. To offset territory in a GLM, include the territory loss cost relativity from the separate model on each record in the data set. Don't forget to transform the relativity to match the link function scale before including it in the linear predictor by adding on 1*(transformed territory relativity). This ensures the GLM variables don't become a proxy for territory.
However, note that the territory modelling process should take into account potential differences in rating characteristics between territories - such as high value homes being concentrated in one territory. Thus, ideally both the territory and GLM models are run alternately in an iterative process until both converge to a stable solution.
Ensembling
Combining the output from two or more models is called an ensemble. The most straightforward way is to take the straight average of the each model's output. The average of everyone's guess is likely closer to the answer than any one person's estimate.
Ensembling works best when the models being combined are truly independent, i.e. no sharing of information/insights between modelling teams. The model errors should be uncorrelated if possible. Also, the models shouldn't all systematically miss some information - if everyone guesses low, then the average guess will also be low.
Full BattleQuiz | Excel Files | Forum |
You must be logged in or this will not work.