Underwriting Manual Construction: Tables vs. Calculators

Yeah, I know, exciting, right?

Well, I have had a good amount of experience authing manual sections and it seem that no matter where you go there are a lot of opinions about the use of calculators vs. tables, especially when it comes to sections dealing with labs like liver function tests or certain cancers like breast, prostate and colon cancer.

To review the points on both side of the debate it helps to consider why the debate arises at all. I think it has to do with the desire to incorporate as many factors (both favorabe and unfavorable) as possible in order to enable the most accurate underwriting possible. This is a fine goal but it comes with a price: complexity. I have found that once the number of varialbes one needs to consider becomes greater than about 6, tables get very long and difficult to read. This is often when a calculator will become appealing to the author of the manual section.

The major strength of a calculator is consistency. Put in the correct inputs and you will get the correct output. The weakness of  a calculator is related to this strength. If a mistake is made in the calculator’s construction, then the resulting errors are guaranteed to occur in every case. Also, some have criticized calculators as being detrimental to the eduction of the underwriter using it. If all you do is ‘plug and chug’ you do not develop a sense of which factors are important to the mortality risk. I have seen this myself in the process of interviewing underwriters. I often ask an interviewee to tell me whether positive estrogen receptors are a positive or negative risk factor in breast cancer (it is positive). I have heard the answer “I’m not sure, I just use the calculator” more than once.

This is unfortunate and I don’t think one needs to sacrifice the promotion of important eductional concepts in order to reap the benefits of a calculator. Instead, a well-designed calculator can include indicators, pop-up boxes or short descriptions which will help develop the underwriter’s knowledge. For instance, if a breast cancer calcultor has a section for estrogen receptors utilizing a drop-down menu for entry, it is a simple matter to indicate using color codes, up/down arrows or pop-up boxes that checking “positive” is a favorable factor. Even making the choice say “positive receptors (favorable)” reinforces the concept without confusion or additional complexity.

So, in my opinion, good calculator design can improve accuracy and consistency while also promoting clarity and understanding. Of course, good calculator design is a complex task and may be a topic for another post.

Machine Learning in Healthcare (and otherwise)

The latest issue of JAMA arrived in my snail-mail box this week and it contained an editorial about machine learning in healthcare. This was, no doubt, in follow-up, to a recent issue of JAMA which contained 2 articles about deep learning algorithms which performed well in the detection of diabetic retinopathy and metastatic cancer.

This editorial does an excellent job of balancing the exciting potential of such algorithms with realistic expectations. Futher it succinctly encapsulates the various tiers of algorithmic complexity. Perhaps most interestingly the editorial contains a figure which maps out many familiar and not-so-familiar algorithms into a chart of human involvement vs. data volume. One of these cited algorithms is a study of coronary risk scoring algorithms developed by examining the electronic records of primary care patients in England. Authors used several techniques (random forest, logistic regression, gradient boosting, and neural networks) and compared performance to a classic risk score from the American Heart Association. They found that all of the machine learning algorithms out-performed the risk score. The intriguing part to me was that logistic regression performed nearly as well as neural networks, and quite a bit better than random forest. Two reasons why that is interesting: 1) I am a nerd, 2) neural nets and random forest are notoriously ‘black box’, while logistic regression is very clear in how the risk prediction is affected by the input variables (so-called ‘glass box’).

The bottom line is this: it is possible to find models which perform extremely well and that still maintain clarity.

You can find the full article here.

The Analytic Process

Spotted this at Harvard Buisness Review. It is a short review of the analytic process in business and how it can go astray. It is definitely worth a read if you are involved with the implementation or creation of analytic models for business.