Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Credit risk analysis (finance risk analysis, loan default risk analysis) and credit risk management
is important to financial institutions which provide loans to businesses and individuals. Credit
can occur for various reasons: bank mortgages (or home loans), motor vehicle purchase finances,
credit card purchases, installment purchases, and so on. Credit loans and finances have risk of
being defaulted. To understand risk levels of credit users, credit providers normally collect vast
amount of information on borrowers. Statistical predictive analytic techniques can be used to
analyze or to determine risk levels involved in credits, finances, and loans, i.e., default risk
levels.
It's been argued that the current global financial meltdown is the consequence of abusive use of
risky holistic risk management methods such as derivative-based insurance and Monte Carlo
techniques by management and professionals. Note that Monte Carlo is a gambler's method that
does not reflect real risk. By doing so, they encouraged risky lending such as NINJA and
subprime mortgages to occur. They prescribed risk models which in essence hide risk involved
and transfer directly to insurers using shadowy derivative financial instruments such as CDS and
CDO. Isn't it a time to implement sound predictive risk management systems based on empirical
data as described in this page?
Personal credit scores are normally computed from information available in credit reports
collected by external credit bureaus and ratings agencies. Credit scores may indicate personal
financial history and current situation. However, it does not tell you exactly what constitutes a
"good" score from a "bad" score. More specifically, it does not tell you the level of risk for the
lending you may be considering. Internal credit scoring methods described in this page address
the problem. It is noted that internal credit scoring techniques can be applied to commercial
credits as well.
Analyzing such vast information is an extremely difficult and challenging task! In conventional
methods, factor analysis is performed on a few (to several) variables at a time using statistical
software. As the total number of variables increases, the number of combinations to be examined
in this way grows combinatorially. When a large number of variables is involved, the number of
combinations is too large to be examined manually. Thorough systematic accurate analysis is all
but impossible! A conventional method to this problem is to examine combinations that are
likely to have influence. However, hunch can leave out important factors without being noticed.
Fortunately, this problem can be overcome with CMSR Hotspot Profiling Analysis. Hotspot
profiling analysis drills-down data systematically and detects important relationships, co-factors,
interactions, dependencies and associations amongst many variables and values accurately using
Artificial Intelligence techniques, and generate profiles of most interesting segments. Hotspot
analysis can identify profiles of high (and low) risk loans accurately through thorough systematic
analysis of all available data. The followings are examples of hotspot profiling applied to credit
information.
Finance risk factor profiling examples
Finance risk factor profiles can be easily developed with CMSR. The followings describe how
CMSR hotspot analysis tools can be used in developing profiles.
[Example 1] A financing firm (or bank) keeps loan records on motor vehicle purchase in its
database including default information: gender, age, education, occupation, income; vehicle type,
manufacturer, model, year make, price, loan amount, default, default amount, etc. The firm
wishes to know which types of loans for motor vehicle purchases are at the highest risk, i.e.,
highest default ratio by probability;
[Example 2] For the same data, the bank wishes to know which types of loans for motor vehicle
purchases are at the lowest risk in terms of lowest average default amounts;
• Neural Network is a very powerful modeling tool. It generally offers most accurate and
versatile models. It's very easy to develop neural network predictive models with CMSR.
Network visualization tools will guide users from configuration, training, testing, and
more importantly direct application to databases.
• Cramer Decision Tree produces most compact and thus most general decision trees.
Decision tree can be used for predicting segmentation-based statistical probability of
credit loan defaults.
• Regression produces mathematical functions for predicting default risk levels. It can be
very limiting to be used as general-purpose credit risk predictive modeling methods.
However when it is used with above methods, it can be a very useful method.
Classification models predict events into categorical classes, say, "risky" or "safe". Classification
methods are supported by decision tree, SVM, neural network, etc. Intuitively, this is a very
appealing approach as prediction is made using terms that anyone can understand! However,
there is a serious drawback in applying classification techniques to credit risk management. The
problem lies with the fact that credit defaults are in general very low ratio events, say, less than
10%. Developing predictive models with skewed data is very difficult, especially with decision
tree classification. Decision trees develop predictive models by segmenting populations into
smaller groups recursively. It uses the dominant category (or most frequent value) of each
segment as the predicted value for the segment. Dominant categories are the values represented
by over 50% segment population. Credit users are already well screened. It is possible that no
segments may contain risky customers in excess over 50%! Even it exists, it may be slightly over
50%! Segments in which 49% customers have default-history will be predicted as "not" risky,
although they are in very high risk segments! This type of models will have very low accuracy in
predicting risky customers as "risky". Much worse is that, as a consequence, more non-risky
customers may end up being classified as "risky". Not much useful properties! It is important to
note that all classification techniques have this limitation. To overcome this problem, you may be
tempted to use tricks by introducing extra instances. However, such tricks will necessarily distort
overall representation of population. Still the problem remains! A better approach is credit
scoring using statistical probability described in the next sections.
Do regression methods work?
Generally speaking, regression methods don't work well for complex modeling. This is
especially true if modeling data have severe skews. It tends to produce rather randomly
predictions. The following histograms show comparison between different modeling techniques
under severe data skew;
By Neural network
Neural network is a very powerful modeling
framework. As shown in the left figure, it can
learn in very detail. Most green areas are
located below 0.4. Most red areas are located
above 0.4.
Credit Scoring
(Internal) credit score is a numerical rating of credit loans. It measures the level of risk of being
defaulted. The level of default risk can be best predicted with predictive modeling. Credit scores
can be measured in term of default probability or relative numerical ratings. The following
subsections outline several credit scoring methods;
Decision tree divides customer loan segments into smaller sub segments recursively. At each
segment, splitting is made in a way that boosts proportions of either defaulted loans or fully-
recovered loans, in each resulting sub segment. This process repeats until no further
improvement can be made.
The above figure shows CMSR decision tree. Customer loan segments are partitioned
recursively in a way that increases the proportion of either defaulted or fully-recovered loans. In
the figure, reds represent defaulted loan portions and greens for fully-recovered loans. Nodes in
red indicate that over 50% customers of the segments have defaulted loans. Green nodes have
less than 50% of defaulted customers.
For new loan applications, when customer's information is applied to the tree, it will normally
lead to a terminal node segment. The default ratio of the node is used as the credit score of the
customer. If the segment has 35% default ratio in the past, the score will be 35% (0r 0.35). For
more information, please read Decision Tree Software.
Better modeling method: Predicting relative default risk level
Tree-based credit scoring provides coarse level prediction. It lacks the accuracy that neural
network models can produce. Neural Network is a very powerful predictive modeling technique.
Neural network is derived from animal nerve systems (e.g., human brains). The heart of the
technique is (artificial) neural network. Neural networks can learn to predict in detail with high
accuracy. The following shows the neural network module of CMSR;
Neural network works differently from decision tree. It can be trained to predict either relative
default levels or expected default amounts. When the former is used, network will predict
relative level of credit defaults. The latter will predict expected default amounts. The followings
are histograms, showing distribution of credit scores predicted by a neural network credit scoring
model. Note that reds are credit loans defaulted. Greens represent credit loans fully recovered.
Clearly, the neural network model predicts default loans with higher scores and loans fully-
recovered with lower scores. Analyzing distribution of scores, default probability may be
deduced.
*** Find out the limitations of predictive modeling based credit risk management in the next
section.
Credit industries heavily rely on judgmental methods. Judgments are made from past experience
on important factors such as customer payment history, debt service capacity, leverages, relevant
references, credit agency ratings, and information extracted from various financial statements.
Judgmental rules are used to arrive at ratings.
Normally, this process is performed manually. With the advancement of predictive rule engines,
it is now possible to automate this process. This can incorporate the best of both judgmental
scoring and statistical scoring methods. Critical data which are the basis of judgment can be
collected from financial statements, credit agency reports, past customer payment records, and so
on. Judgmental data may be included as well. Judgmental data are subjective soft data. From
financial statements, certain judgmental data may be extracted as subjective assessment by staff.
Rules are developed to score risks based on critical and judgmental data. This type of automated
systems will promote scoring consistency and accuracy in ratings while maintaining flexibility.
Predictive models may be included in judgmental rules. That is, rules can be used to assess
outcomes of statistical predictive models. Combining both judgmental and statistical predictive
models can result in best industry practices.
Rule-based modeling is a very powerful platform that combines the best of the knowledge of
experienced human experts and the power of predictive modeling. It is ideally suited to
overcome the limitations of predictive modeling for risk management. This incorporates
judgemental scoring. Rosella BI Platform provides two rule-based modeling engines: RME and
RME-EP. Both are based on SQL-like rule specification languages. They are very powerful
languages incorporating predictive models along with logical expressions and mathematical
formulas. RME is a procedural language. RME-EP is for rule-based expert systems. Together
they serve as a very powerful platform for risk modeling. For more, please read Expert Systems
Shell - Rule Engines.
Rule-based model specification language in Rosella Platform is based on powerful SQL database
query language with enhanced predictive modeling support. Intuitive-ness and expressive power
of SQL is well proven. It can easily incorporates the followings into credit scoring models;
• Government regulations.
• Internal business policies.
• Common sense and judgmental rules.
• Industry professional heuristics.