Sparse data is a big concern in building models for loss given default (LGD) for corporate risk. For LGD, most predictors are instrument-related, firm-specific, macroeconomic and industry-specific variables, while the costs to collect such data may be relatively high. In one example of Gunter and Peter’s book, industry-wise average default rate, yearly average default rate, firm-wise leverage rate were applied to predict LGD. To increase the predictability, the painful transformation of LGD was conducted [Ref. 1]. Actually some non-linear models could be considered.

In a conference paper about consumer risk scoring, Wensui mentioned that generalized additive model (GAM) provides the ability to detect the nonlinear relationship between risk behavior and predictors [Ref. 2]. In this example, we are possibly more interested in estimating the parameter of firm-specific leverage (lev). Thus I used Proc GAM to estimate this variable’s parameter while smoothing other predictors by LOESS functions. In addition, I used Proc LOESS to realize the nonparametric regression. Comparing the two methods in a series plot, their predictions of LGD are pretty close. As the result, Proc GAM may provide us an insightful tool to construct meaningful semiparametric regression to predict LGD.

References:

1. Gunter Loeffler and Peter Posch. ‘Credit Risk Modeling using Excel and VBA’. The 2nd edition. Wiley. 2011

2. Wensui Liu, Chuck Vu, Jimmy Cela.‘Generalizations of Generalized Additive Model (GAM): A Case of Credit Risk Modeling’. SAS Global 2009

`data _tmp01;`

infile "h:\raw_data.txt" delimiter = '09'x missover dsd firstobs=2;

informat lgd lev lgd_a i_def 8.3;

label lgd = 'Real loss given default'

lev = 'Leverage coefficient by firm'

lgd_a = 'Mean default rate by year'

i_def = 'Mean default rate by industry';

input lgd lev lgd_a i_def;

run;

ods html gpath = 'h:\' style = money;

ods graphics on;

proc loess data=_tmp01;

model lgd = lev lgd_a i_def / scale = sd select = gcv degree = 2;

score;

ods output scoreresults = predloess;

run;

proc gam data= _tmp01 plots = components(clm);

model lgd = loess(i_def) loess(lgd_a) param(lev) / method = gcv;

output out = predgam p = pbygam;

run;

ods graphics off;

data _tmp02;

merge predloess predgam;

keep p_LGD LGD pbygamLGD obs;

label p_LGD = 'Prediction by Proc LOESS'

pbygamLGD = 'Prediction by Proc GAM';

run;

proc sgplot data = _tmp02;

series x = obs y = lgd ;

series x = obs y = p_lgd;

series x = obs y = pbygamlgd;

yaxis label = 'loss given default';

run;

ods html close;