Sparse data is a big concern in building models for loss given default (LGD) for corporate risk. For LGD, most predictors are instrument-related, firm-specific, macroeconomic and industry-specific variables, while the costs to collect such data may be relatively high. In one example of Gunter and Peter’s book, industry-wise average default rate, yearly average default rate, firm-wise leverage rate were applied to predict LGD. To increase the predictability, the painful transformation of LGD was conducted [Ref. 1]. Actually some non-linear models could be considered.

In a conference paper about consumer risk scoring, Wensui mentioned that generalized additive model (GAM) provides the ability to detect the nonlinear relationship between risk behavior and predictors [Ref. 2]. In this example, we are possibly more interested in estimating the parameter of firm-specific leverage (lev). Thus I used Proc GAM to estimate this variable’s parameter while smoothing other predictors by LOESS functions. In addition, I used Proc LOESS to realize the nonparametric regression. Comparing the two methods in a series plot, their predictions of LGD are pretty close. As the result, Proc GAM may provide us an insightful tool to construct meaningful semiparametric regression to predict LGD.

1. Gunter Loeffler and Peter Posch. ‘Credit Risk Modeling using Excel and VBA’. The 2nd edition. Wiley. 2011
2. Wensui Liu, Chuck Vu, Jimmy Cela.‘Generalizations of Generalized Additive Model (GAM): A Case of Credit Risk Modeling’. SAS Global 2009

data _tmp01;
infile "h:\raw_data.txt" delimiter = '09'x missover dsd firstobs=2;
informat lgd lev lgd_a i_def 8.3;
label lgd = 'Real loss given default'
lev = 'Leverage coefficient by firm'
lgd_a = 'Mean default rate by year'
i_def = 'Mean default rate by industry';
input lgd lev lgd_a i_def;

ods html gpath = 'h:\' style = money;
ods graphics on;
proc loess data=_tmp01;
model lgd = lev lgd_a i_def / scale = sd select = gcv degree = 2;
ods output scoreresults = predloess;

proc gam data= _tmp01 plots = components(clm);
model lgd = loess(i_def) loess(lgd_a) param(lev) / method = gcv;
output out = predgam p = pbygam;
ods graphics off;

data _tmp02;
merge predloess predgam;
keep p_LGD LGD pbygamLGD obs;
label p_LGD = 'Prediction by Proc LOESS'
pbygamLGD = 'Prediction by Proc GAM';

proc sgplot data = _tmp02;
series x = obs y = lgd ;
series x = obs y = p_lgd;
series x = obs y = pbygamlgd;
yaxis label = 'loss given default';
ods html close;