Sparse data is a big concern in building models for loss given default (LGD) for corporate risk. For LGD, most predictors are instrument-related, firm-specific, macroeconomic and industry-specific variables, while the costs to collect such data may be relatively high. In one example of Gunter and Peter’s book, industry-wise average default rate, yearly average default rate, firm-wise leverage rate were applied to predict LGD. To increase the predictability, the painful transformation of LGD was conducted [Ref. 1]. Actually some non-linear models could be considered.

In a conference paper about consumer risk scoring, Wensui mentioned that generalized additive model (GAM) provides the ability to detect the nonlinear relationship between risk behavior and predictors [Ref. 2]. In this example, we are possibly more interested in estimating the parameter of firm-specific leverage (lev). Thus I used Proc GAM to estimate this variable’s parameter while smoothing other predictors by LOESS functions. In addition, I used Proc LOESS to realize the nonparametric regression. Comparing the two methods in a series plot, their predictions of LGD are pretty close. As the result, Proc GAM may provide us an insightful tool to construct meaningful semiparametric regression to predict LGD.

References:
1. Gunter Loeffler and Peter Posch. ‘Credit Risk Modeling using Excel and VBA’. The 2nd edition. Wiley. 2011
2. Wensui Liu, Chuck Vu, Jimmy Cela.‘Generalizations of Generalized Additive Model (GAM): A Case of Credit Risk Modeling’. SAS Global 2009

data _tmp01;
infile "h:\raw_data.txt" delimiter = '09'x missover dsd firstobs=2;
informat lgd lev lgd_a i_def 8.3;
label lgd = 'Real loss given default'
lev = 'Leverage coefficient by firm'
lgd_a = 'Mean default rate by year'
i_def = 'Mean default rate by industry';
input lgd lev lgd_a i_def;
run;

ods html gpath = 'h:\' style = money;
ods graphics on;
proc loess data=_tmp01;
model lgd = lev lgd_a i_def / scale = sd select = gcv degree = 2;
score;
ods output scoreresults = predloess;
run;

proc gam data= _tmp01 plots = components(clm);
model lgd = loess(i_def) loess(lgd_a) param(lev) / method = gcv;
output out = predgam p = pbygam;
run;
ods graphics off;

data _tmp02;
merge predloess predgam;
keep p_LGD LGD pbygamLGD obs;
label p_LGD = 'Prediction by Proc LOESS'
pbygamLGD = 'Prediction by Proc GAM';
run;

proc sgplot data = _tmp02;
series x = obs y = lgd ;
series x = obs y = p_lgd;
series x = obs y = pbygamlgd;
yaxis label = 'loss given default';
run;
ods html close;