for all the customers of one of my clients.
The churn prediction model that I have currently created is giving some good results.
The client (and I) now wants to include Customer Satisfaction survey results into the the model to augment the predictive capability of the model.
The issue with using the survey data is that responses are only available for the customers who responded to the survey. Right now, I have this for 5000 out of the total 23,000 customers.
I definitely can't impute data in this case because fill rate is only around 5/23. Those features would be mostly NA when I try to predict it for the whole customer base.
How can I use the survey results effectively?
Bottomline is, how can I use a feature which is only available in 22% of the dataset?
The only ethical way to do this is to treat the non-answer as a feature, imo
Обсуждают сегодня