Today's customer spotlight highlights Resuelve, a Mexican consumer credit agency. There, the engineering and data team, led by José Akle, is using data science to model consumer intent - in an effort to qualify potential customers out of the tens of thousands of inbound leads generated each month.
Let's dig in:
First, tell us about you and your professional background
We are a team of math and software geeks working to help Resuelve make better data based decisions.
Currently machine learning (ML) is what’s making our work fun. For example natural language processing on phone calls and support tickets, learning about customer behavior patterns and creating predictions from structured data.
What is Resuelve and what type of business the company is it?
Resuelve Tu Deuda (translated: resolve your debt), is a savings program for folks who have gotten over their head with debts such as credit card debt, car loans, etc. They come to us and we give them financial help in the way of re-structured debt and more lenient payment plans.
Our job at the engineering organization is to use data make the company smarter and faster.
What was first for your data/engineering department?
Our first step was simply gathering and collecting all the data touchpoints for our current and potential customers. Then we aggregated data from various databases through a data warehouse. Before, we had data analysts looking at spreadsheets and doing manual joins. Our data process was largely based on spreadsheets and CRM reports.
Last, we connected other external data sources, such as email and IVR data amongst others that allowed us to start to do basic lead scoring.
What’s your data stack like now?
We use many PostgreSQL databases and a few MongoDB databases. For business intelligence, we use Metabase and for dashboards we use Klipfolio, which we found after a lot of experimentation.
Our aim was to allow people to have direct access to data and enable each area´s analysts to conduct analysis by themselves, without needing help from the engineering team. This also left our data team free to focus on creating better ML models.
Ultimately, our goal is to stop looking at the past, and start predicting the future - as in what type of customer leads are most valuable - using data and machine learning.
The most impactful contribution we’ve contributed to the business is lead scoring. As a company, we generate around 30,000 leads per month but we don’t close them all. It takes a lot of effort to call that many people per month - so a lead scoring model helps our company immensely to choose who to call first.
We started experimenting with prediction models, but soon found out that trying to draw conclusions with just one data source wasn’t effective. However, when we joined Salesforce data with sources such as IVR and email open rates ...we got more predictive models.
We classify leads on a 100-point score that predicts which of our potential customers are most likely to close. To pull in all these data points from their various sources - we could only do it via Panoply.
What data sources do you ingest into Panoply?
- Zendesk (customer service platform)
- Asterix for IVR - Interactive Voice Response.
- Postgres and Mongo databases
This is fascinating - what are the components of the lead score?
First, we started with demographic data such as age, gender, location, etc from our CRM. Then, from email, we analyze open rates, click through rates, email clients etc. (for example, @gmail addresses are more ‘valuable’ leads versus @yahoo or @hotmail addresses). We also look at user’s browsers - for example, Chrome users are highly valuable to us.
We also have to do a lot of data cleansing. We started with around 1,000 variables, and now we have around 30. We eliminated things like data with no variance, incomplete data, redundant and leaky data.
We then run it all through neural networks to generate the lead score models. In terms of perfecting our model, the biggest improvements have been by adding more examples and new variables that give us visibility into prospect’s behavior. For example, we recently started validating the email address of prospects which has been highly valuable.
How long have you had your data warehouse lead scoring model in place?
We’re on version 7 and we’ve had our model in place for 5 months.
How do you decide which elements to add to the data scoring algorithms?
Any place we can discover new behavioral data, as it beats demographic data every time when it comes time to predicting consumer behavior.
Next, we’d like to roll in marketing analytics information to perfect our algorithm. For now, our phone and data points have been very valuable to us.
What were your requirements for a data warehouse?
- Being able to integrate with dozens of different data sources
- Not needing a dedicated staff to manage the data warehouse
- Being able to query the data directly when needed
- Needed to rely upon Redshift or BigQuery
- Panoply’s support has been unrivaled
What’s your favorite Panoply feature?
My favorite facet of Panoply is how easy it is to add new data sources to our warehouse. It’s point and click and the data’s there. It’s fantastic.
Customer support would be number 2 and the fact that Panoply ‘just works’ is priceless.