When Your Best Customers Don’t Want What You’re Selling

Predicting Customer Behavior: A Data Science Journey Through Insurance Marketing

Imagine you’re a marketing director at an insurance company launching a new product. Your team is excited — this offering consolidates coverage in ways customers have been asking for. You’ve got a database of 14,000 customers. The question keeping you up at night: Who should we target?

Common wisdom says target your loyal customers, right? People who already trust you and buy your products. Spend your marketing budget on those established relationships.

But what if the data told you the exact opposite? What if your most loyal customers, the ones already using your products, were the least likely to buy your new offering?

This is the story of a real predictive analytics project that challenged conventional marketing wisdom — and revealed surprising truths about customer behavior. It’s also a story about mistakes, corrections, and the messy reality of data science work.

The Challenge: 14,000 Customers, One Question

The insurance company had rich data on 14,016 customers:

Demographics (age, relationship length)
Purchase history (which products they owned)
Spending patterns (how much they spent on each product)
Loyalty classification (though half the customers were mysteriously “unclassified”)
The outcome: whether each customer purchased the new product (yes or no)

With 43% of customers having purchased, this wasn’t a rare event — there was a real signal to find. The business wanted a model to predict which customers would buy, allowing them to focus marketing resources where they’d have the most impact.

Simple enough, right?

The First Surprise: When “Good” Customers Look Bad

Before building any predictive model, we explored the data — a critical step that reveals patterns and problems. This is where things got interesting.

Discovery #1: The Loyalty Paradox

The company classified customers into loyalty levels 0–3, with 3 being most loyal. Conventional wisdom says loyal customers = higher purchase rates.

The data said otherwise:

Loyalty Level 3 (most loyal): Only 24% purchased
Loyalty Level 1: 37% purchased
Loyalty Level 99 (unclassified): 55% purchased

Loyalty appears to be a meaningful predictor, but not in the expected direction. The unclassified category (99) is particularly important and should be preserved as a separate category rather than treated as missing data.

The “unclassified” group — representing half of all customers — had the highest conversion rate by far. These weren’t the established, categorized, loyal customers. They were likely newer customers still building their insurance portfolios, still in “shopping mode.”

The insight: Sometimes your hottest prospects aren’t your most established customers — they’re the ones still figuring out what they need.

Discovery #2: The Product Substitution Effect

Here’s where it got really counterintuitive. We analyzed purchase rates based on whether customers already owned the company’s existing products (Product A and Product B):

Product A:

Customers WITHOUT Product A: 60% purchased the new product
Customers WITH Product A: 28% purchased

Product B:

Customers WITHOUT Product B: 64% purchased the new product
Customers WITH Product B: 29% purchased

Customers who didn’t own existing products were 2x more likely to buy the new one.

prod_B is also a **strong negative predictor** and should be included in our models.

Think about what this means: The new product wasn’t complementing existing coverage — it was replacing it. Customers who already had Products A or B felt their needs were met. But customers without those products saw the new offering as an attractive all-in-one solution.

The insight: Not all products are add-ons. Sometimes you’re selling a replacement, and your best prospects are those who haven’t committed to your existing offerings yet.

Discovery #3: Age Matters More Than You Think

The average purchaser was 39.7 years old. The average non-purchaser was 33.0 years old.

That 7-year difference might not sound dramatic, but it revealed a clear pattern: older customers, likely with more life complexity (families, mortgages, assets) and higher incomes, were significantly more receptive to comprehensive insurance solutions.

The Messy Middle: When Good Data Scientists Make Mistakes

Here’s where I’ll share something you don’t often see in polished Medium articles: we made a significant error halfway through this analysis.

The Mistake

When building our initial predictive models, we treated the loyalty variable (values: 0, 1, 2, 3, 99) as a single continuous number. The model tried to find a single coefficient that captured the relationship between loyalty level and purchase behavior.

This was wrong for a subtle but important reason: 99 doesn’t mean “extremely loyal” — it means “unclassified.” By treating it as a number, we were telling the model that the relationship between loyalty and purchasing followed a linear pattern: 0 < 1 < 2 < 3 < 99.

But we already knew from our exploration that loyalty 99 customers had the highest purchase rate, not because they were “more loyal than 3,” but because they represented a fundamentally different category.

Our initial models showed weak performance and confusing coefficient signs that contradicted what we’d seen in the data exploration.

The Fix

The solution: one-hot encoding. Instead of treating loyalty as one variable with values 0–3, 99, we created separate binary variables:

loyalty_1 (yes/no)
loyalty_2 (yes/no)
loyalty_3 (yes/no)
loyalty_99 (yes/no)
loyalty_0 as the reference category

This allowed the model to capture the unique relationship between each loyalty level and purchase behavior, without assuming any particular ordering.

After this correction, model performance jumped from 71% accuracy to over 76% accuracy, and the coefficients finally aligned with our exploratory findings.

The Lesson

Data science isn’t a straight line from question to answer. It’s iterative. You explore, you hypothesize, you model, you get weird results, you reconsider your assumptions, and you try again.

The key skill isn’t avoiding mistakes — it’s recognizing when something doesn’t make sense and having the curiosity to dig deeper.

Building the Solution: Three Models, One Winner

We built three distinct predictive models to test different hypotheses:

Model 1: The Full Model
Used all available features with log transformations to handle skewed data (customer ages and spending amounts weren’t normally distributed). This kitchen-sink approach aimed for maximum predictive power.

Model 2: The Parsimonious Model
Focused only on the strongest predictors identified in exploration: age, product ownership, and loyalty. This simpler model traded some accuracy for interpretability.

Model 3: The Original Scale Model
Used untransformed variables to test whether the log transformations actually helped.

The Results

Model Features Accuracy ROC-AUC Model 1 (Full) 10 76.2% 0.833 Model 2 (Simple) 7 71.2% 0.759 Model 3 (Original) 10 75.7% 0.832

Model 1 won, achieving 75.6% accuracy on unseen test data — a 32% improvement over simply guessing the most common outcome.

More importantly, the model showed excellent generalization — it performed nearly identically on new data as on training data, with less than 2% performance drop. This meant it had learned genuine patterns, not just memorized the training examples.

What the Model Revealed: The Power of Coefficients

Here’s where predictive models become not just prediction machines, but insight generators. The model’s coefficients tell us how strongly each factor influences purchase decisions:

The Strongest Drivers (Increasing Purchase Likelihood):

1. Product A Spending (coefficient: +6.40)
The single strongest predictor. Customers who spent more on Product A were dramatically more likely to buy the new product. Each unit increase in log-transformed spending multiplied the odds of purchase by a staggering 599x.

Translation: High-value customers who are already engaged and spending are your hottest prospects.

2. Customer Age (coefficient: +2.12)
Older customers were 8.3x more likely to purchase for each unit increase in log-age.

Translation: Target mature customers in their late 30s and 40s.

3. Unclassified Loyalty Status (coefficient: +0.81)
Customers with loyalty_99 status were 2.2x more likely to purchase than the baseline.

Translation: That mysterious “unclassified” segment is gold. They’re likely newer customers still shopping.

The Strongest Deterrents (Decreasing Purchase Likelihood):

1. Owning Product B (coefficient: -1.05)
Owning Product B reduced purchase odds by 65%.

2. Product B Spending (coefficient: -0.87)
Higher spending on Product B further reduced likelihood.

3. Owning Product A (coefficient: -0.54)
Owning Product A reduced purchase odds by 42%.

Translation: Existing product owners see less value in the new offering. It’s a replacement, not a complement.

The Business Strategy: Who to Target

Based on these insights, here’s the actionable marketing strategy:

Priority 1: High-Spending Prospects Without Existing Products

Target customers who:

Have high Product A spending (engaged, high-value)
Do NOT own Products A or B (open to new solutions)
Are aged 35–45 (mature life stage, higher needs)

This segment shows the highest predicted conversion rates.

Priority 2: The Unclassified Loyalty Segment

That mysterious loyalty_99 group — 50% of the customer base, 64% of actual purchasers. These are likely:

Newer customers (shorter relationship length)
People shopping for coverage (not yet locked into specific products)
Building their insurance portfolios

They’re in active decision-making mode. Don’t ignore them because they’re not “classified loyal customers.”

Priority 3: Product Consolidation Messaging

For customers who DO own existing products, the message needs to shift. Don’t sell the new product as “more coverage” — sell it as “simpler, consolidated coverage that replaces multiple products.”

Address their objection head-on: “Already have Product B? Here’s why upgrading to our comprehensive solution saves you money and hassle.”

The Bigger Picture: Five Lessons for Data-Driven Decision Making

1. Challenge Your Assumptions

“Target loyal customers” sounds right until the data shows loyal customers are saturated and the “unclassified” group converts at double the rate. Always let the data challenge your intuitions.

2. Understand Your Product’s Role

Is your new offering a complement (like fries with a burger) or a substitute (like a newer iPhone replacing your old one)? This fundamentally changes who will buy it.

3. Not All “Good Customers” Are Good Prospects

A loyal customer who’s happy with their current products might be a terrible prospect for a new product that makes their current ones redundant. Customer lifetime value and immediate conversion likelihood are different metrics.

4. The 80/20 Rule of Features

Model 2 used only 7 features instead of 10 and got 71% accuracy. Model 1 used all 10 and got 76% accuracy. That last 5% accuracy gain required 43% more features. Know when simplicity is worth the tradeoff.

5. Mistakes Are Part of the Process

We spent significant time with models that had categorical variables encoded incorrectly. The performance was mediocre, the coefficients were confusing, and we had to start over. That’s normal. What matters is recognizing when something’s off and iterating.

What This Means for You

You might not be in insurance marketing, but these principles apply broadly:

If you’re in marketing: Stop assuming existing customers are always your best prospects for new offerings. Segment by product ownership and lifecycle stage.

If you’re in product development: Understand whether you’re building a complement or a substitute. This changes everything about your go-to-market strategy.

If you’re learning data science: Exploratory data analysis isn’t just a box to check — it’s where you develop intuition about your problem. Build models that align with what you learned in exploration, not against it.

If you’re a business leader: A 32% improvement over baseline might not sound sexy, but in a company with 14,000 customers and typical marketing costs, that translates to hundreds of thousands in saved marketing spend and increased revenue.

The Final Numbers

The deployed model achieved:

75.6% accuracy on unseen test data
74.9% precision (3 out of 4 predicted purchasers actually buy)
64.9% recall (catches about 2 out of 3 actual purchasers)
0.827 ROC-AUC (excellent discrimination ability)

More importantly, it revealed:

The unclassified loyalty segment as the highest-value target
Product substitution dynamics that challenged conventional targeting
The power of customer spending patterns as predictive signals
Age-based segmentation opportunities

The Journey Continues

This analysis answered one question (who will buy?) but raised others:

Why are 50% of customers “unclassified” in the loyalty system?
Could we predict lifetime value, not just immediate purchase?
How do these patterns change over time?
What external factors (economic conditions, competition) influence these relationships?

Good data science doesn’t just solve problems — it reveals new questions worth asking.

The most valuable insight from this project wasn’t a specific number or model. It was this: Your intuitions about customers are hypotheses, not facts. Test them. And be prepared for the data to surprise you.

Sometimes the customers you’d write off — the unclassified, the non-owners, the ones who don’t fit neat categories — are exactly where your next growth opportunity lies.

Key Takeaways:

Explore your data thoroughly before modeling — patterns you find will guide better models
Encode categorical variables properly — don’t assume ordering that doesn’t exist
Test multiple approaches and compare systematically
Interpret your model’s coefficients for business insights, not just predictions
Challenge conventional wisdom with data — your “best customers” might not be your best prospects
Iterate when things don’t make sense — errors are learning opportunities
Connect technical findings to actionable business strategy

The intersection of data science and business strategy is where the real magic happens. Not in perfect models, but in the insights that change how you think about your customers — and what you do differently because of it.

Find the Code and more details on my github: https://github.com/olimiemma/predicting-insurance-product-purchases/tree/main

Search This Blog

Artificial IntelliTools