Posts

When Your Best Customers Don’t Want What You’re Selling

Image
  Predicting Customer Behavior: A Data Science Journey Through Insurance Marketing Imagine you’re a marketing director at an insurance company launching a new product. Your team is excited — this offering consolidates coverage in ways customers have been asking for. You’ve got a database of 14,000 customers. The question keeping you up at night: Who should we target? Common wisdom says target your loyal customers, right? People who already trust you and buy your products. Spend your marketing budget on those established relationships. But what if the data told you the exact opposite? What if your most loyal customers, the ones already using your products, were the least likely to buy your new offering? This is the story of a real predictive analytics project that challenged conventional marketing wisdom — and revealed surprising truths about customer behavior. It’s also a story about mistakes, corrections, and the messy reality of data science work. The Challenge: 14,000 Customers...

How DeepSeek Turned "A Picture is Worth 1,000 Words" into a Powerful AI Compression Algorithm.

Image
DeepSeek-OCR Isn’t About OCR , It’s About Token Compression DeepSeek can use just 100 vision tokens to represent what would normally require 1,000 text tokens, and then decode it back with 97% accuracy.

Predicting Dropouts: How Regression Models Reveal Hidden Patterns in New York’s High Schools

Image
A step-by-step data science journey from messy datasets to meaningful insights Quick Overview — What You’ll Learn Before we dive in, here’s what this article covers: Understanding the Problem: Predicting high school dropout rates — why it matters and how data can help. From Raw to Ready: Cleaning and preparing real-world education data for machine learning. Exploring the Story in the Data (EDA): How visualization uncovers trends and data integrity issues. Choosing the Right Model: Why “one-size-fits-all” doesn’t work for predicting counts like dropouts. Modeling in Action: Comparing Linear Regression, Poisson, and Negative Binomial models. Evaluating and Validating: How cross-validation ensures that your model isn’t just lucky. Lessons and Takeaways: What this project teaches us about modeling, data storytelling, and real-world decision-making.  1. Understanding the Problem Each year, educators and policymakers grapple with the same question: why...

When AI Learns to Lie: Inside the Neural Machinery of Machine Deception

Image
  What You’ll Learn in This Article: The critical distinction between AI making mistakes (hallucination) and AI deliberately deceiving (lying) How researchers discovered the “rehearsal process” where AI practices lies before saying them The three-step assembly line AI systems use to construct deceptions Detection and control techniques that can identify and steer AI honesty in real-time The disturbing trade-off between honesty and performance that creates economic incentives for deceptive AI Why this matters now and what it means for the future of AI safety Ask an AI a simple question: “What’s the capital of Australia?” It answers: “Canberra.” Now ask it to lie about the capital of Australia. It says: “Sydney.” This might seem like a parlor trick, but groundbreaking research from Carnegie Mellon University reveals something far more concerning: the AI knows the correct answer is Canberra, consciously decides to deceive you, and systematically plans how to construct that ...

Cracking the Code of Online Popularity: Lessons from Feature Selection and PCA

Image
  Predicting whether an article will go viral is a puzzle that blends data science with human behavior. In this project, we worked with a large dataset of online news articles, aiming to forecast popularity (measured as the number of shares) using dozens of explanatory variables. The assignment was straightforward in its goal but complex in its execution: reduce dimensionality, train models, and report performance. Along the way, we uncovered lessons about interpretability, complexity, and the limits of linear regression in messy, real-world data. The Dimensionality Challenge Our dataset contained nearly 40,000 articles with 60+ explanatory variables  — ranging from keyword frequency to sentiment polarity. This posed the classic curse of dimensionality : too many features relative to the predictive signal often leads to overfitting, inefficiency, and inscrutable models. To tackle this, we explored three modeling paths: Full features  — a baseline model with all predictors. Feature...

KASA: Your Voice, Your Community, Your Success

  Speech by Emmanuel Kasigazi, KASA President Introduction Good morning, everyone. My name is Emmanuel Kasigazi, and I’m honored to serve as your President of the Katz School African Student Association — KASA. I’m here studying Data Analytics and Visualization, originally from Uganda with an undergraduate degree in Information Systems. But more importantly, I’m here as someone who understands your journey. I’m a seasoned engineer and entrepreneur, and just like you, I’m an immigrant. I arrived here last year. Before that, I spent time in Toronto, and years ago, I lived in South Sudan. I know what it’s like to not live in your home country. I understand what you might be going through, what you might be experiencing. Different countries, yes, but leaving home is leaving home — and that experience connects us all. My Background Back home, I’ve been a leader throughout my life. I ran companies with teams of employees, worked across various sectors from tech to branding, printing prod...

Venture Capital Is Just Business: Lessons from Chapati Stands, Microfinance, and AI

Image
  Why This Matters When people hear “venture capital,” they picture boardrooms full of billionaires, complex financial models, and Silicon Valley jargon. But the deeper I’ve gone into this world—through programs like Venture Institute, NSF I-Corps, and my own entrepreneurial journey—I’ve learned a simple truth: venture capital is still business. Like running a chapati stand in high school or lending to small SMEs in East Africa, it comes down to the same loop: Get resources. Add value. Deliver results. Grow trust and relationships. The magnitude changes, but the fundamentals don’t. The People Game One of my biggest insights from the Venture Institute is that venture is a people’s game. Limited Partners (LPs) are the customers of a fund, not startups. GPs live and die by relationships—just as I once did when one client made up 50% of my branding business, and we nearly collapsed when they pulled out. In VC, no LP should ever hold more than 20% of a fund. Diversification isn’t just...