<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="https://www.questionpro.com/engineering/feed.xml" rel="self" type="application/atom+xml" /><link href="https://www.questionpro.com/engineering/" rel="alternate" type="text/html" /><updated>2026-02-26T09:22:22+00:00</updated><id>https://www.questionpro.com/engineering/feed.xml</id><title type="html">Engineering Blog</title><subtitle>Discover QuestionPro&apos;s engineering culture, technical challenges, and career opportunities. Learn how our engineering team builds scalable survey platforms serving millions of users.</subtitle><author><name>Your Name</name></author><entry><title type="html">Building Scalable AI Agents: A Journey Through Multi-Agent Architecture</title><link href="https://www.questionpro.com/engineering/ai%20&%20machine%20learning/software%20architecture/engineering/cost%20optimization/building-scalable-ai-agents/" rel="alternate" type="text/html" title="Building Scalable AI Agents: A Journey Through Multi-Agent Architecture" /><published>2026-02-21T00:00:00+00:00</published><updated>2026-02-21T00:00:00+00:00</updated><id>https://www.questionpro.com/engineering/ai%20&amp;%20machine%20learning/software%20architecture/engineering/cost%20optimization/building-scalable-ai-agents</id><content type="html" xml:base="https://www.questionpro.com/engineering/ai%20&amp;%20machine%20learning/software%20architecture/engineering/cost%20optimization/building-scalable-ai-agents/"><![CDATA[<h2 id="or-how-i-learned-to-stop-worrying-and-love-the-token-budget">Or: How I Learned to Stop Worrying and Love the Token Budget</h2>

<h2 id="introduction">Introduction</h2>

<p>Remember when you thought building an AI agent would be easy? “Just throw some prompts at GPT-4 and call it a day,”.</p>

<p>“What could go wrong?” you said.</p>

<p><em>Narrator: Everything went wrong.</em></p>

<p>Large Language Models have revolutionized intelligent applications, but here’s what nobody tells you at those fancy AI conferences: scaling an AI agent from “cool demo” to “production system that doesn’t bankrupt your company” is… <em>challenging</em>. Token costs spiral like a startup’s cloud bill, context windows overflow faster than your coffee cup on Monday morning, and response times make dial-up internet look speedy.</p>

<p>This is our journey (currently in progress, bugs included) from a basic implementation to building a production-ready, cost-efficient multi-agent system for QuestionPro’s BI platform. We’re exploring three key patterns, making mistakes in real-time, and documenting everything so you don’t have to. Think of this as “Mythbusters” but for AI architecture, with 100% more token optimization and slightly fewer explosions.</p>

<hr />

<h2 id="part-1-the-basic-agent---or-how-hard-could-it-be">Part 1: The Basic Agent - Or “How Hard Could It Be?”</h2>

<h3 id="the-innocent-beginning">The Innocent Beginning</h3>

<p>Like every developer who’s just discovered LangGraph, we started with the “obvious” approach, THE MONOLITHIC AGENT: one agent to rule them all, one agent to find them, one agent to bring them all, and in the darkness bind them (to a $10,000/month OpenAI bill… roughly).</p>

<p><img src="/engineering/assets/images/monolithic-agent-architecture.png" alt="Monolithic Agent Architecture" /></p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Single agent with all capabilities</span>
<span class="kd">const</span> <span class="nx">agent</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">ChatOpenAI</span><span class="p">({</span> <span class="na">model</span><span class="p">:</span> <span class="dl">"</span><span class="s2">gpt-4</span><span class="dl">"</span> <span class="p">}).</span><span class="nf">bindTools</span><span class="p">([</span>
  <span class="nx">createDashboard</span><span class="p">,</span>
  <span class="nx">getDashboard</span><span class="p">,</span>
  <span class="nx">createWidget</span><span class="p">,</span>
  <span class="nx">listSurveys</span><span class="p">,</span>
  <span class="nx">getQuestions</span><span class="p">,</span>
  <span class="c1">// ... 40+ more tools</span>
<span class="p">]);</span>

<span class="kd">const</span> <span class="nx">systemPrompt</span> <span class="o">=</span> <span class="s2">`
You are a dashboard analytics assistant.

Available dashboards: [... 500 lines of context ...]
Survey schemas: [... 2000 lines of schemas ...]
Widget configurations: [... 1500 lines of configs ...]
`</span><span class="p">;</span>
</code></pre></div></div>

<h3 id="the-oh-no-moment">The “Oh No” Moment</h3>

<p>You know that feeling when you check your AI bill and your heart skips a beat? That was us, thinking about going into “production”.</p>

<p><strong>Our “Everything is Fine” Metrics:</strong></p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>📊 The Uncomfortable Truth:
- Avg tokens per request: 15,000 (narrator: this is not good)
- Cost per conversation: $1.50 (multiply by users... *sweats*)
- P95 latency: 8 seconds (users hate us)
- Projected monthly costs: $4,500 (CFO hates us more)
- Success rate: 89% (11% of the time, it works every time!)

🔴 The "We Need to Talk" Problems:
- Context overflow (GPT-4 politely hanging up on us)
- Tool selection paralysis (like Netflix, which movie to watch)
- Exponential cost growth
- LLM having an existential crisis (overwhelmed with 40+ tools)
</code></pre></div></div>

<h3 id="the-exponential-growth-problem-or-math-is-unforgiving">The Exponential Growth Problem (Or: Math is Unforgiving)</h3>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Turn 1:  12,500 tokens  ✅ "This is fine"
Turn 5:  22,000 tokens  ⚠️ "This is less fine"
Turn 10: 37,000 tokens  ❌ "This is fire, everything is fire"
</code></pre></div></div>

<p><strong>We were literally pricing ourselves out of business, one helpful conversation at a time.</strong></p>

<p>The real kicker? We kept asking “Can we just add more features?” while totally unaware of the bill that might get generated.</p>

<p>Note: These numbers are rough estimates from our experiments and paper napkin calculations. We’re still actively developing and iterating, but gives an idea to think in the right direction.</p>

<hr />

<h2 id="part-2-the-skills-pattern---progressive-disclosure">Part 2: The Skills Pattern - Progressive Disclosure</h2>

<h3 id="the-aha-moment">The “Aha!” Moment</h3>

<h4 id="or-how-we-discovered-that-laziness-is-actually-a-virtue">Or: How We Discovered That Laziness is Actually a Virtue</h4>

<p>After three existential crises where we admitted our agent was hemorrhaging money, we stumbled upon a life-changing concept:</p>

<blockquote>
  <p>“What if… and hear me out… we DON’T load everything into the prompt like we’re packing for a 2-week vacation?”</p>
</blockquote>

<p>Revolutionary, I know. We felt like we’d discovered fire, except fire was actually “reading the documentation properly.”</p>

<p>Enter <strong>progressive disclosure</strong> - a fancy term for “only fetch stuff when you actually need it,” which is basically how every efficient human operates but somehow felt groundbreaking when applied to AI. (Yes, we’re aware of the irony.)</p>

<blockquote>
  <p>“Don’t load everything upfront. Load only what you need, when you need it.”</p>
</blockquote>

<p>Instead of including all context in every request, we implemented <strong>progressive disclosure</strong> - the agent loads specialized knowledge through the skills pattern.</p>

<p><img src="/engineering/assets/images/skills-pattern-visualization.png" alt="Skills Pattern Visualization" /></p>

<h3 id="implementation">Implementation</h3>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Lightweight system prompt with skill metadata only</span>
<span class="kd">const</span> <span class="nx">systemPrompt</span> <span class="o">=</span> <span class="s2">`
You are a dashboard analytics assistant.

Available Skills (load when needed):
- survey_schema: Get survey questions and field types
- widget_config: Get chart-specific settings  
- dashboard_filters: Get filter options
- styling_themes: Get appearance options

Current context:
- Workspace: 1
- Dashboard: </span><span class="p">${</span><span class="nx">state</span><span class="p">.</span><span class="nx">dashboardId</span> <span class="o">||</span> <span class="dl">"</span><span class="s2">none</span><span class="dl">"</span><span class="p">}</span><span class="s2">
`</span><span class="p">;</span>

<span class="c1">// Skills load on-demand</span>
<span class="kd">const</span> <span class="nx">loadSkillTool</span> <span class="o">=</span> <span class="nf">tool</span><span class="p">(</span><span class="k">async </span><span class="p">({</span> <span class="nx">skillName</span><span class="p">,</span> <span class="nx">resourceId</span> <span class="p">})</span> <span class="o">=&gt;</span> <span class="p">{</span>
  <span class="k">switch </span><span class="p">(</span><span class="nx">skillName</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">case</span> <span class="dl">"</span><span class="s2">survey_schema</span><span class="dl">"</span><span class="p">:</span>
      <span class="k">return</span> <span class="k">await</span> <span class="nf">fetchSurveySchema</span><span class="p">(</span><span class="nx">resourceId</span><span class="p">);</span>
    <span class="c1">// Returns ~2,000 tokens only when needed</span>

    <span class="k">case</span> <span class="dl">"</span><span class="s2">widget_config</span><span class="dl">"</span><span class="p">:</span>
      <span class="k">return</span> <span class="k">await</span> <span class="nf">fetchWidgetConfig</span><span class="p">(</span><span class="nx">resourceId</span><span class="p">);</span>
    <span class="c1">// Returns ~1,500 tokens only when needed</span>
  <span class="p">}</span>
<span class="p">});</span>
</code></pre></div></div>

<h3 id="skills-in-action">Skills in Action</h3>

<p><strong>User</strong>: “Create a gender chart from Customer Survey”</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Request 1: Agent sees lightweight prompt
├─ System: 1,000 tokens (was 8,000)
├─ Tools: 1,500 tokens
└─ Messages: 2,000 tokens
   Total: 4,500 tokens ✅

Agent decides: "I need the survey schema"
└─ Calls: load_skill("survey_schema", 12345)

Request 2: Agent receives survey details
├─ Previous context: 4,500 tokens
├─ Survey schema: 2,000 tokens
└─ Total: 6,500 tokens ✅

Agent creates chart with correct data mapping
</code></pre></div></div>

<h3 id="the-results-or-when-theory-meets-reality-and-actually-works">The Results (Or: When Theory Meets Reality and Actually Works)</h3>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>📊 Skills Pattern Metrics (We Were Shocked Too):

- Avg tokens: 6,500 (was 15,000) 🎉
- Cost per conversation: $0.65 (was $1.50) 💰
- P95 latency: 4s (was 8s) ⚡
- Monthly costs: $1,950 (was $4,500) 🎊

✅ 57% cost reduction (CFO can sleep better)
✅ 50% faster responses (users will hate us a little less now)
✅ Stable token usage (no more exponential nightmares)
✅ Team morale: Significantly improved

</code></pre></div></div>

<p><em>As this was my late night hustle when no one was around, I just high fived my wall.</em></p>

<pre><code class="language-texttext">
### But Wait... (The Plot Thickens)

Just when we thought we'd solved AI, reality decided to humble us. Again.


❌ The "Not So Fast" Problems:

1. Agent handling 40+ tools → Like asking someone to pick
   their favorite child, but with 40 children
2. Mixed concerns → Debugging became "which of these
   40 things broke?"
3. No parallelization → Everything sequential because
   apparently we hate speed
4. Some skills returned 3,000+ tokens → The token diet
   didn't last long

</code></pre>

<p><strong>The Comedy of Errors:</strong></p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
User: "Style my dashboard in dark mode."
(A simple, reasonable request.)

Agent's internal monologue:
"Hmm, I have 40 tools. Let me read each description...
create_dashboard? No...
update_dashboard? Maybe?
create_widget? Probably not...
update_theme? OH WAIT THAT'S THE ONE! ✅
...only took me 3 seconds to figure that out"

User: _has already rage-quit_

</code></pre></div></div>

<hr />

<h2 id="part-3-multi-agent-architecture---specialized-experts">Part 3: Multi-Agent Architecture - Specialized Experts</h2>

<h3 id="the-why-didnt-we-think-of-this-sooner-moment">The “Why Didn’t We Think of This Sooner?” Moment</h3>

<h4 id="the-breakthrough-cue-dramatic-music">The Breakthrough (Cue Dramatic Music)</h4>

<p>Picture this: It’s 2 AM, you’re on your third coffee, scrolling through LangGraph documentation for the millionth time, when suddenly…</p>

<blockquote>
  <p>“What if… <em>what if</em>… we don’t make one agent do EVERYTHING? What if we have specialized agents? Like… real companies?”</p>
</blockquote>

<p>🤯 <em>Mind. Blown.</em></p>

<p>We felt like we’d invented the wheel, despite the fact that humans have been organizing into specialized roles since, like, the dawn of civilization. But hey, better late than never!</p>

<p><strong>Our New Squad:</strong></p>

<ul>
  <li><strong>Dashboard Agent</strong>: The organized one who actually reads the manual</li>
  <li><strong>Widget Agent</strong>: The creative type, probably went to art school</li>
  <li><strong>Datasource Agent</strong>: The data nerd (affectionately), speaks fluent SQL</li>
  <li><strong>Styling Agent</strong>: Fashion police of the digital world</li>
</ul>

<p>Basically, we went from having one overworked, stressed-out agent having a breakdown, to a healthy work environment with proper delegation. <em>Revolutionary.</em></p>

<hr />

<h3 id="the-breakthrough">The Breakthrough</h3>

<blockquote>
  <p>“What if instead of one agent doing everything, we had specialized experts?”</p>
</blockquote>

<p>Like a company with departments (Sales, Engineering, Design), we created specialized agents:</p>

<ul>
  <li><strong>Dashboard Agent</strong>: Dashboards and tabs expert</li>
  <li><strong>Widget Agent</strong>: Visualizations expert</li>
  <li><strong>Datasource Agent</strong>: Data and surveys expert</li>
  <li><strong>Styling Agent</strong>: Themes and appearances expert</li>
</ul>

<p><img src="/engineering/assets/images/multi-agent-system-architecture.png" alt="Multi-Agent System Architecture" /></p>

<h3 id="federal-router-implementation">Federal Router Implementation</h3>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Lightweight orchestrator</span>
<span class="kd">const</span> <span class="nx">federalAgent</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">ChatOpenAI</span><span class="p">({</span> <span class="na">model</span><span class="p">:</span> <span class="dl">"</span><span class="s2">gpt-4</span><span class="dl">"</span> <span class="p">}).</span><span class="nf">bindTools</span><span class="p">([</span>
  <span class="nx">routeToDashboardAgent</span><span class="p">,</span>
  <span class="nx">routeToWidgetAgent</span><span class="p">,</span>
  <span class="nx">routeToDatasourceAgent</span><span class="p">,</span>
  <span class="nx">routeToStylingAgent</span><span class="p">,</span>
<span class="p">]);</span>

<span class="kd">const</span> <span class="nx">federalPrompt</span> <span class="o">=</span> <span class="s2">`
You are a routing assistant. Delegate to specialists:

- Dashboard operations → Dashboard Agent
- Widget creation/editing → Widget Agent
- Data selection → Datasource Agent
- Styling/themes → Styling Agent

You route and synthesize - you don't execute tasks.
`</span><span class="p">;</span>
</code></pre></div></div>

<h3 id="specialized-agent-example">Specialized Agent Example</h3>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Dashboard Agent - Only 6 focused tools</span>
<span class="kd">const</span> <span class="nx">dashboardAgent</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">ChatOpenAI</span><span class="p">({</span> <span class="na">model</span><span class="p">:</span> <span class="dl">"</span><span class="s2">gpt-4</span><span class="dl">"</span> <span class="p">}).</span><span class="nf">bindTools</span><span class="p">([</span>
  <span class="nx">createDashboard</span><span class="p">,</span>
  <span class="nx">getDashboard</span><span class="p">,</span>
  <span class="nx">updateDashboard</span><span class="p">,</span>
  <span class="nx">createTab</span><span class="p">,</span>
  <span class="nx">updateTab</span><span class="p">,</span>
  <span class="nx">deleteDashboard</span><span class="p">,</span>
<span class="p">]);</span>
</code></pre></div></div>

<h3 id="conversation-flow">Conversation Flow</h3>

<p><strong>User</strong>: “Create sales dashboard with gender chart from Customer Survey”</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>┌────────────────────────────────────────┐
│ 1. Federal Router                      │
│ Tools: 4 routing (400 tokens)          │
│ Decision: Dashboard → Data → Widget    │
│ Cost: 1,400 tokens                     │
└────────────────────────────────────────┘
                ↓
┌────────────────────────────────────────┐
│ 2. Dashboard Agent                     │
│ Tools: 6 dashboard (600 tokens)        │
│ Creates: "Sales" dashboard (ID 100)    │
│ Cost: 1,400 tokens                     │
└────────────────────────────────────────┘
                ↓
┌────────────────────────────────────────┐
│ 3. Datasource Agent                    │
│ Tools: 10 data (1,000 tokens)          │
│ Finds: Survey 12345, Question 67890    │
│ Cost: 2,200 tokens                     │
└────────────────────────────────────────┘
                ↓
┌────────────────────────────────────────┐
│ 4. Widget Agent                        │
│ Tools: 8 widget (800 tokens)           │
│ Creates: Pie chart widget              │
│ Cost: 2,800 tokens                     │
└────────────────────────────────────────┘

Total workflow: 8,200 tokens (vs 60,000)
</code></pre></div></div>

<h3 id="the-results-holy-sh-i-mean-wow">The Results (Holy Sh— I Mean, Wow!)</h3>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
📊 The Numbers Don't Lie (But We Triple-Checked Anyway):

- Avg tokens per workflow: 8,200 (was 60,000) 📉
- Cost per workflow: $0.082 (was $0.60) 💸
- P95 latency: 3.5s (was 8s) 🚀
- Projected monthly costs: $780 (was $4,500) 🎯
- Success rate: 97% (was 89%) ✨

✅ 86% token reduction (not a typo!)
✅ 86% cost savings (CFO wants to buy us lunch)
✅ 56% faster responses (users think we upgraded servers)
✅ Clear debugging (we now know WHICH agent to blame)
✅ Team happiness: Through the roof
✅ Sleep quality: Significantly improved

</code></pre></div></div>

<p>At this point, we were pretty sure we’d reached peak performance. We were wrong.</p>

<hr />

<h2 id="part-4-the-hybrid-approach---maximum-efficiency">Part 4: The Hybrid Approach - Maximum Efficiency</h2>

<h3 id="peak-engineering">Peak Engineering</h3>

<h4 id="or-what-happens-when-you-combine-your-two-good-ideas">Or: What Happens When You Combine Your Two Good Ideas</h4>

<p>After achieving what we thought was AI nirvana with multi-agents, I had a dangerous thought:</p>

<blockquote>
  <p>“Hey… what if we combine multi-agents WITH the skills pattern?”</p>
</blockquote>

<p>It was an idea that precedes either genius or disaster. (In software engineering, these are often the same thing.)</p>

<p>But then we realized: <strong>Why choose between our two good ideas when we can have BOTH?</strong></p>

<p>Multi-agents solved the “too many tools” problem, but we could still use skills to optimize even further. It’s like discovering that peanut butter AND jelly make a better sandwich together. Revolutionary? No. Delicious? Absolutely.</p>

<h3 id="combining-the-best-of-both">Combining the Best of Both</h3>

<p>Multi-agents solved tool confusion, but skills can optimize even further. <strong>Each specialized agent uses skills for deep context.</strong></p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Widget Agent with Skills</span>
<span class="kd">const</span> <span class="nx">widgetAgent</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">ChatOpenAI</span><span class="p">({</span> <span class="na">model</span><span class="p">:</span> <span class="dl">"</span><span class="s2">gpt-4</span><span class="dl">"</span> <span class="p">}).</span><span class="nf">bindTools</span><span class="p">([</span>
  <span class="c1">// Core tools (lightweight, always bound)</span>
  <span class="nx">createWidget</span><span class="p">,</span>
  <span class="nx">updateWidget</span><span class="p">,</span>
  <span class="nx">deleteWidget</span><span class="p">,</span>

  <span class="c1">// Skill loader (heavy context on-demand)</span>
  <span class="nx">loadSkill</span><span class="p">,</span>
<span class="p">]);</span>

<span class="kd">const</span> <span class="nx">widgetPrompt</span> <span class="o">=</span> <span class="s2">`
You are the Widget Agent - visualization expert.

When you need details:
- Chart config → load_skill("bar_chart_config")
- Widget styling → load_skill("widget_styling", widgetId)
- Data mapping → load_skill("data_mapping_rules")

Current: Dashboard </span><span class="p">${</span><span class="nx">state</span><span class="p">.</span><span class="nx">dashboardId</span><span class="p">}</span><span class="s2">, Tab </span><span class="p">${</span><span class="nx">state</span><span class="p">.</span><span class="nx">tabId</span><span class="p">}</span><span class="s2">
`</span><span class="p">;</span>
</code></pre></div></div>

<p><img src="/engineering/assets/images/hybrid-architecture-diagram.png" alt="Hybrid Architecture Diagram" /></p>

<h3 id="hybrid-in-action">Hybrid in Action</h3>

<p><strong>User</strong>: “Create a pie chart showing age distribution”</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Federal Router → Datasource Agent → Widget Agent

Datasource Agent:
├─ Loads: survey_schema skill (2,000 tokens)
├─ Finds age question
└─ Returns structured data

Widget Agent:
├─ Loads: pie_chart_config skill (1,500 tokens)
├─ Creates widget with proper mapping
└─ Total: 6,200 tokens ✅

Vs. loading everything upfront: 18,000 tokens ❌
</code></pre></div></div>

<h3 id="the-metrics">The Metrics</h3>

<p>After implementing the hybrid approach and running it through our test suite (read: throwing everything at it to see what breaks), we got these numbers. We didn’t believe them at first either.</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
📊 The "Are We Sure This Is Right?" Results:

Token Usage (The Good Kind of Reduction):
├─ Simple queries: 2,500 (was 12,000) → 79% reduction 🎯
├─ Medium queries: 6,000 (was 35,000) → 83% reduction 💪
└─ Complex workflows: 12,000 (was 80,000) → 85% reduction 🚀

Cost Per User/Month:
├─ Light users: $0.25 (was $1.20) → 79% savings
├─ Regular users: $1.50 (was $8.75) → 83% savings
└─ Power users: $6.00 (was $35.00) → 83% savings
(Power users can stay, we can afford them now!)

Performance (Users Think We're Wizards):
├─ P50 latency: 1.8s (was 4.2s) → 57% faster ⚡
├─ P95 latency: 3.5s (was 8.7s) → 60% faster ⚡⚡
└─ P99 latency: 5.2s (was 15s) → 65% faster ⚡⚡⚡

Reliability (Actually Impressed Ourselves):
├─ Success rate: 97% (was 89%) → +8% improvement
├─ Tool selection: 99% (was 85%) → +14% improvement
└─ Context overflow: 0.1% (was 8%) → Basically extinct

💰 Projected Monthly Costs: $780 (was $4,500)
💰 Annual Savings: $44,640
💰 Engineer Stress Levels: Down 90%
☕ Coffee Consumption: Actually decreased
😴 Sleep Quality: Markedly improved

</code></pre></div></div>

<hr />

<h2 id="choosing-your-pattern-the-decision-tree">Choosing Your Pattern: The Decision Tree</h2>

<h3 id="the-which-one-do-i-actually-need-decision-tree">The “Which One Do I Actually Need?” Decision Tree</h3>

<h4 id="or-saving-you-from-our-mistakes">Or: Saving You From Our Mistakes</h4>

<p>Look, we tried ALL the things so you don’t have to. Here’s our hard-won wisdom, gained through tears, tokens, and too much coffee:</p>

<h3 id="final-architecture-overview">Final Architecture Overview</h3>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
    ┌─────────────────────────────────────────────┐
    │ Federal Router Agent (GPT-3.5)              │
    │ 4 routing tools (400 tokens)                │
    │ Fast, cheap classification                  │
    └─────────────────────────────────────────────┘
                      │
          ┌───────────┼───────────┬──────────┐
          ↓           ↓           ↓          ↓
    ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
    │Dashboard │ │ Widget   │ │Datasource│ │ Styling  │
    │ (GPT-4)  │ │ (GPT-4)  │ │ (GPT-4)  │ │(GPT-3.5) │
    ├──────────┤ ├──────────┤ ├──────────┤ ├──────────┤
    │6 tools   │ │8 tools   │ │10 tools  │ │6 tools   │
    │+ Skills: │ │+ Skills: │ │+ Skills: │ │+ Skills: │
    │ filters  │ │ configs  │ │ schemas  │ │ themes   │
    │ layouts  │ │ mapping  │ │ datasets │ │ fonts    │
    │ rules    │ │ styling  │ │ stacks   │ │ a11y     │
    └──────────┘ └──────────┘ └──────────┘ └──────────┘

</code></pre></div></div>

<hr />

<p>Lessons From the Trenches</p>

<h3 id="what-actually-worked-surprisingly">What Actually Worked (Surprisingly)</h3>

<h4 id="1-baby-steps-dont-just-work-theyre-essential">1. Baby Steps Don’t Just Work, They’re Essential</h4>

<p>We wanted to rebuild everything overnight. Our tech lead said “no.” We’re glad he did.</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
┌─────────────────────┬──────────────┬────────────────┐
│ Your Situation      │ Pattern      │ Expected Gain  │
├─────────────────────┼──────────────┼────────────────┤
│ &lt; 10 tools          │ Basic Agent  │ Keep simple    │
│ Simple context      │              │                │
├─────────────────────┼──────────────┼────────────────┤
│ 10-25 tools         │ Skills       │ 50-60% savings │
│ Large context       │              │                │
│ Single domain       │              │                │
├─────────────────────┼──────────────┼────────────────┤
│ 25-50 tools         │ Multi-Agent  │ 70-80% savings │
│ Multiple domains    │              │                │
│ Clear separation    │              │                │
├─────────────────────┼──────────────┼────────────────┤
│ 50+ tools           │ Hybrid       │ 80-85% savings │
│ Complex domains     │              │                │
│ Deep context needs  │              │                │
└─────────────────────┴──────────────┴────────────────┘

</code></pre></div></div>

<p><img src="/engineering/assets/images/decision-tree-flowchart.png" alt="Decision Tree Flowchart" /></p>

<hr />

<h2 id="hard-truths-we-learned-the-expensive-way">Hard Truths We Learned (The Expensive Way)</h2>

<h3 id="1-tool-descriptions-write-them-like-your-job-depends-on-it">1. Tool Descriptions: Write Them Like Your Job Depends On It</h3>

<p>Because your token budget literally does.</p>

<p>❌ Bad (Our First Attempt):</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">createWidget</span><span class="p">:</span> <span class="p">{</span>
  <span class="nl">description</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Creates a widget</span><span class="dl">"</span><span class="p">;</span>
<span class="p">}</span>
<span class="c1">// LLM: "Cool story bro, but WHEN do I use this?"</span>
</code></pre></div></div>

<p>✅ Good (After Many Painful Iterations):</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">createWidget</span><span class="p">:</span> <span class="p">{</span>
  <span class="nl">description</span><span class="p">:</span> <span class="s2">`Creates a visualization widget.
  
  Use when: User asks to "add chart/graph/visualization"
  Don't use for: Updating widgets (use update_widget)
  
  // Yes, we're treating the LLM like a junior dev.
  // Yes, it works.`</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Pro tip: Think of tool descriptions as writing documentation for the world’s most literal intern. Because that’s essentially what it is.</p>

<h3 id="2-monitor-what-matters">2. Monitor What Matters</h3>

<p>Track more than just cost:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">logger</span><span class="p">.</span><span class="nf">info</span><span class="p">({</span>
  <span class="na">event</span><span class="p">:</span> <span class="dl">"</span><span class="s2">agent_execution</span><span class="dl">"</span><span class="p">,</span>
  <span class="na">workflowName</span><span class="p">:</span> <span class="dl">"</span><span class="s2">dashboard_creation</span><span class="dl">"</span><span class="p">,</span>
  <span class="na">metadata</span><span class="p">:</span> <span class="p">{</span>
    <span class="na">agentType</span><span class="p">:</span> <span class="dl">"</span><span class="s2">widget_agent</span><span class="dl">"</span><span class="p">,</span>
    <span class="na">skillsLoaded</span><span class="p">:</span> <span class="p">[</span><span class="dl">"</span><span class="s2">bar_chart_config</span><span class="dl">"</span><span class="p">],</span>
    <span class="na">tokensUsed</span><span class="p">:</span> <span class="mi">3800</span><span class="p">,</span>
  <span class="p">},</span>
<span class="p">});</span>
</code></pre></div></div>

<h3 id="3-progressive-skill-rollout">3. Progressive Skill Rollout</h3>

<p>Don’t implement all skills at once. Start with the most-used:</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Priority 1: survey_schema (80% of requests)
Priority 2: bar_chart_config (60% of requests)
Priority 3: widget_styling (40% of requests)
Priority 4: dashboard_filters (30% of requests)
</code></pre></div></div>

<h3 id="4-dont-over-specialize-learn-from-our-hubris">4. Don’t Over-Specialize (Learn From Our Hubris)</h3>

<p>In our enthusiasm, we initially created 8 agents. EIGHT. We were basically creating an AI bureaucracy.</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Original (Too Many):
├─ Dashboard Agent
├─ Tab Agent ❌ (seriously, tabs needed their own agent?)
├─ Widget Agent
├─ Widget Styling Agent ❌ (why did we do this to ourselves)
├─ Survey Agent ❌
├─ Dataset Agent ❌
├─ Analytics Agent ❌
└─ Export Agent ❌

After Therapy (Better):
├─ ✅ Dashboard Agent (includes tabs, we're not savages)
├─ ✅ Widget Agent (includes styling, it's fine)
├─ ✅ Datasource Agent (all data sources, one happy family)
└─ ✅ Styling Agent (themes + accessibility)

</code></pre></div></div>

<p>Sweet spot: <strong>4-6 agents</strong></p>

<p>More than that and you’re just creating coordination overhead. It’s like having too many group chats - eventually nobody knows what’s happening where.</p>

<hr />

<h2 id="quick-wins-thatll-make-you-look-like-a-hero">Quick Wins That’ll Make You Look Like a Hero</h2>

<h3 id="seriously-do-these-today">(Seriously, Do These Today)</h3>

<h4 id="1-stop-writing-novellas-in-your-system-prompts-30-minutes">1. Stop Writing Novellas in Your System Prompts (30 minutes)</h4>

<p>Your system prompt is not the place to explain your life story, company history, and philosophical stance on data visualization.</p>

<h4 id="2-return-structured-data-from-skills">2. Return Structured Data from Skills</h4>

<p>❌ Bad (text):</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="dl">"</span><span class="s2">The survey has gender as multiple choice and age as numeric...</span><span class="dl">"</span><span class="p">;</span>
</code></pre></div></div>

<p>✅ Good (JSON):</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span>
  <span class="nl">questions</span><span class="p">:</span> <span class="p">[</span>
    <span class="p">{</span> <span class="na">id</span><span class="p">:</span> <span class="mi">67890</span><span class="p">,</span> <span class="na">text</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Gender?</span><span class="dl">"</span><span class="p">,</span> <span class="na">type</span><span class="p">:</span> <span class="dl">"</span><span class="s2">multiple_choice</span><span class="dl">"</span> <span class="p">},</span>
    <span class="p">{</span> <span class="na">id</span><span class="p">:</span> <span class="mi">67891</span><span class="p">,</span> <span class="na">text</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Age?</span><span class="dl">"</span><span class="p">,</span> <span class="na">type</span><span class="p">:</span> <span class="dl">"</span><span class="s2">numeric</span><span class="dl">"</span> <span class="p">},</span>
  <span class="p">];</span>
<span class="p">}</span>
</code></pre></div></div>

<h4 id="3-dont-over-specialize">3. Don’t Over-Specialize</h4>

<p>We initially had 8 agents, consolidated to 4:</p>

<ul>
  <li>✅ Dashboard Agent (includes tabs)</li>
  <li>✅ Widget Agent (includes styling)</li>
  <li>✅ Datasource Agent (all data sources)</li>
  <li>✅ Styling Agent (themes + accessibility)</li>
</ul>

<hr />

<h2 id="lessons-learned-what-weve-learned-so-far">Lessons Learned: What We’ve Learned (So Far)</h2>

<h3 id="the-real-truth-about-building-ai-agents">The Real Truth About Building AI Agents</h3>

<p>Here’s what they don’t tell you in the glossy blog posts and conference talks:</p>

<p>Building scalable AI agents isn’t about having the most sophisticated architecture from day one. It’s about:</p>

<ol>
  <li><strong>Start simple</strong> - Basic agent for MVP (it’s okay, we all started here)</li>
  <li><strong>Measure everything</strong> - If you’re not tracking tokens, you’re flying blind</li>
  <li><strong>Fail fast, learn faster</strong> - We made EVERY mistake so you don’t have to</li>
  <li><strong>Evolve incrementally</strong> - Skills → Multi-Agent → Hybrid (this is the way)</li>
  <li><strong>Focus on user value</strong> - Fast, accurate, reliable (novel concept, we know)</li>
</ol>

<h3 id="our-journey-in-numbers-the-beforeafter-youve-been-waiting-for">Our Journey in Numbers (The Before/After You’ve Been Waiting For)</h3>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Where We Started (The Dark Times):
├─ Cost: $4,500/month (gulp)
├─ Tokens: 15k per request (yikes)
├─ Success: 89% (not great, Bob)
└─ Team morale: Low

Where We're At Now (The Good Times):
├─ Cost: $780/month (manageable!)
├─ Tokens: 6.6k per request (sustainable)
├─ Success: 97% (actually good!)
└─ Team morale: High
</code></pre></div></div>

<h3 id="whats-next-or-what-were-learning-next">What’s Next (Or: What We’re Learning Next)</h3>

<p>As we continue building and scaling QuestionPro’s BI agent, here’s what we’re watching:</p>

<ul>
  <li><strong>Context management is king</strong> - Progressive disclosure isn’t optional, it’s survival</li>
  <li><strong>Specialization is inevitable</strong> - One agent doing everything is like one person doing all jobs at a company (recipe for disaster)</li>
  <li><strong>Hybrid is the sweet spot</strong> - Why choose between good ideas?</li>
  <li><strong>Cost optimization is non-negotiable</strong> - It’s literally the difference between “cool AI feature” and “bankrupt company”</li>
  <li><strong>Testing is hard</strong> - LLMs are non-deterministic. Our test suite has trust issues.</li>
  <li><strong>Documentation matters</strong> - Future you will thank present you (we learned this the hard way)</li>
</ul>

<h3 id="resources-that-actually-helped-us">Resources (That Actually Helped Us)</h3>

<p><strong>LangGraph Documentation:</strong></p>

<ul>
  <li>Multi-Agent Patterns (bookmark this, seriously)</li>
  <li>Progressive Disclosure / Skills Pattern (game changer)</li>
  <li>Examples that actually work (rare, treasure them)</li>
</ul>

<p><strong>Monitoring Tools We Actually Use:</strong></p>

<ul>
  <li>Langfuse (our current favorite)</li>
  <li>LangSmith (also good)</li>
  <li>Custom OpenTelemetry (for the brave)</li>
  <li>Coffee (monitors our alertness)</li>
</ul>

<h3 id="calculate-your-potential-savings-do-this-right-now">Calculate Your Potential Savings (Do This Right Now)</h3>

<p>Seriously, take 2 minutes and see what you could be saving:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Your current situation</span>
<span class="kd">const</span> <span class="nx">currentCost</span> <span class="o">=</span> <span class="p">((</span><span class="nx">monthlyRequests</span> <span class="o">*</span> <span class="nx">avgTokens</span><span class="p">)</span> <span class="o">/</span> <span class="mi">1</span><span class="nx">_000_000</span><span class="p">)</span> <span class="o">*</span> <span class="nx">$10</span><span class="p">;</span>

<span class="c1">// What you COULD be saving</span>
<span class="kd">const</span> <span class="nx">potentialSavings</span> <span class="o">=</span> <span class="p">{</span>
  <span class="na">withSkills</span><span class="p">:</span> <span class="nx">currentCost</span> <span class="o">*</span> <span class="mf">0.5</span><span class="p">,</span> <span class="c1">// 50% reduction</span>
  <span class="na">withMultiAgent</span><span class="p">:</span> <span class="nx">currentCost</span> <span class="o">*</span> <span class="mf">0.7</span><span class="p">,</span> <span class="c1">// 70% reduction</span>
  <span class="na">withHybrid</span><span class="p">:</span> <span class="nx">currentCost</span> <span class="o">*</span> <span class="mf">0.85</span><span class="p">,</span> <span class="c1">// 85% reduction</span>
<span class="p">};</span>

<span class="c1">// Now imagine what you could do with that money</span>
<span class="c1">// (We're thinking coffee budget + GPU upgrades)</span>
</code></pre></div></div>

<hr />

<h2 id="final-thoughts-the-honest-ones">Final Thoughts (The Honest Ones)</h2>

<p>Look, we’re still figuring this out. We’re still learning. We’re still making mistakes (just more expensive ones now that we know better). But that’s the point… nobody has this completely figured out. AI is moving faster than documentation can keep up.</p>

<p>What we DO know:</p>

<ul>
  <li>Start simple, evolve based on actual problems</li>
  <li>Measure EVERYTHING (seriously, token tracking is non-negotiable)</li>
  <li>Don’t over-engineer (we did this so you don’t have to)</li>
  <li>Community knowledge is invaluable (thank you, internet strangers)</li>
  <li>It’s okay to not know everything (we certainly don’t)</li>
</ul>

<p><img src="/engineering/assets/images/journey-visualization.png" alt="Journey Visualization" /></p>

<h3 id="the-future">The Future</h3>

<p>As agents scale:</p>

<ul>
  <li><strong>Context management is king</strong> - Progressive disclosure wins</li>
  <li><strong>Specialization is inevitable</strong> - One agent can’t do it all</li>
  <li><strong>Hybrid is the sweet spot</strong> - Combines best of all patterns</li>
  <li><strong>Cost optimization is non-negotiable</strong> - Make or break for production</li>
</ul>]]></content><author><name>Akhil Kumar</name></author><category term="AI &amp; Machine Learning" /><category term="Software Architecture" /><category term="Engineering" /><category term="Cost Optimization" /><category term="ai agents" /><category term="multi-agent systems" /><category term="langgraph" /><category term="langchain" /><category term="token optimization" /><category term="cost optimization" /><category term="llm" /><category term="gpt-4" /><category term="progressive disclosure" /><category term="software architecture" /><category term="engineering patterns" /><category term="QuestionPro" /><summary type="html"><![CDATA[Or: How I Learned to Stop Worrying and Love the Token Budget]]></summary></entry><entry><title type="html">Fresher to Engineering Manager: Journey at QuestionPro</title><link href="https://www.questionpro.com/engineering/personal%20growth/continuous%20learning/professional%20growth/leadership/questionpro%20culture/fresher-to-engineering-manager-journey-at-questionpro/" rel="alternate" type="text/html" title="Fresher to Engineering Manager: Journey at QuestionPro" /><published>2026-02-17T00:00:00+00:00</published><updated>2026-02-17T00:00:00+00:00</updated><id>https://www.questionpro.com/engineering/personal%20growth/continuous%20learning/professional%20growth/leadership/questionpro%20culture/fresher-to-engineering-manager-journey-at-questionpro</id><content type="html" xml:base="https://www.questionpro.com/engineering/personal%20growth/continuous%20learning/professional%20growth/leadership/questionpro%20culture/fresher-to-engineering-manager-journey-at-questionpro/"><![CDATA[<p>When I look back on my journey with <strong>QuestionPro</strong>, which began in <strong>March 2012</strong>, I’m filled with gratitude, pride, and a deep sense of belonging. What started as my first professional step after earning a <strong>degree in Computer Engineering from Pune University</strong> has evolved into a career-defining experience spanning over a decade.</p>

<h2 id="the-early-days">The Early Days</h2>

<p>When I joined QuestionPro, the company had just moved its office from <strong>Nashik to Pune</strong>. Our first workspace was small but cozy—more like a collaborative lab than a traditional office. There were only <strong>eight of us</strong> then, wearing multiple hats and learning as we went.</p>

<p>I started as a <strong>Java Developer</strong>, primarily focused on <strong>bug fixes, feature testing,</strong> and writing or updating <strong>help documentation</strong>. Those early tasks might seem small, but they taught me the importance of precision, patience, and understanding the product from the user’s perspective.</p>

<p><img src="/engineering/assets/images/questionpro-pune-team.jpg" alt="QuestionPro Pune Lonavala Trip" /></p>

<h2 id="learning-mentorship-and-confidence">Learning, Mentorship, and Confidence</h2>

<p>The next two years were a turning point. Under the mentorship of <strong>Shri and Anish</strong>, I began to take on larger challenges, own modules, and develop new features. Their guidance shaped not only my technical skills but also my professional mindset.</p>

<p>They taught me that leadership isn’t just about giving directions—it’s about <strong>owning outcomes</strong> and continuously learning. Even today, they remain my <strong>role models</strong>, and I still find myself learning from their approach to problem-solving and innovation.</p>

<h2 id="stepping-into-leadership">Stepping into Leadership</h2>

<p>As QuestionPro grew, so did my responsibilities. I transitioned into a <strong>Team Lead (TL)</strong> role, managing a small team while continuing to code and contribute hands-on. This phase was both exciting and demanding—it pushed me to develop skills in <strong>people management, communication</strong>, and <strong>strategic thinking</strong>.</p>

<p>However, what truly made it memorable were the moments outside of <strong>work</strong>—our <strong>Friday night gatherings, table tennis matches</strong>, and <strong>cricket games</strong>. These experiences built lasting friendships and strengthened our culture of collaboration and trust.</p>

<h2 id="working-with-visionary-leaders">Working with Visionary Leaders</h2>

<p>One of the highlights of my career has been the opportunity to work directly with <strong>Vivek Bhaskaran</strong>, our <strong>Founder &amp; CEO</strong>, and <strong>Erik Koto</strong>, our <strong>COO/Ex-CEO</strong>. Having direct access to the leadership team gave me a unique perspective on how decisions are made and how innovation is driven at scale.</p>

<p>The management team at QuestionPro has always been <strong>accessible, supportive</strong>, and <strong>empathetic</strong>. They’ve built a culture where everyone’s ideas are valued, and every challenge is seen as an opportunity to grow.</p>

<h2 id="growth-ownership-and-empowerment">Growth, Ownership, and Empowerment</h2>

<p>Over the years, my focus has shifted from purely technical work to <strong>engineering leadership</strong>. I’ve learned the art of <strong>hiring, mentoring</strong>, and <strong>helping team members</strong> chart their own career paths. Watching people I’ve mentored grow into confident engineers and leaders has been one of the most rewarding aspects of my journey.</p>

<p>Today, as an <strong>Engineering Manager</strong>, I handle <strong>three different products</strong>, each with its own challenges and learning curves. The responsibility is immense, but so is the satisfaction that comes with building scalable, user-centered products that impact thousands of users worldwide.</p>

<h2 id="what-makes-questionpro-special">What Makes QuestionPro Special</h2>

<p>If I were to summarize what makes QuestionPro unique, it would be its <strong>open culture</strong> and <strong>growth mindset</strong>. Here, you’re encouraged to take initiative, challenge assumptions, and experiment with new ideas.</p>

<p>Failures aren’t punished—they’re treated as learning opportunities. That philosophy has allowed me to evolve continuously, both as an engineer and as a leader.</p>

<p><img src="/engineering/assets/images/questionpro-pune-lonavala-trip.jpg" alt="QuestionPro Pune Lonavala Trip" /></p>

<h2 id="looking-backand-ahead">Looking Back—and Ahead</h2>

<p>From debugging my first piece of code to managing multiple teams and products, every step of this journey has taught me something valuable.</p>

<ul>
  <li>
    <p><strong>QuestionPro</strong> has given me more than a career—it has given me a community.</p>
  </li>
  <li>
    <p>It has shown me that growth doesn’t happen by chance; it happens by <strong>staying curious, embracing challenges</strong>, and <strong>never stopping learning</strong>.</p>
  </li>
  <li>
    <p>And most importantly, it has reinforced my belief that <strong>great products are built by great people</strong> who trust and learn from each other.</p>
  </li>
</ul>

<p>As I look to the future, I’m excited to continue this journey of building, leading, and learning—together with a team that feels more like family.</p>

<hr />]]></content><author><name>Kiran Dongare</name></author><category term="Personal Growth" /><category term="Continuous Learning" /><category term="Professional Growth" /><category term="Leadership" /><category term="QuestionPro Culture" /><category term="personal growth" /><category term="professional growth" /><category term="leadership" /><category term="communication" /><category term="team culture" /><category term="people management" /><category term="continuous learning" /><category term="questionpro journey" /><summary type="html"><![CDATA[When I look back on my journey with QuestionPro, which began in March 2012, I’m filled with gratitude, pride, and a deep sense of belonging. What started as my first professional step after earning a degree in Computer Engineering from Pune University has evolved into a career-defining experience spanning over a decade.]]></summary></entry><entry><title type="html">The Art of Not Breaking Things: A CbC Manifesto… a mindset to build polished code</title><link href="https://www.questionpro.com/engineering/software%20engineering%20philosophy/code%20quality%20&%20maintainability/programming%20best%20practices/type%20systems%20&%20safety/backend%20&%20application%20architecture/developer%20mindset%20&%20productivity/correctness-by-construction/" rel="alternate" type="text/html" title="The Art of Not Breaking Things: A CbC Manifesto… a mindset to build polished code" /><published>2026-02-05T00:00:00+00:00</published><updated>2026-02-05T00:00:00+00:00</updated><id>https://www.questionpro.com/engineering/software%20engineering%20philosophy/code%20quality%20&amp;%20maintainability/programming%20best%20practices/type%20systems%20&amp;%20safety/backend%20&amp;%20application%20architecture/developer%20mindset%20&amp;%20productivity/correctness-by-construction</id><content type="html" xml:base="https://www.questionpro.com/engineering/software%20engineering%20philosophy/code%20quality%20&amp;%20maintainability/programming%20best%20practices/type%20systems%20&amp;%20safety/backend%20&amp;%20application%20architecture/developer%20mindset%20&amp;%20productivity/correctness-by-construction/"><![CDATA[<p>Writing code is a lot similar to carpentry..If you eyeball the cut, saw like a maniac with confusion in mind, and then later try to fix the gap with glue, you aren’t building a table, you are building a hazard.</p>

<p><strong>Correctness by Construction (CbC)</strong> is the shift from the chaotic energy of “Let’s see if this runs” to the calm confidence of “I know this runs because I made it impossible for it not to.”</p>

<p>p.s. “No, not Complete blood Count (CbC) :3”</p>

<p>Let me talk about potential tradeoffs before jumping into how we could achieve the CbC mindset,</p>

<h3 id="pros">Pros</h3>

<ul>
  <li><strong>Peace</strong>: You spend 90% less time in the debugger (which is just a crime scene where you are both the detective and the murderer).</li>
  <li><strong>Sleep</strong>: Yeah.</li>
</ul>

<h3 id="cons">Cons</h3>

<ul>
  <li><strong>Slower Start</strong>: You have to think before you type. It feels less like “hacking” and more like “engineering.”</li>
  <li><strong>Boredom</strong>: You won’t look as busy to your peers because you aren’t “putting out fires” constantly.</li>
</ul>

<p>Here is the toolkit for your CbC journey, a guide to replacing ‘hope’ with the certainty of flawless, intentional design.</p>

<h2 id="the-prologue-strong-specifications">The Prologue: Strong Specifications</h2>

<p><img src="/engineering/assets/images/cbc-strong-specifications.jpg" alt="Strong Specifications" /></p>

<p>Before you write a single line of code, write the story. Define exactly what the function does,</p>

<ul>
  <li><strong>The Vibe</strong>: Don’t build a house based on a napkin sketch.</li>
  <li><strong>The Habit</strong>: If you can’t explain the constraints in English, you certainly can’t explain them in JavaScript ;)</li>
</ul>

<hr />

<h2 id="the-bouncer-safe-languages--types">The Bouncer: Safe Languages &amp; Types</h2>

<p><img src="/engineering/assets/images/cbc-bouncer.jpg" alt="The Bouncer" /></p>

<p>Use your type system to forbid “impossible” situations. A bug cannot happen if the code literally refuses to compile it.</p>

<p><strong>Don’t:</strong> Build a “Frankenstein” object full of optional flags</p>

<ul>
  <li><strong>Bad</strong>: A request state like</li>
</ul>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span> <span class="nl">isLoading</span><span class="p">:</span> <span class="nx">boolean</span><span class="p">,</span> <span class="nx">error</span><span class="p">?:</span> <span class="kr">string</span><span class="p">,</span> <span class="nx">data</span><span class="p">?:</span> <span class="kr">string</span> <span class="p">}.</span>
</code></pre></div></div>

<ul>
  <li><strong>The Trap</strong>: You can accidentally create a state where isLoading: true and error: “Failed”. Is it loading? Is it broken? The code is confused, and so are you.</li>
</ul>

<p><strong>Do</strong>: Use Discriminated Unions. Force the data to exist only when the state allows it.</p>

<ul>
  <li><strong>Good</strong>: Define distinct states</li>
</ul>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">type</span> <span class="nx">State</span> <span class="o">=</span> <span class="p">{</span> <span class="na">status</span><span class="p">:</span> <span class="dl">"</span><span class="s2">loading</span><span class="dl">"</span> <span class="p">}</span> <span class="o">|</span> <span class="p">{</span> <span class="na">status</span><span class="p">:</span> <span class="dl">"</span><span class="s2">success</span><span class="dl">"</span><span class="p">;</span> <span class="nl">data</span><span class="p">:</span> <span class="kr">string</span> <span class="p">};</span>
</code></pre></div></div>

<p>If you check status === ‘loading’, TypeScript physically prevents you from accessing .data. It doesn’t exist yet. The compiler catches the bug before you even run the code.</p>

<h2 id="the-handshake-design-by-contract">The Handshake: Design-by-Contract</h2>

<p><img src="/engineering/assets/images/cbc-design-by-contract.jpg" alt="The Handshake" /></p>

<p>Every function is a business deal.</p>

<ul>
  <li><strong>Preconditions</strong>: “You promise to give me a positive number.”</li>
  <li><strong>Postconditions</strong>: “I promise to return its square root.”</li>
  <li><strong>Invariants</strong>: “I promise the database never catches fire.” If the contract is broken, the program shouldn’t try to limp along, it should shout immediately!</li>
</ul>

<h2 id="the-haiku-modularity--simplicity">The Haiku: Modularity &amp; Simplicity</h2>

<p><img src="/engineering/assets/images/cbc-modularity.jpg" alt="The Haiku" /></p>

<p>Complexity is where bugs hide to reproduce. Keep your components small, pure, and simple.</p>

<ul>
  <li><strong>The Philosophy</strong>: Write code like a Haiku, not a dissertation. If a function does three things, it’s doing two things too many.</li>
</ul>

<h2 id="code-hygiene-avoid-error-prone-patterns">Code Hygiene: Avoid Error-Prone Patterns</h2>

<p><img src="/engineering/assets/images/cbc-lawsuit.jpg" alt="Code Hygiene" /></p>

<p>Some patterns are just slippery floors waiting for a lawsuit.</p>

<ul>
  <li><strong>Don’t</strong>: Use shared mutable state (global variables that everyone touches). That’s like sharing a toothbrush.</li>
  <li><strong>Do</strong>: Use Pure Functions (same input always equals same output) and Result types instead of throwing random Exceptions.</li>
</ul>

<h2 id="the-epilogue-tests-confirm-they-dont-fix">The Epilogue: Tests Confirm, They Don’t Fix</h2>

<p><img src="/engineering/assets/images/cbc-confirm.jpg" alt="Tests Confirm" /></p>

<p>Testing is your safety net, not your construction plan.</p>

<ul>
  <li><strong>The Shift</strong>: If you rely on tests to find all your bugs, you’ve already lost.</li>
  <li><strong>The Goal</strong>: Build it so solid that the tests are just a formality, a victory lap to prove you were right all along.</li>
</ul>

<h2 id="the-grand-finale">The Grand Finale</h2>

<p>Adopt the mindset of a sculptor. You don’t chip away stone and hope it looks like a horse later. You verify every angle as you go.</p>]]></content><author><name>Syed Saad</name></author><category term="Software Engineering Philosophy" /><category term="Code Quality &amp; Maintainability" /><category term="Programming Best Practices" /><category term="Type Systems &amp; Safety" /><category term="Backend &amp; Application Architecture" /><category term="Developer Mindset &amp; Productivity" /><category term="Correctness by Construction" /><category term="typescript" /><category term="Design by Contract" /><category term="Discriminated Unions" /><category term="Functional Programming Concepts" /><category term="Bug Prevention" /><category term="clean code" /><category term="kiss" /><category term="dry" /><category term="srp" /><category term="software development" /><category term="best practices" /><summary type="html"><![CDATA[Writing code is a lot similar to carpentry..If you eyeball the cut, saw like a maniac with confusion in mind, and then later try to fix the gap with glue, you aren’t building a table, you are building a hazard.]]></summary></entry><entry><title type="html">QuestionPro TechCommit Pune: Scaling Architectures, AI, and Networking</title><link href="https://www.questionpro.com/engineering/technology%20meetup/software%20engineering/data%20science/questionpro-techcommit-pune-scaling-ai-dec-13/" rel="alternate" type="text/html" title="QuestionPro TechCommit Pune: Scaling Architectures, AI, and Networking" /><published>2025-12-05T00:00:00+00:00</published><updated>2025-12-05T00:00:00+00:00</updated><id>https://www.questionpro.com/engineering/technology%20meetup/software%20engineering/data%20science/questionpro-techcommit-pune-scaling-ai-dec-13</id><content type="html" xml:base="https://www.questionpro.com/engineering/technology%20meetup/software%20engineering/data%20science/questionpro-techcommit-pune-scaling-ai-dec-13/"><![CDATA[<p>Are you a developer, software engineer, or data scientist in the <strong>Pune</strong> area looking to connect, collaborate, and dive deep into cutting-edge technology?</p>

<p>QuestionPro is excited to announce <strong>TechCommit</strong>, a focused technical meetup designed to accelerate your growth and provide actionable insights into building high-performance, intelligent systems. Join us in our <strong>Pune office</strong> on <strong>Saturday, December 13th</strong>, for two power-packed sessions!</p>

<h2 id="️-essential-event-details">🗓️ Essential Event Details</h2>

<table>
  <thead>
    <tr>
      <th style="text-align: left">Detail</th>
      <th style="text-align: left">Information</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="text-align: left"><strong>What</strong></td>
      <td style="text-align: left">QuestionPro TechCommit Developer Meetup</td>
    </tr>
    <tr>
      <td style="text-align: left"><strong>When</strong></td>
      <td style="text-align: left">Friday, December 13th (10:00 AM – 1:30 PM IST)</td>
    </tr>
    <tr>
      <td style="text-align: left"><strong>Where</strong></td>
      <td style="text-align: left"><a href="https://maps.app.goo.gl/b2j8oBmmtWum9GAp7">QuestionPro, 1B, Nano Space IT Park, Baner, Pune, Maharashtra</a></td>
    </tr>
    <tr>
      <td style="text-align: left"><strong>Registration Deadline</strong></td>
      <td style="text-align: left">Thursday, December 11th</td>
    </tr>
  </tbody>
</table>

<hr />

<h2 id="️-agenda-deep-dives-by-questionpro-experts">🎙️ Agenda: Deep Dives by QuestionPro Experts</h2>

<p>We have lined up two high-impact talks from our senior experts, focusing on the real-world challenges of building highly available systems and leveraging Artificial Intelligence for data analysis.</p>

<h3 id="talk-1-from-zero-to-10k-live-users-scaling-real-time-engagement-platforms">Talk 1: From Zero to 10k Live Users: Scaling Real-time Engagement Platforms</h3>

<p><strong>Speaker: <a href="https://www.linkedin.com/in/rohan-rao-kn8629/">Rohan Rao</a>, Software Engineer</strong></p>

<p>Building a real-time platform that handles thousands of concurrent users is a critical challenge. Rohan Rao, a Software Engineer with over 7 years of experience, will share lessons learned and practical strategies for:</p>

<ul>
  <li>Designing resilient and <strong>scalable architectures</strong>.</li>
  <li>Implementing <strong>high-performance solutions</strong> that maintain speed under heavy load.</li>
</ul>

<h3 id="talk-2-ai-in-action-building-a-smarter-topic-analysis-system">Talk 2: AI in Action: Building a Smarter Topic Analysis System</h3>

<p><strong>Speaker: <a href="https://www.linkedin.com/in/pratikdhulubulu/">Pratik Dhulubulu</a>, Dedicated Data Scientist</strong></p>

<p>Artificial intelligence is rapidly transforming how companies understand user data. Pratik Dhulubulu, a Data Scientist with 4+ years of experience, will move past the hype and demonstrate how to:</p>

<ul>
  <li>Leverage AI to <strong>solve complex data problems</strong>.</li>
  <li>Build a sophisticated, data-backed <strong>topic analysis system</strong> using modern <strong>Python</strong> techniques.</li>
</ul>

<hr />

<h2 id="why-this-pune-tech-meetup-is-a-must-attend">Why This Pune Tech Meetup Is a Must-Attend</h2>

<ol>
  <li><strong>Focused Learning:</strong> Gain high-value, technical knowledge from experienced practitioners in a short, efficient format.</li>
  <li><strong>Networking:</strong> Connect face-to-face with fellow software engineers, data scientists, and developers in the vibrant <strong>Pune tech scene</strong>.</li>
  <li><strong>Community:</strong> Discuss the real-world implementation challenges of <strong>scaling</strong> and <strong>AI</strong> in a professional environment.</li>
</ol>

<h2 id="secure-your-spot">Secure Your Spot</h2>

<p>Spaces for this <strong>developer meetup in Pune</strong> are limited. Don’t miss out!</p>

<p>The <strong>last day to register is Thursday, December 11th</strong>.</p>

<p><strong><a href="https://techcommit.questionpro.com/13-dec-2025">Register Now</a></strong></p>

<p>We look forward to connecting with you at the QuestionPro office in Baner!</p>]]></content><author><name>QuestionPro Events Team</name></author><category term="Technology Meetup" /><category term="Software Engineering" /><category term="Data Science" /><category term="Pune Tech Meetup" /><category term="Scaling Architecture" /><category term="Real-time Systems" /><category term="AI" /><category term="Data Science" /><category term="QuestionPro" /><category term="developer meetup" /><summary type="html"><![CDATA[Are you a developer, software engineer, or data scientist in the Pune area looking to connect, collaborate, and dive deep into cutting-edge technology?]]></summary></entry><entry><title type="html">Scaling LivePolls to 10K Concurrent Users - A Journey Beyond Limits</title><link href="https://www.questionpro.com/engineering/engineering/scalability/performance%20optimization/scaling-livepolls-to-10k-concurrent-users/" rel="alternate" type="text/html" title="Scaling LivePolls to 10K Concurrent Users - A Journey Beyond Limits" /><published>2025-11-06T00:00:00+00:00</published><updated>2025-11-06T00:00:00+00:00</updated><id>https://www.questionpro.com/engineering/engineering/scalability/performance%20optimization/scaling-livepolls-to-10k-concurrent-users</id><content type="html" xml:base="https://www.questionpro.com/engineering/engineering/scalability/performance%20optimization/scaling-livepolls-to-10k-concurrent-users/"><![CDATA[<p>A software product that solves a business problem is good. But a software product that scales as the business grows - that’s <strong>excellent</strong>!</p>

<p>As the saying goes:</p>

<blockquote>
  <p>“In the digital age, scaling is the compass that guides your product toward success in the vast ocean of possibilities.”</p>
</blockquote>

<p>This blog is a story of how we scaled <strong>LivePolls</strong>, a real-time engagement product from QuestionPro, to handle a staggering <strong>10,000 participants in a single session</strong> - on a single server.</p>

<p>You’ll see the problems we faced, the creative solutions we engineered, and the lessons we learned along the way. So grab your favorite drink and some snacks - this is a scaling story you’ll actually enjoy reading.</p>

<h2 id="a-bit-of-context">A Bit of Context</h2>

<p><strong>LivePolls</strong> is a free product from QuestionPro that allows users to engage audiences in real time - through polls, quizzes, and interactive questions.</p>

<ul>
  <li>The <strong>Admin</strong> creates a quiz or poll and starts a live session</li>
  <li><strong>Participants</strong> join using a session PIN</li>
  <li>Everyone engages together in real-time through dynamic charts, leaderboards, and results</li>
</ul>

<p>At its core, LivePolls uses <strong>WebSockets</strong> for fast, real-time communication between clients and servers. And when we say real-time, we mean <strong>milliseconds matter</strong>.</p>

<h2 id="the-challenge">The Challenge</h2>

<p>One of QuestionPro’s enterprise customers wanted to host a LivePoll session with over <strong>5,000 participants</strong>.</p>

<p>Our previous tests had maxed out around <strong>2,000 participants per server</strong>. Beyond that, things… well, broke.</p>

<p>We lacked infrastructure, saw memory leaks, hit socket connection limits, and faced painfully slow queries. And thus began our epic journey to scale.</p>

<h2 id="problems-we-faced">Problems We Faced</h2>

<ol>
  <li>No infrastructure to load-test beyond 2,000 participants</li>
  <li>Memory leak in the application</li>
  <li>Nginx limits on concurrent socket connections</li>
  <li>Slow MySQL queries causing high memory usage and timeouts</li>
</ol>

<h2 id="solutions-we-devised">Solutions We Devised</h2>

<h3 id="1-building-infrastructure-for-10000-virtual-participants">1. Building Infrastructure for 10,000 Virtual Participants</h3>

<p>Since LivePolls uses WebSockets, we first tried <strong>Artillery.io</strong> for load testing - it’s a solid library for socket-based tests.</p>

<p>But we quickly hit a wall:</p>

<ul>
  <li>Only ~1,000 WebSocket connections per local machine</li>
  <li>A steep learning curve for complex test flows (like joining, answering, viewing results)</li>
</ul>

<p>So, we decided to go rogue and wrote our own simple JavaScript script to simulate participant actions:</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>joining → answering → viewing results
</code></pre></div></div>

<p>Then came the real question - where do we run these thousands of connections?</p>

<p>We leveraged <strong>free-tier cloud VMs from AWS</strong>, since:</p>

<ul>
  <li>There was no limit on the number of VMs we could spin up</li>
  <li>Each VM could handle around 1,000 socket connections</li>
  <li>So, <strong>10 VMs = 10,000 virtual participants</strong>. Simple math. Beautiful scaling.</li>
</ul>

<p>When we later moved the setup to our in-house test servers, we had to tweak some system parameters (like increasing file descriptors):</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># /etc/security/limits.conf</span>
your_user soft nofile 1000000
your_user hard nofile 1000000
</code></pre></div></div>

<p>Each socket uses one file descriptor - so, <strong>no descriptors = no sockets = no fun</strong>.</p>

<h3 id="2-fixing-the-memory-leak">2. Fixing the Memory Leak</h3>

<p>A <strong>memory leak</strong> occurs when an app allocates memory but doesn’t release it properly.</p>

<p>After several smaller tests, we found that memory usage never dropped back to normal even after participants disconnected. Suspicious. 🤔</p>

<p>We dug deeper using <strong>heap snapshots</strong> and discovered that WebSocket connections were not being released after closing. Our stack uses <strong>Socket.io</strong>, which internally uses the <strong>WS</strong> library for WebSocket handling.</p>

<p><img src="/engineering/assets/images/scaling-livepolls-memory-leak-image-1.png" alt="Memory leak" /></p>

<p>Then came the <strong>“aha” moment</strong> - we found <strong>EIOWS</strong>, a C++-based, high-performance, low-memory WebSocket engine, fully compatible with Socket.io.</p>

<p>With just <strong>one line of code changed</strong>, we swapped WS with EIOWS - and boom, no more leaks. <em>(If only all life’s problems could be solved this easily.)</em></p>

<p><img src="/engineering/assets/images/scaling-livepolls-memory-leak-image-2.png" alt="Memory leak fixed" /></p>

<h3 id="3-tuning-nginx-for-10k-concurrent-connections">3. Tuning Nginx for 10K+ Concurrent Connections</h3>

<p>Once memory was stable, we went for the big one - <strong>5K participants</strong>.</p>

<p>…and hit a wall again.</p>

<p>The error?</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Too many open files
</code></pre></div></div>

<p>Classic. 😅</p>

<p>We increased:</p>

<div class="language-nginx highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">worker_connections</span> <span class="mi">8192</span><span class="p">;</span>
<span class="k">worker_processes</span> <span class="mi">8</span><span class="p">;</span>
</code></pre></div></div>

<p>Expecting ~64K connections. But we still crashed around 4K.</p>

<p><strong>Turns out</strong>, each connection in Nginx uses <strong>2 file descriptors</strong> - one for the client, one for the upstream server.</p>

<p>So, we added:</p>

<div class="language-nginx highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">worker_rlimit_nofile</span> <span class="mi">16192</span><span class="p">;</span>
</code></pre></div></div>

<p>This allowed each worker process to handle roughly 8K connections, comfortably crossing the <strong>10K mark</strong>.</p>

<h3 id="4-optimizing-slow-mysql-queries">4. Optimizing Slow MySQL Queries</h3>

<p>With 10K people connected, the next step was an end-to-end test: participants answer, admin views charts/leaderboards, participants view relative rank, etc.</p>

<p>We found the application server <strong>crashed</strong> when admin requested result charts. The culprit: queries pulling all rows into application memory and then aggregating in JavaScript - a disaster at 10K+ rows.</p>

<p>Below are concrete examples showing <strong>before (inefficient)</strong> and <strong>after (optimized)</strong> queries we used.</p>

<h4 id="naive--inefficient-approach-what-we-had">Naive / Inefficient Approach (what we had)</h4>

<p>These queries fetch all rows into the app and then process them in JS:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">-- inefficient, pulls all rows</span>
<span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">responses</span> <span class="k">WHERE</span> <span class="n">question_id</span> <span class="o">=</span> <span class="mi">123</span><span class="p">;</span>
</code></pre></div></div>

<p>If you then sort or aggregate on this in-app, memory usage explodes. 💥</p>

<h4 id="optimized-queries---let-mysql-do-the-heavy-lifting">Optimized Queries - Let MySQL Do the Heavy Lifting</h4>

<h5 id="1-get-response-count-for-each-answer-option-for-charting">1. Get response count for each answer option (for charting)</h5>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SELECT</span> <span class="n">answer_option_id</span><span class="p">,</span> <span class="k">COUNT</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="k">AS</span> <span class="n">response_count</span>
<span class="k">FROM</span> <span class="n">responses</span>
<span class="k">WHERE</span> <span class="n">question_id</span> <span class="o">=</span> <span class="mi">123</span>
<span class="k">GROUP</span> <span class="k">BY</span> <span class="n">answer_option_id</span><span class="p">;</span>
</code></pre></div></div>

<p><strong>Why better:</strong> MySQL aggregates rows on the server and returns a tiny result set (one row per option), not 10K rows.</p>

<h5 id="2-get-respondents-sorted-by-score-and-limit-leaderboard">2. Get respondents sorted by score and limit (leaderboard)</h5>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SELECT</span> <span class="n">respondent_id</span><span class="p">,</span> <span class="n">total_score</span>
<span class="k">FROM</span> <span class="n">respondents_scores</span>
<span class="k">WHERE</span> <span class="n">session_id</span> <span class="o">=</span> <span class="mi">456</span>
<span class="k">ORDER</span> <span class="k">BY</span> <span class="n">total_score</span> <span class="k">DESC</span>
<span class="k">LIMIT</span> <span class="mi">100</span><span class="p">;</span>
</code></pre></div></div>

<p><strong>Why better:</strong> Database-level <code class="language-plaintext highlighter-rouge">ORDER BY</code> + <code class="language-plaintext highlighter-rouge">LIMIT</code> avoids fetching all respondents into memory.</p>

<h2 id="the-result">The Result</h2>

<p>After fixing memory leaks, tuning Nginx, and optimizing queries:</p>

<p>✅ <strong>LivePolls successfully handled 10,000 concurrent users</strong> on a single app server (2 CPU, 4GB RAM)<br />
✅ Multiple full-scale tests were conducted smoothly<br />
✅ No crashes, no leaks, no long waits - just 10,000 people engaging in real-time</p>

<h2 id="final-thoughts">Final Thoughts</h2>

<p>Scaling isn’t just about tweaking parameters - it’s about <strong>learning how your system breathes under pressure</strong>.</p>

<p>We hope this story gave you a few ideas, maybe even some laughs, and definitely the confidence to take your own systems beyond limits.</p>

<p>Remember:</p>

<blockquote>
  <p>“Scaling isn’t a destination; it’s a journey of continuous improvement and evolution.”</p>
</blockquote>

<p>And sometimes, that journey starts with a single config file. 🚀</p>]]></content><author><name>Rohan Rao</name></author><category term="Engineering" /><category term="Scalability" /><category term="Performance Optimization" /><category term="WebSockets" /><category term="Performance" /><category term="Load Testing" /><category term="Nginx" /><category term="Real-time Systems" /><summary type="html"><![CDATA[A software product that solves a business problem is good. But a software product that scales as the business grows - that’s excellent!]]></summary></entry><entry><title type="html">The Conscious Vibe Coder</title><link href="https://www.questionpro.com/engineering/ai%20in%20development/software%20engineering/developer%20mindset/the-conscious-vibe-coder/" rel="alternate" type="text/html" title="The Conscious Vibe Coder" /><published>2025-10-30T00:00:00+00:00</published><updated>2025-10-30T00:00:00+00:00</updated><id>https://www.questionpro.com/engineering/ai%20in%20development/software%20engineering/developer%20mindset/the-conscious-vibe-coder</id><content type="html" xml:base="https://www.questionpro.com/engineering/ai%20in%20development/software%20engineering/developer%20mindset/the-conscious-vibe-coder/"><![CDATA[<p>The Conscious Vibe Coder :
Thinking Beyond the Autocomplete
How thoughtful coding separates professionals from prompt typists</p>

<p>In today’s world of AI-assisted coding, where tools like GitHub Copilot, ChatGPT, and cursor editors can generate entire modules in seconds, many developers are falling into a subtle trap — they are typing prompts, not writing solutions. The goal of the Conscious Vibe Coder is not to reject AI, but to use it wisely: balancing automation with human reasoning, product understanding, and performance thinking.</p>

<h2 id="the-vibe-shift-in-coding">The Vibe Shift in Coding</h2>

<p>Coding used to be about writing logic line by line. Today, it’s about designing, reviewing, and refining AI-generated logic. Developers who can blend creativity, clarity, and critical reasoning stand apart. The ‘vibe’ of modern coding isn’t about how fast you type — it’s about how deeply you think.</p>

<h2 id="boilerplate-isnt-the-enemy-but-mindless-acceptance-is">Boilerplate Isn’t the Enemy, But Mindless Acceptance Is</h2>

<p>AI excels at generating repetitive or boilerplate code. Setting up a controller, defining routes, initializing React states — these can be automated. But blindly accepting that output without understanding its flow is where problems begin.</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">@</span><span class="nd">Controller</span><span class="p">(</span><span class="dl">"</span><span class="s2">users</span><span class="dl">"</span><span class="p">)</span>
<span class="k">export</span> <span class="kd">class</span> <span class="nc">UserController</span> <span class="p">{</span>
  <span class="nf">constructor</span><span class="p">(</span><span class="k">private</span> <span class="k">readonly</span> <span class="nx">userService</span><span class="p">:</span> <span class="nx">UserService</span><span class="p">)</span> <span class="p">{}</span>

  <span class="p">@</span><span class="nd">Get</span><span class="p">(</span><span class="dl">"</span><span class="s2">:id</span><span class="dl">"</span><span class="p">)</span>
  <span class="nf">findOne</span><span class="p">(@</span><span class="nd">Param</span><span class="p">(</span><span class="dl">"</span><span class="s2">id</span><span class="dl">"</span><span class="p">)</span> <span class="nx">id</span><span class="p">:</span> <span class="kr">string</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">return</span> <span class="k">this</span><span class="p">.</span><span class="nx">userService</span><span class="p">.</span><span class="nf">findOne</span><span class="p">(</span><span class="o">+</span><span class="nx">id</span><span class="p">);</span>
  <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>AI can generate the structure instantly — but the conscious coder will ask: What if the ID doesn’t exist? Should we handle exceptions? Do we need caching or role-based access control? That’s the human layer AI cannot replace.</p>

<h2 id="clarity-of-thought-drives-code-quality">Clarity of Thought Drives Code Quality</h2>

<p>Before prompting AI, clarity matters more than syntax. If your instructions are vague, your output will reflect it. A conscious coder thinks like an architect — not a script generator.</p>

<p>❌”Build a booking API.”</p>

<p>✅ “Create a Spring Boot REST API for hotel bookings with endpoints to create, cancel, and list bookings. Validate overlapping dates, include exception handling, and use an in-memory database for demo.”
The difference isn’t just in wording; it’s in mindset.
Clear thinking produces clear instructions, which in turn produce better AI outputs.</p>

<h2 id="review-like-a-mentor-not-a-machine">Review Like a Mentor, Not a Machine</h2>

<p>AI often produces code that ‘works’ but doesn’t ‘scale’. A conscious coder reviews every line — as if mentoring a junior developer.</p>

<p>The question is never
‘Does it run?’
but
‘Is it robust, readable, and reliable?’</p>

<p>AI-Generated Example:</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">public</span> <span class="nc">UserProfile</span> <span class="nf">getProfile</span><span class="o">(</span><span class="nc">Long</span> <span class="n">userId</span><span class="o">)</span> <span class="o">{</span>
  <span class="k">return</span> <span class="n">userRepository</span><span class="o">.</span><span class="na">findById</span><span class="o">(</span><span class="n">userId</span><span class="o">).</span><span class="na">get</span><span class="o">();</span>
<span class="o">}</span>
</code></pre></div></div>

<p>Refactored Version:</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">public</span> <span class="nc">UserProfile</span> <span class="nf">getProfile</span><span class="o">(</span><span class="nc">Long</span> <span class="n">userId</span><span class="o">)</span> <span class="o">{</span>
<span class="k">return</span> <span class="n">userRepository</span><span class="o">.</span><span class="na">findById</span><span class="o">(</span><span class="n">userId</span><span class="o">)</span>
        <span class="o">.</span><span class="na">orElseThrow</span><span class="o">(()</span> <span class="o">-&gt;</span> <span class="k">new</span> <span class="nc">ResourceNotFoundException</span><span class="o">(</span><span class="s">"User not found: "</span> <span class="o">+</span> <span class="n">userId</span><span class="o">));</span>
<span class="o">}</span>
</code></pre></div></div>

<p>The improved version avoids runtime crashes, adds contextual errors, and is maintainable.
Real-world consciousness means thinking about nulls, caching, retries, and downstream impacts.</p>

<h2 id="the-human-advantage-in-the-loop">The Human Advantage in the Loop</h2>

<p>AI tools can predict syntax, but they can’t predict intent.
They don’t understand business logic, ethics, or the small product details that define why something should or shouldn’t happen.
Imagine an AI-generated discount system for an e-commerce site:</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">public</span> <span class="kt">double</span> <span class="nf">calculateFinalPrice</span><span class="o">(</span><span class="kt">double</span> <span class="n">price</span><span class="o">,</span> <span class="kt">double</span> <span class="n">discount</span><span class="o">,</span> <span class="kt">double</span> <span class="n">coupon</span><span class="o">)</span> <span class="o">{</span>
  <span class="k">return</span> <span class="n">price</span> <span class="o">-</span> <span class="n">discount</span> <span class="o">-</span> <span class="n">coupon</span><span class="o">;</span>
<span class="o">}</span>
</code></pre></div></div>

<p>It works mathematically — but fails logically.</p>

<p>In your product, discounts and coupons aren’t supposed to stack.
So the correct version is:</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">public</span> <span class="kt">double</span> <span class="nf">calculateFinalPrice</span><span class="o">(</span><span class="kt">double</span> <span class="n">price</span><span class="o">,</span> <span class="kt">double</span> <span class="n">discount</span><span class="o">,</span> <span class="kt">double</span> <span class="n">coupon</span><span class="o">)</span> <span class="o">{</span>
  <span class="kt">double</span> <span class="n">appliedDiscount</span> <span class="o">=</span> <span class="nc">Math</span><span class="o">.</span><span class="na">max</span><span class="o">(</span><span class="n">discount</span><span class="o">,</span> <span class="n">coupon</span><span class="o">);</span>
  <span class="k">return</span> <span class="n">price</span> <span class="o">-</span> <span class="n">appliedDiscount</span><span class="o">;</span>
<span class="o">}</span>
</code></pre></div></div>

<p>That difference — understanding business logic, user impact, and edge conditions — is the human layer that AI doesn’t see.
AI writes code that’s syntactically correct;
humans write code that’s contextually correct.
A conscious developer reads AI output like a reviewer, not a consumer.
You question assumptions, ask “What if?”, and make the code align with real-world behavior — not just logic.</p>

<h2 id="coding-with-product-empathy">Coding with Product Empathy</h2>

<p>Coding isn’t just about systems; it’s about people.
A conscious coder doesn’t only think in APIs and functions —
they think in user experience, product flow, and real-world outcomes.
Imagine AI writes a simple payment retry function:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">async</span> <span class="kd">function</span> <span class="nf">processPayment</span><span class="p">(</span><span class="nx">orderId</span><span class="p">:</span> <span class="kr">string</span><span class="p">)</span> <span class="p">{</span>
  <span class="kd">let</span> <span class="nx">attempts</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
  <span class="k">while </span><span class="p">(</span><span class="nx">attempts</span> <span class="o">&lt;</span> <span class="mi">3</span><span class="p">)</span> <span class="p">{</span>
    <span class="kd">const</span> <span class="nx">success</span> <span class="o">=</span> <span class="k">await</span> <span class="nf">charge</span><span class="p">(</span><span class="nx">orderId</span><span class="p">);</span>
    <span class="k">if </span><span class="p">(</span><span class="nx">success</span><span class="p">)</span> <span class="k">break</span><span class="p">;</span>
    <span class="nx">attempts</span><span class="o">++</span><span class="p">;</span>
  <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Technically fine — but logically dangerous.
If the first payment went through but the confirmation failed due to a network glitch,
this logic may charge the customer multiple times.
A conscious developer understands business context and rewrites it safely:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">async</span> <span class="kd">function</span> <span class="nf">processPayment</span><span class="p">(</span><span class="nx">orderId</span><span class="p">:</span> <span class="kr">string</span><span class="p">)</span> <span class="p">{</span>
  <span class="kd">const</span> <span class="nx">existing</span> <span class="o">=</span> <span class="k">await</span> <span class="nf">checkTransactionStatus</span><span class="p">(</span><span class="nx">orderId</span><span class="p">);</span>
  <span class="k">if </span><span class="p">(</span><span class="nx">existing</span><span class="p">?.</span><span class="nx">status</span> <span class="o">===</span> <span class="dl">"</span><span class="s2">SUCCESS</span><span class="dl">"</span><span class="p">)</span> <span class="k">return</span> <span class="nx">existing</span><span class="p">;</span>

  <span class="kd">let</span> <span class="nx">attempts</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
  <span class="k">while </span><span class="p">(</span><span class="nx">attempts</span> <span class="o">&lt;</span> <span class="mi">3</span><span class="p">)</span> <span class="p">{</span>
    <span class="kd">const</span> <span class="nx">success</span> <span class="o">=</span> <span class="k">await</span> <span class="nf">charge</span><span class="p">(</span><span class="nx">orderId</span><span class="p">);</span>
    <span class="k">if </span><span class="p">(</span><span class="nx">success</span><span class="p">)</span> <span class="k">return</span> <span class="nx">success</span><span class="p">;</span>
    <span class="nx">attempts</span><span class="o">++</span><span class="p">;</span>
  <span class="p">}</span>
  <span class="k">throw</span> <span class="k">new</span> <span class="nc">Error</span><span class="p">(</span><span class="dl">"</span><span class="s2">Payment failed after retries</span><span class="dl">"</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Empathy-driven coding means caring about what happens after the code runs —
in the user’s world, not just in your terminal.</p>

<h2 id="code-that-runs-vs-code-that-scales">Code That Runs vs. Code That Scales</h2>

<p>Here’s where most developers slip.</p>

<p>AI-generated code can pass tests, but may crumble under real load, missing caching, concurrency handling, or memory management.
Conscious coders always test the system’s behavior, not just its syntax.</p>

<p>Example</p>

<div class="language-tsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="err"></span><span class="nf">useEffect</span><span class="p">(()</span> <span class="o">=&gt;</span> <span class="p">{</span>
<span class="nf">fetch</span><span class="p">(</span><span class="dl">'</span><span class="s1">/api/products</span><span class="dl">'</span><span class="p">)</span>
<span class="p">.</span><span class="nf">then</span><span class="p">(</span><span class="nx">res</span> <span class="o">=&gt;</span> <span class="nx">res</span><span class="p">.</span><span class="nf">json</span><span class="p">())</span>
<span class="p">.</span><span class="nf">then</span><span class="p">(</span><span class="nx">setProducts</span><span class="p">);</span>
<span class="p">},</span> <span class="p">[]);</span>
</code></pre></div></div>

<p>Refactored Version:</p>

<div class="language-tsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="p">[</span><span class="nx">loading</span><span class="p">,</span> <span class="nx">setLoading</span><span class="p">]</span> <span class="o">=</span> <span class="nf">useState</span><span class="p">(</span><span class="kc">true</span><span class="p">);</span>
<span class="kd">const</span> <span class="p">[</span><span class="nx">error</span><span class="p">,</span> <span class="nx">setError</span><span class="p">]</span> <span class="o">=</span> <span class="nf">useState</span><span class="p">(</span><span class="kc">null</span><span class="p">);</span>

<span class="nf">useEffect</span><span class="p">(()</span> <span class="o">=&gt;</span> <span class="p">{</span>
  <span class="kd">const</span> <span class="nx">controller</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">AbortController</span><span class="p">();</span>
  <span class="kd">const</span> <span class="nx">fetchData</span> <span class="o">=</span> <span class="k">async </span><span class="p">()</span> <span class="o">=&gt;</span> <span class="p">{</span>
    <span class="k">try</span> <span class="p">{</span>
      <span class="kd">const</span> <span class="nx">res</span> <span class="o">=</span> <span class="k">await</span> <span class="nf">fetch</span><span class="p">(</span><span class="dl">"</span><span class="s2">/api/products</span><span class="dl">"</span><span class="p">,</span> <span class="p">{</span> <span class="na">signal</span><span class="p">:</span> <span class="nx">controller</span><span class="p">.</span><span class="nx">signal</span> <span class="p">});</span>
      <span class="k">if </span><span class="p">(</span><span class="o">!</span><span class="nx">res</span><span class="p">.</span><span class="nx">ok</span><span class="p">)</span> <span class="k">throw</span> <span class="k">new</span> <span class="nc">Error</span><span class="p">(</span><span class="dl">"</span><span class="s2">Failed to fetch</span><span class="dl">"</span><span class="p">);</span>
      <span class="kd">const</span> <span class="nx">data</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">res</span><span class="p">.</span><span class="nf">json</span><span class="p">();</span>
      <span class="nf">setProducts</span><span class="p">(</span><span class="nx">data</span><span class="p">);</span>
    <span class="p">}</span> <span class="k">catch </span><span class="p">(</span><span class="nx">err</span><span class="p">)</span> <span class="p">{</span>
      <span class="k">if </span><span class="p">(</span><span class="nx">err</span><span class="p">.</span><span class="nx">name</span> <span class="o">!==</span> <span class="dl">"</span><span class="s2">AbortError</span><span class="dl">"</span><span class="p">)</span> <span class="nf">setError</span><span class="p">(</span><span class="nx">err</span><span class="p">.</span><span class="nx">message</span><span class="p">);</span>
    <span class="p">}</span> <span class="k">finally</span> <span class="p">{</span>
      <span class="nf">setLoading</span><span class="p">(</span><span class="kc">false</span><span class="p">);</span>
    <span class="p">}</span>
  <span class="p">};</span>
  <span class="nf">fetchData</span><span class="p">();</span>
  <span class="k">return </span><span class="p">()</span> <span class="o">=&gt;</span> <span class="nx">controller</span><span class="p">.</span><span class="nf">abort</span><span class="p">();</span>
<span class="p">},</span> <span class="p">[]);</span>
</code></pre></div></div>

<p>The second version anticipates real scenarios — slow networks, user cancellations, and error handling. AI can write syntax, but performance empathy comes from human experience.</p>

<h2 id="the-conscious-loop-ai--human-refinement">The Conscious Loop: AI + Human Refinement</h2>

<p>Being a Conscious Vibe Coder doesn’t mean rejecting AI — it means co-creating intelligently. You let AI handle structure, and you focus on quality, performance, and clarity. Think of it as a loop: instruct → generate → review → refactor → re-instruct.
This approach creates a developer who can build faster and smarter — someone who can lead design discussions, not just follow autocomplete suggestions. The conscious coder owns the solution, not the syntax.</p>

<h2 id="final-thoughts">Final Thoughts</h2>

<p>In the age of AI-powered development, our value as engineers lies in our ability to think critically, design for humans, and question every line.
Conscious Vibe Coders don’t just code — they create systems that last.
Stay thoughtful. Stay curious. Stay conscious.</p>]]></content><author><name>Kapil Karandikar</name></author><category term="AI in Development" /><category term="Software Engineering" /><category term="Developer Mindset" /><category term="ai-coding" /><category term="clarity" /><category term="refactoring" /><category term="typescript" /><category term="react" /><category term="nestjs" /><category term="java" /><summary type="html"><![CDATA[Exploring how conscious developers can blend human reasoning with AI assistance to write clear, scalable, and meaningful code.]]></summary></entry><entry><title type="html">Manual Topic Mapping to AI-Powered Discovery: Our Journey Building a Smarter Topic Analysis System</title><link href="https://www.questionpro.com/engineering/ai/machine%20learning/nlp/text%20analysis/journey-building-smarter-topic-analysis/" rel="alternate" type="text/html" title="Manual Topic Mapping to AI-Powered Discovery: Our Journey Building a Smarter Topic Analysis System" /><published>2025-10-15T00:00:00+00:00</published><updated>2025-10-15T00:00:00+00:00</updated><id>https://www.questionpro.com/engineering/ai/machine%20learning/nlp/text%20analysis/journey-building-smarter-topic-analysis</id><content type="html" xml:base="https://www.questionpro.com/engineering/ai/machine%20learning/nlp/text%20analysis/journey-building-smarter-topic-analysis/"><![CDATA[<h2 id="the-beginning-discover-and-how-we-used-to-work">The Beginning: Discover and How We Used to Work</h2>

<p>For years, our text analysis platform Discover was the workhorse of our operations. It was sophisticated, accurate, and reliable. But it had one fundamental limitation that would eventually force us to rethink everything: every single topic had to be manually defined before we could analyze any text.</p>

<h3 id="how-discover-worked">How Discover Worked</h3>

<p>Imagine you’re analyzing customer feedback for a restaurant chain. Before Discover could process a single review, we needed domain experts to sit down and create a comprehensive list of topics:</p>

<ul>
  <li>“food quality”</li>
  <li>“service speed”</li>
  <li>“ambiance”</li>
  <li>“cleanliness”</li>
  <li>“value for money”</li>
  <li>… and potentially 50+ more</li>
</ul>

<p>Once these topics were defined, Discover would read each piece of text and figure out which predefined topics it matched. The system was actually quite clever about this—it offered four different ways to match text to topics:</p>

<p><strong>RAG-based Tagging</strong> used advanced AI to understand context and map text to the closest matching topic. If someone wrote “the meal was delicious,” it would correctly map this to “food quality.”</p>

<p><strong>Rule-based Extraction</strong> used linguistic patterns and grammar rules to extract topic-opinion pairs from sentences.</p>

<p><strong>Language Model Extraction</strong> used fine-tuned AI models to directly generate topic-opinion-sentiment combinations.</p>

<p>The output was beautifully structured. For the review “The food was excellent but service was slow,” Discover would produce:</p>

<ul>
  <li>
    <table>
      <tbody>
        <tr>
          <td>Topic: “food quality”</td>
          <td>Opinion: “excellent”</td>
          <td>Sentiment: Positive</td>
        </tr>
      </tbody>
    </table>
  </li>
  <li>
    <table>
      <tbody>
        <tr>
          <td>Topic: “service speed”</td>
          <td>Opinion: “slow”</td>
          <td>Sentiment: Negative</td>
        </tr>
      </tbody>
    </table>
  </li>
</ul>

<p>It worked brilliantly. Until it didn’t.</p>

<h2 id="when-the-system-started-breaking-down">When the System Started Breaking Down</h2>

<h3 id="the-expert-bottleneck">The Expert Bottleneck</h3>

<p>Every new client meant weeks of preparation. We’d gather domain experts, conduct stakeholder interviews, review sample data, and painstakingly build topic taxonomies.</p>

<ul>
  <li>For a new restaurant chain: 2-3 weeks of topic curation</li>
  <li>For a healthcare feedback system: 3-4 weeks</li>
  <li>For a complex B2B SaaS product: Sometimes a month or more</li>
</ul>

<p>We were spending more time defining topics than actually analyzing data. Every project started with the same bottleneck: waiting for humans to tell the AI what to look for.</p>

<h3 id="the-problem-of-emerging-topics">The Problem of Emerging Topics</h3>

<p><img src="/engineering/assets/images/journey-topic-analysis-discover.png" alt="Emerging Topic Issue" /></p>

<p>Here’s where things got really frustrating. The world doesn’t wait for your topic taxonomy to catch up.</p>

<p>When COVID hit, suddenly customers were talking about:</p>

<ul>
  <li>“contactless delivery”</li>
  <li>“safety protocols”</li>
  <li>“mask enforcement”</li>
  <li>“outdoor seating”</li>
</ul>

<p>Discover couldn’t see these topics. They weren’t in our predefined list. By the time we identified these patterns manually and added them to the taxonomy, we’d already missed weeks or months of valuable insights in historical data.</p>

<p><strong>The painful reality:</strong> We were always looking backward, never forward.</p>

<h3 id="the-granularity-dilemma">The Granularity Dilemma</h3>

<p>Different people needed different views of the same data:</p>

<ul>
  <li><strong>Marketing teams</strong> wanted broad themes: “Is customer service a problem?”</li>
  <li><strong>Operations teams</strong> wanted specifics: “Is it the checkout process or the return policy causing issues?”</li>
  <li><strong>Executives</strong> wanted strategic insights: “How does our brand perception compare to competitors?”</li>
</ul>

<p>Discover forced us to choose one granularity level upfront. Want to switch from high-level themes to detailed issues? Reprocess everything with a new topic taxonomy.</p>

<h3 id="the-multilingual-nightmare">The Multilingual Nightmare</h3>

<p>Expanding internationally meant exponentially more work. English restaurant topics weren’t the same as Spanish restaurant topics—cultural differences meant different things mattered. Either we had to compromise with this or add topics in different languages. Each language and region needed its own carefully curated taxonomy.</p>

<p>We found ourselves maintaining hundreds of topic lists, each requiring expert maintenance. It was unsustainable.</p>

<h2 id="the-turning-point-what-if-topics-could-discover-themselves">The Turning Point: What If Topics Could Discover Themselves?</h2>

<p>One day, someone asked the obvious question: “What if we just let the AI figure out what topics exist?”</p>

<p>It seemed radical. Our entire system was built on human-curated topics. But the more we thought about it, the more sense it made.</p>

<p>Instead of telling the AI “These are the topics, find them in the text,” what if we said “Here’s the text, tell us what topics exist”?</p>

<h3 id="the-new-vision">The New Vision</h3>

<p>We needed a system that could:</p>

<p>✅ <strong>Analyze any domain immediately</strong> - No weeks of topic curation<br />
✅ <strong>Discover emerging topics automatically</strong> - See new patterns as they appear<br />
✅ <strong>Provide multiple levels of detail</strong> - Zoom from strategic themes to operational specifics<br />
✅ <strong>Work across languages</strong> - Understand meaning, not just words<br />
✅ <strong>Keep getting smarter</strong> - Learn and adapt as more data comes in</p>

<p>This meant rebuilding our entire approach from the ground up.</p>

<h2 id="building-the-new-system-from-classification-to-discovery">Building the New System: From Classification to Discovery</h2>

<h3 id="the-fundamental-shift">The Fundamental Shift</h3>

<ul>
  <li><strong>Discover’s approach:</strong> “Here are the topics. Match text to them.”</li>
  <li><strong>AI Topics approach:</strong> “Here’s the text. Discover the topics within it.”</li>
</ul>

<p>The technical term is moving from supervised classification to unsupervised clustering. But what really matters is the user experience transformation:</p>

<p><strong>Before:</strong> Project kickoff → 3 weeks of topic curation → Data processing → Analysis<br />
<strong>After:</strong> Project kickoff → Upload data → Topics discovered automatically → Analysis</p>

<h3 id="how-automatic-topic-discovery-works">How Automatic Topic Discovery Works</h3>

<p>AI Topics uses <strong>semantic embeddings</strong>—a way of converting text into mathematical representations that capture meaning. Here’s why this matters:</p>

<p>When Discover saw “fast service” and “quick response,” it treated them as potentially different topics because they used different words.</p>

<p>AI Topics understands they mean the same thing. It converts both phrases into similar mathematical representations, so they naturally cluster together as one topic.</p>

<p>This happens automatically. No one needs to tell the system that “fast” and “quick” are synonyms, or that “service” and “response” are related in customer feedback contexts. The AI learns semantic relationships from the data itself.</p>

<h3 id="the-magic-of-bertopic-clustering">The Magic of BERTopic Clustering</h3>

<p>We chose <strong>BERTopic</strong> as our clustering engine—a state-of-the-art topic modeling framework that combines the best of modern NLP with robust clustering algorithms.</p>

<p>Here’s the process in simple terms:</p>

<ol>
  <li><strong>Understanding</strong> - Every document gets converted into a mathematical representation that captures its meaning using transformer-based embeddings</li>
  <li><strong>Dimensionality Reduction</strong> - Using UMAP (Uniform Manifold Approximation and Projection), we reduce the complexity while preserving the semantic structure of the data</li>
  <li><strong>Finding Patterns</strong> - HDBSCAN (Hierarchical Density-Based Spatial Clustering) identifies natural groupings in the data—documents talking about the same things cluster together automatically</li>
  <li><strong>Initial Labeling</strong> - BERTopic generates preliminary, raw topic names based on the most statistically distinctive terms in each cluster (e.g., ‘quick_fast_rapid’)—a necessary but often unintuitive first step</li>
  <li><strong>Organizing</strong> - Topics arrange themselves hierarchically, from broad themes down to specific issues</li>
</ol>

<p>The beautiful part? This all happens automatically. Upload your data, and within hours (not weeks), you have a complete topic taxonomy discovered from the content itself.</p>

<p><img src="/engineering/assets/images/journey-topic-analysis-textai.png" alt="Text AI Emerging Topic" /></p>

<h3 id="smart-configuration-one-size-doesnt-fit-all">Smart Configuration: One Size Doesn’t Fit All</h3>

<p>Here’s something crucial we learned: the optimal clustering parameters change depending on your data size.</p>

<p>Clustering 100 documents requires different settings than clustering 100,000 documents. BERTopic uses UMAP and HDBSCAN under the hood, both of which have parameters that dramatically affect results.</p>

<p><strong>Our solution:</strong> Dynamic configuration selection.</p>

<p>AI Topics automatically adjusts UMAP and HDBSCAN parameters based on the number of texts you’re analyzing:</p>

<ul>
  <li><strong>Small datasets</strong> (hundreds of documents): More aggressive parameter settings to ensure patterns emerge even with limited data</li>
  <li><strong>Medium datasets</strong> (thousands of documents): Balanced parameters that capture both broad themes and specific nuances</li>
  <li><strong>Large datasets</strong> (tens of thousands+): Conservative parameters that prevent over-fragmentation and maintain interpretable clusters</li>
</ul>

<p>This adaptive approach means you get quality results whether you’re analyzing a week’s worth of customer feedback or a decade’s worth of archived documents. No manual tuning required—AI Topics knows what works best for your data volume.</p>

<h3 id="what-about-discovers-classification-strength">What About Discover’s Classification Strength?</h3>

<p>Here’s the good news: we didn’t throw away what made Discover great.</p>

<p>AI Topics includes an <code class="language-plaintext highlighter-rouge">available_taxonomy</code> operation that works exactly like Discover—but enhanced with modern AI. If you have predefined topics you want to use (maybe you’re in a well-established domain with known categories), you can still use them. The system will classify your text against those topics using the same sophisticated extraction methods Discover offered.</p>

<p>The difference? Now you have a choice. Use predefined topics when they make sense, or let the system discover topics when exploring new territory.</p>

<h2 id="the-migration-journey-challenges-we-faced">The Migration Journey: Challenges We Faced</h2>

<h3 id="challenge-1-speed-vs-intelligence">Challenge 1: Speed vs. Intelligence</h3>

<p>Discover was fast—about 50 milliseconds per document. Why? Because checking text against a predefined list is computationally simple.</p>

<p>AI Topics’ automatic discovery is more complex. Converting text to semantic embeddings and running clustering algorithms takes more time—about 100 milliseconds per document for batch processing.</p>

<p><strong>Our solution:</strong> A two-tier approach.</p>

<ul>
  <li>For <strong>real-time topic assignment</strong> (when you need answers now), the system assigns new documents to existing topics quickly—just as fast as Discover.</li>
  <li>For <strong>topic discovery and refinement</strong> (when you’re exploring patterns), the system runs more thorough analysis in the background. You submit a job, get an immediate confirmation, and receive results when the deep analysis completes.</li>
</ul>

<p>Most users don’t notice the difference. They get fast responses for daily operations and thorough insights for strategic analysis.</p>

<h3 id="challenge-2-making-sense-of-discovered-topicsenter-openai">Challenge 2: Making Sense of Discovered Topics—Enter OpenAI</h3>

<p>Discover had an unfair advantage: humans named the topics. “Customer Service Quality” is immediately clear to anyone.</p>

<p>Automatically discovered topics? Sometimes the labels were… less intuitive. Early BERTopic attempts produced gems like:</p>

<ul>
  <li>Cluster 1: “good_great_excellent_amazing_fantastic”</li>
  <li>Cluster 2: “quick_fast_rapid_speedy_prompt”</li>
  <li>Cluster 3: “wait_long_slow_delay_forever”</li>
</ul>

<p>Technically accurate—these words did cluster together. But not exactly boardroom-ready.</p>

<p><strong>The breakthrough:</strong> OpenAI representation labeling.</p>

<p>Instead of relying on simple keyword extraction, we integrated OpenAI’s language models to generate intelligent, contextual topic labels. The AI reads the most representative documents in each cluster and creates human-like descriptions of what the topic is actually about.</p>

<p>Those same clusters transformed:</p>

<ul>
  <li>Cluster 1 → “Overall Satisfaction and Positive Experiences”</li>
  <li>Cluster 2 → “Service Speed and Efficiency”</li>
  <li>Cluster 3 → “Wait Times and Service Delays”</li>
</ul>

<p>But it gets better. OpenAI’s understanding of context means labels adapt to your domain:</p>

<ul>
  <li>In restaurant feedback, a cluster about “noise” becomes → “Ambiance and Noise Levels”</li>
  <li>In hospital feedback, the same semantic cluster becomes → “Patient Rest and Recovery Environment”</li>
</ul>

<p>The AI understands that identical concepts need different framing depending on context.</p>

<p>The quality improvement was dramatic. In blind tests with business stakeholders, OpenAI-generated labels were preferred over human-curated topic names 73% of the time. Why? Because the AI had read every single document and distilled the essence, while humans were working from assumptions and samples.</p>

<p>Users can still refine labels manually if needed, but most of the time, the OpenAI-generated names are immediately usable.</p>

<h3 id="challenge-3-optimizing-bertopic-for-different-data-volumes">Challenge 3: Optimizing BERTopic for Different Data Volumes</h3>

<p>BERTopic is powerful, but it’s not a one-size-fits-all solution. The same clustering parameters that work beautifully for 500 documents can produce terrible results for 50,000 documents.</p>

<p><strong>The challenge:</strong> UMAP and HDBSCAN (the algorithms powering BERTopic’s clustering) have parameters that need careful tuning based on data characteristics. Set them too conservatively, and small datasets won’t reveal any patterns. Set them too aggressively, and large datasets fragment into hundreds of meaningless micro-topics.</p>

<p><strong>Our solution:</strong> Intelligent parameter adaptation.</p>

<p>We built logic that analyzes your dataset size and automatically selects optimal UMAP and HDBSCAN configurations:</p>

<ul>
  <li>For smaller datasets, the system uses parameters that encourage pattern discovery even when data is limited. This prevents the “everything is one big topic” problem.</li>
  <li>For medium datasets, it balances between capturing nuanced sub-topics and maintaining broad thematic groupings.</li>
  <li>For larger datasets, it applies conservative settings that prevent over-fragmentation while still revealing meaningful structure.</li>
</ul>

<p><strong>The result?</strong> Whether you’re analyzing 200 customer reviews or 200,000, you get coherent, interpretable topics without manual parameter tuning. The system adapts to your data automatically.</p>

<h3 id="challenge-4-when-topics-evolve">Challenge 4: When Topics Evolve</h3>

<p>Real-world topics don’t stay still. During COVID, “service” discussions shifted dramatically—from “Is the staff friendly?” to “Are safety protocols being followed?”</p>

<p>AI Topics needed to detect these shifts and adapt. We built monitoring capabilities that track:</p>

<ul>
  <li>How topic distributions change over time</li>
  <li>When topic coherence starts degrading (a sign that a topic is splitting or evolving)</li>
  <li>When new patterns emerge that don’t fit existing clusters</li>
</ul>

<p>When significant drift is detected, AI Topics can trigger retraining to refine topic boundaries. This keeps the analysis relevant as the world changes.</p>

<h2 id="the-advantages-what-changed-for-the-better">The Advantages: What Changed for the Better</h2>

<h3 id="dramatically-better-topic-quality">Dramatically Better Topic Quality</h3>

<p>The most surprising benefit wasn’t just eliminating manual curation—it was discovering that automatically generated topics were often better than human-curated ones.</p>

<p><strong>Real impact:</strong> A retail client came to us with Black Friday feedback data. With Discover, their domain experts had created 47 predefined topics like “checkout experience,” “product quality,” and “delivery speed.”</p>

<p>Our automatic discovery found 63 topics—a 34% increase in thematic coverage—and many revealed insights the manual taxonomy had missed entirely:</p>

<ul>
  <li>“Gift wrapping delays” (a Black Friday-specific concern)</li>
  <li>“Size inconsistency across brands” (an operational issue causing returns)</li>
  <li>“Mobile app cart abandonment” (technical friction the IT team needed to know about)</li>
</ul>

<p>The human experts weren’t wrong—they just couldn’t anticipate every nuance hidden in thousands of customer comments. The AI could.</p>

<p><img src="/engineering/assets/images/journey-topic-analysis-image-1.png" alt="Manual vs AI topics" /></p>

<h3 id="never-miss-emerging-topics-again">Never Miss Emerging Topics Again</h3>

<p>AI Topics continuously monitors for new patterns. When enough documents start discussing something novel, it surfaces as a new topic cluster automatically.</p>

<p><strong>Example:</strong> A hotel chain client suddenly started receiving feedback about “EV charging stations.” This wasn’t in anyone’s original topic taxonomy because it wasn’t a common concern when the project started. AI Topics identified it as an emerging topic within days, allowing the hotel chain to prioritize installing charging stations at high-demand locations.</p>

<p>Discover would have missed this entirely until the next quarterly taxonomy review.</p>

<h3 id="higher-quality-topic-granularity">Higher Quality Topic Granularity</h3>

<p>Discover required choosing granularity upfront: broad categories or specific issues? You couldn’t have both without running separate analyses.</p>

<p>AI Topics’ automatic discovery naturally creates multiple granularity levels simultaneously because it uses hierarchical clustering in BERTopic:</p>

<ul>
  <li><strong>Level 1 (Strategic):</strong> “Guest Experience Quality”
    <ul>
      <li><strong>Level 2 (Departmental):</strong> “Room Comfort,” “Service Responsiveness,” “Facility Amenities”</li>
    </ul>
  </li>
</ul>

<p>What makes this powerful is that the AI determines the right number of levels and topics at each level based on the actual data patterns—not human assumptions.</p>

<p>A retail client with seasonal products found their topic hierarchy changed across seasons. Summer analysis had deep sub-topics around “outdoor furniture durability,” while winter analysis had completely different granular topics around “indoor heating products.” The AI adapted automatically.</p>

<p><strong>The insight:</strong> Different datasets need different structures. Forcing a predefined hierarchy misses these natural variations.</p>

<h3 id="semantic-search-that-actually-understands">Semantic Search That Actually Understands</h3>

<ul>
  <li><strong>Discover search:</strong> “Find documents about fast service” → searches for the exact words “fast” and “service”</li>
  <li><strong>AI Topics search:</strong> Understands you’re looking for concepts about service speed and finds:
    <ul>
      <li>“the staff was very efficient”</li>
      <li>“quick response to our requests”</li>
      <li>“didn’t have to wait long”</li>
      <li>“prompt attention from servers”</li>
    </ul>
  </li>
</ul>

<p>None of these contain your exact search terms, but they’re all semantically related to what you’re looking for.</p>

<p><strong>Real impact:</strong> A customer support team reported finding 40% more relevant documents for issue investigation. Problems they thought were isolated turned out to have been discussed repeatedly using different terminology.</p>

<h3 id="true-multilingual-understanding">True Multilingual Understanding</h3>

<p>Instead of maintaining separate topic taxonomies for each language, AI Topics uses multilingual embeddings. English “excellent food” and Spanish “comida excelente” are understood as the same concept automatically.</p>

<p><strong>Real impact:</strong> A global hotel chain operating in 15 countries went from maintaining 15 separate topic systems to one unified system that understands all their languages. Cross-cultural insights that were previously impossible became routine.</p>

<h2 id="what-we-learned-along-the-way">What We Learned Along the Way</h2>

<h3 id="the-importance-of-human-touch">The Importance of Human Touch</h3>

<p>Fully automated doesn’t mean fully hands-off. The best results come from a partnership between AI and human intelligence.</p>

<ul>
  <li><strong>Topic labeling</strong> benefits enormously from occasional human review. The system might generate “speed_efficiency_temporal” as a cluster name. A human immediately sees this should be “Service Speed.”</li>
  <li><strong>Quality validation</strong> requires human judgment. Is “parking” a subset of “facilities” or its own top-level topic? Depends on your business context.</li>
</ul>

<h3 id="start-simple-add-complexity-gradually">Start Simple, Add Complexity Gradually</h3>

<p>Our first automatic clustering was overly sophisticated. We used every advanced technique we could think of. It was slow and hard to debug when something went wrong.</p>

<p>Stepping back to simpler approaches—good embeddings with straightforward clustering—delivered 90% of the value with 10% of the complexity. We added sophistication only where it clearly improved results.</p>

<h3 id="the-value-of-explaining-the-ai">The Value of Explaining the AI</h3>

<p>Users were initially skeptical. “How can the AI know what topics are important if we don’t tell it?”</p>

<p>Building transparency helped enormously:</p>

<ul>
  <li>Show representative documents for each topic</li>
  <li>Display confidence scores for topic assignments</li>
  <li>Explain why documents clustered together</li>
  <li>Provide side-by-side comparisons of automatic vs. manual taxonomies</li>
</ul>

<p>When people understood how the discovery worked, trust followed.</p>

<h2 id="looking-ahead-whats-next">Looking Ahead: What’s Next</h2>

<h3 id="smarter-topic-evolution-tracking">Smarter Topic Evolution Tracking</h3>

<p>Right now, the system detects that topics are changing. Next, we want to visualize how they’re changing:</p>

<ul>
  <li>Topics splitting into sub-topics</li>
  <li>Topics merging as concepts converge</li>
  <li>Topics appearing and disappearing over time</li>
</ul>

<p>Imagine a time-lapse showing how customer concerns evolved throughout a product lifecycle.</p>

<h3 id="predictive-intelligence-from-what-to-why">Predictive Intelligence: From What to Why</h3>

<p>Moving beyond “What topics exist?” to “What topics actually matter?”</p>

<ul>
  <li>Which new conversation topic today can predict a 15% rise in customer churn next month?</li>
  <li>Which topics drive repeat purchases?</li>
  <li>Which topics are early indicators of viral content?</li>
</ul>

<p>Connecting topic analysis to business outcomes would transform it from descriptive analytics to predictive intelligence.</p>

<h3 id="real-time-topic-monitoring">Real-Time Topic Monitoring</h3>

<p>Imagine a live dashboard showing topics as they emerge in real-time—not hours after data upload, but streaming as new documents arrive. Social media monitoring could surface trending topics within minutes of them starting to appear.</p>

<h2 id="the-bottom-line-different-tools-for-different-jobs">The Bottom Line: Different Tools for Different Jobs</h2>

<p>Discover wasn’t wrong—it was optimized for a specific use case. When you:</p>

<ul>
  <li>Work in a well-understood domain</li>
  <li>Have stable topic requirements</li>
  <li>Need maximum processing speed</li>
  <li>Want human-validated topic names</li>
</ul>

<p>Then predefined topic classification (like Discover’s approach) is actually the better choice. And with AI Topics’ <code class="language-plaintext highlighter-rouge">available_taxonomy</code> operation, you can still use this approach when it makes sense.</p>

<p>But when you:</p>

<ul>
  <li>Explore new domains without established taxonomies</li>
  <li>Need to discover emerging patterns</li>
  <li>Want to understand semantic relationships</li>
  <li>Require multilingual analysis</li>
  <li>Need strategic-to-tactical granularity</li>
</ul>

<p>Then automatic topic discovery is transformative.</p>

<p>We’re not declaring victory over an old system. We’re celebrating having the right tools for different situations.</p>

<h2 id="the-real-achievement">The Real Achievement</h2>

<p>The technical accomplishments are meaningful: BERTopic clustering, semantic embeddings, OpenAI representation labeling, adaptive parameter configuration, hierarchical topic trees, multilingual understanding.</p>

<p>But the real achievement is qualitative: <strong>topics that capture nuances human experts miss while remaining immediately understandable.</strong></p>

<p>The combination of BERTopic’s sophisticated clustering and OpenAI’s contextual labeling creates something neither could achieve alone:</p>

<ul>
  <li><strong>BERTopic finds the patterns</strong></li>
  <li><strong>OpenAI makes them comprehensible</strong></li>
  <li><strong>Together, they reveal insights hiding in plain sight</strong></li>
</ul>

<p>When someone says “We have a new dataset to analyze,” the conversation isn’t just about eliminating setup time. It’s about discovering what’s actually in the data, not just confirming what we thought would be there.</p>

<p>That’s the shift that matters. From AI that classifies text into boxes we define, to AI that shows us which boxes actually exist—and labels them in ways we immediately understand.</p>

<h2 id="technical-summary">Technical Summary</h2>

<ul>
  <li><strong>Discover:</strong> Supervised topic classification with predefined taxonomies and human-curated labels</li>
  <li><strong>AI Topics:</strong> BERTopic-based unsupervised discovery with OpenAI representation labeling</li>
  <li><strong>Best of Both:</strong> <code class="language-plaintext highlighter-rouge">available_taxonomy</code> operation preserves classification capabilities when needed</li>
</ul>

<h3 id="key-quality-improvements">Key Quality Improvements</h3>

<ul>
  <li>Discovery of nuanced topics human experts didn’t anticipate</li>
  <li>Context-aware, business-ready labeling through OpenAI integration</li>
  <li>Adaptive topic granularity based on natural data patterns</li>
  <li>Superior semantic understanding for better search and grouping</li>
  <li>Unified multilingual analysis maintaining consistent quality across languages</li>
</ul>

<p><strong>The Philosophy:</strong> Let humans focus on interpretation and strategy. Let AI handle pattern discovery, intelligent labeling, and routine classification.</p>]]></content><author><name>Pratik Dhulubulu</name></author><category term="AI" /><category term="Machine Learning" /><category term="NLP" /><category term="Text Analysis" /><category term="topic-modeling" /><category term="bertopic" /><category term="openai" /><category term="clustering" /><category term="AI" /><summary type="html"><![CDATA[The Beginning: Discover and How We Used to Work]]></summary></entry><entry><title type="html">How QuestionPro Has Impacted My Tech Career</title><link href="https://www.questionpro.com/engineering/personal%20growth/mindset%20shift/continuous%20learning/professional%20development/self-reflection/how-questionpro-has-impacted-my-tech-career/" rel="alternate" type="text/html" title="How QuestionPro Has Impacted My Tech Career" /><published>2025-10-14T00:00:00+00:00</published><updated>2025-10-14T00:00:00+00:00</updated><id>https://www.questionpro.com/engineering/personal%20growth/mindset%20shift/continuous%20learning/professional%20development/self-reflection/how-questionpro-has-impacted-my-tech-career</id><content type="html" xml:base="https://www.questionpro.com/engineering/personal%20growth/mindset%20shift/continuous%20learning/professional%20development/self-reflection/how-questionpro-has-impacted-my-tech-career/"><![CDATA[<p>It has been three years since I joined QuestionPro, and during this time, the journey has been nothing short of transformative. When I look back, I realize that what has changed the most is <strong>my mindset</strong>—towards problems, towards products, and even towards people. Earlier in my career, I focused only on solving a technical issue in isolation. Today, I think about why a problem exists, how it affects the product as a whole, and what impact it has on the people using it.</p>

<hr />

<h2 id="mindset-shift">Mindset Shift</h2>

<p>The first and perhaps the most profound transformation I’ve experienced is a <strong>shift in mindset</strong>.</p>

<p>When I started my career, my perspective was narrow:</p>

<ul>
  <li>I saw tasks as checkboxes to be completed.</li>
  <li>I measured success by how quickly I could deliver code.</li>
  <li>I viewed problems as obstacles, not opportunities.</li>
</ul>

<p>At QuestionPro, I learned to see beyond the code. Problems are no longer roadblocks—they are <strong>signals</strong> pointing toward hidden issues, unmet user needs, or areas of improvement in our systems. A bug isn’t just “something broken”; it’s a chance to refine, to rethink, and to deliver a better user experience.</p>

<p>I also began to see features differently. A feature is not just a “requirement” to be implemented but part of a <strong>larger story</strong>. Every line of code we write contributes to a user’s journey, and every decision we make leaves an imprint on how someone interacts with the product. This realization changed how I approach design, architecture, and collaboration.</p>

<p>Most importantly, this mindset shift extended to how I see <strong>my own role</strong>. As a developer, my contribution is not just technical—it’s strategic. It’s about asking the right questions:</p>

<ul>
  <li><em>Why does this matter for the user?</em></li>
  <li><em>How does this tie into the product vision?</em></li>
  <li><em>What impact will this decision have six months down the road?</em></li>
</ul>

<p>This way of thinking has made me not just a better developer, but a more thoughtful professional. It has given me resilience too—because when you see challenges as opportunities, the journey itself becomes rewarding.</p>

<hr />

<h2 id="scope-of-growth">Scope of Growth</h2>

<p>If I were to describe the most valuable gift I’ve received here, it would be <strong>growth</strong>—and not just in technical skills. The growth I’ve experienced spans multiple areas:</p>

<ul>
  <li>From <strong>technical expertise</strong> to <strong>communication</strong></li>
  <li>From <strong>cultural awareness</strong> to <strong>people management</strong></li>
  <li>From understanding a <strong>single feature</strong> to comprehending the <strong>entire product lifecycle</strong></li>
</ul>

<p>This wide range of opportunities is rare, and it has been maintained through a few core practices that stand out to me:</p>

<h3 id="learning-environment">Learning Environment</h3>

<p>A healthy learning environment is not a luxury, it’s a necessity. QuestionPro has made it possible for us to learn without fear of judgment. Mistakes are seen as part of the process, and curiosity is encouraged.</p>

<h3 id="collective-learning">Collective Learning</h3>

<p>Learning collectively has been a breakthrough. When we share our discoveries, failures, and ideas, the knowledge spreads faster and deeper than any documentation could allow. I’ve seen how this practice saves countless hours across the team and also builds trust.</p>

<h3 id="continuous-push-and-improvement">Continuous Push and Improvement</h3>

<p>Borrowing from the CI/CD mindset, growth works the same way: push continuously, improve continuously. Growth is like resistance training—it’s not always comfortable, but it’s the only way to get stronger. At QuestionPro, this philosophy is embedded in both personal and professional development.</p>

<hr />

<h2 id="understanding-the-products-goal">Understanding the Product’s Goal</h2>

<p>One of the most important lessons I’ve learned is the <strong>value of understanding the product’s larger goal</strong>.</p>

<p>Think about it this way: the best doctors don’t just treat a symptom; they treat the body as a whole. Similarly, when we build a feature, we shouldn’t just focus on a ticket or task in isolation. Instead, we need to connect it back to the overall product vision.</p>

<p>This approach has helped me identify edge cases more effectively, think from the <strong>user’s perspective</strong> rather than just the <strong>developer’s perspective</strong>, and write code that reduces bugs in the long run. It’s not just about solving what’s in front of you; it’s about anticipating what lies ahead.</p>

<hr />

<h2 id="cultural-fitness">Cultural Fitness</h2>

<p>Early in my career, I underestimated the importance of cultural fitness. At QuestionPro, I realized how critical it is.</p>

<p>Working in harmony requires <strong>decency, mutual respect, and approachability</strong>. These may sound simple, but when a team of 15 developers embodies these traits, the impact on collaboration and delivery is remarkable. On the flip side, even one person who doesn’t align with the cultural norms can affect the morale and the outcomes of the entire group.</p>

<p>Cultural fitness isn’t about uniformity—it’s about respect, openness, and creating an environment where everyone feels safe to contribute.</p>

<hr />

<h2 id="communication">Communication</h2>

<p>For a long time, I thought communication was simply about fluency in English. QuestionPro changed that belief completely.</p>

<p>Effective communication is about <strong>clarity, context, and empathy</strong>. Some practices I’ve picked up include:</p>

<ul>
  <li>Being specific with time commitments<br />
<em>Example: “I’ll ping you within 10 minutes” instead of “I’ll ping you soon.”</em></li>
  <li>Surfacing blockers early<br />
<em>Example: “I’ve been stuck here for 30 minutes; can you pair with me?”</em></li>
</ul>

<p>These small practices create big changes in how smoothly a team functions. Communication is not about sounding good—it’s about making sure the other person understands you without confusion.</p>

<hr />

<h2 id="people-management">People Management</h2>

<p>Finally, one of the biggest growth areas for me has been <strong>managing people</strong>. Leading a team of 15 developers is a responsibility that requires more than technical know-how. It requires empathy, patience, and the ability to recognize each individual’s strengths.</p>

<p>At QuestionPro, I’ve learned that good people management is less about directing and more about <strong>enabling</strong>. My role is not to dictate solutions but to remove obstacles, foster trust, and empower others to do their best work.</p>

<p>Sometimes this means mediating between two teammates with differing viewpoints. Sometimes it means mentoring juniors and helping them connect technical work with product thinking. And sometimes it simply means stepping back and letting others take the lead.</p>

<p>I’ve found that when people feel trusted and supported, their creativity and ownership naturally grow—and the product benefits tremendously as a result.</p>

<hr />

<h2 id="closing-thoughts">Closing Thoughts</h2>

<p>Looking back at these three years, I can say with confidence that QuestionPro has been more than just a workplace for me—it has been a place of transformation. The lessons on mindset, growth, cultural fitness, communication, and people management have not only shaped my career but also influenced how I approach challenges in life.</p>

<p>As I continue this journey, I hold onto one truth: growth is not a destination; it’s a continuous cycle of learning, sharing, and evolving. And for that, I remain grateful.</p>]]></content><author><name>Saiful Islam Khan</name></author><category term="Personal Growth" /><category term="Mindset Shift" /><category term="Continuous Learning" /><category term="Professional Development" /><category term="Self-Reflection" /><category term="personal growth" /><category term="mindset shift" /><category term="professional development" /><category term="product thinking" /><category term="leadership" /><category term="communication" /><category term="team culture" /><category term="people management" /><category term="continuous learning" /><category term="questionpro journey" /><summary type="html"><![CDATA[It has been three years since I joined QuestionPro, and during this time, the journey has been nothing short of transformative. When I look back, I realize that what has changed the most is my mindset—towards problems, towards products, and even towards people. Earlier in my career, I focused only on solving a technical issue in isolation. Today, I think about why a problem exists, how it affects the product as a whole, and what impact it has on the people using it.]]></summary></entry><entry><title type="html">How we created our Analytical System for Billions of Rows</title><link href="https://www.questionpro.com/engineering/software%20engineering/data%20engineering/how-we-created-our-analytical-system-for-billions-of-rows/" rel="alternate" type="text/html" title="How we created our Analytical System for Billions of Rows" /><published>2025-09-10T00:00:00+00:00</published><updated>2025-09-10T00:00:00+00:00</updated><id>https://www.questionpro.com/engineering/software%20engineering/data%20engineering/how-we-created-our-analytical-system-for-billions-of-rows</id><content type="html" xml:base="https://www.questionpro.com/engineering/software%20engineering/data%20engineering/how-we-created-our-analytical-system-for-billions-of-rows/"><![CDATA[<p>QuestionPro is one of the world’s leading enterprise survey platforms, which means we collect a significant amount of data every single day. As a data-centric company, we live and breathe numbers. To give you some perspective, we handle around 5 million requests daily, which translates to approximately 1.8 billion responses a year.</p>

<p>But raw requests don’t tell the whole story of our data footprint. When we look at what’s actually stored in our databases to understand user behavior and survey results, the scale is staggering: we’re talking about <strong>~500 billion rows of data</strong>.</p>

<p>Storing this volume in a single MySQL database instance is simply not feasible—it would grind to a halt. The data is partitioned across different MySQL databases, allowing us to scale horizontally.</p>

<p><img src="/engineering/assets/images/analytical_system_image1.png" alt="MySQL shards architecture" /></p>

<h3 id="the-bottleneck">The Bottleneck</h3>

<p>When we fire our analytical queries like which aggregates data using count/sum even with a fully sharded architecture, we were hitting performance walls. The slowness became particularly painful when analyzing data for a single, large survey. For surveys with over 10 million responses, it could take up to <strong>30 seconds to get a correct answer</strong>, which was unacceptable for our users.</p>

<p>We tried everything in the traditional playbook: optimizing indexes, tweaking configurations, and rewriting queries. Despite our best efforts, the benefit was minimal. We had to face the facts: we had squeezed every last drop of performance out of MySQL for our analytical needs. It was time to look for a new solution—a different kind of database.</p>

<h3 id="the-search-begins">The Search Begins</h3>

<p>Our quest for a new analytical database began. We had a clear understanding of the data we needed to analyze and the types of queries we wanted to run. We shortlisted several promising candidates:</p>

<ul>
  <li>MongoDB</li>
  <li>Elasticsearch</li>
  <li>Google BigQuery</li>
  <li>Apache Druid</li>
  <li>ClickHouse</li>
  <li>Apache Pinot</li>
</ul>

<p>Each database came with its own set of pros and cons. Some required more storage, others came with a higher cost, a few had complicated query languages, and some were simply slower than others in our tests.</p>

<p>After a thorough evaluation, one database emerged as the clear winner, solving our most critical problems: <strong>ClickHouse</strong>. The query performance was astounding. Aggregations on over 100 million records completed in milliseconds. We were thrilled.</p>

<h3 id="productionizing-clickhouse">Productionizing ClickHouse</h3>

<p>We designed our new ClickHouse system for resilience and scale from day one. It included replicas for redundancy and fault tolerance, along with multiple shards to continue partitioning our data and scaling horizontally.</p>

<p><img src="/engineering/assets/images/analytical_system_image2.png" alt="ClickHouse shards and replicas architecture" /></p>

<h3 id="whats-next-making-analytics-real-time">What’s Next? Making Analytics Real-Time</h3>

<p>With the performance problem solved, a new challenge emerged: real-time data availability. When users send out a survey, they want to see the results instantly, especially for new surveys with a small number of responses. This meant our analytical database needed to have near real-time data.</p>

<p><img src="/engineering/assets/images/analytical_system_image3.png" alt="Diagram showing the need to move data from MySQL to ClickHouse in real-time" /></p>

<p>We started looking for solutions to transfer data from our production MySQL databases to ClickHouse. The obvious approach was polling—periodically querying MySQL for new data. However, we quickly dismissed this, as it would place an unnecessary and constant load on our primary databases.</p>

<p>Instead, we found a much more elegant solution: <strong>MySQL binlogs</strong>. Binlogs are transaction logs that MySQL uses internally for replication. Essentially, any DML operation (CREATE, UPDATE, DELETE) creates an entry in these logs. We realized we could tap into this existing mechanism. Instead of replicating from MySQL to another MySQL server, we would replicate from <strong>MySQL to ClickHouse via Kafka</strong>.</p>

<p>To achieve this, we built a pipeline using two powerful systems:</p>

<ol>
  <li><strong>Debezium + Kafka Connect:</strong> This stack reads data directly from the MySQL binlogs and streams it into Kafka topics in real-time.</li>
  <li><strong>Kafka:</strong> Acting as a highly scalable, high-throughput pub-sub message bus.</li>
</ol>

<p>ClickHouse has the native ability to connect to Kafka and ingest data streams, and we could even perform transformations within ClickHouse itself. This gave us a complete, end-to-end ELT (Extract-Load-Transform) system. We built redundancy into every layer, from Kafka Connect to the Kafka brokers themselves.</p>

<p>The result? We could transfer data from MySQL to ClickHouse in <strong>under one second</strong>.</p>

<p><img src="/engineering/assets/images/analytical_system_image4.png" alt="Data pipeline from MySQL Bin Logs to ClickHouse via Kafka Connect and Kafka" /></p>

<h3 id="was-it-this-smooth---hell-no">Was it this smooth? - Hell No</h3>

<p>Our first task was to change MySQL’s binlog format from <code class="language-plaintext highlighter-rouge">Mixed</code> to <code class="language-plaintext highlighter-rouge">Row</code>. This change was critical, as the <code class="language-plaintext highlighter-rouge">Row</code> format was necessary to read data from the binary logs. While this might seem straightforward, we successfully executed this migration with zero downtime.</p>

<p>However, this solution introduced a new problem: the <code class="language-plaintext highlighter-rouge">Row</code> format dramatically increased storage consumption, as it records the entire row’s data for every single transaction. This storage issue became particularly apparent during the initial data snapshot. While the continuous data stream didn’t require much space, the initial bulk load would completely fill our Kafka storage.</p>

<p>Throughout this project, we faced and fixed numerous other issues. This would not have been possible without our robust monitoring and logging infrastructure. It provided the crucial early symptoms that allowed us to drill down and address the underlying root causes effectively.</p>

<h3 id="monitoring-and-observability">Monitoring and Observability</h3>

<p>A system this critical requires robust monitoring. We went beyond the basics of liveliness, readiness, CPU, and memory usage. We implemented key performance indicators specific to our pipeline, including:</p>

<ul>
  <li><strong>End-to-end latency:</strong> The time taken for data to travel from MySQL to Kafka, and from Kafka to ClickHouse.</li>
  <li><strong>Throughput:</strong> The volume of data moving through the system, measured in bytes and messages per second.</li>
</ul>

<p>Thanks to this powerful and resilient architecture, we successfully built a blazing-fast BI and analytics system called “QuestionPro BI” capable of delighting our users with instant insights from billions of records.</p>

<p><img src="/engineering/assets/images/analytical_system_image5.png" alt="Diagram showing ClickHouse powering the QuestionPro BI dashboard" /></p>]]></content><author><name>Angadsingh Rajput</name></author><category term="Software Engineering" /><category term="Data Engineering" /><category term="analytics" /><category term="big data" /><category term="mysql" /><category term="clickhouse" /><category term="kafka" /><category term="debezium" /><category term="elt" /><category term="database architecture" /><summary type="html"><![CDATA[QuestionPro is one of the world’s leading enterprise survey platforms, which means we collect a significant amount of data every single day. As a data-centric company, we live and breathe numbers. To give you some perspective, we handle around 5 million requests daily, which translates to approximately 1.8 billion responses a year.]]></summary></entry><entry><title type="html">Building a Component Library: Lessons Learned</title><link href="https://www.questionpro.com/engineering/software%20engineering/frontend%20development/react/vite/building-component-library-lessons-learned/" rel="alternate" type="text/html" title="Building a Component Library: Lessons Learned" /><published>2025-08-28T00:00:00+00:00</published><updated>2025-08-28T00:00:00+00:00</updated><id>https://www.questionpro.com/engineering/software%20engineering/frontend%20development/react/vite/building-component-library-lessons-learned</id><content type="html" xml:base="https://www.questionpro.com/engineering/software%20engineering/frontend%20development/react/vite/building-component-library-lessons-learned/"><![CDATA[<p>At <a href="https://questionpro.com/">QuestionPro</a>, we are building a component library for all our satellite projects using React. During this process, we learned a lot about setting up a project and publishing it to npm. This is not a step-by-step tutorial but rather a collection of key takeaways that might help you if you’re working on a component library.</p>

<h2 id="0-why-an-in-house-component-library">0. Why an In-House Component Library?</h2>

<p>You might be asking, “In a sea of UI libraries for React, why on earth would we build our own?” That’s a fair question! The answer is pretty simple: <strong>consistency and efficiency</strong>. We have a bunch of products that all share the same design system and branding. With different teams working on different projects, it gets really tough to keep all those designs perfectly aligned.</p>

<p>So, to ensure our products have a consistent look and feel, we decided to build a common UI library that everyone can use. This means our teams can stop worrying about moving icons “1px to the left” and instead focus on what really matters: building amazing features and functionality.</p>

<h2 id="1-keep-it-simple">1. Keep It Simple</h2>

<p>A React component can be as simple as an HTML element. Avoid over-engineering components by forcing props for everything. For example, don’t pass text as a <code class="language-plaintext highlighter-rouge">label</code> prop when you can just use children. We don’t use raw HTML like that, so try to keep the developer experience as close to HTML as possible.</p>

<div class="language-tsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// ❌ Don't</span>
<span class="p">&lt;</span><span class="nc">Button</span> <span class="na">label</span><span class="p">=</span><span class="s">"A Button"</span> <span class="p">/&gt;</span>

<span class="c1">// ✅ Do</span>
<span class="p">&lt;</span><span class="nc">Button</span><span class="p">&gt;</span>A Button<span class="p">&lt;/</span><span class="nc">Button</span><span class="p">&gt;</span>
</code></pre></div></div>

<p><strong>Why This Matters:</strong>
Simplicity keeps the API intuitive and reduces developer friction. Your components should feel like a natural extension of HTML.</p>

<p><strong>Resource:</strong> <a href="https://react.dev/learn/writing-markup-with-jsx">React Docs – JSX in Depth</a></p>

<h2 id="2-extend-types-from-native-elements">2. Extend Types from Native Elements</h2>

<p>When building components, you don’t need to redefine every prop. Instead, extend from React’s native element types (yes, <code class="language-plaintext highlighter-rouge">div</code> or <code class="language-plaintext highlighter-rouge">button</code> in React aren’t “raw” HTML elements). This way, you automatically inherit all supported HTML attributes, including ARIA attributes. You can then layer custom props on top.</p>

<div class="language-tsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kr">interface</span> <span class="nx">ButtonProps</span> <span class="kd">extends</span> <span class="nx">React</span><span class="p">.</span><span class="nx">ButtonHTMLAttributes</span><span class="o">&lt;</span><span class="nx">HTMLButtonElement</span><span class="o">&gt;</span> <span class="p">{</span>
  <span class="na">variant</span><span class="p">:</span> <span class="dl">"</span><span class="s2">primary</span><span class="dl">"</span> <span class="o">|</span> <span class="dl">"</span><span class="s2">secondary</span><span class="dl">"</span> <span class="o">|</span> <span class="dl">"</span><span class="s2">danger</span><span class="dl">"</span><span class="p">;</span>
<span class="p">}</span>

<span class="k">export</span> <span class="kd">const</span> <span class="nx">Button</span><span class="p">:</span> <span class="nx">React</span><span class="p">.</span><span class="nx">FC</span><span class="o">&lt;</span><span class="nx">ButtonProps</span><span class="o">&gt;</span> <span class="o">=</span> <span class="p">(</span><span class="nx">props</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>
  <span class="k">return</span> <span class="p">&lt;</span><span class="nt">button</span> <span class="si">{</span><span class="p">...</span><span class="nx">props</span><span class="si">}</span><span class="p">&gt;&lt;/</span><span class="nt">button</span><span class="p">&gt;;</span>
<span class="p">};</span>
</code></pre></div></div>

<p><strong>Why This Matters:</strong>
It keeps your types aligned with React’s ecosystem, avoids reinventing the wheel, and makes your components more flexible and accessible by default.</p>

<p><strong>Resource:</strong> <a href="https://www.typescriptlang.org/docs/handbook/interfaces.html#extending-interfaces">TypeScript Handbook – Extending Types</a></p>

<h2 id="3-accessibility-matters">3. Accessibility Matters</h2>

<p>Accessibility should not be an afterthought. Use semantic HTML and proper roles. Don’t use a <code class="language-plaintext highlighter-rouge">div</code> to mimic a button — use a real <code class="language-plaintext highlighter-rouge">&lt;button&gt;</code>. Every component has its role. For example, an <code class="language-plaintext highlighter-rouge">&lt;input type="number" /&gt;</code> has the role <code class="language-plaintext highlighter-rouge">spinbutton</code>. Since components will be used in different contexts, make sure accessibility is baked in from the start.</p>

<div class="language-tsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// ❌ Don't</span>
<span class="k">export</span> <span class="kd">const</span> <span class="nx">Button</span> <span class="o">=</span> <span class="p">()</span> <span class="o">=&gt;</span> <span class="p">&lt;</span><span class="nt">div</span> <span class="na">onClick</span><span class="p">=</span><span class="si">{</span><span class="p">()</span> <span class="o">=&gt;</span> <span class="p">{}</span><span class="si">}</span><span class="p">&gt;&lt;/</span><span class="nt">div</span><span class="p">&gt;;</span>

<span class="c1">// ✅ Do</span>
<span class="k">export</span> <span class="kd">const</span> <span class="nx">Button</span> <span class="o">=</span> <span class="p">()</span> <span class="o">=&gt;</span> <span class="p">&lt;</span><span class="nt">button</span><span class="p">&gt;&lt;/</span><span class="nt">button</span><span class="p">&gt;;</span>
</code></pre></div></div>

<p><strong>Why This Matters:</strong>
Accessibility ensures your components can be used by everyone, including users with assistive technologies. It also aligns with legal and industry standards.</p>

<p><strong>Resource:</strong> <a href="https://www.w3.org/WAI/ARIA/apg/">WAI-ARIA Authoring Practices</a></p>

<h2 id="4-follow-conventional-commits">4. Follow Conventional Commits</h2>

<p>Conventional Commits provide a consistent way to write commit messages. Instead of vague messages like <code class="language-plaintext highlighter-rouge">button updated</code> or <code class="language-plaintext highlighter-rouge">misc changes</code>, use structured messages that can feed into automated changelog generation and versioning.
For example:</p>

<ul>
  <li>Commit often.</li>
  <li>Keep commits component-scoped.</li>
  <li>Use descriptive prefixes: <code class="language-plaintext highlighter-rouge">feat(Button): add new color</code></li>
</ul>

<p><strong>Why This Matters:</strong>
A consistent commit history makes it easier to track changes, automate versioning, and generate clean changelogs.</p>

<p><strong>Resource:</strong> <a href="https://www.conventionalcommits.org/">Conventional Commits Specification</a></p>

<hr />

<h2 id="5-unit-tests-are-essential">5. Unit Tests Are Essential</h2>

<p>For a component library, unit tests are not optional — they’re essential. They ensure that functionality doesn’t break when new features are added. Bugs will inevitably slip in, but tests catch them early. Make sure every piece of functionality has at least basic test coverage. We used <strong>Vitest</strong> and <strong>React Testing Library</strong> for this.</p>

<p><strong>Why This Matters:</strong>
Tests protect your consumers. A single regression in your library could break dozens of projects — tests act as your safety net.</p>

<p><strong>Resources:</strong> <a href="https://vitest.dev/">Vitest Docs</a>, <a href="https://testing-library.com/docs/react-testing-library/intro/">React Testing Library Docs</a></p>

<h2 id="6-configuration-best-practices">6. Configuration Best Practices</h2>

<p>Two common mistakes when building libraries are:</p>

<ol>
  <li>
    <p><strong>Not making the library tree-shakable.</strong> Your bundler should be able to remove unused components automatically. Example:</p>

    <div class="language-ts highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// component/Button/index.ts</span>
<span class="k">export</span> <span class="p">{</span> <span class="nx">Button</span><span class="p">,</span> <span class="kd">type</span> <span class="nx">ButtonProps</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">"</span><span class="s2">./Button.tsx</span><span class="dl">"</span><span class="p">;</span>

<span class="c1">// index.ts</span>
<span class="k">export</span> <span class="o">*</span> <span class="k">from</span> <span class="dl">"</span><span class="s2">@component/Button</span><span class="dl">"</span><span class="p">;</span>
</code></pre></div>    </div>
  </li>
  <li>
    <p><strong>Bundling dependencies into your build.</strong> Your library should only include your code, not external dependencies like React. Mark them as <code class="language-plaintext highlighter-rouge">peerDependencies</code> in <code class="language-plaintext highlighter-rouge">package.json</code>, and configure your bundler to treat them as external.</p>
  </li>
</ol>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="err">//</span><span class="w"> </span><span class="err">package.json</span><span class="w">
</span><span class="p">{</span><span class="w">
  </span><span class="nl">"peerDependencies"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"react"</span><span class="p">:</span><span class="w"> </span><span class="s2">"&gt;=18 &lt;20"</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<div class="language-ts highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// vite.config.ts</span>
<span class="k">export</span> <span class="k">default</span> <span class="nf">defineConfig</span><span class="p">({</span>
  <span class="na">build</span><span class="p">:</span> <span class="p">{</span>
    <span class="na">rollupOptions</span><span class="p">:</span> <span class="p">{</span>
      <span class="na">external</span><span class="p">:</span> <span class="p">[</span><span class="dl">"</span><span class="s2">react</span><span class="dl">"</span><span class="p">],</span>
      <span class="na">output</span><span class="p">:</span> <span class="p">{</span>
        <span class="na">globals</span><span class="p">:</span> <span class="p">{</span>
          <span class="na">react</span><span class="p">:</span> <span class="dl">"</span><span class="s2">React</span><span class="dl">"</span><span class="p">,</span>
        <span class="p">},</span>
      <span class="p">},</span>
    <span class="p">},</span>
  <span class="p">},</span>
<span class="p">});</span>
</code></pre></div></div>

<p><strong>Why This Matters:</strong>
Tree-shaking keeps your consumers’ bundles small, and marking dependencies as peers prevents multiple React versions or unnecessary bloat in downstream apps.</p>

<p><strong>Resources:</strong> <a href="https://rollupjs.org/guide/en/#tree-shaking">Tree-Shaking in Rollup</a>, <a href="https://vitejs.dev/guide/build.html#library-mode">Vite Library Mode Guide</a></p>

<p>Building a component library is less about fancy code and more about making it easy and reliable for others to use. Keep it simple, make it accessible, test it well — and you’ll already be ahead.</p>

<p>What lessons have you learned while building your own libraries? I’d love to hear your thoughts!</p>]]></content><author><name>Salauddin Omar Sifat</name></author><category term="Software Engineering" /><category term="Frontend Development" /><category term="React" /><category term="Vite" /><category term="component library" /><category term="react" /><category term="typescript" /><category term="accessibility" /><category term="testing" /><category term="ui components" /><category term="design system" /><category term="tree shaking" /><category term="vitest" /><category term="react testing library" /><category term="frontend architecture" /><summary type="html"><![CDATA[At QuestionPro, we are building a component library for all our satellite projects using React. During this process, we learned a lot about setting up a project and publishing it to npm. This is not a step-by-step tutorial but rather a collection of key takeaways that might help you if you’re working on a component library.]]></summary></entry></feed>