"Just call ChatGPT" isn't the answer

A raw general-purpose language model (a generic ChatGPT, Claude, or similar) knows nothing specific about your business. Ask it "do you ship the ZT-500 to Germany" or "what are your hours on Sundays" and it will produce a confident, entirely fabricated answer. Not because it's malicious, but because generating plausible-sounding text is what language models do.

The fix is to constrain the model to your actual business content and to put a thin rule layer on top that handles the things language models are unreliable at: deciding when to ask the visitor for their name, knowing when to stop pitching, recognizing a request for a live person, and routing that request to you quickly. None of this is exotic, but doing it well is the difference between "a chatbot" and "a chat assistant I'd actually put on my site."

Grounded retrieval, in plain English

Grounded means the bot's reply has to come from content you gave it: your website text, an FAQ you maintain, policy documents, maybe an inventory feed. When a visitor asks a question, the system first looks up the relevant chunks of your content, then hands those to the language model with the instruction "answer only from this."

The retrieval step is where most of the quality lives. Two different techniques each catch things the other misses:

A well-built assistant runs both in parallel and fuses their rankings — a technique the industry usually calls reciprocal rank fusion, or RRF. That way a question like "I need something durable for daily commuting" gets help from the dense side (catches "durable" in a product description that uses "reinforced construction"), and a question like "ZT-500 price" gets help from the lexical side (pins the exact product to #1). Neither approach alone is enough for a business with both narrative answers and branded products.

What you should not need to know as an operator: vector dimensions, embedding models, chunk sizes, rank fusion formulas. You should need to know: the bot answers from your content, it handles paraphrase, and it catches your product names exactly. If a vendor can't explain how the second of those works, that's a flag.

A rule-based layer on top of the LLM

Here's a thing that sounds boring but matters a lot: pure prompt engineering is unreliable. If you just tell a language model "always ask for the visitor's name before closing the lead," it will comply often enough to look fine in a demo, and not reliably enough to run a real lead-capture flow on. Missing the name on some leads is a real business cost, and the failures are hard to catch during testing because they look identical to the successes until you review a week of transcripts.

The fix is a small deterministic policy layer: a rule-based decision per turn about what shape the next reply should take. Should the bot just answer? Should it answer and then offer to connect the visitor with a real person? Is it time to ask for a name? A contact method? Is the visitor winding down, in which case the bot should stop pitching? Is the visitor frustrated, in which case the bot should acknowledge that before anything else?

These decisions are made by code, not by asking the model. The model still writes the words, but the instructions it receives are different depending on which rule fired. Same vocabulary, different guardrails.

Why this matters for you: it's how the bot stops asking for your email on the third turn in a row, how it refuses to offer to connect twice to the same visitor, how it knows to wind down the conversation instead of keeping the pitch going when you've clearly decided not to buy. None of that is a prompt trick. It's a policy layer.

Session state and the "snapshot" lead

A chat is a conversation. The bot needs to remember things across turns. When a visitor says their name on turn 2 and their phone number on turn 5, the bot has to hold onto "name given" the whole time, even if the intervening turns talk about something else. That's session state.

The important constraint: state is per-session and time-bounded. It's not a running memory of "everything this visitor has ever told us." It's scoped to the conversation, it expires, and it doesn't train the underlying model. That matters for privacy and for predictability.

Lead capture is a related but separate event. Most chat platforms get this wrong by firing a notification every time the bot picks up a new fact about the visitor. A well-built system treats the lead as a single snapshot event: the bot collects what it can across the conversation, and when the visitor shares contact info, the system fires one notification with everything. No duplicate emails, no CRM pollution, no "is this the same person" guessing game downstream.

What a well-built chat assistant deliberately doesn't do

Some of the most useful architectural choices are about what the bot won't do:

That list is a feature set, not a limitation. A bot that "learns from customer conversations" is a bot you cannot audit. A bot that "searches the web" introduces a moving source of truth you cannot review or sign off on. Neither is acceptable on a small business site.

Per-industry tuning, briefly

A message like "severe pain and bleeding" means something different to a dental office, a legal intake form, and an e-commerce support queue. A well-tuned chat assistant recognizes that and routes accordingly. This is usually done with industry-specific keyword patterns and verticalized behavior rules: a dental bot knows what an emergency looks like; a legal bot doesn't try to diagnose an injury; an e-commerce bot cares about SKUs and order numbers.

None of this is glamorous. It's the difference between a generic "AI for any business" that kind of works for everyone and a per-industry assistant that actually belongs on your site.

How Simple Business Bots handles each of these

Questions to ask any chat vendor

If you want a short checklist for evaluating any AI chat vendor, these five questions tend to separate the careful systems from the thin wrappers:

  1. "Can your bot look things up on the open internet, or only from content and systems I've explicitly provided or connected?" The right answer is only content and systems I've connected (your FAQ, website, inventory feed, calendar, CRM, etc.), not the open internet.
  2. "How does the bot decide when to hand off to me?" The right answer is some version of a rule, not the model decides on its own.
  3. "What does the bot do when it can't answer?" The right answer is admits it, captures contact info, offers a handoff, not guesses and not replies with a dead end.
  4. "Do customer conversations train the underlying model?" The right answer is no.
  5. "Is there behavior specific to my industry, or is it the same bot for everyone?" Industry tuning isn't always essential, but it's a meaningful signal about how much care went into the product.