Skip to main content
The website crawler is the fastest way to build your chatbot’s knowledge base. Enter a URL and InsiteChat crawls your entire site, extracts the content, and indexes it — no manual copy-pasting required.

Add your website as a knowledge source

1

Open your dashboard

Log in to InsiteChat and select the chatbot you want to train.
2

Go to Sources

In the left sidebar, click Sources, then click Add source.
3

Choose Website

Select Website from the list of source types.
4

Enter your URL

Type or paste your website’s root URL — for example, https://yourcompany.com. InsiteChat starts from this page and follows links to discover the rest of your site.
5

Start crawling

Click Crawl. InsiteChat begins indexing your site in the background.
6

Wait for indexing to complete

The source card shows a progress indicator. Once indexing is complete, the status changes to Ready and your chatbot can answer questions based on the content.
If your site is large, start by submitting your most important pages — such as your homepage, pricing page, and FAQ — as separate sources. This gets your chatbot answering key questions quickly while the full crawl finishes.

What gets indexed

InsiteChat follows links from your starting URL and indexes every page it can reach, including:
  • Product and service pages
  • Blog posts and articles
  • FAQ and help center pages
  • About, pricing, and contact pages
Pages behind a login or blocked by robots.txt are not crawled.

Crawl depth and what to expect

InsiteChat crawls your site up to several levels deep, following internal links automatically. For most sites, the full crawl completes within a few minutes. Very large sites may take longer. The crawler extracts the main text content from each page. It does not index images or embedded videos — for that content, use Document Upload or add a YouTube transcript as a separate source.

Keep your content up to date

Your website content changes over time. InsiteChat provides two ways to re-sync a crawled source so your chatbot stays current.

Manual sync

You can trigger a sync at any time on all paid plans:
1

Go to Sources

In your dashboard, click Sources.
2

Select your website source

Click the website source you want to update.
3

Sync now

Click Sync now. InsiteChat re-crawls your site and updates the index.

Auto-sync

InsiteChat can re-sync your website automatically based on your plan:
PlanAuto-sync frequency
FreeNo auto-sync
StarterMonthly
GrowthWeekly
ScaleEvery 3 days
On every sync, InsiteChat only re-processes pages that have changed since the last crawl. Unchanged pages are not re-indexed, so syncs are fast and efficient.