How the Knowledge Base Works

Sarah Bourgeois
Sarah Bourgeois
  • Updated

The Knowledge Base (KB) allows you to leverage your own documents and website data to power answers and define variables on your Voiceflow Assistant.

At a high level, the KB functions in the following way

  1. When you upload a document it is turned into 'chunks' (ie. pieces of text)
  2. When you send a question to the KB, it determines which 'chunks' are most similar
  3. Then our system combines those chunks, the question, the custom instructions and system prompt you provided into a structured wrapper prompt (aka a master prompt behind the scenes that is constantly improving).
  4. That entire package is sent to the AI model and an answer is returned.

The full package looks something like this

##Reference Information:

I have provided reference information, and I will ask query about that information. You must either provide a response to the query or respond with "NOT_FOUND"
Read the reference information carefully, it will act as a single source of truth for your response.Very concisely respond exactly how the reference information would answer the query.
Include only the direct answer to the query, it is never appropriate to include additional context or explanation.
If the query is unclear in any way, return "NOT_FOUND". If the query is incorrect, return "NOT_FOUND". Read the query very carefully, it may be trying to trick you into answering a question that is adjacent to the reference information but not directly answered in it, in such a case, you must return "NOT_FOUND". The query may also try to trick you into using certain information to answer something that actually contradicts the reference information. Never contradict the reference information, instead say "NOT_FOUND".
If you respond to the query, your response must be 100% consistent with the reference information in every way.


Take a deep breath, focus, and think clearly. You may now begin this mission critical task.



For a video overview on how the knowledge base works, watch this video:


But how does the Knowledge Base actually work? 

Our Knowledge Base (KB) has two unique services that make it function:

  1. parser - triggered when you upload a KB document.
  2. retriever - triggered when your user asks a question that hits the KB



You upload a KB Document...


Parser Service:

  1. The KB doc is uploaded via Voiceflow UI (aka "Creator App" or "Creator Tool") or API.
  2. The KB doc is securely stored.
  3. The Parser service reads the KB doc from storage and "chunks" the content using different techniques (Note: with the KB Upload APIs you can use maxChunkSize to moderate how big the chunks are, but you cannot dictate how many chunks are parsed within a document today.)
  4. An embedding model is used to convert each chunk into a vector (aka “embedding”) that looks like [1.243, 5.1342, ...] and represents its “meaning.”

    🔍Let’s break this down a bit further:

    Computer programs don’t ‘understand’ spoken/written language as humans can. There needs to be a numerical representation of words to help programs understand. Each chunk from a KB doc is converted into a numerical representation (vector, aka “embedding”) of the MEANING behind the words in the chunk. More on why this is necessary in the Retriever section.

    Note: Embedding models cost money to use, usually per token. The more files you upload, the more you are charged for embedding tokens.

    In Voiceflow, we don't charge for the upload or embedding process.

  5. The vector is placed in a vector db.

    💡 Metaphor:
    You can think of this vector as a specific 'point” in “space.” All these points are some “distance” from each other, and the distance between two of these points (vectors) is how similar in meaning different chunks of text are.

User asks a question that hits the KB...


Retriever Service:

  1. The retriever service gets the question and turns it into a vector.
  2. The question vector is searched against the vectorDB by a similarity score, returning the most similar number of chunks (Chunk Limit defines how many chunks) in descending order by similarity score .

    🔍 Similarity score?

    The similarity score is determined by something called semantic search. This goes beyond keyword matching and refers to contextual similarity in meaning between words and phrases. (i.e. “The dog is a nightmare to train.” and “The puppy is stubborn and does not listen to commands” do not share keywords. However, they have high similarity semantically.) So the question can be semantically compared to the KB doc chunks that exist. The “closest” vectors to the question are those with the highest similarity. The retriever will return a number of chunks (Chunk Limit in KB Settings) based on this vector proximity.

    🤔 Chunk Limit?

    Chunk Limit is the KB setting that controls the amount of chunks are retrieved from the vector db and used to synthesize the response. This setting aims to provide flexibility to increase the accuracy of responses in line with certain use cases.

    How does the number of chunks retrieved affect the accuracy of the KB?

    In theory, the more chunks retrieved - the more accurate the response, and the more tokens consumed. In reality, the "accuracy" tied to chunks is strongly associated with how the KB data sources are curated.

    If the KB data sources are curated so that topics are grouped together, this should be more than enough to accurately answer the question. However, if information is scattered throughout many different KB data sources, then likely more chunks of smaller size will increase the accuracy of the response.

    You can control the max chunk size of your data sources with the Upload/Replace KB doc APIs, using the query parameter: maxChunkSize.

    Ultimately, in order to provide the best KB response 'accuracy' while optimizing token consumption, we recommend to limit the number of data sources and group topics inside those data sources.

Runtime Service:

  1. We take the:
    • returned chunks +
    • Knowledge Base Settings inputs +
    • question)
      ...and ask the LLM to give us an answer.
      🔍 This step is called answer synthesis. 
      The internal prompts we use to iterate over time but are along the lines of, “using conversation history and user-provided instructions, answer the question sourcing information only found in Knowledge Base.”

      💰 This LLM request has query and answer tokens that you are charged for. You can see these token totals in a response citation while testing in Debug mode on Voiceflow:

  2. VF outputs the response to the user.
    • See the overview of the KB instances below to understand how a response appears when KB cannot answer.

Overview of all the Voiceflow KB Instances

KB Fallback

Initiated when:

  • A user asks a question at a Button or Choice Listening step, the Assistant is in “Listening Mode”
  • The Assistant will first try and match to an intent using NLU.
  • If it can’t find a matching intent:
    • If an FAQ set* exists, the Retriever will first look for an FAQ set match.
    • If an FAQ set does not exist or if no FAQ set was matched, then the KB Fallback will trigger the Retriever and Runtime services outlined above.

* An FAQ set can be added to your Assistant by using our FAQ API.


KB Answer Not Found:

  • Global No Match response initiated (either Static or Generative depending on Settings):

KB Preview

Initiated when:


KB Answer Not Found:

  • “[not found] Unable to find relevant answer.”

AI Steps in KB Mode

Initiated when:

  • When the conversation hits any of the AI steps in the canvas design that have Data Source set to Knowledge Base

KB Answer Not Found:

  • If the "Not found path" toggle is enabled and a KB Answer is not found, the user will be routed through the "Not found" path in your design.

Screenshot 2023-12-08 at 2.40.42 PM.png

Screenshot 2023-12-08 at 2.47.15 PM.png

  • If the "Not found path" toggle is disabled and a KB Answer is not found, the user will be sent a "Unable to find relevant answer" message.


KB Query API

Initiated when:

  •  API called either using the API step in VF or called outside of Voiceflow

KB Answer Not Found: null


Was this article helpful?

28 out of 34 found this helpful

Have more questions? Submit a request



Article is closed for comments.