Can a chatbot learn from a website’s content?
by Vagelis H. 01/26/2021
We live in the time of AI and expectations are high. I had several customers ask me about the extent of AI used by SmartBot360. A common question is if a chatbot can become smarter as more people chat with it, which I cover in an earlier article. Another question, which is the focus of this article, is if a chatbot can search for an answer in a Web site or other document collection (such as SalesForce documents).
The answer is that there are several techniques to achieve this, with different levels of sophistication, which lead to different user experiences. Below, we will examine some of these solutions.
1) Search for relevant documents
This is the simplest solution. The key idea is that if the user asks for something that the chatbot cannot answer, for example, “can I bring my dog to the visit,” the chatbot may search the documents of a web site and send the user to a relevant document, like sending the user to the page visit_policies.html.
The chatbot may recommend only the top-1 most relevant page or a small list of results. The ranking can be simple keyword matching or use AI to search semantically for the most relevant document. The most recent techniques to achieve accurate text matching are based on deep learning, and specifically on computing the similarity between document-to-vector embeddings.
Another consideration is how to incorporate the context of the chat to the query. For example, if the user mentioned before that she is interested in the Riverside office, then the query should search if the Riverside office allows dogs. Most solutions out there do not support context.
There are a few chatbots that employ this document search approach. For example, Anthem has a simple keyword match chatbot in its members portal. Typing “find doctor” forwards the user to the provider search page. SmartBot360 includes a Document Search box in its bot diagrams, which returns the top-5 most relevant documents given a query, as shown below. SmartBot360 supports several document sources, including Web page. SmartBot360 also has options to improve the search accuracy by using the conversation context.
2) Search for the answer segment
The problem with the first solution is that it returns whole pages rather than the answer to the user question. Going back to the example of the user asking “can I bring my dog to the visit,” if page visit_policies.html contains a sentence “For safety reasons, no pets are allowed to our offices.” then the chatbot could return this sentence instead of the whole page.
This is much more challenging than the first solution, but fortunately, there are recent advances in deep learning that bring this solution closer to reality. There is a recent competition set up by Stanford to promote research in this area. SmartBot360 is testing several solutions to extract the best text segment by customizing state-of-the-art techniques.
3) Give the answer
The ultimate goal is to answer the question as a human would. Going back to the example of the user asking “can I bring my dog to the visit,” the answer should be “No, you cannot.”
This is called Question-Answering (QA) in the research literature. QA is the holy grail of search (including Web search), and even though many researchers and companies have been studying solutions, most work only for limited types of questions. For example, take a look at Google AI’s research page on this.