AI assistants need access to high-quality corporate data to effectively automate routine business tasks. This requirement applies to data used for training, fine-tuning, and inference in Retrieval-Augmented Generation (RAG).
A large amount of work still remains, including data labeling, ETL (Extract, Transform, Load), software development, and analysis, all of which are highly customized to each specific business.
Sensitive data such as trade secrets or clients data must not leak to areas where are not necessary. We need to implement a fine access policy for client specific data and sensitive corporate data that are available to LLMs during inference time.
We also need to make a decision if a single fine-tuned model is sufficient for our purposes or if it is necessary to train and/or fine-tune models that are specific to different areas of the business.
The LLMs capture statistical probability of words that follow other words. The statistical probability words are not the same as a factual reality. Inference time data augmentation, RAG, prompt engineering and prompt chaining only reduce factual mistakes but do not eliminate them.
We have seen examples of using engagement time as a success measure. Unfortunately this can measure a time where customers or users wasted their time using the AI assistants trying to achieve their tasks.
We should rather define a list of routine tasks that we plan to automatise and get benchmarks on these tasks without an AI Assistant and perform the same tasks with the AI Assistant. We should not only measure the time users spent finishing these tasks but also the quality of the results.
When users interact with an AI Assistant they can continuously correct mistakes in the generated content. They also need to have access to the correct information source and to recognize when the AI Assistant is making factual errors. We need to define what is acceptable level of mistakes generated by an AI Assistant and include the cost of correcting these mistakes. We have seen use cases where customers performed self-service and nudged the AI Assistants into providing reasonable results. The headline where companies boasted that they replaced a large number of customer support staff with AI Assistant does look premature. The more realistic scenario is where 90% of all support cases can be fully self-served by an AI but the rest of the cases require some level of human intervention.
Ladislav Urban, CEO of Dynocortex
Update your browser to view this website correctly.