In this article, Vincent Poulain d’Andecy, Research Department Manager at Yooz, revisits the early days at Yooz, which started using artificial intelligence technologies from the very beginning.
Thirty years ago, Yooz’s name was Informatics and Advanced Technologies. The company was already working with the computer science engineering school in Nîmes, France, EERIE, which is now part of the well-regarded Mines d’Alès National Higher School of Engineering.
Technology and collaboration with academic research are key aspects of Yooz’s strategy to stay at the cutting edge. They are also central to the vision of its founder Didier Charpentier. As early as 1990, we were already offering technology that was very innovative at the time – neuron networks – implemented for a public works company that wanted to make activity report entry easier for people not familiar with computer tools.
Neuron networks offered new possibilities in terms of automatic handwriting recognition. From there, we were able to create a software package for reading forms automatically. A pioneer in the market, we were then faced with increasingly complex types of documents, including checks, health care reimbursement statements, order forms, and more. Scientific progress, notably regarding shape recognition and Markov chain models for cursive handwriting, made it possible to handle these documents.
In the early 2000s, we had evolved from a fundamentally computer-based vision and headed even further towards truly understanding documents. By using an expert system to analyze content, we could extract precise information contained in complex documents such as invoices.
We classified documents by applying machine learning to graphical and textual data. Still working in close collaboration with academic laboratories, we made significant progress integrating new technologies to improve our solutions’ performance, notably by introducing linguistic knowledge and developing adaptive approaches, deep learning, and incremental learning to adapt to ever-changing document workflows.
The road ahead is still long, with vast topics such as big data, the Cloud, and data center computing capacity opening not just a new dimension for improving the performance of automated document understanding, but also offering opportunities to expand our scope of application into areas such as fraud detection and decision-based analysis.
Mechanization, automation, and robotization have always been used as tools to eliminate tedious tasks for people and enable them to spend more time on other activities where they have greater added value. Determining the type of information we need would be quite simple if there were only one way to write a symbol and one way to construct a document.
The whole challenge stems from the great variety of writing types, expressions, and page layouts. People know how to adapt naturally, for example, understanding a document date rapidly: “11/11/18” or “Signed in Paris on 11/11/18” represents the same information.
Enabling a machine to adapt to all possible variations leads to many examples of machine learning or artificial reasoning techniques that can deduce information based on knowledge models in expert systems. Artificial intelligence for accounting was therefore the natural choice for Yooz to automate document processing, as our vision is to provide high-performance solutions that can adapt to all types of documents.
A three-year old child starts to decode symbols; a six-year old starts to decode text; a ten-year old starts to understand complex texts. The learning process is based on the repetition of observations: read, read, read, and read some more until memorization is achieved. At the same time, teachers also transmit rules regarding sentence construction, syntax, conjugation, and the semantics of notable words – such as “but”, “or”, and “thus” – that provide children with models for understanding text.
Artificial intelligence implements similar strategies. Machine learning leverages massive learning by example. Expert systems utilize an inference engine that applies knowledge and rules to observations in order to deduce new knowledge. Expert systems are generally effective because rules provide a model for document logic. On the other hand, they require a human expert to maintain the knowledge base.
Machine learning does not require that expertise. Today, deep learning neuron networks are extremely efficient, as a Cloud-based approach makes it possible to collect huge volumes of documents, and data centers have enough capacity to handle the data. A neuron network is a mathematical statistical model inspired by biological neuron organization, making it possible to simulate nearly any function, especially any recognition or decision-making function.
Learning consists of statistically calculating thousands of network parameters based on document examples. This learning phase is critical. However, if you do not have a sufficient volume of examples, it may not be possible.
Other models exist to cover that case, such as incremental learning. This type of network has the benefit of being able to evolve and learn from fewer examples. The network can be fueled on an ongoing basis with examples in order to improve its performance continually without a massive amount of learning.
It can be implemented rapidly to automate processing for document types that are encountered relatively frequently. The limitation is that it cannot reach the performance of deep learning technology, when it can be applied.
These systems are all complementary to each other. Our logical choice, and our strength, is to combine all these technologies in our “robots”.