Project Holiday: why we’re working hard, just like Tesla, to let AI do all the work
When you’re working in the world of artificial intelligence (AI), Tesla is a company you can’t afford to ignore. Their work on computer vision is ground-breaking, and even at this early stage in the development of self-driving cars, the results are impressive.
At Shipamax, we use AI in a very different domain—the automatic recognition, classification and extraction of data from invoices and other documents. But the methodologies and aspirations of companies like Tesla are still very relevant. So, when we saw Andrej Karpathy, Tesla’s Director of AI, talking about “Operation Vacation”, it immediately struck a chord.
Operation Vacation is an inspirational concept for Tesla’s AI team. The idea is to completely automate the infrastructure and processes supporting the development of AI, so that even if the entire AI team goes on holiday, their models can continue to ingest and learn from new examples and get smarter over time.
We like the sound of that. And thus, on the other side of the pond, Shipamax’s own “Project Holiday” was born.
A brief digression on understanding AI
If you’re a Shipamax client, you don’t necessarily need to know anything about what AI is or how it works in order to benefit from our document automation solutions.
However, if you’re interested in learning a bit more about how our document processing workflows actually work, let’s explain a little about our approach to building AI, and why Project Holiday is such an important aspiration.
Our AI model has two main tasks.
- First, it needs to recognise and classify a scanned document—for example, is it an accounts payable invoice, a commercial invoice, a bill of lading, or something else?
- Second, it needs to locate and extract important data from each document, such as job reference numbers, line items, and amounts.
Unlike traditional approaches to document automation, we don’t require our clients to create templates to provide our system with explicit instructions on how to recognise and extract data from all the different types of document. Instead, we supply our AI with a large training data set containing example documents, annotated with the data that human experts have extracted from them.
The AI’s job is to discover a set of algorithms that, for any given document in the training data set, identify the patterns to produce the same results that the human expert produced. This is called the model training process, and when it’s complete, the result is a model that we can test to see how accurately it can classify and extract data from new documents.
The big advantage of this approach is that when it’s done well, the result is a model that is capable of generalisation. During training the model discovers some patterns, but the model will still be able to make an accurate prediction if there is a small change in them.
- The model learns a pattern associated with invoice number.
- It will also be able to identify invoice numbers with different patterns - the length of the invoice number, the position or surrounding text can vary, etc.
- Or if the model identifies the logo of a company, it should still be able to identify that same logo if the image is noisy e.g grainy or pixelated.
How do AI models learn over time?
This is one of the greatest advantages of machine learning. That is, when it encounters a new document in a format that it has never seen before, it can use its knowledge of previous documents to infer what type of document it is and how to extract the relevant pieces data from it—just like a human would do. This makes the process much more robust and easier to maintain than a manual, template-based approach.
No AI model is perfect, but in principle, it should always be possible to make models more accurate through repeated cycles of training and testing. This is where things get tricky, because retraining models is a non-trivial task, and there are important trade-offs to consider.
Why retraining from scratch isn’t scalable
Probably the first option you’d think of is to retrain the model from scratch, using the entire original training data set plus examples of the new document type. However, this has two key drawbacks.
First, for any model of appreciable size, training is a time-consuming and expensive process, and the brute-force approach of retraining from scratch is neither economically viable nor fast enough to keep pace with changing requirements.
Second, there’s a phenomenon known as “catastrophic forgetting”. Retraining a model from scratch is likely to change its behaviour, and in learning how to identify the document from your new client, it may forget how to recognise some other documents that it handled perfectly well before.
Where Project Holiday fits in
What we’re aiming for with Project Holiday is that our AI should be able to learn perfectly from all the corrections our customers provide, without forgetting any of the things it has previously learnt. This will provide a huge quality-of-life improvement for our internal data science team, because it will dramatically reduce the need for their hands-on involvement in the model training process (and of course, they’ll finally all be able to go on holiday together).
More importantly, Project Holiday will unlock significant benefits for our clients too. One of the additional unique advantages of Shipamax’s approach to document automation is that we use a community-powered model. In essence, instead of building a separate model for each of our clients and training it on their data alone, we have one central model that learns from all the data.
This means that if your company receives a document that you’ve never seen before there is a high likelihood that our AI has seen something similar from other customers and is therefore able to interpret it. So you benefit from the experience of the whole Shipamax community and start from a much higher baseline.
Once Project Holiday comes to fruition, the benefits of this community model will be force-multiplied. Every time any of our clients submits a correction, we can incrementally retrain our model and everyone will benefit:
- Your results will get even more accurate
- Document processing times will be even faster
- And straight-through processing rates should soar
Turning aspiration into achievement
Are we there yet? No. Is Tesla there yet? Well, Elon Musk claims to work 120 hours a week, so I guess we can assume that Operation Vacation isn’t quite ready either.
But Shipamax’s data science team is committed to finding the best, smartest ways to turn leading-edge AI research into business value.
Our community model already learns from corrections but we’re working on improving this so that when it learns new things it doesn’t forget what it already knows and we’re trying ways to help it learn from a greater proportion of corrections.
We are looking at automatic annotation of error examples and investing in state-of-art models with high generalisation ability. Continual learning - how to train a model on new datasets while keeping accuracies on previous data is a difficult problem and is an active field of research that Shipamax is striving to master—so, watch this space. And in the meantime, our existing community-powered model already puts Shipamax clients way ahead of the game with accurate, cost-effective document automation that saves time, reduces costs, and eliminates whole classes of errors.
If you’d like to learn more about how Shipamax can help your business harness state-of-the-art AI to automate your document processes today, reach out to us today.