Organize your mailroom with Machine Learning and save the day

Modern mailrooms employ a slew of techniques to ensure all incoming communications are properly sorted, filed and sent to the right recipients –based on a range of metadata captured from this inbound correspondence. And this works fine most of the time, but there are still occasions when these techniques fall short of doing the job properly. Machine learning (ML) can save the day, and adding it to your system might be easier than you realize. We’ll run you through three scenarios where ML can extract and classify communications, and become your mailroom hero.

Scenario 1: defining your mailroom rules

Digital mailrooms usually sort documents and incoming client communications according to a set of rules. First, a set of document types are defined by the business, ranging from complaints and contracts to more complex files such as medical documents and European Accident Statements. Next, the business is asked to come up with typical phrases or words that might identify the file type, and rules are defined on the basis of this user input. But this manual process is not only time consuming; it also turns out it’s not that easy for users to come up with these typical words and phrases on the spot – especially since there’s so much variety in language. Moreover, as the number of document types increases, so does the difficulty to manually select keywords.

Machine learning can help by completely taking over this process. Classification algorithms sift through heaps of files to come up with initial categories and the characteristics that identify them. Business and IT can then label these documents and judge whether the sorting is accurate, enabling the algorithms to be trained further with additional data in a supervised learning process – with users validating the choices made by AI, so the algorithm knows whether it’s on the right track or needs adjusting. Employing machine learning improves the accuracy of your classification and extraction processes over time, whereas the accuracy actually decreases if you stick to the manual approach.

Scenario 2: consolidating multiple mailrooms

When companies merge into one, administrative process and mailrooms also need to be consolidated. Each of these digital mailrooms usually has a number of defined document types and a set of rules for labelling them. Lumping them all together would create excess, overlapping categories, and sifting and adjusting them manually is virtually impossible. For machine learning, however, this is just business as normal.

A confusion or error matrix is uniquely suited to reducing complexity and ambiguity in your mailroom. It starts by visualizing the output of your classification and extraction algorithm, and the results of this matrix can then be used to identify similar classifications, a lack of data and other possible issues. In essence, it gives you insights into your classification which you wouldn’t be able to gain using a manual process. Based on the conclusions you draw from the matrix, you can then fine-tune the sorting further, combine document types, and ultimately narrow down your categories, so you’re ultimately left with fewer distinctly different document types and a streamlined mailroom.

Scenario 3: fine-tuning sorting rules

A final scenario is a combination of the first two. As companies grow, products are added, and communications and terms change. Over the course of time, your initial mailroom rules won’t suffice anymore. They therefore need constant adjustment, with new document types added to keep up with the changing business. This often results in a company having more categories than they need, and a lot of work every month to adjust the sorting criteria.

Classification and extraction with machine learning can kill two birds with one stone here. As mentioned earlier, a confusion matrix is used to analyze document classes and identify potential problems. It can then be used to narrow down the number of document types by looking for categories where there is no clear distinction. Sorting algorithms are self-learning and can be continuously updated behind the scenes with minimal effort to account for slight shifts in communications and products. Thanks to machine learning, you’re saving time, effort and ultimately costs while ensuring high sorting accuracy.

Adding machine learning shouldn’t be a hassle

Many companies already have a digital mailroom, but often without artificial intelligence. At Docbyte, however, we believe that adding these classification and extraction algorithms doesn’t have to mean implementing an entirely new mailroom – a costly and time-consuming affair. So we’ve included the AI in our own mailroom software and packaged it in a module which you can easily add to your current solution using REST APIs. With our module, you’ll be adding innovative artificial intelligence techniques such as natural language processing to your mailroom, ensuring high accuracy, accelerating your business digitally, and more – all in the blink of an eye.

Want to find out what our classification and extraction module could mean for your organization? Reach out to our experts now. We’ll be more than happy to help!



Kortrijksesteenweg 1144 B

9051 Gent


VAT: BE0880119503

Phone: +32 9 242 87 30


Docbyte is Certified.