Organizations create and store a lot of content but often know too little about what this content represents. To give meaning to content, much effort is needed in advance. However, classification of content is time-consuming and expensive when done manually. With the use of Artificial Intelligence (AI) and Machine Learning (ML) capabilities, this can be done exponentially faster and often more accurately. On top, the added value improves content discovery and analysis. ProcessMaker IDP leverages AI and ML technologies for uncovering the true value of content. This article gives more insights into the unique cognitive capabilities of ProcessMaker IDP.
Post-OCR Correction
ProcessMaker IDP uses Optical Character Recognition (OCR) technology to extract text from documents. This makes documents searchable, and the extracted text can be used for further processing. If the OCR service is not able to recognize words correctly, e.g., due to bad image quality or stains, they will be corrected with our Post-OCR Correction service. By using both a content-based correction model and a dictionary to correct the identified mistakes, we boost the quality of your OCR data.
Big Data Clustering
Machine Learning by itself is not the holy grail since it only works if a proper training set is available. Creating the training set involves substantial human effort when working with high volumes of many different types of documents. ProcessMaker IDP uses clustering techniques to identify similar documents automatically. It helps to speed up the preparation of a training set and provide better and more accurate insight into the relations between documents.
Evolutionary Machine Learning
Proper Machine Learning requires both a training set and a test set, implementing classifiers, and finally testing the several classifiers based on the training data. To improve results, a data scientist must manually test classifiers and compare the results. With ProcessMaker IDP Evolutionary Machine Learning, the system can dynamically determine the most optimal classifier to achieve the best results instead of using fixed classifiers with a fixed training set.


