Synthetic invoice that has simulated rips and stains

This shows a synthetic invoice, that has been annotated with bounding boxes

Mindtech, the company enabling 10x faster deployment of robust computer vision systems is offering a dataset of 1000 synthetic invoices free of charge.

SHEFFIELD, SOUTH YORKSHIRE, UNITED KINGDOM, September 30, 2024 /EINPresswire / -- Mindtech, the company enabling 10x faster deployment of robust computer vision systems through its synthetic data platform, today announced that it is offering a dataset of 1000 synthetic, fully annotated invoices, including several“distressed” documents, free of charge to coincide with exhibiting at the Intelligent Automation Conference Europe, part of TECHEX, on 1-2 October 2024 in Amsterdam.

The datapack which was created using Mindtech's advanced intelligent synthetic data creation platform, Chameleon, consists of high-quality, photo-real images with fully accurate annotations, will enable developers to train and test advanced document processing systems. The applications include advanced OCR, deep document understanding and analysis. These technologies serve a multitude of industries including archival and auditing industries. The dataset is available to customers via the Snowflake data warehouse, and is supplied with a license allowing for commercial exploitation. A summary of the datapack can be seen at Youtube .

At the same time as this provision of this datapack to the ML community, Mindtech have also released several other packs, including multiple languages and document types, which can be found at Mindtech's website.

Chris Longstaff, COO at Mindtech, commented:“Previous invoice datasets, real or synthetic have focused purely on“Perfect” documents, whereas for building robust systems, we need to have documents with defects such as creases, rips, tears and stains. Of equal importance is that these documents should be free of privacy concerns and GDPR. Mindtech's synthetic platform allows for creation of these real-world documents non-real-world examples that, unfortunately, fail to represent the real-world. With this dataset, Mindtech are allowing Data scientists and vision engineers access to a set of documents as would typically be found in the real world, but without concerns over GDPR and privacy.”

In addition to this launch, Mindtech is excited to exhibit at the Intelligent Automation Conference Europe, part of TECHEX, on 1-2 October 2024 in Amsterdam. This event offers industry leaders and peers a prime opportunity to explore the forefront of AI-driven innovation. Visitors are encouraged to stop by Stand 328 to discover how Mindtech's DataOps platforms revolutionise vision data analysis and facilitate synthetic data creation for AI training and testing. Whether your focus is on intelligent document processing or robotic vehicle simulation for warehouse automation, Mindtech provides solutions to inspire and advance your AI initiatives.

About Mindtech

Mindtech Global is the developer of intelligently engineered synthetic data, enabling better AI models through data analysis, visualisation, and curation. Mindtech's Data Ops Platform delivers a step change in the way AI vision systems are trained, helping computers understand and predict human interactions in applications ranging across retail, smart home, healthcare, and smart cities.

Mindtech is headquartered in the UK and is funded by investors including Mercia, Deeptech Labs, In-Q-Tel , Appen and Edge

Mindtech Synthetic Invoices DataPack

