Mindtech, the developer of the world’s leading synthetic data creation platform, has today announced the launch of Dolphin, an AI-enhanced, automated dataset analysis platform to help customers solve data issues.
The platform provides insights, statistics, and visualisations for datasets with real and synthetic images. It compares and contrasts datasets based on their semantic content such as object occurrence rates and sizes, and image characteristics such as average brightness or entropy. Dolphin also extracts image embeddings from labelled or unlabelled images to enable their visualisation within the latent network space, providing unique insights into the data fit.
Dolphin is designed to fit within existing MLOps workflows by supporting industry standard and open dataset formats. It is fully containerised, allowing easy deployment on-premise or in cloud, optimising data access, minimising data movement and duplication. This flexibility and scalability enable the efficient processing of extremely large datasets.
Another benefit of the Mindtech approach is the ability of a user to analyse and provide key statistical and structural information about their datasets to a third party, without having to share original sensitive data; this, for example, guides the generation of synthetic data from the Mindtech Chameleon platform, filling diversity and bias gaps, and maintaining the required domain matching.
“We have introduced Dolphin to solve a critical customer issue,” commented Steve Harris, CEO of Mindtech. “We believe that data centric AI is the future. Users must be able to understand their application by analysing their real world data and using the insights gained to create optimal training data. Dolphin is the world’s first and only platform to enable that vision.”