Clearbox AI is an award-winning tech startup specializing in synthetic data for privacy preservation and data augmentation. The SURE team, led by CEO Shalini Kurapati, CTO Luca Gilli, and Machine Learning Engineer Dario Brunelli, has an impressive track record in both R&D and commercial projects centered around trustworthy AI, with a focus on data-centric AI, privacy, and fairness in the AI lifecycle. In this interview, the team discusses their work on the SURE project, which aims to set a new standard for evaluating the privacy and utility of synthetic data. They also highlight how TrustChain has been key in supporting the project’s development and future potential.
Can you briefly present your team?
Clearbox AI is a highly innovative and award-winning tech startup that operates in the field of Synthetic Data for privacy preservation and data augmentation. Our SURE team is composed of Clearbox’s CEO Shalini Kurapati, our CTO Luca Gilli and Dario Brunelli, Machine Learning Engineer. Clearbox AI has a strong track record in working with R&D as well as commercial projects in the field of Trustworthy AI with a focus on data-centric AI, privacy and fairness in AI lifecycle.
How did you come up with this project idea and what benefits will it bring to the end users?
This library was born from the need of a unified and comprehensive framework for evaluating both utility and privacy of synthetic data. By making this library open-source we aim to provide stakeholders with the tools required to measure and understand the implications of using synthetic data in diverse contexts, with the primary objectives of enhancing user privacy, ensuring data utility for AI training, and facilitating regulatory compliance, particularly in alignment with the General Data Protection Regulation (GDPR).
How is TrustChain supporting your growth and what role does it play for the next steps in your development?
TrustChain is playing a pivotal role in this project’s growth by actively supporting the development of the SURE library. By backing SURE, TrustChain is significantly enhancing its potential for widespread adoption, helping establish it as a standard for democratising access to secure and privacy-compliant data solutions, setting new benchmarks for the industry in this emerging field.
As the first tool of its kind, SURE is poised to become a leading technology in synthetic data assessment. TrustChain’s support not only accelerates SURE’s development but also positions it to evolve into the state-of-the-art solution for evaluating synthetic data.
Why did you apply to the TrustChain call and has your vision changed since then?
Our goal was to develop an open-source tool that anyone can use to assess the security and privacy of synthetic data. TrustChain’s mission perfectly aligns with our vision for SURE, making it an ideal partnership to bring our project to life. Since applying, our commitment to this vision has only strengthened, as we see the growing importance of accessible, reliable tools in the rapidly evolving field of synthetic data.
What is the most valuable takeout from the TrustChain project and why was the topic of the Open Call important to you/your team?
We as a company mainly operate in the field of AI and synthetic data, with an inclination to the applications of synthetic data for privacy preservation. The TrustChain Open Call #2 was perfectly aligned with the scope of our project: developing an open source software library for evaluating the privacy and utility performance of synthetic datasets.
The key takeaway from TrustChain Open Call #2 for Clearbox AI’s SURE project emphasizes the importance of building trust in AI by addressing privacy concerns in synthetic data training.
Did you establish collaboration with any of the TrustChain teams or plan for any kind of synergies? If yes, what is the biggest potential in such collaborations?
We have interacted with 3 OC 2 teams, who used our module, and one of them proposed to continue collaboration towards exploring joint R&D initiatives in the Healthcare AI data space.
What are your expectations regarding the TrustChain software ecosystem and its contribution to the NGI priority areas?
We would like to continue to be part of the Trustchain community and ecosystem to raise awareness and enhance adoption of privacy enhancing technologies such as synthetic data, which are crucial but not very common in the overall Trustchain system.
What are the next steps for your team?
Our next steps involve continuously developing and enhancing the SURE library, with a strong focus on refining its features based on user feedback. We aim to build an active community of users and contributors who will collaborate with us in shaping and advancing this project, ensuring it meets the evolving needs of the synthetic data community.
We are also in the process of applying the know-how to commercial use cases in the feel of data auditing and operationalizing the know-how to align with the upcoming regulations such as AI act.
What is the message you would give to new and potential applicants to TrustChain Open Calls?
Applying to TrustChain Open Calls was a fantastic experience, offering valuable collaboration opportunities and support throughout the process. We managed to build a highly impactful software technology while grounded in use-centricity.I highly encourage new and potential applicants to seize this chance to be part of an innovative and impactful initiative.