Mostly AI, a pioneer in structured synthetic data, launched its synthetic text functionalitywhich produces Fortune 500 companies including Databricks and Amazon Web Services (AMZN)access to a “vast amount of custom text” to train and refine major language models or LLMs – without compromising user privacy, the report said.
The Mostly AI platform allows users to upload original text data, such as emails and customer support call transcripts, and choose an open-source language model from Hugging Face to generate the synthetic data. The original data is used to refine the LLM on the Mostly AI platform, which then generates synthetic text that can be downloaded or stored in a database.
“Today, AI training is reaching a plateau as models deplete public data sources and produce diminishing returns,” said Tobias Hann, CEO of Mostly AI. a statement. “To leverage high-quality proprietary data, which offers far greater value and potential than the remaining public data currently in use, global enterprises must take the leap and leverage both structured and unstructured synthetic data to secure upcoming generative AI solutions training and deployment. ”