Tag: #learningcurve

  • The Learning Curve, Part 4: A New AI Model and an Evolving Language

    The Learning Curve, Part 4: A New AI Model and an Evolving Language

    As Samsung continues to lead the way in providing top-notch mobile AI experiences, we take a look at Samsung Research centers across the globe to understand how Galaxy AI is empowering users to reach their full potential. With support for 16 languages, Galaxy AI allows more individuals to enhance their language skills, even without an internet connection, thanks to on-device translation in features like Live Translate, Interpreter, Note Assist, and Browsing Assist. But what exactly goes into developing AI language capabilities? In our previous visit to Vietnam, we explored the process of preparing data for training AI models. This time, we delve into how teams have made Galaxy AI a distinctive offering for both the Chinese mainland and Hong Kong.

    The rapid growth in AI tools that use large language models (LLM) has been seen worldwide, and China is no exception. With Baidu’s ERNIE Bot and Meitu’s MiracleVision emerging as popular choices in China, Samsung R&D Institute China partnered with both companies to help build Galaxy AI features for the country.

    Samsung R&D Institute China in Guangzhou (SRC-G) and Beijing (SRC-B) focused on providing Mandarin speakers in China with a seamless Galaxy AI experience, even though the underlying technology may differ significantly. The team utilized the specialized resources of Chinese dialects from external partners and developed a distinctive Galaxy AI solution for China.

    “We have the advantage of blending global best practices with China’s local practices, as well as creating new features and constantly improving them through daily communication with Chinese consumers,” says Hairong Zhang, Software Innovation Group Leader at SRC-G. “With rich development experience from the Galaxy S24, I’m proud of how our team cooperated with local Chinese AI companies such as Baidu and Meitu to provide a solution that resonates in China.”

    At the beginning, the teams had to acclimate to each other’s working styles and iron out the initial kinks of information asymmetry. Daijun Zhang, Head of SRC-B, established a task force to ensure the project followed the development schedule and moved quickly toward its goals.

    Thanks to the Beijing team’s experience in generating large-scale models and successful collaboration with third-party partners, all the generative AI features were successfully launched in China. The result is a solution that has local relevance and market-specific features such as Touch to Search.

    Expanding on Chinese to Develop for the Cantonese Dialect

    Mandarin Chinese for mainland China was introduced to Galaxy AI when the Galaxy S24 was released in January 2024. However, the task for Samsung R&D Institute China was still incomplete. The team was given the responsibility of creating the AI model for Chinese in Hong Kong (Cantonese), a dialect that builds upon the previous work done for Mandarin but introduces a whole new set of language features to tackle.

    In developing for Cantonese, the China R&D team faced major cultural challenges that it needed to respond to in order to fully support localization for the market. The first cultural phenomenon is the two sets of systems for writing and speech. Hong Kong locals use grammar and expressions similar to Mandarin when writing but adopt a completely different colloquial grammar when communicating daily. Also, Cantonese has nine tones for pronunciation, whereas Mandarin has four.

    Another cultural phenomenon is that the Cantonese dialect itself develops with the times. Add to that the fact that people often blend Cantonese and English into conversations, and it’s clear to see why it was complicated to create test cases and validate language packs.

    “Cantonese is a very unique dialect that varies in different Cantonese-speaking regions,” says Jing Li, who leads the operation for testing the Cantonese AI solution. “Some of the slang, phrases, vocabulary and even the tones are varied from place to place. Therefore, we conducted a large amount of work in verifying the Hong Kong-specific data, as well as proofreading tens of thousands of relevant test cases.”

    With these complexities in mind, SRC-G and SRC-B worked together to support a deep code mix using a mixture of Cantonese and English for speech recognition, simultaneously supporting both written and spoken expressions in machine translation and reflecting current pronunciations in speech synthesis.

    Cultural Impact of Communication

    When Galaxy AI launched the Chinese (Hong Kong) language option, the customer feedback showed that the hard work of the Samsung R&D team was justified.

    For both the Chinese mainland and Hong Kong, Samsung’s Galaxy AI activities show the importance of a global brand having a local presence and expertise, as well as the power of open collaboration with other organizations. In Hong Kong, Cantonese is a key part of the cultural identity of those who live there. That’s why it was so important for the team to get the AI language model right.

    “Language and communication are crucial in every region and in all walks of life,” says Henry Wat, Heads of Engineering Group at Samsung Electronics Hong Kong. “No matter the language, any tool that helps people communicate is invaluable. I believe our work is meaningful.”

    In the next episode of The Learning Curve, we will head to Brazil to see how a team works across cultures and borders to bring Galaxy AI to more people.

    Join Galaxy AI-Volution Squad Today!

    Purchase our latest Galaxy innovation

    To learn more about Galaxy AI, visit: https://www.samsung.com/my/galaxy-ai

  • The Learning Curve, Part 3: Mastering AI Data for Optimal Performance

    The Learning Curve, Part 3: Mastering AI Data for Optimal Performance

    Samsung is at the forefront of developing cutting-edge mobile AI experiences. We are currently visiting Samsung Research centres around the world to explore how Galaxy AI is unlocking the full potential of its users. With support for 16 languages, Galaxy AI is now making it easier for individuals to enhance their language skills, even without an internet connection. This is made possible through on-device translation in various features like Live Translate, Interpreter, Note Assist, and Browsing Assist. During our recent trip to Jordan, we had the opportunity to delve into the intricacies of creating an AI model specifically designed for the Arabic language, which is known for its diverse range of dialects. For our next adventure, we’ll be heading to Vietnam to delve into the fascinating world of data preparation for training AI models.

    Can you explain the distinctions between a ghost, grave, and mother in Vietnamese? Despite being spoken by 97 million people worldwide, the attention given to this language is surprisingly minimal. Every word can be translated as “ma,” “mả,” or “má,” and the only way to differentiate them is by their tone. This highlights the challenges that AI models face when it comes to language learning. They lack the ability to grasp the context, emotions, and intentions of conversations, making it a complex task for them.

    Samsung R&D Institute Vietnam (SRV) utilised highly accurate data to enhance its AI model’s ability to accurately identify even the most nuanced variations in language.

    The accuracy of automatic speech recognition (ASR), neural machine translation (NMT), and text-to-speech (TTS) is directly influenced by the quality of data. These processes are crucial for Galaxy AI features like Live Translate, Interpreter, Chat Assist, and Browsing Assist, as they effectively overcome language barriers.

    A Storm of Challenges

    “Vietnamese is a language that is both intricate and varied, with a wide range of expressions that can be difficult to fully grasp,” explains Ngô Hồng Thái, the NMT lead at SRV. Developing Vietnamese as one of the 16 supported languages was quite challenging for Galaxy AI.

    “Creating an AI model for Vietnamese was quite a challenge,” he remarks, as he goes on to describe the obstacles encountered during the development process.

    Vietnamese, like many other languages, features six distinct tones that are an integral part of its linguistic structure. As shown in the example above, slight variations in vocalisation can have a significant impact on the meanings of words. Thus, a careful and thorough approach was required.

    “When words that sound similar are analysed, one word is made up of multiple short segments, or ‘frame sets’,” explains Bui Ngoc Tung, the ASR lead at SRV. The AI model is able to distinguish between short audio frames that last around 20 milliseconds. It can identify which words correspond to a specific sequence of frames. It is crucial to dedicate significant effort to the initial phases of the AI learning process.

    In addition, Vietnamese frequently includes homophones and homonyms. Typically, individuals can depend on the surrounding context and nonverbal cues during conversations to distinguish between homophones or homographs with distinct definitions. Nevertheless, AI models must be trained to effectively recognise and distinguish between tones and words that are alike.

    “This task is quite complex,” Thái explains. Ensuring the accuracy of the data is crucial in order to recognise the subtle nuances of the Vietnamese language, in addition to the quantity of data.

    Thorough Preparation

    The process of refining the data involves three steps. Initially, it is necessary to thoroughly review and rectify the audio and text utilised for training the AI model. Afterwards, this dataset undergoes random inspections to ensure its overall quality. After completing the necessary steps, the dataset is prepared for training by normalising and cleaning it.

    “We thoroughly performed a series of tests to check the accuracy of our dataset,” says Nguyen Manh Duy, TTS lead at SRV who oversees database creation. “We faced a number of unexpected problems including misspelled words in scripts and background noise or incorrect pronunciation during audio recordings. We spent significant time refining and improving our training data.”

    In addition to the unique linguistic challenges in Vietnamese, there is a lack of universally accessible data compared to more widely spoken languages. “This is another reason why the data refinement stage is so important,” he adds. “Since we had limited sources, every piece of data had to be fully reliable. There was no margin for error.”

    Moreover, the AI model for Vietnamese must consider both tonal and regional differences. To improve the AI model’s accuracy, the team collected vast amounts of data with Vietnam’s northern, central and southern accents — resulting in an enormous amount of information to refine and verify.

    Continued Improvement

    Developers at SRV completed the project after months of hard work, and Vietnamese became one of the first languages to be supported by Galaxy AI. Despite this success, the team is ceaselessly working to improve the Vietnamese Galaxy AI experience.

    “We’re continuing to enhance the AI model by incorporating user feedback about the relevance of words and phrases in Galaxy AI,” says Tran Tuan Minh, leader of the AI language development project at SRV. “We have just taken our first steps into a more open world — and we have so much more to explore together.”

    In the next episode of The Learning Curve, we will head to China to dig into how AI models are trained and fine-tuned.

    Join Galaxy AI-Volution Squad Today!

    Purchase our latest Galaxy innovation 

    To learn more about Galaxy AI, visit: https://www.samsung.com/my/galaxy-ai