- Artūras Nakvosas, Technical Lead, NLP department, NeurotechnologyVILNIUS, LITHUANIA, August 27, 2024 /EINPresswire / -- Neurotechnology , a provider of deep learning-based solutions and high-precision biometric identification technologies, today released its first open-source large language model (LLM) customized for the Lithuanian language.Neurotechnology's LLMs are built upon the foundation of the transformer-based LlamaV2 7 and 13 billion parameter language model architectures. To train it, the company's Natural Language Processing (NLP) team thoroughly used a vast dataset comprised of more than 14 billion tokens in Lithuanian that were used to pretrain the model. This allowed them to inject the Lithuanian language into LlamaV2 language models.“We are proud to contribute our LLM to the open–source community,” said Artūras Nakvosas, Technical Lead of the Natural Language Processing department at Neurotechnology.“By making it publicly available, we aim to encourage others to use it and expand the development of AI applications in Lithuanian.”To accelerate the training process, Neurotechnology utilized the NVIDIA H100 graphics processing units. Benchmarking results conducted by the company demonstrated that Neurotechnology's model outperforms the default Llama 2 in multiple areas, making it a reliable foundation for developing a wide range of AI applications in the Lithuanian language.“The proposed open LLMs for the Lithuanian language are evaluated in multiple benchmarks. In addition, these models are fully transparent, which allows them to be efficiently utilized in both commercial and academic contexts,” said Dr. Povilas Daniušis, Machine Learning scientist at Neurotechnology.Research papers with more extensive information and benchmarking results are available at the arXiv archive , while the open-source models and datasets are available at the Hugging Face platform .Neurotechnology's open-source Lithuanian LLM is the first step toward advancing NLP technologies among the Baltic States. By sharing this sophisticated tool, the company aims to encourage others and continue its research with Large Language Models across the Baltic, Scandinavian and Eastern European regions and languages.NLP solutionsThe company's extensive collection of NetGeist Natural Language Processing solutions are designed to provide organizations with vast textual data processing capabilities. NLP-based solutions include analyzing customer reviews to understand the sentiment, converting text to human-like speech for applications like audiobooks or presentations, transcribing audio recordings from meetings, lectures or interviews, and summarizing lengthy documents for faster information gathering."Natural language processing technology holds immense potential for both the public and private sectors," said Vytas Mulevičius, NLP team lead at Neurotechnology. "We have already developed textual data analysis platforms and created virtual chatbots, and now we're introducing an open-source model tailored to the needs of specific regions. With continuous research and development, we are planning to offer more solutions in the future.”About NeurotechnologyNeurotechnology is a developer of high-precision algorithms and software based on deep neural networks and other Artificial Intelligence (AI) technologies. Launched in 1990 in Vilnius, Lithuania, the company offers AI-powered solutions in a range of fields, including biometrics, natural language processing (NLP), computer vision and brain-computer interface, as well as ultrasound technologies. The company's NLP solutions are designed to enhance the capabilities of both public and private sectors, automating language-based operations.

