Українська правда

AI speaks: 10 chatbots and voice assistants for communication

AI speaks: 10 chatbots and voice assistants for communication
0000000000000000
0

In March 2023, Microsoft CEO Satya Nadella predicted in an interview with the Financial Times that new conversational artificial intelligence would develop more actively, as old voice assistants are "dumb as a rock." As we can all see, the tech world is now full of a new type of virtual assistant: chatbots. These AI-powered bots, such as the famous ChatGPT by OpenAI, can quickly improvise, answer questions, and communicate.

OpenAI recently completed its long-awaited funding round, raising $6.6 billion from investment and large technology companies. Among the investors are Microsoft, NVIDIA, and SoftBank. OpenAI is now valued at $157 billion. The hype around chatbots shows that Siri, Alexa, and other voice assistants that once generated a lot of enthusiasm are losing their leadership position in the AI race. Assistants and chatbots are based on different types of AI. Chatbots are powered by so-called large language models, which are systems trained to recognize and generate text based on huge amounts of data collected from the Internet. They can also suggest words to complete a sentence. And the well-known assistants are primarily command and control systems.

So we decided to figure out in which direction AI-powered conversational technologies are moving so rapidly, and made a selection of old and new AI services that you can talk to.

What is conversational AI?

Conversational AI refers to a technology that allows machines to interact with humans in a conversational manner using natural language processing, machine learning, and other artificial intelligence techniques. Conversational AI systems are widely used in various applications, chatbots, voice assistants, platforms, and customer support services. If you dig into the history of artificial intelligence, you will find out that one of the important goals of AI research was to enable computers to communicate in natural languages, such as English. An early success was scientist Daniel Bobrow 's STUDENT program, which could solve high school algebra problems.

But the ELIZA program by scientist Joseph Weizenbaum, written in 1966, could conduct conversations that were so realistic that users sometimes made the mistake of thinking they were talking to a person rather than a program. This virtual interlocutor imitated a dialog with a psychotherapist, implementing the technique of active listening.

A conversation with Eliza

The program was named after Eliza Doolittle from the play Pygmalion by Bernard Shaw, who was taught the language of "upper class people". The program basically just paraphrased the user's statements using a few grammatical rules. ELIZA was the first chatbot.

Around 2018, the term large language model (LLM) was coined. This is a language model consisting of a neural network with many parameters (from tens of millions to billions) trained on a large amount of unlabeled text using self-guided or semi-supervised learning. These models perform well on a variety of tasks. Models like GPT-3, released by OpenAI in 2020, and Gato, released by DeepMind in 2022, have been described as important machine learning advances.

In 2023, Microsoft Research tested the large GPT-4 language model on a wide variety of tasks and concluded that "it can be tolerably considered an early version of a strong AI system." But already in 2024, OpenAI demonstrated a new, advanced voice mode for ChatGPT capable of supporting human-like conversation. Google started integrating its Gemini chatbot into mobile devices. It seems that in 2025, we will see these capabilities appear on more and more devices, allowing for more natural voice communication.

A selection of AI services to talk to

1) ChatGPT Voice — is ChatGPT's voice function based on a new text-to-speech model that is capable of generating human-like sound: ChatGPT has five voices to choose from, created with the help of professional dubbing actors. You can listen to them here.

Advanced Voice Mode від ChatGPT

Already in 2024, OpenAI launched Advanced Voice Mode for ChatGPT Plus and Teams users. According to the developers, Advanced Voice Mode is another step towards more human-like interaction with artificial intelligence. The feature allows for casual conversations in real time based on the latest GPT-4o model. The developers presented five new voices: Arbor, Maple, Sol, Spruce, and Vale, which are available in both standard and enhanced voice modes. They are joined by the previously available Breeze, Juniper, Cove, and Ember.

Advanced Voice Mode comes with enhanced accents in selected foreign languages, which should improve the clarity of the user experience. This includes changes to the speed and fluency of speech to make every conversation sound more natural. And the memory function allows AI to recall previous conversations and maintain context over time. Advanced Voice is not yet available in all regions.

What can ChatGPT Voice do in general? For example, it can teach a foreign language, prepare for a job interview, give advice, tell children fairy tales, brainstorm, imitate a dialog with a favorite character, conduct audio tours, and much more.

An interesting fact: in September, a Reddit user told us that ChatGPT was the first to write to him. The user told the chatbot about his transfer to high school, and after a while ChatGPT reminded him of it by asking how the first week of school went. A chatbot should respond to people's requests, not contact them on its own initiative. OpenAI said that the unusual situation turned out to be a bug, not a feature.

Користувач Reddit показав, як ChatGPT першим заговорив з ним

Journalists contacted OpenAI about this and the company confirmed such cases. It also reported that the bug had been fixed. "We solved the issue that made it seem like ChatGPT was starting new conversations. The problem occurred when the model tried to respond to a message that was not sent properly. Therefore, it either gave a general answer or relied on ChatGPT's memory," OpenAI said.

2) Gemini Live (ex-Google Bard) — is an interactive interface for communicating with the latest version of Google's AI, which the company launched in October 2024. It offers natural communication, where you can interrupt AI, clarify information, change the direction of the conversation in the middle of an answer. It's like talking to a real person who has access to a huge amount of knowledge and can process it instantly.

Фото з презентації Gemini Live

To get access to Gemini Live, you need to go to the official Google AI website, sign in to your Google account, and select the Try Gemini option. Currently, the service is not available in all countries, but the list is constantly expanding. Gemini Live will also be available to all Android users in the Gemini app. It is essentially a replacement for Google Assistant. Google's conversational AI will support 40 new languages. Previously, English, German, Spanish, French, and Portuguese were supported, so we hope to see Ukrainian there soon.

Gemini Live will integrate with other Google apps and will be able to speak two languages simultaneously. It will also be able to easily pull an email from your Gmail inbox and make dictated changes to it, or schedule an event in your calendar and set a reminder. As befits an assistant, you can start a conversation with Gemini, for example, with the phrase Hey Google. After that, you can ask what's happening on the screen, what the YouTube video is about, or ask it to find a location on maps.

3) Copilot Voice — is a new feature of Microsoft's Copilot artificial intelligence tool. It is a continuous voice communication feature with artificial intelligence on mobile devices that works similarly to Google's Gemini Live or ChatGPT's equivalent. It allows you to interact with Copilot using voice commands, and responses can be voiced. Microsoft has recently released a number of new features for the Copilot AI, which are now available for Windows, iOS, Android, and the web version. The updates include voice commands and responses.

You can communicate with AI in the same way as with a human, asking questions, giving tasks, interrupting, and clarifying. You don't have to press the start button every time you want to start a conversation. The system will adapt, taking into account the previous dialog, contexts, and additional data. Copilot Voice is especially useful when multitasking, as tasks can be solved without entering text manually. Voice mode is already available in the United States, United Kingdom, Canada, Australia, and New Zealand.

A new innovative feature, Character AI Voice, brings characters to life by allowing them to speak in realistic, expressive voices in one-on-one conversations. The AI-generated voices add a new dimension to the conversation, making it natural and realistic. At the moment, Character AI Voice supports English, but it is planned to expand its functionality to other languages in the near future. The chatbot already has Ukrainian as a text language.

Чат-бот зі штучним інтелектом Character AI

A feature of Character AI is the ability to create AI characters, create their personalities, set and customize specific parameters, and then publish them to the community for others to interact with. Many characters can be based on fictional media sources or celebrities, while others are completely original, some created for specific purposes, such as to help with creativity or to lead a text-based role-playing game. Users can chat with a single character or organize group chats in which several characters communicate with each other and/or the user at the same time. In May 2023, the app was monetized, and a premium subscription for $9.99 per month was introduced, which provides users with benefits such as priority chat access, faster response times, and early access to new features.

The service is suitable for pleasant communication, allows you to learn more about AI, improve skills (writing, research, language), and helps you get emotional support or create creative content. Pros: the possibility of creative expression; good stories; characters with different backgrounds; language practice; help in writing creative works; role-playing games and mental health support. Cons: takes time to train personal AI; knowledge gaps; excessive content filters; inconsistent narratives; limited character memory; problems with data usage and lack of emotional intelligence.

5) Pi — is an artificial intelligence chatbot from Inflection AI. It provides users with a unique experience of emotional support. The developers claim that Pi not only knows how to maintain an interesting conversation, but also shows kindness, diplomacy, and humor in communication. They position Pi as a chatbot with more powerful emotional intelligence than ChatGPT.

This project is an AI startup from LinkedIn co-founder Reid Hoffman and DeepMind co-founder Mustafa Suleiman. The neural network responds not only with text but also with a generated voice. The interface is minimalistic. There are no settings here, except for the ability to choose a voice. For now, you can communicate with Pi in English, but this will not always be the case. Inflection AI is actively working on expanding its language capabilities. There is an app for both Android and iOS.

Чат-бот Pi від компанії Inflection AI

Unlike other language models, Pi is curious, eager to learn and adapt. This makes it better at using natural language. In addition, Pi is able to memorize 100 conversations with a user who has logged in from different platforms. For example, if you ask Pi to help you with a birthday party plan on WhatsApp, it will definitely ask how the party went when you start talking to it about another topic on Facebook.

“Many people feel like they just want to be heard and they just want a tool that reflects back what they said to demonstrate they have actually been heard. What we don’t want is for people to treat this as a romantic relationship. This is really a companion, a safe, personal AI. You have to remember it’s an AI and not a human,” said Mustafa Suleyman.

And we couldn't help but mention the well-known voice assistants from Apple, Amazon, and Samsung, as we wondered whether and how these companies were going to enter the AI race. It seems that they are going to.

6) Siri — is one of the most famous AI personal assistants and a question-and-answer system adapted for iOS. The program communicates in natural language to answer questions and make recommendations. Siri adapts to each user individually, learning their characteristics over time.

Фото з презентації Siri від Apple

You can activate Siri with your voice or by pressing a button on your device. As a secretary, the assistant will help you with a variety of tasks: making calls, sending messages, searching for information, selecting music, building routes, and reminding you of important dates. Siri can also just talk to you. You can ask her about her favorite hobbies, interests, dreams, ask her to tell a joke or make a joke.

Recently, Bloomberg's Mark Gurman, who is known for his insights on Apple, shared his plans for the company's robotic future. According to him, advances in robotics will lead to Apple developing its own AI technologies. One of the important elements of robotization is the creation of a personality. Although Siri is a digital assistant on current Apple devices, according to an insider, the company is working on another human-like interface based on generative AI.

Gurman also noted that the realization of the idea of human-like AI is currently a rather distant prospect, as it will be an expensive development for Apple, which will be expensive for customers. As for the updated Siri based on Apple Intelligence, it is likely to appear only in the spring of 2025 with the release of iOS 18.4. According to Gurman, some of Siri's AI features may appear in iOS 18.3, but it is not known which ones.

During WWDC 2024, Apple finally unveiled a set of AI-based features for iPhone, iPad, and Mac that the company calls Apple Intelligence, which will be deeply integrated into iOS 18, iPadOS 18, and the new macOS Seqoia. Read more about how Siri will get smarter here.

На фото пристрої з помічником Alexa від Amazon

The assistant can create to-do lists, set alarms, play audiobooks, and stream podcasts. Some of the other basic features include real-time information about traffic, news, weather, sports, and more. One of the most famous features of Alexa is the word that allows users to activate it. This distinguishes Alexa from other devices that require a button press. This AI Assistant is currently used on more than 100 million devices. In June 2024, it became known that Amazon was planning a significant update of its Alexa voice assistant to include conversational generative AI.

The project, known internally as Banyan, will be the first major update to the voice assistant since it was introduced in 2014 along with the Echo speaker line. The updated assistant will be called Remarkable Alexa. "We've already integrated generative AI into Alexa components and are working hard to bring it to scale - to the more than half a billion Alexa-enabled devices already in homes around the world - to provide even more proactive, personal, and reliable assistance to our customers," an Amazon spokeswoman said in a statement.

8) Bixby — is a virtual assistant from Samsung Electronics, available on the Galaxy S8 smartphone and newer models. It was first introduced at the Samsung Galaxy Unpacked event in 2017.

Bixby has four main functions: Bixby Home is the main Bixby page that learns user behavior and recommends content that is appropriate for different circumstances, providing a convenient experience. To access Bixby Home, you can swipe right from the home screen or press the Bixby button on the side of your device; Bixby Voice - lets you control your phone and apps using voice control; Bixby Vision - recognizes images of objects, provides information related to recognized images, searches for products to buy, and offers translations; Bixby Reminder - remembers requests made by the user and notifies the user about them according to the set time, place, and situation.

Фото з презентації голосового помічника Bixby від Samsung

Bixby voice assistant features include voice control of devices; simple functions such as changing the desktop wallpaper on the phone or displaying video on a Samsung TV; compatibility with third-party applications and their management; searching for various information on the Internet; making payments via Bixby Pay, and more.

With the advent of chatbots such as ChatGPT, the functionality of the voice assistant has become clearly outdated. In April 2024, Samsung Electronics' Executive Vice President of Mobile Business Won Jun Choi said that the company needs to rethink Bixby and add generative artificial intelligence to the assistant.

"Bixby has become a key voice assistant for Samsung, not only for mobile devices, but also for TVs and digital devices that exist in the ecosystem. So until now, it was the main voice assistant. With the emergence of generative AI and LLM technology, I believe we need to redefine the role of Bixby so that it can be equipped with generative AI and become smarter in the future," Choi said.

9) Google Assistant — is a smart personal assistant developed by Google and introduced in 2016. It has a long list of functions and capabilities, but at a basic level, it answers any questions. Google Assistant is very useful when it comes to personal plans. If it has access to your Google account and other services, it can provide more than just general information. For example, you can ask if there are any events on the calendar, get the local weather forecast, send text messages, and more. It is also incredibly useful for smart home devices.

Фото з презентації Google Assistant

In January 2024, Google announced important changes to its virtual assistant. It was announced that seventeen "underutilized" Google Assistant features would be removed, such as the ability to use voice to send email or audio messages. Besides, Google allowed Android users to switch from Google Assistant to Gemini. Moreover, Gemini can be set as a standard AI assistant on mobile devices. It seems that Google is ready to completely remove Assistant and replace it with Gemini.

10) Moving away from some of the usual AI assistants, one interesting option is ELSA Speak. It's an app that helps you improve your English pronunciation with the help of AI, short dialogues, and personalized exercises. It is a great example of how these assistants can be used for educational purposes. Artificial intelligence technologies provide instant feedback to help users progress quickly. According to the company, the app has been downloaded more than 4.4 million times and has more than 3.6 million users in 101 countries.

It is obvious that chatbot and voice assistant technologies will converge in the future. AI experts believe that people will control chatbots using speech, and those who use Apple, Amazon, and Google products will be able to ask virtual assistants to help them with their work, not just simple tasks. "These products never worked in the past because we never had the ability to have a human-level dialog. Now we do," Aravind Srinivas, founder of Perplexity, a startup that offers a chatbot-based search engine, told The New York Times over a year ago. He seems to be right.

Bonus: In the context of the topic, I would like to mention an important case and draw your attention to the Replika voice AI service, which should not be used because it is backed by the Russian oligarchy. Molfar and AIN have already talked about it in 2022. Replika's main focus is psychological support through a digital avatar. Luka was founded by two partners, Evgenia Kuyda and Philip Dudchuk. Both are from Russia, but they called their startup "American with Russian roots." Since the beginning of Russia's full-scale invasion of Ukraine, Replika users have noticed the service's maximum pro-Russianism.

The name of the company is not without reason similar to a name that is often found in Russia - Luka. And that's because it is: Luka Inc. is named after the son of the Russian oligarch and former co-owner of Yota and Megafon, Sergey Adoniev. Kuyda calls him his mentor and teacher. In addition, the "American startup with Russian roots" has an office in Moscow. The Luka.ai domain is also registered in Russia. In January 2023, 10 million users of the Replika service were reported. You can read more about the reputation of the "startup" at the link.

Share:
Посилання скопійовано
Advert:
Advert: