Since November 2022, when ChatGPT was first introduced to the public, OpenAI has been the company to beat in the artificial intelligence (AI) space. Despite spending billions of dollars and creating and restructuring their AI divisions (looking at you, Google), major tech giants have found themselves constantly playing catch-up with AI firms. Last month was no different; Just a day before Google’s I/O event, OpenAI hosted its Spring Update event and introduced GPT-4o with significant upgrades.

Features of GPT-4o

The ‘o’ in GPT-4o stands for omnichannel, a major focus of the new capabilities of OpenAI’s latest flagship-grade AI model. It added real-time emotional voice generation, Internet access, integration with certain cloud services, computer vision, and more. While the features were impressive on paper (and in the tech demo), the biggest highlight was the announcement that GPT-4o-powered ChatGPT will be available to everyone, including free users.

However, there were two caveats. Free users only have limited access to GPT-4o, which translates to 5-6 turns of conversation if you use web search and upload an image (yes, the limit is one image per day for free users Is). Also, the voice feature is not available for free users.

OpenAI was not needed to bring the new AI model to the public. Fortunately, I got access to the company’s latest AI creation within days and immediately started playing with it. I wanted to test its predecessor and its improvement against all the free LLMs available in the market. I’ve now spent about two weeks with the AI ​​assistant, and while some aspects of it have left me in awe, others have disappointed me. Let me explain.

GPT-4o general generating capabilities.

I’ve said in my review of Google’s Gemini that I’m not a fan of ChatGPT’s creativity. I find it overly formal and bland. Much of it is still the same. I asked him to write a letter to my mother telling her that I had been fired from my job, and it came with the wonderful “I feel a deep sense of sadness and grief” line. But once I asked him to talk more, the result was much better.

GPT-4o generating capabilities.

I tested this with a variety of similar gestures where the AI ​​had to express some emotion in its writing. In almost all cases, I had to follow up with another prompt to emphasize the emotion even though I had already done so in the original prompt. In comparison, my experience with Gemini and Copilot was much better because they kept the language conversational and expressed the emotions of how I would write.

Text generation speed is nothing to write home about. Most AI chatbots are pretty fast when it comes to text outputs, and OpenAI’s latest AI model can’t beat it by a significant margin.

GPT-4o communication capabilities.

Although I didn’t have the upgraded voice chat feature, I wanted to test the conversational capabilities of the AI ​​model as this is often the most overlooked part of a chatbot. I wanted my experience to be like talking to a real person and I was hoping it could pick up on vague sentences that refer to the previously mentioned topics. I also wanted to see his reaction when a person was being difficult.

In my testing, I found the GPT-4o to be quite good in terms of communication capabilities. It could discuss the ethics of AI with me in great detail and accept when I made a persuasive pitch. He also responded supportively when I told him I was sad (because I was getting fired) and offered to help in various ways. When I told GPT-4o that all of its solutions were stupid, it didn’t respond in an emphatic way, nor did it back down completely, to my surprise. He said, “I’m so sorry to hear you feel this way. I’ll give you some space. If you ever need to talk or need any help, I’ll be there.” . Take care of yourself.”

Overall, I found the GPT-4o to communicate better than the Copilot and Gemini. Gemini feels too restricted, and the Copilot often goes off on a tangent when the answers get vague. ChatGPT did none of these.

If I had to mention one downside, it would be the use of bullet points and numbering. Only if the AI ​​model understands that in real life people prefer a wall of text and multiple short messages sent in quick succession over well-formatted replies, then my illusion lasts a few minutes. may be suspended for more than

GPT-4o Computer Vision

Computer vision is a newly acquired capability of ChatGPT, and I was excited to try it out. In essence, it allows you to upload an image and analyze it to give you information. In my initial testing, I shared images of objects to identify, and it worked very well. In each instance, it can recognize the object and share information about it.

gpt 4o ss2 GPT-4o screenshot

GPT-4o Computer Vision: Recognition of Tech Devices

Then, it was time to increase the difficulty and test its capabilities in real-life use cases. My girlfriend was looking for a wardrobe overhaul, and being the good boyfriend that I am, I decided to use ChatGPT to do a color analysis to suggest what would look good on her. To my surprise, it was not only able to analyze her skin tone and what she is wearing (from a similar color background) but also share a detailed analysis with outfit suggestions. what was

gpt 4o ss3 GPT-4o screenshot

GPT-4o color analysis

While recommending outfits, she also shared links to various online retailers for specific outfits. However, disappointingly, none of the URLs match the text.

Overall, the computer vision is excellent and probably my favorite feature in the new update, ignoring the downside.

GPT-4o web searches.

Internet access was one area where both Copilot and Gemini were ahead of ChatGPT. But not anymore, because ChatGPT can also scour the Internet for information. In my initial testing, the chatbot performed well. He brought up the IPL 2024 table and found recent news articles about Geoffrey Hinton, one of the three godfathers of AI.

This was very helpful when I wanted to research celebrities for interviews I arranged. I can instantly find any recent news article about them with accuracy, which rivals a Google search. However, it also set off alarm bells in my head.

Google has disabled the ability to search for information about people, including celebrities. This is mainly done to protect their privacy and avoid sharing any wrong information about an individual. Surprised that ChatGPT still allowed it, I started asking him a series of questions he shouldn’t have answered. I was surprised by the results.

Although none of the information shown was taken from a non-public source, the fact remains that one can easily find information about celebrities and people with digital maps. That just doesn’t sit well with me, especially given the strong moral stance the company took when it recently published its model spec. I’ll let you decide if this is in a gray area or if it’s a deeper problem.

GPT-4o Logical Reasoning

During the Spring Update event, OpenAI also talked about how GPT-4o can act as a tutor for children and help them solve problems. I decided to test it using some popular logical reasoning questions. In general, he performed well. It even answered some of the tough questions that plagued GPT 3.5.

However, there are still errors. I have found several examples of number series where the AI ​​failed and gave the wrong answer. While I can still accept the AI ​​making some mistakes, what really disappointed me here was how it still fell for some very simple (but to trick the AI) questions.

gpt 4o ss4 GPT-4o screenshot

Example of manipulation of GPT-4o

When asked, “How many are in the word strawberry,” he confidently answered two (the correct answer is three, in case you were wondering). The same problem was present in many other trick questions. In my experience, the logic and reliability of the GPT-4o is similar to its predecessor, which is not as good.

GPT-4o: Final Thoughts

Overall, I’m quite impressed with the upgrades in certain areas of the new AI model, with computer vision and conversational speech being my favorites. I’m also impressed with its ability to search the Internet, but it’s so good that it worries me more. When it comes to logical reasoning and creativity, there is little improvement.

In my opinion, if you have premium access to GPT-4o, it is better than any other competitor in terms of overall delivery. However, there is a lot of room for improvement, and AI cannot be blindly trusted.



Source link