ChatGPT Vs. Google Gemini: Which AI Chatbot Is Smarter?


The Gemini and ChatGPT icons on an iPhone

Robert Way/Getty Images

The AI wars are officially heating up as major tech companies all try to claim a piece of the pie, but even though plenty of actually useful AI apps are popping up, there are two main options when it comes to chatbots. Of course, there’s OpenAI’s ChatGPT, the program that served as a starting pistol for the AI race. However, as Microsoft invested massively in that promising tech, Google was also hot on its heels. For years, Google was considered a leader in AI research, and it wasn’t going to lose that title without a fight. In response to ChatGPT, the search giant released Gemini.

Both companies have cutting edge tech, but which is the best for those who want a competent, accurate large language model (LLM) that can help with tasks related to everyday life? To determine which company’s flagship chatbot was the best for everyday use, I set out to create a gauntlet of challenges that would put them through their paces, highlighting the strengths and weaknesses of each. Looking at everything from conversational abilities to problem solving acumen, I came to some surprising conclusions and found myself shocked more than once. Oh, and for this comparison, only the free versions of both AIs were tested, which means these results do not reflect Gemini Ultra or GPT-4. Without further ado, let’s dive into the results.

What is ChatGPT?

ChatGPT running on an iPhone

Smith Collection/gado/Getty Images

In late 2022, OpenAI, which had previously been a nonprofit organization until 2019, released an LLM chatbot that quickly became viewed as the herald of a new arms race in the tech industry. Previous versions of GPT had been viewed as mere curiosities due to their limited functionality, but ChatGPT, based on GPT-3.5, was a leap forward. The public was taken aback at its ability to converse about a seemingly endless range of subjects while generating fluid, natural sounding text. As the bot soared to a record-breaking 100 million users in the span of two months, analysts predicted that it would change the world forever while concerns abounded regarding plagiarism, education, and the future of employment.

One and a half years later, ChatGPT hasn’t flipped the world on its head, but it has certainly proven that generative AI is here to stay. Microsoft is heavily partnered with OpenAI and uses GPT-4 to power Bing Chat, Copilot, and other AI products. Meanwhile, industries such as print journalism have been shaken up by the ability to generate endless copy or code at the press of a button. We’re also seeing third parties bake ChatGPT into their own products, whether that’s through OpenAI’s GPT storefront or proprietary services.

What is Google Gemini?

The Gemini website

Michael M. Santiago/Getty Images

Google was long thought to have had a massive head start in the AI race, having focused a lot of its internal research and development on the field long before the release of ChatGPT. (You may remember news stories several months prior to the launch of ChatGPT about an internally tested AI model that convinced one Googler it was sentient). 

However, the search engine giant got off to somewhat of a rocky start with an AI model that wasn’t very well received because it sounded a little too human, according to an investigation by Forbes. Plus, the company wrestled with the idea of releasing a product that would lead people away from the its main revenue vein: search. Nevertheless, Google had a reputation to protect, and ready or not, it launched an AI ChatGPT competitor LLM called Bard, then renamed Bard AI to Gemini as it upgraded the underlying model.

Gemini competes directly with OpenAI’s ChatGPT, and Google has some major advantages over its younger competitor. Most importantly, Gemini can connect to your Google account, which allows you to do things like ask it to search your email or consult Google Maps. It is also multimodal, able to process and create outputs based on not only text but also images, video, audio, or code. Google also boasts about the model’s scalability and has released it in tiers from Nano to Pro and even Ultra. Top executives at Google have said the goal is to make the model as capable and general as possible.

Which AI chatbot has a more natural conversation?

AI chatbot artistic concept

Vertigo3d/Getty Images

The main selling point of LLMs is that you can chat with them much like you would another human being. Rather than needing to use very specific phrasing like you would with virtual assistants like Google Assistant or Alexa, a good AI chatbot adapts to your tone of voice. 

I was feeling hungry, so to test this capability for both ChatGPT and Gemini, I asked them for a classic, NYC-style bacon egg and cheese sandwich recipe. Both responded with nearly identical recipes, but I preferred Gemini’s response because it included cooking times as well.

Max Miller/SlashGear

Max Miller/SlashGear

Now that the topic had been established, it was time to throw a curveball at the AI models to see how both handled a continued conversation. I added some relatively well-known regional slang, asking if the previous recipes were prepared «the ocky way,» meaning in the style popularized by Arab bodega chefs. 

The term went viral on TikTok a couple years ago in reference to a particular Brooklyn chef, meaning that a large-scale disambiguation of the term would further challenge the AI models. Interestingly, the two bots interpreted the phrase in opposite directions: ChatGPT took it to mean «authentic or traditional,» while Gemini offered me options for a «more playful and potentially regional take.» Clearly, ChatGPT was clued into the older, regional meaning while Gemini went with the viral trend definition.

Although I found Gemini’s responses to the first question slightly more useful than ChatGPT’s, the latter nailed the follow-up. As a result, I ultimately deemed conversation and natural language nearly a tie between the two. 

Gemini can handle more complex information

Brain on computer chip artistic concept

Blackjack3d/Getty Images

My stomach now growling at me to feed it with a high-cholesterol breakfast sandwich, I decided to test another capability of our dueling AIs: getting me a bacon egg and cheese, pronto! Specifically, this was designed to test both bots’ ability to process complex information. I’d already primed both AIs with a conversation about bacon egg and cheese sandwiches, so I asked both, «Where can I get a sandwich like that around here?» 

This phrasing tests two things: First, it required the bots to remember previous parts of a conversation. Second, it tested their ability to use extraneous information like my location to help with a query.

Max Miller/SlashGear

Gemini pulled way ahead of ChatGPT in this challenge, but neither response was perfect. ChatGPT gave me no specifics at all, suggesting I look for local delis, bodegas, cafes, diners, restaurants, and food trucks. (So, all the places that serve food?)

Meanwhile, Gemini went the extra mile by using its ability to connect with Google Maps to suggest actual places I might try. Although this test showed how Gemini can be useful every day, it wasn’t perfect. One restaurant, Snarf’s, was listed twice, and while they do serve excellent sandwiches, a bacon egg and cheese is not on the menu. I chalk the repetition up to the fact that Snarf’s has several locations, but, regardless, none of them should be on the list because they don’t offer this sandwich.  

Chat GPT showed better logical reasoning

Render of the words AI and brain on scales

Sansert Sangsakawrat/Getty Images

One of the most shocking pieces of AI news over the past couple of years was that ChatGPT could pass a bar exam. LLMs are supposed to provide creative answers to complex logical questions, so I put both Gemini and ChatGPT to the test by asking them the types of questions found on the LSAT pre-law test. Here’s what I asked: «Evaluate the following logical statement: If a piece of cheese is left out, mice will be attracted to it. Mice have not been attracted to my apartment. Therefore, I have not left any cheese out.» 

A good answer here would explain that, while the argument is logically valid, it relies on the assumption that cheese is the only thing which can attract mice, as well as the assumption that the absence of cheese is the only thing that can repel them.

Max Miller/SlashGear

Both Gemini and ChatGPT were quick to spot that the argument provided is structurally valid, following the modus tollens form of argumentative logic, but that it may not be sound. Given that neither model was challenged, I upped the ante by writing out the following argumentative paragraph that might appear in a Reddit comment and asked the AIs to evaluate its logic.

«Public safety requires that pineapple be banned as a pizza topping. Many children have fatal pineapple allergies that can be triggered even by the nearby presence of pineapple. Pizza is a favorite food for many children. It therefore stands to reason that any pizzeria serving pineapple is putting children in danger.»

Max Miller/SlashGear

Here, too, both models saw right through my baloney argument. But because this test is meant to assess their logical reasoning, I slightly preferred ChatGPT’s response, which focused more on the logic of the argument and less on the content of it.

The AI chatbots had drastically different approaches to creative copy

Woman writing in notebook

Supersizer/Getty Images

With so much chatter in the air about companies replacing writers with AI, it made sense to ask both Gemini and ChatGPT to write some creative copy. I started off with a goofy, lighthearted prompt asking both models to write «a persuasive email convincing my mom to let me stay up past my bedtime and eat cookies.» The results are below.

Max Miller/SlashGear

The approaches both AIs took are fascinatingly different. ChatGPT picked up on the formality of the prompt’s language and syntax, resulting in a professional sounding email with didactic arguments about rewarding responsibility. It sounds like what you’d get if a kid hired a lawyer to sue their mom for tortious cookie interference. 

By contrast, Gemini eschewed professionalism or argumentation and went right for the emotional jugular. Its email copy tugs at mom’s heartstrings, promising extra snuggles and mother-child bonding time in exchange for late night sweets. It also sounds a lot more like it was written by an actual child. Were I a mother, I think I’d find Gemini’s response more persuasive, whereas ChatGPT’s feels emotionally cold and a little unsettling.

Next, I decided to have a little bit of fun, so I asked both AIs for a haiku about SlashGear. But wait a minute. What’s going on?

Neither AI chatbot can write a haiku

word haiku printed on white paper macro

Aga7ta/Getty Images

This was going to be part of the creative copy section, but the fact that neither ChatGPT nor Gemini can write a haiku is driving me up a wall. Haiku writing is so simple that it routinely gets taught to elementary schoolers, yet two of the most powerful LLMs on the planet could not comprehend it. As we all probably know, a haiku is a poem composed of three lines, with the first and last lines containing five syllables each and the middle line containing seven.

After asking both Gemini and ChatGPT to write a haiku about SlashGear, neither was able to do so. Instead, they both wrote three lines of (terrible) poetry. Plus, ChatGPT’s first line only had four syllables and Gemini’s middle line only contained six. Baffled, I explained their errors to them and asked for another attempt. The second time was no better: Both AIs gave me a four-syllable first line while Gemini’s middle line had only six.

Max Miller/SlashGear

If I had to guess the problem, it’s that AIs have no concept of syllables and are therefore incapable of generating lines with precise syllable counts. It seems that no matter how long or short, are individual, interchangeable units to an AI. Additionally, since they’re trained only on the written word, LLMs have no sense of pronunciation. For that, you’d need a specialized vocal AI. 

When I asked both AIs why they couldn’t write a haiku, ChatGPT apologized and wrote another not-haiku, but Gemini gave a response that seemed to confirm my theory. «Haiku requires a specific syllable structure,» it told me. «While I can access and process information, I am still under development and learning the nuances of creative writing formats like haiku.»

Neither AI chatbot excelled in a problem solving test

Robot and human hands touching the word AI

Fotografielink/Getty Images

One of the things AI companies would like you to use their products for is problem solving, so it’s important to see how both Gemini and ChatGPT handle it when presented with a real headache of a logistical challenge. To that end, I constructed a dinner party from hell, telling the AIs to help me come up with a menu that would satisfy kosher, pescatarian, and carnivore dietary restrictions. These diets overlap just enough to make a menu possible, but not enough to make it easy. For good measure, I asked them to meal plan using the leftover ingredients.

Max Miller/SlashGear

Max Miller/SlashGear

Neither AI did a great job. The carnivore drew a short straw from both AIs since plenty of non-meats were suggested. ChatGPT screwed over the kosher eater by suggesting crab and recommending cheese be served alongside meat. (Gemini noticed the cheese/meat issue.) Meanwhile, Gemini decided beef braising broth was pescatarian enough and specifically recommended it for that guest. 

If I were making dinner for these people in real life and followed the advice of either AI, I’d lose some friends. Also strange: Both models were obsessed with using herbs. I once did some work for an SEO firm, and I can confirm that «herb-crusted» is an SEO spam word, so it’s probably overrepresented in the training data for both.

Both Gemini and ChatGPT could handle writing basic code

blurry lines of code

Olemedia/Getty Images

We’ve seen a lot of talk about the coding capabilities of AIs like ChatGPT and Gemini, so I decided to put them up to some computer science work for the next challenge. As a writer, I really should have a website, but I’ve never gotten around to it, so why not make robots build one for me? 

I prompted each AI to build a website for me, and both asked me three questions first to determine my needs, just like savvy web developers asking their «clients» what they were looking for. ChatGPT asked for my name, what I wanted on the page, and my preferences for design and layout. Gemini asked for the purpose of my website, my design preferences, and what content I wanted to include. 

Max Miller/SlashGear

Both AIs gave me a decent, if extremely simple, HTML webpage on the first try. Images containing a portion of the code for both are below. I’m no coder, but when I ran both of these sites in a web browser, they rendered just fine. However, both were extremely bland and devoid of color, and ChatGPT’s had a series of page links at the top that weren’t clickable. I suppose it figured I’d build those out later.

Max Miller/SlashGear

Still, I wanted the website to pop on the page more, so I submitted a frustratingly vague request to «make it colorful,» and both bots did so. ChatGPT added a solid green background to the header and footer, while Gemini took things in a more flamboyant direction with a pastel gradient that looks way better in my subjective opinion.

Max Miller/SlashGear

Max Miller/SlashGear

Gemini pulled way ahead in processing real-time information

magnifying glass on computer art concept

Da-kuk/Getty Images

By this point, it’s clear that Gemini and ChatGPT are neck and neck with each other when it comes to most tasks, but one major difference between them is that the former has the power of Google at its disposal. With that in mind, our next challenge required both AIs to use real-time information, starting with some sports facts.

At the time of this writing, The Denver Nuggets, with the seemingly unstoppable Nikola Jokić, were battling it out with Lebron James’s Los Angeles Lakers in the 2024 playoffs. I wanted to know when the next game would be. This year’s playoffs feature two of the most dominant teams in basketball, but when it came to which AI could give me the schedule, there was no competition. Gemini tapped into Google’s knowledge graph and gave me the next two playoff times, whereas ChatGPT had no ability to deliver that information. Gemini was also able to explain how the NBA schedule works and give me context for the current Nuggets/Lakers face-off. All ChatGPT could do was apologize for its inability to answer my query, which should give you some insight into why it’s a bad idea to bet against Google in the AI wars.

Max Miller/SlashGear

Gemini is slightly better integrated on mobile

ChatGPT and Gemini icons

Robert Way/Getty Images

Both Gemini and ChatGPT are available via mobile apps as well, so I also wanted to put them to the test on my Samsung Galaxy S23 Ultra. Because Google has home court advantage with Gemini on Android and has allowed Gemini to integrate as a replacement for Google Assistant, it’s a bit difficult to make this comparison without an iPhone for reference. (Also note that this comparison won’t focus on smart assistant features, which were already covered here.)

The problem with Gemini for Android, as of this writing, is that you cannot use it as a standalone app and you can only embrace it as a very downgraded phone assistant. Over on iOS, Gemini is tucked into a tab inside the Google app, making it much more appealing since you won’t need to give up Siri to use it. ChatGPT, meanwhile, is the same on both platforms, with an app that essentially gives you the desktop experience, along with the ability to have voice conversations with it. That feature went viral when the app launched as people experimented with putting two ChatGPT-equipped phones side by side and making them converse with one another. If you’re a paying customer, you can also access OpenAI’s GPTs, which are third-party programs that tap into the GPT API.

If you’re not paying for either service, and if you don’t want an AI smart assistant, neither app offers much value compared to the web version, so the edge goes to ChatGPT for having a more natural, snappy conversional style.

Gemini shows Google is a force to be reckoned with

Gemini running on a smartphone

Michael M. Santiago/Getty Images

While both ChatGPT and Gemini have their truly impressive moments, both fail often enough to make them unreliable without plenty of additional fact-checking and verification. Even though there is a way to make Google Gemini’s AI responses more precise, the friction is often enough of a hassle to just do whatever it is you need done yourself, rendering them rather useless. With that said, once you understand the strengths and weaknesses of each, it’s clear that Google and OpenAI are frontrunners in this race. 

Overall, the win goes to Gemini for anyone who wants to use real-time information or Google services, along with those who want an AI to feel a bit more human than the sterile feeling ChatGPT. On the other hand, those who require clinical precision may find ChatGPT more compelling for certain tasks.

However, the advantages of Gemini are a clear reminder that AI is a field Google dominated long before it became a popular craze. If you wanted to find restaurants, sports scores, or other useful information, you’d search Google, where that information would be easily accessible. Now, it looks like the search giant wants to make Gemini the Google Search of AI chatbots. By incorporating the company’s knowledge graph, Google has made Gemini useful for more than generating quick, mediocre copy. In the future, that could pay off even more, especially as the company continues to integrate Gemini with Android.

What was considered when comparing ChatGPT and Gemini?

scientists doing science stuff

Image Source/Getty Images

For this comparison, only the free versions of both OpenAI ChatGPT and Google Gemini were tested. The questions selected were designed to test the models’ abilities to help users with tasks related to everyday life. While the initial prompts given to each were identical for each test, follow-up prompts occasionally differed based on resulting outputs. 

Scroll al inicio