DeepSeek V3 is SMARTER Than ChatGPT (Here is PROOF)
A real example that no one is talking about…
DeepSeek V3 just launched.
I decided to test it and compare its intelligence to ChatGPT (GPT-4o).
Spoiler: DeepSeek V3 is a lot smarter than GPT-4o. But the example I’m going to share below will shock you.
DeepSeek V3 scores much higher than other top LLMs in “math”.
This is much more important than most people think.
A lot of people will think, “That’s cool, I guess, but I don’t really use LLMs for math. I use LLMs to write and explain things”.
The problem with that thinking is that we’re all using LLMs for “math” more than we think we are.
Math = Reasoning
If an LLM can’t do math, you can’t trust it to do basic logic or reasoning either.
If you ask an LLM to research several websites and synthesise the information, you must ensure that the LLM is reasoning correctly and not making logical mistakes. (Or else it is worthless and could lead you to a wrong conclusion).
DeepSeek V3 vs GPT-4o — My Reasoning Challenge
I was using Grok 3 (one of my favorite LLMs), and I was doing some research for this article. I asked Grok 3 to come up with a good question I could use to compare DeepSeek V3 and GPT-4o on reasoning.
I thought the question Grok 3 came up with was too easy and both LLMs would ace this little test, but I decided to try it anyway.
Here is the question:
You’re buying tickets for a concert with your friends.
The ticket site has two group deals:
Deal A: 4 tickets for $80.
Deal B: 3 tickets for $66.
You need at least 14 tickets total.
What’s the cheapest way to buy enough tickets, and how much will it cost? (You can mix and match deals, but you have to buy them as full deals, not single tickets.)
A 10-year-old should be able to solve this quite easily.
I just made a video on this as well:
ChatGPT (GPT-4o) answer:
Keep reading with a 7-day free trial
Subscribe to Andrew’s Substack to keep reading this post and get 7 days of free access to the full post archives.