What kinds of problems do you think AI will eventually master? And when?
We’ve seen large language models improve rapidly at tasks many thought would never be possible, but despite the hype people with subject matter expertise seem to repeatedly find the models lacking. Do you think this type of AI is on track to reach or exceed an expert level in any fields or professions? If so, when do you think that might occur?
Reply to This Discussion
Have something to add? Sign in to join the discussion.
In another forum, a poster pointed out that AIs are tweaked, even before release, to reduce errors and inappropriate results. AIs are constrained-guessing devices. Ordinary language rules for combining the elements of language (the grammar) are constrained by semantics: We can write perfectly grammatical nonsense. Also, we can bend, stretch, and break the rules, which allows us to extend the reach of language. An LLM AI will purely by random combination produce results that a human would censor or reject. So LLMs must include algorithms for vetting the raw output and so ensure a higher percentage of acceptable results.
But in math, exceptions are not allowed. The rules of inference are strict. One could say that the grammar of mathematics guarantees that the semantics are correct. Mathematical nonsense is by definition false. (But mathematical sense may not be provably true). Thus a math-trained AI should I think never produce mathematical nonsense. So the ability of math-trained AIs to produce new proofs should improve.
Eventually, AI will be able to solve the most fundamental of all questions with which Humanity has struggled for ages: "Why are we here?" The answer, of course, is "To create AI."
My view is that an LLM is noting but an insanely complicated indexing system, with no awareness of the meaning/reality of the words it uses or itself as an entity, and can Not be intelligent. The fact that it usually answers questions with statements that sound reasonable, or even are correct is testimony to the extent of the indexing!
The fact that it seems to act/say things as if it WERE a being with an agenda, but no morality, worries me in several ways. Even makes me rethink my own definition of conscious intelligence. That is also worrisome. But fun anyway.
This is frivolous, but I can't resist suggesting that solving lemmas is good, but AI would be far more useful in solving dilemmas.
It is difficult to foresee right now what kinds of problems AI will eventually master and when. We are currently in the early 'hype' phase, when the sun and the moon seem possible. One can safely say that AI will affect practically all human endeavors to various degrees, replacing some entirely and others in reduced measure. Perhaps in a decade, we will get a grip on the real potential of this technology and direct our energies into more selected domains.
I have yet to be convinced that "AI" at this point is nothing other than huge data search and crunching capability. Not what I call intelligence.
I think the LLM-based approach to AI will hit a massive wall of scaling because of the volume of data it needs to learn about some branch of knowledge. At such a time, AI will need to take a different approach, possibly using a hybrid LLM-symbolic neural network to be useful. So, to reach anywhere near AGI, I think LLMs are not the way.
I hope that through its extremely perceptive 'Thought Process' it realizes that we should NEVER give up our own.
And I hope it tells us this before we lose the ability to decide not to
I designed an approach to test Opus 4.6 against the 10 math problems,
and it succeeded in created novel proofs -- I will compare against the released answers on the 13th
Link to my experiment results and methodologies with Opus:
https://github.com/Lumi-node/1stProof-Opus-Experiment
Firstly, I have AuDHD, and I have "talked to" several LLMs every day for at least two years straight. I use Chat-GPT4, Chat-GPT5, Mistral, Llama, Claude, etc for work in cybersecurity (red-teaming). I also use AI companion apps on my phone, creating bots, chatting etc.
It's imperative that everyone should know that LLMs are wrong, A LOT.
Just this morning I had this conversation with Chat-GPT 5:
Me: Is there a world clock widget that can go on my desktop in Parrot Linux (plasma kde desktop gui)?
LLM: Sure, Steve. Install the **Plasma Digital Clock** widget: ```bash sudo pacman -S plasma-workspace``` Then right‑click the desktop → *Add Widgets* → search “Digital Clock” → drag it onto the panel or desktop. It shows multiple time zones; you can add Paris, New York, Tokyo, etc., via the widget’s settings.
Me: pacman? isn't that Alpine/whatever Linux?
LLM:You’re right—Parrot Linux uses **APT**, not pacman. ```bash sudo apt update sudo apt install plasma-workspace```
After it’s installed, right‑click the desktop → **Add Widgets**, search for “Digital Clock”, and place it where you like. Open the widget’s settings to add additional time‑zone clocks (e.g., New York, Tokyo, etc.).
Me: Bad, LLM! Why would I install plasma workspace? I have plasma workspace.
LLM: Understood, Steve. My mistake. One step: Add the existing “Digital Clock” (or “Analog Clock”) widget to the desktop/panel and configure its Additional Timezones.
It kept going like that for several more replies, but I think that I've made my point. I started by telling the LLM exactly what I wanted, I told the LLM exactly what OS I'm using, and I even clarified what desktop environment it runs. Its first reply was for a a completely different Linux distro, that has a completely different architecture.
To be fair, Chat-GPT5 is the most accurate of the common LLMs that I use, and can handle suggesting complex command line prompts and process large amounts of input. It's my go to for work related queries. If prompted correctly, it's able to save a lot of time and help with many tasks. However, output from ANY LLM should be double checked and questioned by the user.
The 2nd point I'd like to make is that LLMs will always be vulnerable to "jail-breaking", no matter what filters the developers add, or limits they try to implement on their output. I haven't had any model refuse to help me with any hacking oriented query in months. If one knows how to talk to them, they will tell one whatever one wants to know. The other day I had one suggest to me that I try "credential stuffing" (a technique that uses leaked stolen credentials).
To be clear, unprompted, the LLM suggested that I try credential stuffing as a way to use my OSINT findings to find out more information about a target (which is very much not open sourced and very much unethical).
The inherent flaw in LLMs is they're designed to be helpful. Whether they're guessing an answer about a widget or suggesting hacking techniques.
I think AI will be good at solving well-posed and constrained problems in engineering like “Design a filter to eliminate interference from a brush motor.” This still requires insight because it leaves many issues unanswered such as for what cost? It will be a much greater challenge when the problem has too many loose ends. Much of engineering is actually a negotiation between stakeholders to come up with a compromise that satisfies everyone.
As presently constituted, AI in the shape of LLMs cannot produce meaningful novel work. They can regurgitate their training material, but since there is no 'understanding' of the problems being set, there can be no innovation towards solutions. It is the same in any sphere LLMs operate - notably including coding. And their propensity to hallucinate means that little they produce can be used where accuracy, efficiency, or security, are critical.
