ChatGPT’s capabilities are getting worse with age, new examine claims

admin July 19, 2023July 19, 2023

[ad_1]

OpenAI’s synthetic intelligence-powered chatbot ChatGPT appears to be getting worse as time goes on and researchers can’t appear to determine the rationale why.

In a July 18 study researchers from Stanford and UC Berkeley discovered ChatGPT’s latest fashions had change into far much less able to offering correct solutions to an similar collection of questions inside the span of some months.

The examine’s authors couldn’t present a transparent reply as to why the AI chatbot’s capabilities had deteriorated.

To check how dependable the completely different fashions of ChatGPT have been, three researchers, Lingjiao Chen, Matei Zaharia and James Zou requested ChatGPT-3.5 and ChatGPT-4 fashions to resolve a collection of math issues, reply delicate questions, write new traces of code and conduct spatial reasoning from prompts.

We evaluated #ChatGPT‘s habits over time and located substantial diffs in its responses to the *identical questions* between the June model of GPT4 and GPT3.5 and the March variations. The newer variations received worse on some duties. w/ Lingjiao Chen @matei_zaharia https://t.co/TGeN4T18Fd https://t.co/36mjnejERy pic.twitter.com/FEiqrUVbg6

— James Zou (@james_y_zou) July 19, 2023

In accordance with the analysis, in March ChatGPT-4 was able to figuring out prime numbers with a 97.6% accuracy fee. In the identical take a look at carried out in June, GPT-4’s accuracy had plummeted to only 2.4%.

In distinction, the sooner GPT-3.5 mannequin had improved on prime quantity identification inside the identical timeframe.

Associated: SEC’s Gary Gensler believes AI can strengthen its enforcement regime

When it got here to producing traces of recent code, the skills of each fashions deteriorated considerably between March and June.

The examine additionally discovered ChatGPT’s responses to delicate questions — with some examples exhibiting a concentrate on ethnicity and gender — later turned extra concise in refusing to reply.

Earlier iterations of the chatbot supplied in depth reasoning for why it couldn’t reply sure delicate questions. In June nevertheless, the fashions merely apologized to the person and refused to reply.

“The habits of the ‘identical’ [large language model] service can change considerably in a comparatively quick period of time,” the researchers wrote, noting the necessity for steady monitoring of AI mannequin high quality.

The researchers beneficial customers and corporations who depend on LLM companies as a element of their workflows implement some type of monitoring evaluation to make sure the chatbot stays on top of things.

On June 6, OpenAI unveiled plans to create a staff that can assist handle the dangers that might emerge from a superintelligent AI system, one thing it expects to reach inside the decade.

AI Eye: AI’s trained on AI content go MAD, is Threads a loss leader for AI data?