Some doctors are worse than ChatGPT technology at giving advice on eye problems, test used in study shows

Some doctors are worse than ChatGPT technology at giving advice on eye problems, test used in study shows

The technology behind ChatGPT performed better at assessing eye problems and providing advice than non-specialist doctors, according to a new study.

A study conducted by the University of Cambridge found that GPT-4the large language model (LLM) developed by OpenAIperformed almost as well as specialist ophthalmologists in a multiple-choice written test.

The AI ​​model, known for generating text from the large amount of data it is trained on, has been tested with doctors at different stages of their careers, including junior doctors without specialization, as well as trainee ophthalmologists and experts.

Each group was presented with dozens of scenarios in which patients have a specific eye problem and was asked to make a diagnosis or advise on treatment by choosing from one of four options.

Dr Arun Thirunavukarasu – the lead author of the study – at work

The test was based on written questions, taken from a manual used to test trainee ophthalmologists, on a range of eye problems including sensitivity to light, decreased vision, eye damage and itching.

The manual on which the questions are based is not publicly available. The researchers therefore believe that it is unlikely that the large linguistic model was trained on its content.

In the test, GPT-4 performed significantly better than young doctors, whose level of specialization is comparable to that of general practitioners.

The model achieved similar scores to trainee and expert ophthalmologists, but was beaten by the top-performing experts.

Follow Sky News on WhatsApp
Follow Sky News on WhatsApp

Keep up to date with all the latest news from the UK and around the world by following Sky News

Tap here

The research was conducted last year using the latest major language models available.

The study also tested GPT-3.5, an earlier version of OpenAI’s model, Google’s PaLM2 and Meta’s LLaMA on the same set of questions. GPT-4 gave more accurate answers than any other model.

Researchers said large language models won’t replace doctors, but they could improve the healthcare system and reduce waiting lists helping doctors provide care to more patients in the same amount of time.

Read more on Sky News:
Tourist tax warning in 10 cities
Beating children should be banned, doctors say

Dr Arun Thirunavukarasu, lead author of the study, said: “If we had models that could provide care at a similar level to that provided by humans, it would help overcome the problems of NHS waiting lists.

“This requires testing to ensure this is a safe and effective model. But if it is, it could be revolutionary in the way care is delivered.”

He added: “Although the study does not immediately indicate the deployment of LLMs in clinical work, it gives the green light to start developing clinical tools based on LLMs, because the knowledge and reasoning of these models compare well with those of expert ophthalmologists. »


Related posts