Google’s latest research has revealed that an artificial intelligence (AI) model, designed to answer medical questions, shows both accuracy and biases similar to those of doctors.
The study conducted by Google examined the performance of a large language model, similar to ChatGPT, in responding to medical queries and multiple-choice polls.
The researchers discovered that the model incorporated biases that could worsen health disparities and lead to inaccurate responses.
However, Google developed a specialized version of the model for medicine, which demonstrated improved accuracy and reduced bias, bringing it closer to the performance of a group of doctors who were also part of the study.
While the findings indicate the potential for AI to support clinicians in decision-making and accessing information more efficiently, further development is required before these models can be effectively utilized.
The researchers believe that AI could help expand medical capacity by aiding doctors, but efforts to fine-tune the models are crucial to ensure reliable and unbiased responses.
An assessment by a panel of clinicians revealed that the unspecialized model provided answers aligned with the scientific consensus in only 61.9% of cases, while the medicine-focused model achieved a significantly higher rate of 92.6%—comparable to the 92.9% reported by the clinicians themselves.
These results highlight the potential of AI in the medical field, but also emphasize the importance of refining the models to improve accuracy and address biases.