Video: more accessible data can help busy doctors to know more about the context of what they are looking at.
First of all, let’s ask the question that you often hear around AI in healthcare – are the AI entities going to “replace” doctors?
Often, when this question is asked, and we look into the future, what we see is that AI is likely to function much more frequently as a type of ‘decision support’ – that the ‘human in the loop’ is still going to be an integral part of the work that gets done.
For more on this question, in the realm of healthcare, we can listen to Polina Golland talking about why AI may be more of an assistive technology for clinicians than a replacement. (Golland is the Henry Ellis Warren Professor of Electrical Engineering and Computer Science at the Massachusetts Institute of Technology. As background, she specifically mentions collaboration with Stephen Horne, an emergency room doctor at Beth Israel Deaconess Medical Center).
In some of the examples that Golland gives us, it’s not so much that the data is not there, or that the doctor is unable to analyze it. It’s that the doctor simply doesn’t have time!
“When the patient comes in, and they stay in the hospital for a few days, their chest x-ray is taken every 12 to 24 hours … to measure, the physician looks at it, makes a decision and moves on. So the patient who came in might have (a large number of) x-rays from (the) previous … visits, and that information is captured in the images. … for the next visit, the physician has absolutely no bandwidth to look at the images and get the information from previous visits, and make better predictions for this patient. So they’re making decisions on whether to admit the patient or not; these decisions are recorded, but the information that is in the images is not.”
To which, the evident question would be: why not give the doctor more time to review?
However, AI might be part of the solution; as Golland points out, it could extract that data and present it like vitals, taking the heavy lifting from the clinician through statistical analysis.
Later on, though, Golland explains to us why it might be hard to use a neural network to do this particular kind of evaluation.
Part of the reason, as she points out, is that clinicians are telling stories about groups of images, not just going through them by rote one after another.
they look at the image and they describe everything that they can see in the image, as well as what the answer might be to the question that the treating physician asked. So we have a collection of images, a quarter million images, with the corresponding radiology reports.”
That means that training data for this process isn’t available in the same way it is for some other types of projects. (throughout this, she uses the example of pulmonary edema.)
You can also hear her talking about a “coupling” between what radiologists see and what AI captures. (Check out the part of the video where Golland goes into learning local representations).
“Our goal is to understand this coupling, and to exploit it to build better predictive models.”
She points out that Dall-E is similar in matching image and text information, and speaks about the human radiologist as a “biological learning system” this way:
(image caption: what do humans and AIs have in common?)
“They look at the image, extract the concepts, and then write them down in a fairly limited language to describe for us to use as a guidance for this process,” she says.
As for the AI representation, Golland goes over how this might impact predictions and analysis (watch the video with this!):
“If you enter text, the method can take a look at every region in the image, and paint it with how similar the concept in that region is to the text that you entered: these (red) images correspond to very similar, (and) blue corresponds to not similar at all. So the red regions are the regions that the machine learning algorithm said correspond to the text that the user entered. The black bounding boxes are physicians putting ‘ground truth bounding boxes’ on the concepts that they view as corresponding to the text. So you can see that the alignment (is) pretty close … “
And then she explains:
“Our dream of where we should take this next is: if a physician encounters a challenging case in the emergency room, they can basically point at the image and say, ‘what is that here?’ and the system would produce text that might describe that image, which would help the physician brainstorm, in some sense, what possible concepts could correspond to the tricky part of the image. This is similar to the systems that you might have heard about for code generation. … where the machine learning systems hypothesize possible snippets of code, where the programmers then can use them to actually write the programs. Here, we’re creating similar tools for the physicians to get to the bottom of a particularly tricky case.”
That’s it in a nutshell. As Golland summarizes in closing:
“We have lots of generous funding from multiple sources, that has enabled us to build machine learning methods that take multimodal data, (for example, text and images,) and build representations that help us support physicians in making clinical decisions better, and ultimately, treat their patients better.”
Read the full article here