Podcast: Artificial intelligence in medicine
Read Implementing machine learning in medicine.
Read Problems in the deployment of machine-learned models in health care.
Read Evaluation of machine learning solutions in medicine.
Transcript
Kirsten Patrick Artificial intelligence and machine learning have transformed our lives. The availability of massive amounts of data in the hands of big tech are making our lives easier, and also generating worries about a Big Brother society and Coded Bias — the title of a new Netflix documentary. The adoption of AI in medicine has perhaps lagged its adoption in other areas, and machine learning in healthcare has had mixed results. I'm Dr. Kirsten Patrick, interim editor-in-chief of CMAJ. Here with me today to discuss a series of three articles on the development, use, misuse, and evaluation of machine-learned models in medicine are two of many co-authors on the series, Muhammad Mamdani and Amol Verma. Dr. Mamdani is vice-president of data science and advanced analytics at Unity Health Toronto, director of the Temerty Center for Artificial Intelligence Education and Research in Medicine, and professor at the University of Toronto. Dr. Verma is a physician and scientist at St. Michael's Hospital and the University of Toronto, an AMS healthcare fellow in compassion and artificial intelligence and a provincial clinical lead in health quality improvement with Ontario Health. But first, a short break.
Advertisement This episode is brought to you by Audi Canada. The Canadian Medical Association has partnered with Audi Canada to offer CMA members preferred incentive on select vehicle models. Purchase any new qualifying Audi model and receive an additional cash incentive based on the purchase type. Details of the incentive program can be found at audiprofessional.ca. Explore the full line of vehicles available to suit your lifestyle. The Audi driving experience is like no other.
This episode is brought to you by Dr. Bill. Dr. Bill makes billing on the go easy and pain free. Add a patient in as little as three seconds and submit a claim with just a few taps. Start your 45-day free trial today. Visit drbill.app/cmaj to get started.
To shingles age isn't just a number. Do you have patients 50 or older? They're at higher risk of getting shingles. Don't wait. Talk about Shingrix with your patients over 50 today. Shingrix is indicated for the prevention of herpes zoster or shingles in adults 50 years of age or older. Consult the product monograph at gsk.ca/shingrix/pm for contraindications, warnings and precautions, adverse reactions, interactions, dosing administration information. To request a product monograph or to report an adverse event please call 1-800-387-7374. Learn more at thinkshingrix.ca.
Kirsten Patrick Welcome to CMAJ's podcast, Muhammad, Amol. It's great to have you with us.
Amol Verma Thanks for having us.
Muhammad Mamdani Thank you. It's great to be here.
Kirsten Patrick So let's start with the basics. I can't imagine that in 2021 there are many listeners who have not heard of AI and machine learning. But let's get the definitions in so that we're all on the same page. What is artificial intelligence?
Muhammad Mamdani Sure. So artificial intelligence is basically a concept. It refers to the theory and development of computer systems able to perform tasks that normally require human intelligence, things like visual perception, speech recognition, decision making, translation between languages, those sorts of things. Now, machine learning is, I guess you can refer to it as the the kind of the math or the algorithms, the process of developing systems that learn from data, to recognize patterns and make accurate predictions of future events. So with machine learning models, you're typically using math to learn about all sorts of patterns and such in the actual data.
Kirsten Patrick Now, in the articles, you use the term "machine learned" when you're talking about algorithms and tools. Why did you choose to use machine learned rather than machine learning, which is commonly known?
Muhammad Mamdani Yeah, that's a great question. I think some people prefer the term machine learned because it's, it's at that point where you're thinking, okay, you know what, I think I've actually had the algorithms learn enough so that they are learned now so I can actually start looking at more deployment issues. It's that closer to the end process where you're satisfied with with the model.
Kirsten Patrick Muhammad, why did you want to write this series of articles?
Muhammad Mamdani I think we were seeing a lot of interest in artificial intelligence in medicine and machine learning models in medicine. And there's a lot of excitement and energy. And I think we're in a situation where healthcare is still using fax machines quite extensively. There's some challenges in terms of understanding the concepts of AI and machine learning, but also how to deploy them into what's traditionally a very conservative environment that has its challenges with data and with implementation and such. So we thought, there's not a lot of papers out there that actually look at how do you practically deploy AI or ML solutions into clinical practice. So we gathered experts around the country and asked for their expert advice around, what are some of the things that we need to be thinking about, not only in terms of developing these sorts of solutions, but actually deploying them into practice in meaningful ways. And we hope to fill that gap with this series.
Kirsten Patrick When you say you've gathered experts from across the country, where are the co-authors of the series from in Canada?
Muhammad Mamdani Sure. So they, they range, there are three large AI centers, AMII in Alberta, Vector Institute in Ontario, and Mila out in in Quebec. And so we we actually collaborated with the researchers from each of these centers to come up with the series, and we also had clinicians. It tends to be a very multidisciplinary undertaking. So we brought in clinicians and experts from these three large Institutes to write the papers.
Kirsten Patrick Now, one of the articles in the series talks about problems that other AI tools have shown in the medical arena. Can you give us some examples of how machine learning tools have been used in medicine so far, and with what success?
Muhammad Mamdani So I think there are lots of examples of projects or solutions that that have been successfully implemented, I would say more not in healthcare. So if we look at the more more general application of ML and AI, and then I'll get into the healthcare in a bit, you know, we have examples such as people shop on Amazon, I'm sure lots of people have shopped on Amazon. It's incredible how it knows what to recommend, in terms of what to buy next. And it bases that on machine learning models, where the the data on your previous purchase patterns is also then matched to people who are kind of like, you know, hundreds of thousands if not millions of people from all around the world, to then say, you know, people like you tend to buy X, and then it actually recommends it. There's all sorts of other applications like voice assistants on our on our smartphones, Google Assistant on Android and Siri on Apple, we have Alexa, that Amazon has created, you know, where you stay interactive with these sorts of things we see chatbots quite a bit and more so more and more in medicine as well. Specifically in healthcare, we have a few examples of things, some that are emerging, some that are actually being used. So for example, if we look at the medical imaging diagnostic space, we have companies like Zebra Medical Vision, RapidAI, Hidoc. Just picking on one of these I know, they've been deployed, I think in over 1800 hospitals. And they deal with detecting anomalies in medical images in areas like stroke, aneurysms, pulmonary embolism. And they're increasingly being used because they could actually detect anomalies much faster, and in some cases more accurately than many clinicians can. Another example that we're seeing more and more of, we have virtual care apps, for example, TELUS health MyCare, formerly Babylon, it's been deployed in four provinces, in terms of allowing doctor consultations. So it's basically an app you can download on your phone. And it has a symptom checker that's driven by artificial intelligence. So you can type in symptoms, and it'll actually kind of sort through, well, this is what it could possibly be. And you know what, I think you need to see a clinician, and there's a button, you can actually push to schedule a virtual appointment. Now, the problem here, though, is we don't know how good the symptom checkers are. So there's a lot of, I guess, apprehension around using them full force. But these are the sorts of AI applications that are making their way through healthcare. There are examples that haven't worked as well, I think a pretty good example of one that hasn't worked, and there are numerous examples like this is there was a really neat algorithm by those created by researchers at Google to look at diabetic retinopathy. And the concept was that we can look at AI images and detect diabetic retinopathy. And the accuracy was pretty fantastic. And it was actually comparable to if not better than trained clinicians in the space. And there was notion in Thailand that, you know, it'd be great if we could actually use this algorithm because we have this mass screening campaign for diabetic retinopathy and we would love to gain efficiencies from AI. Now, when it was actually deployed in Thailand, it didn't end up working out so well, because one could argue there was an out of distribution problem. Basically what happened was that the algorithm was trained on very clean pristine images in a lab like setting. Of course, in the real world, you don't always get that. There's, there's a lot of messiness. And so the algorithms had a little bit of trouble, because in some cases, the images are taken under poor lighting conditions, to really translate into clinical practice, because of all that messiness. It was a different dataset that the algorithm was dealing with. And so there were quite a few failures, and people had to be rescheduled, and people were very upset. And it didn't go over that well. And they had to kind of close it down and regroup.
Kirsten Patrick It sounds like you're talking about most applications being related to diagnosis, and not so much in other areas of medicine.
Amol Verma Yeah, I think what Mohammed was talking about were a lot of the advances have been in computer vision, in pattern recognition, especially vision-oriented problems, whether that's diabetic retinopathy, or radiographic images, or dermatology, things like that, there's been really great advances there in clinical medicine. I think what's been less well explored with machine learning has been complex clinical decision making and providing decision support for that. So for example, if you, you could name anything, and someone, any clinical outcome, any condition, someone out there is developing a machine learning tool today to try to predict that that outcome right now, there are 1000s and 1000s of people, you know, working in this space and doing really innovative work, it's just that a lot of those prediction tools have not really broken their way into clinical practice, to some of the early efforts in this area around personalizing cancer care using IBM Watson, for example, actually, you know, didn't perform particularly well in real clinical environments. And I think one of the challenges around that has been okay, how do we actually get an algorithm to work in real practice and to provide useful recommendations. And that's something that I think we've been working on at St. Michael's Hospital in Toronto. In some ways, that was kind of the motivation first, for a lot of this work here was when we started thinking about actually implementing machine learning tools into real time clinical environments, we realized there was no playbook to do this.
Kirsten Patrick So that's where this series comes in. And when you talk about deployments in the first article you use as a worked example, something that you've done at St. Michael's Hospital. Would you like to tell us about that?
Amol Verma Yeah, absolutely. So you know, my involvement in this space of machine learning in clinical medicine really started when I was a research fellow working with Muhammad, several years ago, an embarrassingly long number of years ago now. And, you know, we were sitting in Muhammad's office, and he had, you know, just been named the head of data science and advanced analytics at St. Michael's Hospital. And there was a real energy and enthusiasm within the hospital around implementing advanced analytics in clinical care. And Muhammad asked me a question. He said, If you could use artificial intelligence to predict anything in clinical care, what would it be? And so that started us on this huge journey. We started I, you know, we started talking to clinicians, I, we started in general internal medicine, because that's the division I work in. We asked clinicians, we asked patients and caregivers, we asked our nursing colleagues and other allied health professionals this specific question, well, what can we predict? And how would that help us improve clinical care, and what we landed on was predicting clinical deterioration, so predicting whether patients are going to die or require intensive care in the next 48 hours, and developed a tool that could predict this in advance and deliver an early warning signal to clinicians to say, you know, focus on this high risk patient, and see if you can try to prevent their clinical deterioration or improve the care as they deteriorate. And so we call that tool CHART watch. And that was really the focus of the paper that we were, we talked about everything we learned as we tried to implement CHART watch in a real clinical environment.
Kirsten Patrick So as a once anesthesiologist and intensive care specialist, I can back you up that this is the holy grail. I remember being on call and wondering so many times why I wasn't called earlier, to somebody who was deteriorating. And I think that unless you've spent a lot of time with a lot of sick patients in clinical care, detecting who's deteriorating is actually really, really hard. And we don't know until patients are very sick, and it's even worse for kids. So this seems like a super exciting tool. What did you do?
Amol Verma Yeah, just to pick up briefly on that point. You're right. So the literature tells us that unrecognized clinical deterioration is the number one root cause of unplanned transfer into an intensive care unit. And what you highlight is sort of the challenge in clinical judgment, I would say. There's also a really important system problem, which is, obviously we can't have eyes on everyone at all times or monitors on everyone at all times. And so trying to identify who are the high risk patients is, as you said, sort of a holy grail in medicine. And unfortunately, it happens rarely and more often than not, we use the term someone, a patient "crashed," right. They had a sudden, unpredicted, you know, we use the the synonym for almost an accident, right, a crash. But of course, that's not the case. That's not how clinical deterioration tends to happen. It often happens with some early signals, which are difficult to detect. And so that's really the focus of this, is why we thought there is real opportunity is the computer can detect patterns early, can identify high-risk patients, and then convey that information to clinicians. So we can focus our attention on those highest risk patients, we can monitor them more closely. We can try to intervene earlier. And so you know that that was the beginning of our journey, we thought about, okay, what are the elements that are needed for this system, this early warning system. So the first is, of course, you need a machine learning tool. And so we assembled a team to develop the machine learning tool, but the tool itself is irrelevant, what you need to do is to deploy it in the clinical environments. So you need the right IT and informatics experts to help you actually get the the tool up and running and communicating with clinicians. The other thing that you need is some clinical response once you receive the tool, right, so you need to say, okay, this is what you would do for high risk patients. So we developed an implementation team to work through not only the IT implementation, but the clinical implementation, what does the clinical pathway look like? And then the final thing is, we needed to understand if once we launched something, if it was harming or helping patients, and very importantly, we wanted to make sure, of course, that there was no harm associated with this. So we created an evaluation team that thought about, okay, how are we going to carefully monitor and evaluate this tool and also assess it for its effectiveness. So we created these three parallel teams, a model development team, an implementation team, and an evaluation team. And their work was very overlap. And so we had to, basically they worked together, we had sort of similar membership, you know, different people focusing on each task, but also across the different teams so that, for example, the clinical implementation could affect the way the model was trained. Just to give you a quick example of that, the very first model we developed, just predicted whether patients were likely to deteriorate at any point on the hospitalization. And we when we took that back to clinicians, they said, so I don't know, like when this deterioration is going to happen. So now you're giving me this information that this person is at high risk for deteriorating maybe seven days from now, I don't know what to do with that information, right, I'm not going to watch them closely for 7 or 14 or 30 days. And so we retrained the model with that feedback to focus on the approximate 48 hour prediction window, because that was felt to be more clinically useful. So there was a lot of back and forth between the teams designing the way the model was developed, how the implementation was done, and how the evaluation was done. And that was kind of what we landed on as a framework that worked really well for the for the team structure to actually implement something like this, acknowledging all of the multidisciplinary skills that are necessary, and making sure to elevate the voices of patients and their caregivers, as we think about, you know, these tools that our patient faces.
Kirsten Patrick And I think the story is, is that you were ready to implement this tool sometime around the time that the pandemic started, right?
Amol Verma Yes, unfortunately, you know, our initial launch was planned for spring 2020, which was exactly when the pandemic hit. And so, you know, we had to put a pause on the initial launch, although I will say, we then did launch kind of in August 2020. So in that liminal phase between, you know, wave 1 and wave 2 of the pandemic, and it actually turned out to be quite useful in our COVID unit, which was, which was very gratifying to see.
Kirsten Patrick So practically speaking, how did you implement it?
Amol Verma Yeah. So when we had developed a model, and there's there's actually kind of a little interesting anecdote behind this. And I hope Muhammad doesn't mind me pulling the curtain back a little bit. So we had developed this model that worked quite well. And we were all set to go with implementation. And actually, our colleagues at the Vector Institute had hosted a large international machine learning for health conference and Muhammad and I were sitting in the audience listening to one of the other professors from the University of Michigan, Professor Jenna Wiens, speaking about her work, implementing machine learning technologies, and one of the things she commented on was the necessity of a silent testing period. So to turn on the models, have them running in the back end in real time, and monitor their performance before pushing it out to the real world. And we've kind of heard of this, but we weren't really planning on making this a big part of our intervention, and she just talked about all of the horror stories that that had gone wrong in the absence of this silent testing period. And Muhammad and I texted each other during that, that session saying, we really have to put, our launch was planned for the next week or so. And we said, boy, we really have to shut that down and put up a silent testing period. And thank goodness we did. So you know, the first thing we started with was this silent testing period, where the models ran silently in the background. And we detected quite a number of challenges related to data related to the real time IT environment where we had to address those problems and get the models running right, in real time, before we were able to finally push towards implementation, then we went through the pandemic. And so it actually ended up being about a nine month sort of silent testing period where we worked out all the kinks. And then we were ready for a real launch. It involves several months of training of the clinicians to help them understand the physicians, you know, St. Michael's is a teaching site, so the residents, and also our nurse colleagues and the Allied Health interprofessional teams. So training about the tool, and then finally, we launched it in sort of a phased approach. So we have five general internal medicine teams, typically running at St. Michael's Hospital, we started with two of those teams just to work out the kinks for about a month. And then we, you know, expanded and rolled it out to all five teams. And then finally, we included some of the other pieces of the intervention, including the palliative care service, and the nursing components of the intervention. So we did this staggered phased roll out that helped us troubleshoot as we went. And during that time, we had an implementation team that met weekly to review the model to identify issues to be addressed. And we used quality improvement methods sort of Plan, Do, Study, Act or PDSA cycles to address concerns as they arose and to revise and refine the tool iteratively such that, you know, over the probably about a three to five month period, we finally had a system that we were really happy with that was working really well.
Kirsten Patrick It sounds like the speaker's intervention was pretty serendipitous, because if you think about it, if you had gone ahead and launched without doing a silent testing period, and there had been problems, would that, do you think, have reduced trust in the tool?
Amol Verma Yeah, I think without question, Muhammad, do you want to reflect on what you know, what you thought was a bit of a bullet dodged there?
Muhammad Mamdani Yeah, absolutely. And I think Amol you've summarized it extremely well, I think the the key issue for people around machine learning models, is how much do I actually believe what the models are telling me for me to take action, because we're talking about an often a blackbox situation, or so much depth that that the average clinician won't really understand the mechanics of it, but needs to actually believe that this is something that they can actually rely on and trust. So what does that look like? What does that mean? It means that we need to actually be able to demonstrate that this model actually performs just as well, if not better than them. So we actually did do a bit of that evaluation with 3000 clinician predictions to to say, alright, well, are we actually going to gain any, any advantage of this algorithm over what our clinicians are suggesting. And that actually then, leads to a bit of trust, but to Amol's point, it's actually that silent deployment, working out the bugs and the kinks. So we're putting our best foot forward. Whenever you deploy these things, it's often it's not being as good as but being much, much better than what we're used to, for people to actually really buy in. Amol, I don't know if you have any thoughts at that end.
Amol Verma Yeah, I think that to the point about, right after implementation, having some successes of the system that then engenders further trust, I think is really important. So, you know, right when it launches, there's, you know, we experienced there was there was a certain buzz and energy around the implementation. It's new, it's interesting. There's a lot of eyes on the, on the system. And early on, there were a couple of instances, a couple of cases where the system detected patients who subsequently deteriorated and it kind of generated a bit of mystique around the tool, people asking like, how does it know what it knows? Or, oh, this really helped, you know, in this specific case, there were a couple of cases where the tool set off an alarm, clinicians went, reassessed patients and detected occult infections like intra abdominal infections. One patient had a cholecystitis and other patient had a diverticulitis. There was another case of a patient who the alarm went off. The resident did a full assessment. The patient actually seemed okay and then several hours later had emergency respiratory decompensation but they were able to manage it very easily because they had just done a thorough assessment. They quickly involved the intensive care team and the patient was, you know, rushed to the unit, but did not arrest on the ward. And, you know, their reflection was that that patient probably would have arrested in the absence of that early warning signal. And so I think, you know, having a couple of early wins with the model performing really well was very important for our clinicians to trust and buy in. And, and I think, Kirsten, had we not had that silent period, you know, the opposite could very easily happen. Right? All it takes is a couple of instances where the tool suddenly is not performing well, and people lose trust.
Kirsten Patrick So moving on to evaluation, the second article in the series makes the point that machine learned models are not all that different from old Skool prediction models, and that we need to worry about the same things like data that go into them and the model that's been created. But there are some more things that we may need to be concerned about with machine learned models. Can you talk me through the ideal evaluation process for machine learned models in medicine?
Muhammad Mamdani Sure, evaluation of these sorts of technologies can be fairly complicated. That being said, there is a pretty good traditional framework that we have in epidemiology and in medicine, where, you know, we follow that to evidence-based hierarchy of observational studies, randomized trials and such. So you have all sorts of options once you actually deploy the solution on how to evaluate it. So do you do a randomized trial, that is certainly a possibility, we don't have a lot of published RCTs in this space. I can think of just a handful. Of course, you have observational study designs as well. So do you deploy the solution, simply look back in terms of alright, how have we historically done, we deployed the solution, then what happens afterwards, you could then start employing control groups, to see what happened in the groups where you didn't intervene. So there's all sorts of options in terms of randomised trials, cohort studies, time series based approaches to really evaluate these sorts of technologies, there's a huge push towards mixed methods in terms of incorporating qualitative research as well as quantitative research. Because oftentimes, the numbers don't really give you the complete story. So I would say the evaluation approach has to be practical, so very pragmatic, it also has to be reliable and valid. And there's often a trade off. So for example, in our example, with our early warning system tool in general internal medicine, well, we tried to estimate how many patients will we need to do a randomized trial, it quickly became apparent that we would need I think the estimate was over 20,000 patients. And that was just not practical at all. And so we had to step back and say, what can we do in a reasonable way that would be more compelling. And where we landed were a couple of approaches, one being time series analysis to say what happens in a fairly rigorous way, prior to the intervention, what how does it change after the intervention? We have some control groups in there, but also, can we look at a cohort study that is propensity score matched to compare patients who have the intervention versus those who don't to take a look at the outcomes. We ended up doing, landing on a couple of methods of evaluation.
Kirsten Patrick So I'm going to refer our listeners to the articles themselves to look at them as an amazing resource to see who goes into the planning team, who's part of the evaluation team, and how you implement all these things, according to the frameworks. I'd like to ask you, though, did you involve any patients in this work?
Amol Verma Yeah, we did. It was important for us, you know, I think both philosophically and scientifically, we know that involving patients and caregivers, in designing clinical interventions can improve their quality can make you think of things that you hadn't previously considered. And is just the, you know, I kind of believe it's almost inherently kind of the right thing to do. It has inherent value. I think we articulate this in in some of the papers, as you mentioned Kirsten and I think the way we've suggested is that patients and caregivers should be involved in patient-facing tools. There may be some, you know, machine learning solutions that, you know, are purely algorithms designed to change the way, maybe you organize a, you know, a radiology unit or something like that. And maybe there are some technical applications that don't all, don't always involve that kind of stakeholder consultation. Although, I would say almost all of these applications, it's going to add value. What we did was we involved several different people with lived experience of being patients or caregivers of hospitalized adult medical patients from our St. Michael's Hospital Patient and Family Advisor Group, we involve them in a consultative way. So basically, at each stage of designing the tool, we brought several people in, you know, told them about the intervention and had specific questions for them and got their input on those specific questions. And it turned out to be extremely valuable. I think the area that I take away the most from those discussions was in thinking about how do you communicate this tool and the results of an early warning system to a patient, you know, is does the patient have a right to know about all of the predictions? How should that be communicated? Who should communicate? And we had some really rich discussions around that, but I couldn't imagine having without those voices in the room. And so, you know, I think it's, it's crucially important to involve patients and caregivers and family members in these discussions.
Kirsten Patrick Amol, you've touched on ethics there. And there are a number of ethical issues, which are not necessarily covered in the, in the series, and are probably the subject of another article. In popular discourse about AI, there have been calls for what some people call an FDA of AI. So an oversight body that looks at the quality of these tools and approves them for use in medicine or general society. What do you think about that concept?
Amol Verma Maybe I'll start and then I'll definitely flip this to Muhammad, who certainly thinks about these things at the system level all the time. First of all, I mean, there is the FDA that is approving or regulating AI technologies in the United States. So I don't think we need a separate FDA for AI, the FDA is doing this work about AI, and they've created some approaches to regulating AI. And they're working with experts in this field, I think it's really important to have oversight of these technologies. These technologies, like any medical device, or tool, need to be produced with rigor, because they really can implement and affect clinical care in both helpful and harmful ways. And so I do think we need some kind of oversight and ongoing maintenance of the solutions. You know, one of the things that is different about a machine learning tool from a conventional tool, you know, because a lot of the things we've been talking about are actually not that specific to machine learning, right? I mean, having different approaches and being really thoughtful about your implementation, you know, the evaluation approach that Muhammad talked about, you know, the question, so what is unique to machine learning tools? Well, one issue is that, because these tools integrate so many different inputs, it's less simple to say, oh, the patients score this many points on this tool, because of factors X, Y, and Z, right? Because it's incorporating hundreds of inputs. So you can't know specifically why any one prediction is generated, for example. And that can make it challenging to think about, well, is this tool performing fairly across people from different socioeconomic and racial backgrounds? Or is this tool performing as well today, as it did two years ago, when the tool was developed, right. Has the has the tool's performance changed, because the tool was taking some information about clinical practice into account? And that clinical practice changed, right. So how do you keep the machine learning tool current and fresh, you know, you need to retrain it, and then that raises its own host of problems. So I think the complexity of the tools raises really important, practical issues around tool performance and around ethics. And so oversight is essential. And that's different from some of the more conventional technologies that we use in that.
Muhammad Mamdani I think Amol raised some some very good points, and some oversight, I agree. I mean, I think it is needed. You know, we look at examples that are out there. So for example, in the US, there's a paper that was written, I believe, last year or the year before that, where there were algorithms that were used to manage care in organizations, to target specific people who who may need more or less intensive care, and it was found to be incredibly biased according to race. And what these algorithms do is if they if you actually deploy these algorithms at wide scale, you're just amplifying that bias that we actually have in practice and actually harming people not helping them. So there's a real worry there that these algorithms not only have the potential to do good but they also have the potential to do harm if they're misused. The problem is how do we oversee them or regulate them. What I do worry about is that we have people who, who don't really do this for a living, or they don't really understand how you deploy or how these models actually work. Some of them may actually be making those rules in terms of how we oversee. And I would really caution and worry that we actually have very informed ways of overseeing and regulating these these sorts of solutions. Because I think if we don't do it the right way, we actually will stop really good solutions from coming into practice. On the flip side, if we don't understand these well enough, we will have too-loose approaches to evaluation. So there's a balance there, in terms of having very informed people who actually went through this process and really understand the mechanics to have enough of a rigorous process that we know these algorithms work well, and they'll work well in in a variety of scenarios. But also then, to not make it too challenging that they don't get into practice, if we over regulate. And if we put too many barriers, I think that's going to hurt us not help us. We really need to be able to explore to be able to learn from the solutions on how they can help us best in healthcare. And that means being progressive and bold, and taking some risks.
Kirsten Patrick What do you see as the future of AI in the medical space?
Muhammad Mamdani We're really in the beginning phases in healthcare. The future of AI, to me is very exciting. I think one of our biggest issues in healthcare right now is around data. How do we glean the incredible amounts of data that we have in an organized, disciplined manner. And when we look at other organizations, let's say the automobile industry, you know, what they do is they assimilate a whole bunch of data in in a self driving car, for example, a car with a bunch of sensors, it'll actually bring all that data into one place, analyze it in a very disciplined way, and then take out action. So your GPS knows to make a right turn, your sensors are thinking there's rain here, so I'm going to turn on the wipers. There's a red light, my camera says so so I'm going to actually stop. It does things in a very disciplined, intentional way. We don't really have that discipline yet in healthcare, I think we're working towards that. So over the next few years, I would see us having much more disciplined on the data end, because you need that to do AI and ML. I would say, over the next 20 years, we would hope to see that once we actually have more discipline in our data, we're gonna see much more proliferation of machine learning and AI algorithms to drive automation to drive clinical decision support. Virtual care is on the rise, and we're gonna see more digital data there. To the point where I think, let's say 30-50 years from now, we may wake wake up and have some biosensors that are portable in our home, and click on video camera to see clinician and go through diagnostics at home. And in some cases, maybe not even a clinician, the AI will be able to diagnose itself reasonably accurately well, and clinicians will then really be dealing with more challenging cases, more involved cases where I think we really do need that help, that the AI wouldn't be able to figure out on its own rare conditions, for example, that it hasn't seen before, where the clinicians can actually focus on the work of medicine rather than some of the more mundane tasks.
Kirsten Patrick Thank you, Muhammad and Amol for joining me today on the CMAJ podcast and drawing attention to your great series of articles that are published today in CMAJ.
Amol Verma Thanks so much for having me.
Muhammad Mamdani Thank you.
Kirsten Patrick I've been speaking to Muhammad Mamdani and Amol Verma, two co-authors on a series of articles on artificial intelligence and machine learning in medicine published at CMAJ. You can find links to all three articles in the show notes. I'm Dr. Kirsten Patrick, interim editor in chief of CMAJ. Thanks for listening.
Transcribed by https://otter.ai