Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database): The challenges to put ML models in production (Healthcare)

June 10, 2021

The challenges to put ML models in production (Healthcare)

Very good thread, Summarizing insights

Can you detect COVID-19 using Machine Learning? 🤔

You have an X-ray or CT scan and the task is to detect if the patient has COVID-19 or not. Sounds doable, right?

None of the 415 ML papers published on the subject in 2020 was usable. Not a single one!

Let's see why 👇 pic.twitter.com/Vrd91ZpXy3
— Vladimir Haltakov (@haltakov) June 9, 2021

Observations from papers ?

None of the 415 ML papers published on the subject in 2020 was usable. Not a single one!
Black small square 2212 papers, Black small square 415 after initial screening, Black small square 62 chosen for detailed analysis, Black small square 0 with potential for clinical use
Many papers were using very small datasets often collected from a single hospital - not enough for real evaluation
Some papers used a dataset that contained non-COVID images from children and COVID images from adults. These methods probably learned to distinguish children from adults
Training and testing on the same data
Many papers failed to disclose the amount of data they were tested or important aspects of how their models work leading to poor reproducibility and biased results
Many papers didn't even consult with radiologists.
Rushing to publish results based on small and bad quality datasets undermines the credibility of ML
At some point people start figuring out how to fine tune on the test set
Dataset is not diverse enough and bias-free
Authors find that covid-19 detectors often attend to the position of the shoulders and not the lungs. Models can easily learn shortcuts as opposed to robust features

Take everything with a pinch of salt. Real world data is not kaggle data. Kaggle does not reflect the reality or quality or the challenges we spot on data.

Keep Exploring!!!

Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database)

June 10, 2021

The challenges to put ML models in production (Healthcare)

No comments:

Git Code Repository

About Me

What is your Expertise

Search This Blog

Translate

About Me and Disclaimer

Labels

Data Science Good Reads

Cloud, Datacentre, BigData and NOSQL Blogs

SQL Links

Archecture Blog List

Programming Problems

Startup - Reads

Perl-Python-Ruby-Linux-Oracle

Management + Leadership Blogs

Research Papers & Podcasts

My Wordpress

Interesting Reads

Useful Links - C# and .NET

Java, Selenium, QTP and Test Tools Learning

Agile Testing

Reverse Logistics Reads

Biztalk Blogs

MS BI Links

Process - Learnt it :)

Usability Guidelines - Building Better Sites

.NET Test Tools and Other Interesting Reads

Review Checklist

Blog Archive

Live Traffic

Total Pageviews

Popular Posts