Data Science and Machine Learning Developments in 2016



As we welcome 2017, it’s time to sum up the key developments in data science and machine learning from the past year so that we can open our eyes to the new year.

I compiled expert opinions (such as from here, here, and here), reviewed selected publications, and assembled a short list.  For more, read the original full postings!

  • Popularity: From art and literature to science and business and healthcare, there are new applications almost on a daily basis.
  • Frustration with supervised learning: Supervised learning tells us about correlation, but not causality.  Most effective algorithms require millions of labeled samples. Emerging out of the response are the increasing popularity of reinforcement learning techniques and adversarial learning.
  • Unsupervised Learning: Made a stride in healthcare with this Nature publication in which unsupervised convolutional neural network was used to create a general-purpose patient representation that worked better than existing models.
  • Reinforcement Learning: Many advances were made on the empirical side, notably AlphaGo.
  • Adversarial training: became very popular during the past year, made famous with Google’s encryption experiment.
  • Fairness: Algorithms discriminate – in good ways and in bad.  2016 saw the European Union’s approval of General Data Protection Regulation (GDPR), mandating human’s “right to explanation” for decisions made by algorithms.
  • Real-Life: Deep Learning and Ensemble Modeling will continue to be utilized more in the day to day applications, not just in data science competitions.
  • Natural language processing: converting questions into SQL queries to return accurate (although not necessarily quick) answers made great strides.
  • Inclusion: Still much to work on to include women and LGBT community in technology sectors, who are both less likely to attend key conferences and less likely to hold jobs in technology outside of service departments.  In 2016’s OpenAI conference, significant stride were made through travel grants and code of conduct to be maximally inclusive and encouraging to underrepresented groups.
  • Diversity isn’t just a matter of gender, either: Academicians, industry experts, artists, minority ethnic groups – 2016 saw the beginning of increasing involvement by groups of all cultural and professional background.  But there is still more work to be done.
  • Data Science: Continue to emerge.  2016 saw further refinement to what it means to be a data scientist.  It has become a better-defined and well-paid career choice.


Howard Chen
Associate Informatics Officer at Cleveland Clinic Imaging Institute
(Howard) Po-Hao Chen, MD MBA is the Associate Informatics Officer at the Cleveland Clinic Imaging Institute and a musculoskeletal radiology subspecialist. He has an interest in data-driven radiology, quality improvement, and innovation. Howard has an MD and MBA from Harvard University, and he finished training with fellowships in musculoskeletal radiology, nuclear medicine, and clinical imaging informatics in June 2018 from University of Pennsylvania.

3 Responses to “Data Science and Machine Learning Developments in 2016

Trackbacks & Pings

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.