Active Learning for Natural Language Processing

More than 90% of machine learning applications improve with human feedback. For example, a model that classifying news articles into pre-defined topics has been trained on 1000s of examples where humans have manually annotated the topics. However, if there are tens of millions of news articles, it might not be feasible to manually annotate even 1% of them. If we only sample randomly, we will mostly get popular topics like “politics” that the machine learning model can already identify accurately. So, we need to be smarter about how we sample. This talk is about “Active Learning”, the process of deciding what raw data is the most optimal for human review, covering: Uncertainty Sampling; Diversity Sampling; and some advanced methods like Active Transfer Learning.

Robert Munro has worked as a leader at several Silicon Valley machine learning companies and also led AWS’s first Natural Language Processing and Machine Translation solutions. Robert is the author of Human-in-the-Loop Machine Learning, covering practical methods for Active Learning, Transfer Learning, and Annotation. Robert organizes Bay Area NLP, the world’s largest community of Language Technology professionals. Robert is also a disaster responder and is currently helping with the response to COVID-19.

The slides are available on this link.

Do you like our visualizations? Buy them for yourself!

Visit our shop on Society6 to get a printout of our vizs.

Subscribe to our newsletter

Get highlights on NLP, AI, and applied cognitive science straight into your inbox.

powered by TinyLetter

Posted In: