Highlights of the Month: March 2020
Key words Sub-Word Tokenization; Ligand-Based Virtual Screening; Meta Learning; ELECTRA;
Research Papers 🎓
Pre-trained Models for Natural Language Processing: A Survey
A comprehensive review of Pre-trained Models for NLP
A Primer in BERTology: What we know about how BERT works
ELECTRA: PRE-TRAINING TEXT ENCODERS AS DISCRIMINATORS RATHER THAN GENERATORS. [Github]. [Blog]
Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates
Ligand-Based Virtual Screening:
Ligand-based virtual screening utilizes information about molecules with known activity to predict the activity of new molecules. It assumes similar molecules tend to have similar activities/properties. Searching for active molecules can be viewed as similarity-based querying of a database storing molecules with unknown activity. Thus the use of molecular similarity measure is the cornerstone of the success of virtual screening. The following are papers of benchmarking different methods.
Open-source platform to benchmark fingerprints for ligand-based virtual screening
Heterogeneous Classifier Fusion for Ligand-Based Virtual Screening: Or, How Decision Making by Committee Can Be a Good Thing
Benchmarking Data Sets for the Evaluation of Virtual Ligand Screening Methods: Review and Perspectives
Ligand-Based Virtual Screening Using Graph Edit Distance as Molecular Similarity Measure
Meta-Learning: Learning to Learn Fast
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Investigating Meta-Learning Algorithms for Low-Resource Natural Language Understanding Tasks
META-LEARNING INITIALIZATIONS FOR LOW-RESOURCE DRUG DISCOVERY
Molecule Attention Transformer. [Code]
Breaking Down Structural Diversity for Comprehensive Prediction of Ion-Neutral Collision Cross Sections
- A diverse set of 7405 molecues with CCS values.
- Machine learning models to predict CCS.
A Deep Generative Model for Fragment-Based Molecule Generation
Software and Tools 💻
A very simple framework for state-of-the-art NLP.
A text embedding library. Flair has simple interfaces that allow you to use and combine different word and document embeddings, including our proposed Flair embeddings, BERT embeddings and ELMo embeddings.
PyTorch implementation of SimCLR
Create interactive textual heatmaps for Jupiter notebooks.
An Open Source Library for Quantum Machine Learning
Generative Teaching Networks:
AI Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data
Implementation of Generative Teaching Networks for PyTorch.
Articles and Blog Posts 📃
Transformers are Graph Neural Networks
From PyTorch to PyTorch Lightning — A gentle introduction
PyTorch Lightning was created for professional researchers and PhD students working on AI research. PyTorch Lightning
The Ultimate Guide to using the Python regex module
A Deep Dive into the Wonderful World of Preprocessing in NLP
Explain machine learning concepts with (1) simple words and (2) real-world examples.
From PyTorch to JAX: towards neural net frameworks that purify stateful code
Introducing DIET: state-of-the-art architecture that outperforms fine-tuning BERT and is 6X faster to train
Dual Intent and Entity Transformer (DIET) is a multi-task transformer architecture that handles both intent classification and entity recognition together.
TRAINING ROBERTA FROM SCRATCH - THE MISSING GUIDE
How to generate text: using different decoding methods for language generation with Transformers
FROM Pre-trained Word Embeddings TO Pre-trained Language Models — Focus on BERT
In meta-learning, there is a meta-learner and a learner. The meta-learner (or the agent) trains the learner (or the model) on a training set that contains a large number of different tasks. In this stage of meta-learning, the model will acquire a prior experience from training and will learn the common features representations of all the tasks. Then, whenever, there is a new task to learn, the model with its prior experience will be fine-tuned using the small amount of the new training data brought by that task.
Notable Mentions ✨
AI Curriculum. Open Deep Learning and Reinforcement Learning lectures from top Universities like Stanford University, MIT, UC Berkeley.
Huggingface’s official notebook tutorials
Today we're happy to release four new official notebook tutorials available in our documentation and in colab thanks to @MorganFunto to get started with tokenizers and transformer models in just seconds! https://t.co/zzBVWsEnef (1/6) pic.twitter.com/bpOYA6UTDk— Hugging Face (@huggingface) March 5, 2020
Leave a Comment