Modal Title
AI / Machine Learning

Meta-Semi Is an AI Algorithm That ‘Learns How to Learn Better’

Researchers from Tsinghua University are proposing an algorithm that would help deep learning AI models exploit data more efficiently, without compromising on accuracy.
Jun 6th, 2023 3:00am by
Featued image for: Meta-Semi Is an AI Algorithm That ‘Learns How to Learn Better’

As a subset of machine learning, deep learning is the form of artificial intelligence that is inspired by how human brains work. Deep learning is what supercharges natural language processing (NLP), which underpins applications like voice search, intelligent assistants, and image classification.

However, many deep learning models to date have relied on supervised training to some degree, which requires that data be manually identified and labeled by a human prior to it being used to train an AI model, which can take a lot of time and money to do.

Semi-supervised learning (SSL) — which uses both labeled and unlabeled data — could be a potential solution, but may be impractical in many real-world scenarios where labeled data is lacking.

Now, a team of researchers from Tsinghua University are proposing an algorithm that would help deep learning AI models exploit the labeled data that is available more efficiently, without compromising too much on accuracy.

According to the team, their semi-supervised learning algorithm performs better than other semi-supervised learning algorithms. It would allow deep learning models to be trained effectively with only a small sample of annotated data.

“We propose a meta-learning-based SSL algorithm, named Meta-Semi, to efficiently exploit the labeled data, while it requires tuning only one additional hyper-parameter to achieve impressive performance under various conditions,” wrote the team in their paper, which was recently published in the journal CAAI Artificial Intelligence Research. “The proposed algorithm is derived from a simple motivation: the network can be trained effectively with the correctly ‘pseudo-labeled’ unannotated samples.”

In machine learning, a hyperparameter is a parameter whose value controls the learning process, while other parameters derive their value from the process of training the model.

The issue with other semi-supervised learning algorithms is that they introduce multiple tunable hyperparameters into the process, with the final performance of these algorithms being contingent on whether the hyperparameters are set at the correct values.

In real-life situations like medical image processing, hyper-spectral image classification, network traffic recognition, and document recognition, searching for the optimal hyperparameter configuration is not always possible.

In addition, the team’s use of what is called “pseudo-labeling” is what helps give the Meta-Semi algorithm an advantage. Pseudo-labeling is a technique used in semi-supervised learning where the model is initially trained with whatever labeled data is available.

The trained model then predicts labels for the unlabeled data, thus creating a set of pseudo-labeled data. The model is then re-trained again, together with the pseudo-labeled and labeled data repeatedly, so that the model gradually improves its accuracy.

In the case of the Meta-Semi model, the team’s process included filtering out samples whose pseudo-labels were erroneous or unreliable, and then training the model with the filtered dataset that contained the most reliable pseudo-labels.

As the team explains, this filtering step is part of a “meta-learning” paradigm, where the correctly pseudo-labeled data is dynamically reweighted to have a similar distribution to the data that is labeled, thus minimizing the loss on labeled data.

“The idea of meta-learning is motivated by the goal of ‘learning to learn better’,” explained the researchers. “Meta-learning algorithms usually define a meta-optimization problem to extract information from the learning process.”

With this approach, the team’s Meta-Semi algorithm was able to consistently perform better than other state-of-the-art semi-supervised algorithms, notably even with less labeled data and larger number of classes.

In particular, the Meta-Semi algorithm outperformed on the challenging range of tasks made possible by the image datasets CIFAR-10, STL-10, and SVHN, which are frequently used to train AI models.

The team noted that Meta-Semi “converges to the stationary point of the loss function on labeled data under mild conditions,” and requires much less work to tune hyperparameters while attaining state-of-the-art performance on the four aforementioned datasets.

The team is now working to refine Meta-Semi to produce another more effective and powerful version of the algorithm, to minimize the required amount of labeled data, training time and tuning of hyperparameters.

Group Created with Sketch.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.