New AI method increases prediction accuracy and reliability.

WhatsApp Group Join Now
Telegram Group Join Now
Instagram Group Join Now

Abstract: Researchers developed a new method to improve uncertainty estimation in machine learning models, increasing prediction accuracy. Their method, IF-COMP, uses the minimum description length principle to provide more reliable confidence measures for AI decisions, which is important in high-stakes settings such as healthcare. .

This scalable technique can be applied to larger models, helping non-experts determine the reliability of AI predictions. The results can lead to better decision-making in real-world applications.

Important facts:

  1. Improved accuracy: IF-COMP improves uncertainty estimation in AI predictions.
  2. Scalability: Applicable to large, complex models in critical settings such as healthcare.
  3. User friendly: AI helps non-experts assess the reliability of decisions.

Source: MIT

Because machine learning models can make incorrect predictions, researchers often equip them with the ability to tell how confident they are about a particular decision. This is especially important in high-stakes settings, such as when models are used to help identify disease in medical images or filter job applications.

But quantifying a model's uncertainty is only useful if they are accurate. If a model says it is 49% confident that a clinical image shows pleural effusion, then 49% of the time, the model must be correct.

The researchers tested their system on these three tasks and found it to be faster and more accurate than other methods. Credit: Neuroscience News

MIT researchers have introduced a new method that can improve uncertainty estimation in machine learning models. Their method not only produces more accurate uncertainty estimates than other techniques, but does so more efficiently.

In addition, because the technique is scalable, it can be applied to large deep learning models that are increasingly being deployed in healthcare and other safety-critical situations.

This technique can give end users, many of whom lack machine learning skills, better information they can use to decide whether to trust a model's predictions or deploy the model for a particular task. What should be done?

“It's easy to see that these models do really well in scenarios where they're great, and then assume they'll be just as good in other scenarios.

“This makes it particularly important to advance the kind of work that seeks to better calibrate the uncertainty of these models to ensure that they are consistent with human perceptions of uncertainty. are,” says lead author Nathan Ng, a graduate student at the University of Toronto, visiting MIT.

Ng co-authored the paper with Roger Gross, assistant professor of computer science at the University of Toronto. and senior author Marzih Qasmi, an associate professor in the Department of Electrical Engineering and Computer Science and a member of the Institute of Medical Engineering Sciences and the Laboratory for Information and Decision Systems. The research will be presented at the International Conference on Machine Learning.

Assessing uncertainty

Uncertainty quantification methods often require complex statistical calculations that are not amenable to machine learning models with millions of parameters. These methods also require users to make assumptions about the model and data used to train it.

MIT researchers took a different approach. They use what is known as the rule of minimum length of explanation (MDL), which does not require assumptions that can prevent the validity of other methods. The MDL is used to better quantify and quantify the uncertainty for the test points that the model is asked to label.

The technique the researchers developed, known as IF-COMP, makes MDL fast enough to be used with large deep learning models deployed in many real-world settings.

MDL involves considering all possible labels that a model test point can give. If there are many alternative labels for a point that fit well, then his confidence in his chosen label should decrease accordingly.

“One way to understand how confident a model is is to give it some conflicting information and see how likely it is to believe you,” says Ng.

For example, consider a model that states that the clinical picture shows pleural effusion. If the researchers tell the model that the image represents an anomaly, and it is willing to update its belief, the model should be less confident in its original judgment.

With MDL, if a model is confident when it labels a data point, it should use a very short code to describe that point. If it is uncertain about its decision because the point may have many other labels, it uses a long code to derive those possibilities.

The amount of code used to label a data point is called stochastic data complexity. If researchers ask the model how willing it is to update its belief about a data point given contrary evidence, the complexity of stochastic data should be reduced if the model has confidence.

But testing each data point using MDL would require a lot of computation.

Speed ​​up the process

With IF-COMP, researchers have developed an estimation technique that can accurately estimate the complexity of stochastic data using a special function, called the influence function. They also used a statistical technique called temperature scaling, which improves the calibration of model outputs. This combination of influence functions and temperature measurements enables high-quality estimates of the complexity of stochastic data.

Finally, IF-COMP can effectively produce well-calibrated uncertainty quantifications that reflect the true confidence of the model. The technique can also determine if the model has mislabeled some data points or reveal which data points are outliers.

The researchers tested their system on these three tasks and found it to be faster and more accurate than other methods.

“It's really important to be certain that a model is well-calibrated, and there's an increasing need to detect when a particular prediction doesn't seem quite right. Auditing tools in machine learning problems are more are becoming necessary as we use large amounts of untested data to build models that will be applied to problems faced by humans.

IF-COMP is model-agnostic, so it can provide accurate uncertainty quantification for many types of machine learning models. This enables it to be deployed in a wider range of real-world settings, ultimately helping more practitioners make better decisions.

“People need to understand that these systems are very fallible and that they can make things up as they go. A model may look like it's very confident, but there's a lot of different things to the contrary that give evidence to the contrary. but willing to believe,” says Ng.

In the future, researchers are interested in applying their approach to larger language models and studying other potential use cases for the principle of minimum description length.

About this AI research news

the author: Melanie Grados
Source: MIT
contact: Melanie Grados – MIT
Image: This image is credited to Neuroscience News.

Original research: closed access
“Measuring the complexity of stochastic data with Boltzmann influence functions” by Roger Gross et al. arXiv


Abstract

Measuring complexity of stochastic data with Boltzmann influence functions

Estimating the uncertainty of the model prediction at the test point is an important part of ensuring reliability and calibration under distribution changes.

A minimum specification length approach to this problem uses a predictive normalized maximum likelihood (pNML) distribution, which considers each possible label for a data point, and if other labels are also modeled. and consistent with the training data, the confidence in the prediction decreases.

In this work we propose IF-COMP, a scalable and efficient estimator of the PNML distribution that aligns the model with the Boltzmann influence on the temperature scale. IF-COMP can be used to generate well-calibrated predictions on test points as well as measure complexity in both labeled and unlabeled settings.

We empirically validate IF-COMP on the tasks of uncertainty calibration, false label detection, and OOD detection, where it consistently matches or beats robust baseline methods.

WhatsApp Group Join Now
Telegram Group Join Now
Instagram Group Join Now

Leave a Comment