Big Data News Hubb
Advertisement
  • Home
  • Big Data
  • News
  • Contact us
No Result
View All Result
  • Home
  • Big Data
  • News
  • Contact us
No Result
View All Result
Big Data News Hubb
No Result
View All Result
Home Big Data

Bringing Models and Data Closer Together

admin by admin
February 9, 2023
in Big Data


We are excited to announce a new AutoML capability to quickly and easily use Feature Store data to improve model outcomes. AutoML users can now simply join Feature Store tables to AutoML data sets to improve model quality. As Machine Learning (ML) gets faster and easier, customers are able to apply this transformational technology to an increasing variety of use cases. This allows customers to find more ways to grow their revenues or reduce their costs using ML. We have already seen many customers using AutoML to solve critical business challenges. Some customers use AutoML to extend their ML expertise while others use it to help accelerate their outcomes. With today’s announcement, AutoML is now fully integrated with the Databricks Feature Store.

What is a Feature Store?

A feature store is a centralized data repository that enables data scientists to store, find, and share features. The feature store ensures that the same code used to compute the feature values is used for model training and inference. This creates a curated set of data that modelers can access knowing that they can use this data both to train as well as to deploy their models. Many companies report significant accelerations in experimentation and deployment when utilizing the Feature Store. For example, Director of Data Engineering at Anheuser-Busch InBev said, “It [the Feature Store] has been instrumental in helping us quickly scale our data science capabilities as well as in uniting data engineers and analysts alike with a common source of feature engineering and data transformations.”

Getting started with a feature store is easy, any Delta table with a primary key and a timestamp can easily be used in the feature store. You can learn more about the Databricks Feature Store here: AWS, Azure, GCP.

How will this integration accelerate ML outcomes?

Databricks AutoML (AWS, Azure, GCP) was designed to help customers at all levels of technical expertise build and train ML models. AutoML not only provides a high quality candidate model, but also provides the customer with all of the model code in a notebook should the customer want to further tune the model’s performance.

In the past AutoML was able to train a model using a table as a training set. Now, customers can improve their model quality by augmenting their AutoML training data with data in their feature store. This makes it easier to train an even more accurate model. AutoML models using the Feature Store integration will automatically capture the feature lineage as well as add the new model to the end to end lineage tracking. This lineage accelerates deployment and provides the tooling to help meet your MLOps and compliance needs.

How do I get started?

In the AutoML experiment page, select a cluster with Databricks Runtime 11.3 LTS ML or above. After selecting the problem type, data set and prediction target, you will see a button in the bottom left of the screen.

Selecting this button will bring up the ability for you to select feature tables to join to your data set as well as the lookup keys that will be used to do the joins.

AutoML

Once we have identified the tables that we want to join as well as the lookup keys, we can simply hit the “Start AutoML” button and the service will start creating models with both your inputted data and data added from your feature store tables. In this example, augmenting the NYC Yellow Taxi fares data with feature tables brings a 21% improvement to the model fit ( i.e. a decrease from 3.991 to 3.142 in RMSE).

Not only is this integration in the AutoML UI, but the AutoML API now supports programmatically augmenting your training data with feature store tables. You can learn more about the API capabilities here (AWS, Azure, GCP)

As we continue to invest in ways of making ML faster and simpler, we are excited to see how customers improve their workflows and look forward to finding more ways we can help teams achieve their ML objectives.



Source link

Previous Post

Synchronize your Salesforce and Snowflake data to speed up your time to insight with Amazon AppFlow

Next Post

Heard on the Street – 2/9/2023

Next Post

Heard on the Street – 2/9/2023

Recommended

Migrate Google BigQuery to Amazon Redshift using AWS Schema Conversion tool (SCT)

January 2, 2023

Serving Up a Primer for Unity Catalog Onboarding

November 23, 2022

Lightning AI Releases PyTorch Lightning 2.0 and a New Open Source Library for Lightweight Scaling of Machine Learning Models 

March 15, 2023

Don't miss it

News

How Enterprises Can Defray the Hidden Cost of the Cloud

March 23, 2023
Big Data

Evolution through large models

March 23, 2023
Big Data

Observe Everything – Cloudera Blog

March 22, 2023
Big Data

NVIDIA Launches Inference Platforms for Large Language Models and Generative AI Workloads

March 22, 2023
Big Data

Announcing the General Availability of Private Link and CMK for Databricks on AWS

March 22, 2023
News

Manage users and group memberships on Amazon QuickSight using SCIM events generated in IAM Identity Center with Azure AD

March 22, 2023

big-data-footer-white

© 2022 Big Data News Hubb All rights reserved.

Use of these names, logos, and brands does not imply endorsement unless specified. By using this site, you agree to the Privacy Policy and Terms & Conditions.

Navigate Site

  • Home
  • Big Data
  • News
  • Contact us

Newsletter Sign Up

No Result
View All Result
  • Home
  • Big Data
  • News
  • Contact us

© 2022 Big Data News Hubb All rights reserved.