Big Data News Hubb
Advertisement
  • Home
  • Big Data
  • News
  • Contact us
No Result
View All Result
  • Home
  • Big Data
  • News
  • Contact us
No Result
View All Result
Big Data News Hubb
No Result
View All Result
Home Big Data

Label Studio Survey Highlights Changing Investments and Technology Choices with the Shift from Model-Centric to Data-Centric AI 

admin by admin
December 10, 2022
in Big Data


Data science teams are shifting their focus from model development to dataset development in order to deliver Machine Learning (ML) and Artificial Intelligence (AI) initiatives that are more performant, differentiated and aligned with business goals. This and other findings are available in the first Label Studio Community Survey, where data scientists, ML engineers and researchers from the global open source community shared insights into the state of ML and AI. 

Label Studio is the most popular open source data labeling platform with more than 150,000 users worldwide, 95,000,000+ annotations created and over 11,000 stars on GitHub. Community members from more than 40 countries participated in the survey, and 75% of the survey respondents currently have ML/AI models in production with another 15% planning to have models in production soon. 

“We’re in the midst of a fundamental shift in how organizations approach ML and AI,” said Michael Malyuk, co-founder and CEO of Heartex, creators of Label Studio. “Model development was once the source of differentiated value, but as the results of this survey highlight, organizations now spend 50-80% of their time iterating on the dataset and quality of its labeling to train accurate models. We call this emerging practice dataset development.”

Successful ML and AI applications rely on models trained using high quality data. The 2022 Label Studio Community Survey explores the current state of the ML/AI ecosystem, with a focus on how teams are approaching data labeling, preparation and management as a key part of the pipeline. 

Key Findings in the Label Studio Community Survey

Machine Learning and AI are becoming increasingly strategic. 

  • 73% of respondents noted their organizations will make a higher level of investment in their ML/AI initiatives in the coming year.

Data poses the biggest challenge to putting ML/AI models into production.

  • 80% of respondents state that accurately labeled data is one of the biggest challenges to getting ML/AI models in production (the top response), while 46% cited lack of data as one of the biggest challenges (the second most popular response).

Data science teams now spend the majority of their time on dataset preparation, management and iteration, known as dataset development.

  • 72% of respondents reported spending 50% or more of their time on data preparation, iteration and management, while more than one-third (34%) of respondents said they spend 75% or more of their time on the data.

Data preparation and labeling are becoming increasingly cross-functional. 

  • While most respondents have the traditional roles of data scientists and data engineers, the responsibility for data labeling is broad, requiring engagement across organizations from interns to executives and business leaders. Notably, 20% reported that a mix of roles held the data prep responsibility, including subject matter experts, who accounted for 5% of responses, and business analysts, who accounted for 3%. 

The Label Studio Community Survey also dives into popular technology choices, finding that ML/AI workloads are primarily hosted on cloud offerings, while HuggingFace is the most popular source for pre-trained models. More details can be found in the full report. 

Sign up for the free insideBIGDATA newsletter.

Join us on Twitter: https://twitter.com/InsideBigData1

Join us on LinkedIn: https://www.linkedin.com/company/insidebigdata/

Join us on Facebook: https://www.facebook.com/insideBIGDATANOW





Source link

Previous Post

How Blueshift integrated their customer data environment with Amazon Redshift to unify and activate customer data for marketing

Next Post

Inside AWS’s New Supply Chain Planning Offering

Next Post

Inside AWS’s New Supply Chain Planning Offering

Recommended

Cloudera Partner Network: Poised to Heat up Channel Growth

November 2, 2022

Getting started with AWS Glue Data Quality for ETL Pipelines

December 16, 2022

Hypothesis-led data exploration is failing you …

December 11, 2022

Don't miss it

News

Stormy Skies Ahead? Report Finds 20% of Businesses Intend to Move Workloads From Cloud to On-Prem

February 5, 2023
Big Data

An Introduction to Disaster Recovery with the Cloudera Data Platform

February 4, 2023
Big Data

Comet Announces Convergence 2023, the Leading Conference to Explore the New Frontiers of Machine Learning

February 4, 2023
Big Data

Design Patterns for Batch Processing in Financial Services

February 4, 2023
News

AWS Lake Formation 2022 year in review

February 4, 2023
News

Data Mesh Creator Takes Next Data Step

February 4, 2023

big-data-footer-white

© 2022 Big Data News Hubb All rights reserved.

Use of these names, logos, and brands does not imply endorsement unless specified. By using this site, you agree to the Privacy Policy and Terms & Conditions.

Navigate Site

  • Home
  • Big Data
  • News
  • Contact us

Newsletter Sign Up

No Result
View All Result
  • Home
  • Big Data
  • News
  • Contact us

© 2022 Big Data News Hubb All rights reserved.