The ChatGPT revolution has spawned a tidal wave of AI assistants that do all sorts of things, like writing stories, generating images, and even making music. Data analysts and data engineers at small and midsize businesses who desire a natural language interface for cleaning, analyzing, and making predictions from data might want to check out Akkio, which calls its $50 AI assistant “GenBI.”
Akkio provides an all-in-one tool for data professionals to access, cleanse, visualize, and build machine learning models with tabular data. But unlike other business intelligence (BI) tools, Akkio uses a natural language interface, powered by GPT-4, to let users interact with their data using a series of questions.
“We call it generative BI, because it’s the intersection of generative AI and your data,” says Jon Reilly, the co-founder and co-CEO of the Cambridge, Massachusetts company.
A demo by Reilly shows how easy working with data via Akkio really is. Upon loading data into the product using the supplied ETL connectors (it supports Excel, Google Sheets, BigQuery, Snowflake, and a number of other sources), the cloud-based product automatically analyzes the data values and creates a histogram that show the variance.
“So right away you get some early exploratory analysis on your data,” Reilly says. “You can see the shape of it. You can one-click clean it. You can standardize your date columns to ISO 8601 with a single click.”
The software supports a “Chat Data Prep” mode that helps users get their data ready for analysis. In Reilly’s demo, which was all about analyzing MLS data to determine a correlation between housing features and price, the user needs to create a new column with all the data about the location of the houses.
But instead of writing complex SQL statements–or slightly less complex Python statements–the Akkio user can simply tell the product what to do. Simply typing “combine all location info” and hitting the enter key gives Akkio all the direction it needs to return the aggregated column.
“You can do pretty much any data transformation you want just by typing in whatever it is that you want to have happen,” Reilly says. “It’ll figure out if streets, city, states, and country are the location-containing columns, and combine them into new column called ‘location.’”
Removing outliers is as simple as typing “remove outlies from square footage lot.” The software chooses the 95th percentile as the appropriate boundary, and prompt removes 5% at the extremities with Python code generated under the covers.
Another Akkio mode, dubbed Chat Explorer, gives the user a natural language interface for working with the data. In the demo, Reilly asks the machine to show him the relationship between price and square footage, and the machine quickly spits out a chart with the requested data.
“That’s a live chart back to your live data source, so if your data source updates, it’ll update your chart,” Reilly says. “You can ask for any visualization. I think we support like 20 different chart types here.”
Because the large language model (LLM) underpinning the system–Microsoft’s GPT endpoint running in Azure–has such as firm grasp of the English language, users can ask some fairly ridiculous questions. For example, users can ask Akkio to give it five “interesting” charts, or show houses that would be good for two college roommates, for example. The product will parse the pertinent qualifiers and come up with an answer. (However, none of the customers’ actual data is seen by GPT. Only metadata is sent across the wire from the customers’ data, stored with Akkio in AWS, and the GPT endpoint in Azure, Reilly says.)
Lastly, the company built an automated machine learning (AutoML) engine into the product, allowing users to dabble in data science from the comfort of the Akkio GUI. In Reilly’s demo, he instructed the machine to build a model to predict the price of a house based on features extracted from the previous step. After evaluating several models, it picked one that delivered an accuracy rate of 16%, which is reasonable for a small data set, he says.
“We have a standard 80/20 split here. We’re encoding each one of the columns with the proper encoder given the type of information that was in each one of those columns, and then we’ll bootstrap an ML model using 80% and then validate it on the remaining 20%,” Reilly says. “We’re primarily neural network-based and we do some decision trees and we’ll even try a linear regression to see if it performs better. It usually doesn’t win. Mostly neural networks.”
Akkio isn’t the first AI assistant to provide a natural language interface for data cleaning, analysis, and ML operations. There have been many such tools unveiled by the industry giants over the past 10 months In Akkio’s case, delivering all of those capabilities, for a starting price of $50 per user per month, shows that it’s quite serious about developing a volume business.
“We’re a tool for the rest of us,” Reilly says. “Our thesis is a business has data scientists and they’re usually working on the super complex, highest leverage problems. And then there’s the longtail of people working in business, operations, sales, marketing, finance, logistics, sometimes customer support, even HR. They’re working with data, they’re generating data in their systems, they’re trying to be more intelligent in their decision making, but they’re not necessarily super skilled in the state-of-the-art of data interactions.”