Welcome to insideBIGDATA’s “Heard on the Street” round-up column! In this regular feature, we highlight thought-leadership commentaries from members of the big data ecosystem. Each edition covers the trends of the day with compelling perspectives that can provide important insights to give you a competitive advantage in the marketplace. We invite submissions with a focus on our favored technology topics areas: big data, data science, machine learning, AI and deep learning. Enjoy!
Nightmares turn to reality for this IT professional. Commentary by Carl Sverre, Senior Director, Engineering, Launch Pad at SingleStore
A ‘horror story’ we experienced happened very early in SingleStore’s journey. Someone at the company who was working on our test infrastructure wrote a script to replicate our analytics database from a legacy MySQL instance into SingleStore. We were excited to start eating our own dog food, and in our rush didn’t audit the script enough before running it on production. Unfortunately, the script started with a drop database command which rather than executing on the destination, ran on the source. Moments later, we realized that over a year worth of important test data had been eliminated. Needless to say it was a bit of an intense moment. Fortunately, we had a backup and we were able to recover most of the important data by the end of the day. I am proud to work at a company which embraces failure and the learning experiences that come with it. Now, years later it’s exciting to see our product develop new features that can recover from this kind of horror story from happening within seconds. Maybe, without this event in our past, we wouldn’t have prioritized our innovative separation of storage and compute design as much as we have.
On-prem vs. the cloud; the ever-present database management debate. Commentary by Mathias Golombek, CTO, Exasol
Many organizations are undergoing a ‘tech refresh,’ and as the U.S. teeters on the edge of a recession, database management is an ever present topic of discussion. Organizations continue to ask themselves — should we stay with our current infrastructure or pivot? Is this the time to jump to the cloud? There are specific benefits to on-premise and cloud infrastructure, and it’s important to note that an organization does not have to go ‘all in’ on one over the other. Ultimately, decision-makers need to ask three fundamental questions: How much flexibility does the IT infrastructure require? Can the cloud significantly reduce costs for the organization? And how dependent is the organization on the services of a particular cloud provider? The answers to these questions will vary over time. The key is finding an analytics database that works both on-premises and with your cloud operations – and at best, even in parallel and across different platforms. This type of flexibility allows CDOs to get the most out of company data, regardless of where the organization is at in its specific cloud journey.
Why AI isn’t going to take over your DevOps role. Commentary by Mav Turner, Senior Vice President, Product, Tricentis
AI is augmenting people, not replacing them. While AI has changed the nature of the types of tasks testers and developers perform on a daily basis, AI-augmentation gives DevOps teams the ability to fully maximize and enhance the unique strength of the testers and developers, streamlining the workflow and giving DevOps teams the ability to be more effective and innovative. In the context of test automation, AI and machine learning detect patterns to recognize bugs and other critical issues upfront, pointing out deviations for the exploratory tester to then go investigate. This example brings to light how AI-augmentation is incredibly empowering for the exploratory tester, freeing up their time to focus their skills and expertise on discovering the nuances of issues that AI is not capable of uncovering on its own. Furthermore, AI-augmentation accelerates the need for developers as we continue to have better tooling and testing processes move more quickly. The increase in tester productivity, in turn, enables developers to build, aggregate and connect components faster, which remains a critical function in DevOps .
AI Bill of Rights Impact on Ethical AI Practices and Implementation. Commentary by Sagar Shah, Client Partner at Fractal Analytics
The AI space has been plagued by numerous public missteps when it comes to ethics and a lack of good judgement. AI is an incredibly powerful tool, but for it ever to reach its full potential, it is essential that the technology be built and used in a way that is human-centric, ethical, and sustainable. Therefore, for many practitioners in the space the U.S. AI Bill of Rights is a welcomed bit of governance – even if it is overdue. That said, despite the positive progress that these guidelines and goals aim to accomplish as it pertains to key areas like protecting civil rights, privacy and accountability, until these well-meaning objectives are backed up with clear enforcement teeth, the space as a whole will continue to run the risk of reputational damage brought on by companies looking to exploit gray areas.
Five Steps to empower developers to operate securely in the cloud. Commentary by Josh Stella, vice president and chief architect at Snyk
Empowering developers to find and fix cloud misconfigurations when developing infrastructure as code is critical, but it’s equally important to give them the tools they need to design cloud architecture that’s inherently secure against today’s control plane compromise attacks. Here are five key steps for any organization: (i) Understand your cloud environment and SDLC. Security teams should embed engineers with application and DevOps teams to understand everything that’s running, how it’s configured, how it’s developed and deployed, and changes when they happen, (ii) Prioritize secure design and prevent misconfiguration. Once a control plane compromise attack is underway, it’s generally too late to stop it. Bake security into the entire cloud SDLC to catch misconfigurations before they get deployed, (iii) Empower developers with tools that guide them on security. Cloud security tooling should provide developers with useful, actionable feedback on security issues and how to remediate them quickly, (iv) Adopt policy as code for cloud security. PaC helps security teams scale their effort with the resources they have by empowering all cloud stakeholders to operate securely without any ambiguity or disagreement on what the rules are and how they should be applied. It serves to align all teams under a single source of truth for policy, eliminates human error in interpreting and applying policy, and enables security automation (evaluation, enforcement, etc.) at every stage of the SDLC , and (v) Focus on measurement and process improvement. Cloud security is less about intrusion detection and monitoring networks for nefarious activity, and more about improving the processes. Successful cloud teams continuously score the risk of their environment as well as the productivity of developers and security teams.
The rise of intelligent and intuitive transcription fueled by AI. Commentary by James Hom, Chief Product Officer at SoundHound
We will soon see call centers and other phone-centric businesses start to leverage voice AI to bring new levels of meaning and structure to business conversations in real-time. This will involve AI technology that simultaneously structures and tags key topics and entities from which it can infer the speaker’s meaning and intent. So — for example — if a customer tells the agent: “I’d like to ship this item back”, the transcription system will understand and automatically tag this as a request for a return/exchange. In a scenario like this, predictive analytics might also be deployed to intelligently suggest responses and next best actions to agents across a range of industries. This kind of AI-fueled transcription will be key to improving the customer experience, reducing the time it takes to resolve customer service issues (even complex ones), and delivering better outcomes for businesses. As it stands, there is an evident need for smart, real-time voice AI transcription services that go beyond legacy solutions by accurately capturing, identifying and attributing meaning to conversations.
Fundamentals of data mesh concepts and what approach (if any) is right for an organization. Commentary by Tiago Cardoso, product manager at Hyland Software
The fundamentals of a data mesh concept include efforts to decentralize data governance and ownership to allow separate groups to be responsible for their own data while applying product-centric concepts. This enables data usage in a continuous improvement fashion while providing federated governance and proper documentation. Organizations should see how their data clusters in terms of context and technologies to empower the correct players to build their data strategy based on these data mesh concepts. Additionally, a data mesh architecture is beneficial for users that need to make informed decisions. The outcome from data mesh is a better fit between domain context and data availability and relevance. Data will have better quality, better context (metadata), better documentation and will result in an augmentation of decision-making to be quicker and more efficient. It can even lean to automation – when AI is applied – letting users free for more complex decisions. Nevertheless, data mesh also benefits the users concerned with compliance and regulations as it is easier to provide clarity and means for better governance for each domain. Data mesh is particularly effective for those cases as cultural and organizational change is usually a requirement for adopting these concepts. Overall, it is exceedingly difficult for companies to improve governance, ownership, data security and access. Traditional siloed or centralized data concepts make this mission extremely complex as usually the correct people are not involved in the correct tasks. Moving to a data mesh concept, a company that is willing to adopt it and prioritize cultural and organizational change will be well-positioned to structure internal responsibilities according to data context and domain, empowering the right people to move these topics forward. Recently, some new companies are starting to build global CDN networks and nodes with streams and processors. This will provide the way to build near-real-time global meshes where processing is done locally while data transfer is optimized to scale with near-zero latency.
Hardware Depend is Threatening AI Innovation. Commentary by Luis Ceze, CEO, OctoML
We’re in the midst of a storm. On one side we have a global chip shortage with no end in sight; on the other specialized hardware players like Intel, NVIDIA, Arm and others are at war for AI dominance. To make matters worse, we’re living in a world of hardware dependence, where models have to be tuned and optimized according to specific hardware specifications and software frameworks. This is in contrast to most major software components, where portability has been the industry standard—and taken for granted—for over a decade. The problem is billions of dollars in resources (time, talent, R&D) go into solving this dependence problem with little result. And this is threatening AI innovation—because only companies with massive resources can crack the code on this industry-wide problem. We as an industry must achieve hardware independence if we’re to fulfill the promise of AI. Achieving hardware independence will enable faster innovation, unlock hybrid options for model deployment and ultimately save practitioners time and energy.
The Broken Promise of AI. What went wrong between 2012-2022. Commentary by Lewis Wynne-Jones, VP Product, ThinkData Works
In 2012, DJ Patil wrote an article calling data science the “sexiest job of the 21st century” and sparked a global hiring boom of these new professionals who were believed to be wizards who could spin gold out of raw data and deliver impactful insight to their business divisions overnight. The result of this hiring boom was that three years later Bloomberg would write that 2015 was a landmark year for Artificial Intelligence, starting a hype cycle that continues to this day. The reality, however, is that the AI Bloomberg noticed in 2015 was a direct result of the hiring boom of data scientists that started in 2012. These professionals were dropped unceremoniously into large enterprises that had not designed a robust data environment or even clearly articulated their data strategy. With the support of the organizations they worked for, they operated as pure scientists – experimenting, testing, modeling – furthering the evolution of the discipline of AI, but not its practical implications. Since then, organizations have started to wonder why their investments in data science and AI are not resulting in a return on investment. Largely, it is because for many of these organizations AI itself was the goal they aimed for. Not specific business transformations or improved processes, but merely developing intelligent technology. In this, they got what they wanted. Algorithms have been developed, models have been run in test environments. But business transformation has not, yet, occurred in any automated, scalable way. This is because we have collectively ignored the reality that pure data science will increase our knowledge base but not our bottom line. In order to make good on the promise of AI, we need to realign data science to business priorities and develop data strategies less focused on hype and more on the problems we need to solve.
What IT spends will a CFO write a check for? Commentary by Chris Gladwin, Co-Founder and CEO of Ocient
The pandemic resulted in the unprecedented acceleration of digital transformation and a related boom in IT spend. As Gartner indicates, there is no going back. But that doesn’t mean all IT spend is considered equal. Security and compliance are top of mind with CIOs as the growth of digital business practices is attracting more bad actors who present a greater risk to critical business infrastructures than ever before. The growth of Data & AI will also drive increased spend, however, CFOs are starting to question the high costs of legacy technologies and cloud providers in cases where operating at scale is tapping out their IT and departmental growth budgets. While investment in this area will continue, we expect to see a consolidation of spend in strategic areas and an accelerated adoption of new, disruptive solutions that offer increased cost advantages when running Data & AI workloads at scale.
Sign up for the free insideBIGDATA newsletter.
Join us on Twitter: https://twitter.com/InsideBigData1
Join us on LinkedIn: https://www.linkedin.com/company/insidebigdata/
Join us on Facebook: https://www.facebook.com/insideBIGDATANOW