The promise of artificial intelligence (AI) is undeniable, but its enormous potential also comes with enormous responsibilities. Companies and organizations around the world are feeling the competing pressures to accelerate the use of AI, while having to safeguard against the problems that can result if the technology is not used properly.
As companies chart their AI path, we want to help them think about how to responsibly use this transformational technology, whether they rely on small scale open source models or hyperscale proprietary large language models (LLMs). Our platform enables customers to carefully control their enterprise-wide data and AI development and allows them to better manage risk, lessen instances of bias and address other problematic issues.
While AI technology continues to rapidly evolve, we believe the future needs to be grounded in trust and transparency – the cornerstones of any lasting relationship. That is why we developed “Databricks’ Commitment to Responsible AI.” This statement includes the core principles that guide the vision for our technology and how we can continue to support companies as they adopt AI.
We want our statement to contribute to the healthy discussions around the responsible use of AI. It is an important topic that will shape AI’s direction and we look forward to continuing the conversations with our customers and partners, and with regulators, policymakers and other key stakeholders.
Databricks’ Commitment to Responsible AI
Artificial intelligence (AI) has been under development and in use for decades, but has recently entered a new phase that is significantly increasing its rate of adoption and impact. The latest advancements have the potential to transform entire industries – for example, by accelerating medical research, delivering personalized customer experiences and combating climate change, among many other breakthroughs.
As more companies, government agencies and other organizations adopt AI technology, questions around its responsible use must be addressed. Enterprises need to consider the enormous benefits that AI can deliver, as well as the significant risks and negative outcomes that can result when it is not developed and used with care.
The speed at which AI is evolving poses one of the biggest challenges facing enterprises: how to develop and enhance their AI testing and monitoring tools without a clear roadmap to where the technology goes next. The concerns around unethical uses, bias, hallucinations and other systemic problems will only become more pronounced as AI advances and new ways to apply the technology are developed.
Importantly, there are a number of industry best practices that have emerged to help enterprises anticipate and mitigate these problems. A leading example is the NIST AI Risk Management Framework, which offers a useful set of guidelines to evaluate and address AI risk.
The solutions to the questions posed by widespread use of AI will arise from a broad range of industry players. If the industry maintains a focus on transparency and trust, we are confident that we can work together to harness the best from AI, while minimizing the dangers.
Our Principles and How We Help Enterprises Responsibly Deploy AI
As a data and AI company focused on helping enterprise customers extract value from their data to solve the world’s toughest problems, we firmly believe that both the companies developing the technology and the companies and organizations using it need to act responsibly in how AI is deployed.
Our platform is designed to help enterprises better control, protect and understand their data. In their efforts to use AI responsibly, companies and organizations of all sizes rely on our data governance and machine learning tools to monitor and test their data sets and AI models. Within the secure environment our platform provides, these tools can make our customers’ data more explainable and freer from bias, inaccuracies, incompleteness and other harmful errors, as well as enabling greater accountability and helping them meet compliance standards.
While AI’s future is still being written, our commitment to the use of our technology to help enterprises responsibly use AI will not change.
1. Good governance is essential.
- As an enterprise software company, we care deeply about how our customers use our technology. That is why our platform provides a suite of data and AI governance tools – including Unity Catalog (the governance framework on our platform), Lakehouse Monitoring and a number of features within MLflow (our tool to help customers manage the machine learning lifecycle) – that allow enterprises to create a best-in-class framework. The Databricks platform provides customers a range of features to ensure proper governance including tools for quality control, data lineage tracking, monitoring, security, privacy and auditing.
- Customer-facing AI applications can have unique issues, which is why our platform helps enterprises follow responsible guidelines to address them, including having human intervention where appropriate, transparency around the use of AI and reasonable efforts to avoid the output of undesirable content.
- We deliver the tools and framework that customers need to anticipate and address potential issues as they seek to responsibly deploy AI to advance their business objectives. This focus builds on our Acceptable Use Policy, which helps ensure that our platform is not used for fraudulent, deceptive or illegal activities.
- We have also established an AI Advisory Committee to inform how we think about and use AI as the technology continues to move ahead.
2. AI should be democratized for all companies.
- We believe in simplifying AI and broadening its development and use, so every company and organization can access it. AI should not be controlled by just a few large players. With this in mind, the Databricks platform can be used to build and deploy custom models in addition to hyperscale large language models (LLMs).
- We believe democratizing AI will help to keep the costs low, allowing the broadest possible range of businesses, nonprofits and other organizations to adopt this fast-moving and disruptive technology.
3. Companies should own and control their data and models.
- Enterprises using AI technology should be able to maintain control of their proprietary data and model quality. We believe that customers should have the opportunity to build and deploy models that let them securely utilize their data without having to move or share it with a third party.
- Our Lakehouse architecture provides customers with extensive security protections, including data access controls and other monitoring and governance features within Unity Catalog and MLflow, among many other security measures (see our Security & Trust Center for more detail). We want enterprises to gain valuable insights from their data, while having full control and remaining fully compliant with global privacy and data protection regulations.
4. AI is only as good as the data it is trained on and enterprises should be able to control and monitor their data to help reduce hallucinations, bias and other errors.
- Our platform has functionality to help address inclusivity, fairness, accuracy, transparency and accountability. For example, tools within MLflow allow customers to run, monitor and adjust their models. Other tools enable reproducibility and lineage tracking for both data and models. Our model testing features can also filter for problematic content and numerous additional tools within Unity Catalog and other parts of the Lakehouse help our customers better manage risk, lessen the instances of bias and address other potential issues.
- In June 2023, Databricks introduced Lakehouse Monitoring. This is our data and model monitoring suite that allows customers to monitor models that are in production, checking for data quality and bias issues such as model and feature drift. This functionality gives enterprises the ability to apply intelligent automation to generate alerts, trigger retraining pipelines when needed and generate reports for audit purposes. Lakehouse Monitoring is fully integrated within Unity Catalog and is designed to work seamlessly with related features of MLflow.
- We also believe diversity of data and use cases are critical to reflecting the populations that our customers want to reach. The availability of varied data sources on Databricks Marketplace and safe data sharing functionality within Lakehouse Data Clean Rooms can help customers diversify their data.
5. Enterprises should limit AI’s environmental and financial costs to what is required to support their business objectives.
- Hyperscale AI LLMs are appropriate for certain use cases that we fully support, but they do require enormous compute and storage resources. Their financial and environmental costs should be weighed against the value they provide in light of the applicable set of circumstances.
- We believe that smaller scale models can help democratize AI and greatly reduce the harmful environmental impacts and the significant costs that are associated with creating and using hyperscale models when such large models are not needed.
- MLflow provides enterprises with the ability to monitor compute resources utilized by a model, enabling customers to assess its impact on their carbon footprint.
6. Thoughtful regulation is needed to help ensure that AI is used responsibly.
Databricks plays an important role in helping enterprises think about and use AI. We look forward to continuing the conversations around governance, best practices and regulatory structures that will enable us to responsibly capitalize on AI’s enormous potential.
- AI enables many high value use cases. However, AI technologies can be misused or misapplied, which is why we believe thoughtful regulation is necessary to align with best practices around the responsible development and use of AI.
- It is important that any regulation does not stifle innovation and democratization or extinguish the vibrant spirit of collaboration that is fostering technological advancements. Accordingly, we believe regulations and their mandated obligations should be proportionate, sensible and tailored to particular use cases and outcomes, rather than being focused on underlying technical methodology. It is particularly important that open source AI not be unduly restricted because of the substantial benefits provided by the availability of open source AI in terms of furthering innovation and keeping the cost of productivity-enhancing AI low for a wide range of businesses and uses.