In today’s world where the competition is fierce for talent, it says a lot when your country is selected for opening a major engineering centre. It says, even more, when that company is a global leader in bringing the power or AI and machine learning to the enterprise. As such, upon hearing about Databricks coming to Canada, it sparked my interest to learn more.
Databricks is leading the charge for organizations to derive value out of AI and machine learning and is one of the fastest-growing SaaS companies in the world today. The next decade of innovation will combine the technology domains of cloud, data, and AI – Databricks is sitting at the intersection of all three.
Databricks was founded in 2013 and has thousands of global customers including Comcast, Shell, HP, Expedia, and Regeneron among many others across virtually every industry. Databricks is currently valued at over $6B with funding from leading investors like Andreessen Horowitz and NEA. To help bring the power of AI to the enterprise, Databricks also has hundreds of global partners that include Microsoft, Amazon, Tableau, Informatica, Cap Gemini and Booz Allen Hamilton.
Interestingly, you could say that Canada is actually embedded in the DNA of Databricks as the Co-founder and Chief Architect, Reynold Xin, is a University of Toronto alum. Reynold has BASc in Engineering from the U of T and holds a Ph.D from the University of California, Berkeley. Additionally, co-founder and chief technologist, Matei Zaharia, grew up in Toronto, went to the University of Waterloo and has a Ph.D. in Computer Science from the University of California, Berkeley. I connected with Reynold to gain further insight into the company, the Canada decision, and what the technical vision of the future may hold for AI and machine learning enabling organizations to make data-driven decisions – from improved health outcomes to superior operational efficiency.
Brian Clendenin: For those that may not know, what is Databricks?
Reynold Xin: “Databricks is a 6-year-old technology startup based in San Francisco. Our mission is to help data teams solve the world’s toughest problems, from security threat detection to cancer drug development. We do this by building and running the world’s best data and AI infrastructure platform, so our customers can focus on the high-value challenges that are central to their own missions.
The founding team were the original creators of Apache Spark. We worked on research problems in big data and machine learning at UC Berkeley. As part of that, we had a very close collaborative relationship with Silicon Valley, and saw some of the earlier use cases and challenges with data. We created Databricks with the belief that data has the potential to help solve some of the world’s toughest problems.
Fast forward six years, the company has evolved into a global organization with over 1000 employees and thousands of organizations entrust us with their most critical data infrastructure. Last year, we announced a $400 million Series F round of funding which valued the company at $6.2 billion USD.”
Brian: Why select Canada to open a global engineering centre?
Reynold: “Our ‘secret sauce’ is the people at Databricks. We want to find the most talented and motivated people and create success collectively. We started in the San Francisco Bay Area, which has the highest concentration of software engineers. But the demand for our platform is so large that we need to grow the team substantially.
As part of our quest for talent, we opened our European Development Center in Amsterdam three years ago. The Amsterdam office has become an integral part of the Databricks innovation factory. They have shipped some of the highest impact features that made our customers life so much better.
Earlier this year, we decided it’s time to repeat the success we had seen with Amsterdam, and set out to find our third engineering hub. This time, we started with the following criteria:
- High concentration of software engineers, so we can build the initial teams with just local talent.
- A major city that’s easy to get to from San Francisco.
- English speaking.
- Friendly immigration policies, so we can attract engineers from all over the world.
It wasn’t that difficult to narrow it down to Toronto, especially considering two of the founders have ties to Toronto. Matei grew up in Toronto, and I went to college at U of T.
Brian: How do you envision Canadians will contribute to Databricks’ innovation and market leadership?
Reynold: “Throughout modern history, Canadians have played a critical part in the invention of new technologies, from medicine to more recently information technology. But at the same time, there’s also a large brain drain of Canadians going south to the United States, often for better pay or better work.
We want to create an awesome environment in Toronto so the most talented engineers can work on the cutting edge technologies that have massive real-life impacts. They should wake up every day eager to come into work, knowing that the technologies they are building have contributed to fundamental societal issues such as reducing traffic congestion or curing cancer.
It is what they will be building that will define the next decade for Databricks, as part of our goal to enable every organization to leverage data and solve the toughest problems.
In Amsterdam, in addition to hiring a lot locally, we’ve also attracted some of the best engineers in other parts of the world and convinced them to move to the Netherlands. I think we will be able to help Toronto attract this calibre of people over as well.”
Brian: You’ve mentioned that Databrick’s is at the intersection of cloud computing, big data, and machine learning. Will these technology domains be the big drivers of innovation over the next decade?
Reynold: “Absolutely, and Databricks is uniquely positioned at the intersection of these 3 megatrends. When we first started the company, we decided we wanted to build a cloud data platform that has diverse capabilities including machine learning. Most companies back then, and even now, are focusing on on-prem shrinkwrap software and on data warehousing, without any capabilities to do machine learning. Many investors we talked to were very skeptical about our approach: although big data was already “big”, the concept of cloud computing and machine learning was nascent and the market was small.
In 2020, it’s clear all of them took off and became megatrends. Cloud computing enables the rapid delivery of software as a service and compute resources on demand. This can create massive cost savings for IT infrastructure, but the real reason I’m super excited about it is that it could shorten time-to-market for new applications our customers are developing from years to days.
As you know the field of machine learning isn’t new, but what’s completely new is the abundance of data available at our fingertips to train and apply state-of-the-art models. These models in return can help considerably enhance customer experiences, products, and help drive positive business outcomes. However, without computing power, without the ability to scale, processing big data or training machine learning models on big data becomes extremely challenging.
So it truly is the combination of the cloud, big data, and machine learning technologies combined will drive massive innovations over the next decade. And that’s what we have been focusing on.”
Brian: What is the promise of AI and machine learning in the enterprise?
Reynold: “The promise of AI in the enterprise is massive. For the past three decades, data warehouses have become a standard component in any enterprise IT architecture. Those allow enterprises to look into the past, understanding how their businesses are doing. That’s obviously tremendously important and is phase 1 of the revolution.
We are on the verge of starting phase 2 with AI: look into (predict) the future.
Why is this important? Imagine what enterprises can do if they have a crystal ball into the future. To give you some examples. We have been working with Bechtel to reinvent the construction industry leveraging AI to sequence the complex dependency graph in billion-dollar construction projects. We’ve worked with Regeneron in accelerating drug discovery, and Quby in helping homeowners reduce energy consumption.
However, few organizations have succeeded so far due to many challenges like infrastructure limitation, poor data quality, or challenges hiring qualified workforce in that space. We believe our technology can uniquely help solve many of the technical challenges, and we continue to add groundbreaking innovation to the platform based on customer needs. We partner with hundreds of ISVs and technology providers to allow customers to leverage their investments and for example, connect their existing infrastructure to the Databricks platform. In addition, we have and continue to scale as an organization, and our customer success and support organization work very closely with thousands of customers worldwide to help their data teams innovate faster.”
Brian: What type of software engineering talent is optimal for Databricks?
Reynold: “We are hiring software engineers from all subareas of computer science, from cloud infrastructure, databases, distributed systems, developer tooling, to machine learning. Our engineers are recognized by their peers outside Databricks as the top engineers, but at the same time are extremely collaborative and “customer-obsessed”. That means they tend to care a lot more about the impact of what they have created on our customers, rather than the creation process itself. We also emphasize “own it” a lot as a cultural principle. People are here on a mission and they are willing to do whatever it takes to drive projects end to end. When something is not going well, they don’t spend energy blaming somebody else, but rather focusing on finding a solution.”
Brian: What do you find most exciting about the future for Databricks?
Reynold: “Of course one of the most exciting parts is the growth of the company. We have become one of the fastest-growing SaaS companies ever created, and it will be terrific to see the next phases of growth.
What I find even more exciting than the growth itself is I wake up every day learning new use cases that our platform has enabled our customers to do. We already discussed some very interesting ones that have already created a large impact, but I believe the best is yet to come. Perhaps one way we will indeed receive an email from a major pharmaceutical company or a university research lab that some data analysis and machine learning done on our platform has led to the creation of a new drug that cures cancer. We are really lucky that we are solving intellectually challenging technical problems every day, and those solutions are helping create a better world.”