The Machine Learning (artificial intelligence or AI) megatrend embedded everywhere is essential for enterprise survival and competitive advantage. It is easy to implement with the drag and drop tools readily available from all vendors such as in Microsoft Azure and seen widely entrenched in familiar rapid adoption environments such as Windows 10. For this reason machine learning is a prominent theme in my upcoming keynote at the IFIP World Computer Congress Industry Prospect Conference. The implications are also spotlighted in my speeches to the United Nations General Assembly in July, the United Nations Global e-Government Forum International Scientific Practical Conference with follow-up discussions with ministers, and chairing the session at the CIO CITY forum for EU CIOs of the Year–it’s that important.
In preparation for the World Congress, I talked with the leading light in the field Pedro Domingos who has the Nobel Prize equivalent in Data Science and won the top prize this year in AI for the RDIS optimization approach which on average performed tasks between 100,000 and 10 billion times more accurately than previous methods and has broad applications in all areas of science, engineering and business. I’m sharing my chat with him due to profound impact on business, industry, government, science, education, media, and society. If there is one interview that I have done in the last 30 years you should listen to and read, it is the one with Domingos. To get you started here is a quick ten minute video from Pedro of sample applications in all areas of business, finance, e-commerce, information extraction, social networks, web search, debugging, computational biology, space exploration, robotics, … pretty much any area you can name.
First some background on Pedro
Pedro Domingos is a professor of Computer Science at the University of Washington in Seattle. He is a winner of the SIGKDD Innovation Award, the highest honor in data science. He is a Fellow of the Association for the Advancement of Artificial Intelligence and has received a Fulbright Scholarship, a Sloan Fellowship, the National Science Foundation’s CAREER Award, and numerous best paper awards.
He received his Ph.D. from the University of California at Irvine and is the author or co-author of over 200 technical publications. He has held visiting positions at Stanford, Carnegie Mellon, and MIT. He co-founded the International Machine Learning Society in 2001. His research spans a wide variety of topics in machine learning, artificial intelligence and data science, including scaling learning algorithms to big data, maximizing word of mouth in social networks, unifying logic and probability, and deep learning.
To listen to the interview you can go to the non-profit ACM Learning Center podcasts or click on this MP3 file link. In the learning center there is added text from the interview.
Here are extracts from the full interview.
Ibaraki: Let’s start with your most recent roles and research interests. You and Ph.D student Abe Friesen contributed the paper winning the top prize at the 24th International Joint Conference on Artificial Intelligence, the world’s largest AI conference. You with Abe developed a new algorithm often described as magical by others. Let’s explore this work. Can you define optimization in the context of AI and the broad class of nonconvex optimization problems?
Pedro: “Optimization is really what a lot of science, engineering and business problems boil down to at the end of the day. It’s the problem of finding the settings for the variables that give you the most of what you want. What our algorithm is basically doing is taking some ideas from AI and computer science and bringing them over to the problem of optimization with numeric variables. So you can do optimization with discrete variables, which is what people have traditionally done in AI, or you can do it with numeric variables which is what happens in most of engineering and increasingly in AI.”
Ibaraki: What are the roots of RDIS, the evolution of the research, and what is RDIS?
Pedro: “RDIS stands for Recursive Decomposition into Independent Subspaces and the idea is actually very intuitive, it’s based on how we human beings solve problems. We break them up into smaller sub-problems and then we break them up again until the problems that remain are simple enough so that you can just solve them one at a time outright. Then you can combine the solutions back again and then find the solution for the whole problem. As widespread as this is in computer science, the thing that’s amazing is that in continuous problems people don’t do that. So our motivation behind this was really to bring some of those ideas from AI and computer science over into continuous optimization.”
Ibaraki: Now let’s get to the magical part, what is the performance of RDIS?
Pedro: “The thing that is amazing about this is that when we do this we often end up getting exponential improvements in either the speed with which we can solve problems or the quality of the solution given the fixed time. The reason this happens is because we are doing this decomposition of the problems into small sub-problems (the individual problems are over a smaller number of variables), and as a result there is an exponentially smaller space to search. When you can do this (not always), you could potentially get really spectacular improvements.”
Ibaraki: Can you provide some specific examples where it can be applied and what this means to each domain of the application?
Pedro: “In principle this could be applied to any domain where continuous optimization is applied and the number of domains where that happens is really endless. Vision is one and another one is robotics. In business, for example, what is the best use to put your resources to or how much of different things to produce? In engineering for example, designing the shape of an airplane is a continuous optimization problem, the same thing for cars, electronic circuits, designing power plants, and figuring out what is the best configuration of components to do different things.”
Ibaraki: I can see now applications in things like economics and finance, epidemiology, genomics and so on. Do you see applications in those areas?
Pedro: “Definitely so. In economics there’s all these variables that you need to vary to get your best results. A very classic example in finance is you want to find the optimal sequence of trades that will maximize your profit, like the amount of money that you make at the end of the day. In epidemiology, one of the big things is to detect outbreaks early, but you can’t have sensors everywhere because that is not feasible, so you might want to optimize exactly where you put them so that the cost is minimal, but you get the earliest detection possible.”
Ibaraki: How do you see the work evolving and what are your specific next steps?
Pedro: “One of the things we want to do is explore more applications of this new algorithm. We are still finding out what it’s good for and what it’s not, but the other thing that we want to do is improve the algorithm. I think part of why we won this award is that this is not necessarily just one algorithm, it’s potentially a whole new direction in optimization, and whole new directions in optimization don’t come up every day. One of the things we want to do is come up with better methods for figuring out how to break the problem into sub-problems. Another example of something that we want to deal with is right now we are only dealing with the problems in the variables that they’re originally described in. For example, like the positions of the different amino acids in the proteins, I think that you can probably solve the problems a lot better if you transform those variables into a new set of variables. I have a feeling that if you combine ideas like that with this algorithm you will find much better ways to split the problem in terms of the right variables and as a result do much better.”
Ibaraki: There are a lot of interdisciplinary thoughts in this. Are you interfacing with many of the other departments at your university?
Pedro: “Definitely. One of the fascinating things about optimization is that it cuts across so many disciplines.”
Ibaraki: You talked about some areas where perhaps it’s not so good at. What are some of its limitations?
Pedro: “This algorithm is not a silver bullet; it’s not going to solve every problem. It’s a very good solution when the problem has this characteristic that it can be divided into smaller sub-problems.”
Ibaraki: You’ve come out with this remarkable book and have some really interesting ideas in there. Can you talk about your new book and some of the key points that the audience needs to pay attention to?
Pedro: “The book is called ‘The Master Algorithm How the Quest for the Ultimate Learning Machine Will Remake Our World’. It’s not a technical book. There’s no equations there (well there’s a couple), there’s no pseudo-code; it’s really trying to explain the deep ideas behind machine learning to a general book-reading audience. The reason I decided to write this book is that I realized there’s a really urgent need for one because machine learning is not a small obscure field anymore that a few scientists worry about, it’s something that touches everybody’s life every day. Part of my goal was to demystify machine learning and give people a conceptual model of how learning works.”
Ibaraki: What will computers and robots look like by 2020?
Pedro: “There’s a lot of robots in factories and then there are very simple ones like the Roomba that can vacuum a floor, but I think we are reaching the point where robots can really take off because the computing power is there, the sensors are there, the components are inexpensive enough that this actually becomes feasible. Also AI – because the crucial question is building the brain of the robot – has progressed to the point where we can actually start doing these things, but I think a lot of the robots that we are going to see are going to be very different from what people imagine. A self-driving car is a robot, it’s just a robot in the shape of a car. Other robots are going to be very specifically designed for the particular thing that they do in the world so there’s going to be a Cambrian explosion of different kinds of shapes and sizes and types of robots. It’s going to be very exciting. The same with computers, I think we are going to see increasingly powerful computers on the one end. At the other end of the spectrum we are also going to see tinier and tinier computers embedded in everything until the world that we live in is going to have intelligence embedded in it everywhere.”
Ibaraki: Are you saying AI, robots and machine learning or deep learning and the impact it’s going to have could be a major disruptor and pivotal point in our history?
Pedro: “This gets back to when you asked me why do you work on machine learning? It’s that if you care about all these global problems, machine learning and all these technologies are all going to be a big part of solving them. In a way we are limited by our ingenuity, but I think we have so many tools at our disposal today that it behooves us to use them to solve all these problems, and I think we are going to see a lot of progress on them in the next 10 to 20 years. Of course not all of it is going to come from technology, but I think technology is a big part of it.”
Ibaraki: Can you talk further about recent advances in knowledge discovery and data mining?
Pedro: “Knowledge discovery and data mining didn’t exist as a field or barely existed 20 years ago, but now it’s this huge sprawling field that reaches everywhere. Even I, an old-timer, can’t really keep up with everything that goes on there anymore, but I can tell you what some of the main things are. One of the main things is that data streams have become one of the big things that people do, mining data continuously. Another very big one which again basically did not exist 20 years ago is mining networks, learning about networks (often social networks, but could also be other kinds of networks). Another area is that the world is full of unstructured data. We used to only mine databases of records, but these days we mine text, audio, video, and increasingly we mine combinations of them.”
Ibaraki: What are the big questions in machine learning?
Pedro: “One of the big questions is, ‘Is there such a thing as a universal learner?’. If there is such an algorithm the next question is, ‘What is it going to look like?’. There are similar questions in all the schools of thought, but I think at the end of the day (and this is very much the argument that I make in the book and I make to the community), each of these tribes is solving a real problem and they have some very brilliant solutions to them, but in the end to solve the machine learning problem it’s not enough to solve one of these problems, you have to solve all of them in the same algorithm. The biggest question for me is how we combine these pieces into what you might call the grand unified theory of machine learning; in the same way that there are grand unified theories in physics like the standard model or in biology like the central dogma, I think we should be looking for one in machine learning and in AI.”
Ibaraki: What are big questions in artificial intelligence overall?
Pedro: “In artificial intelligence the problem really boils down to the following thing: on the one hand we need powerful representations that we can encode knowledge in – if we want to build a really intelligent system as opposed to a system that does a very specialized thing we need very powerful representations. At the same time when the representation is powerful it also becomes intractable. Computing with it, doing inference with it becomes unbearably expensive so AI really boils down to this problem of how do you find representations that are expressive enough for what you want to do with them like controlling the robots or whatever or building a knowledge base on the web, but at the same time not be so expressive that they become intractable.”
Ibaraki: What are the big questions in scaling learning algorithms to big data?
Pedro: “One of them is how to parallelize learning algorithms. How to learn on data streams that we already talked about. Another one is because we are dealing with networks as opposed to isolated examples and this creates a huge scaling problem. I think at the end of the day the most interesting scaling up problem is what algorithms can we design for this world of large scale data that are actually different from the ones that we had before?”
Ibaraki: Again in a broader sense, what are the big questions in deep learning?
Pedro: “The central question in deep learning is how can you discover a representation of the world in the internal layers of your network? This is a great problem but it is still far from solved, so what is preventing this from happening? Another problem is scaling up, you take advantage of a lot of data and all these things have to run fast enough that you can stream the data through and basically learn your model as the data streams through. Another question is how do you incorporate more of these symbolic types of learning and inference into a deep learning network? Again, if you believe in modeling the brain you know that the brain understands language and the brain can reason and plan, but today’s deep networks can’t do that yet. So one of the frontiers is how do you make them do that?”
Ibaraki: We’ve already talked about some of the applications of this, can you perhaps further elaborate for example on business, government, media, education and society?
Pedro: “Deep learning folks and machine learning folks in general are very ambitious. I don’t think there is ever a problem that they look at that they don’t think that they could ever apply machine learning to. What you are already seeing that you are going to see more of in the future are businesses where there is learning in every nook and cranny of the business. When you look at companies like Google and Amazon that is already the case. They use machine learning pretty much everywhere. What is true of business is also true of government, is also true of healthcare and is also true of education.”
Ibaraki: What will you do next?
Pedro: “My next three months will be largely taken up with promoting the book so there will be a lot of things to do, pieces to write, media, talks to give and so on and so forth. I’m also continuing to do research so I have several very exciting problems that me and my students are working on, some of them related to deep learning and some of them to unifying logic and probability. I think in research and a lot of other things it pays to be opportunistic. You should always have your eyes open for the opportunities that come up and if a new better thing to do appears, then run with it.”
Ibaraki: Pedro, with your demanding schedule, we are indeed fortunate to have you come in to do this interview. Thank you for sharing your substantial wisdom with our audience.