Sebastian Schaal 1

BY

Mythbusters reloaded: 7 more myths about AI

It has been over half a year since I published the article Mythbusters: 7 common myths about AIOn my quest to demystify AI, I have held many workshops and talks, in which I faced strong enthusiasm for the topic, but also numerous misconceptions. In this follow-up post, I have collected and debunked seven more frequently encountered myths.

Myth #8: AI is only for the tech elite

US tech giants and top universities used to drive most of the open-source contributions. However, other parts of the world, especially China, have caught up and are taking the lead in many fields.

When reading about AI and why it has not been more widely adopted by the mainstream yet, authors mention the lack of high-quality data, suitable infrastructure, and talent. There is a prevalent belief that only a handful of experts worldwide is involved in state-of-the-art AI research and implementation, so your chances of employing one of these select few are low.

Whoever lets this statement discourage them, underestimates the current dynamics of the field. As already stated in one of our former Mythbusters, open-source plays a central role within AI: the authors often publish the code basis of their cutting-edge research on GitHub before they present the actual paper at a conference. This enables anyone that’s comfortable digging into the code to use and build upon months of intense research. But also, there is a crucial distinction between applied AI and research endeavors, in which the latter one focusses on squeezing out the last percentages of performance, for instance on a public challenge (e.g., reducing classification error from 3.6 to 3.1%), with little concern for scalability or real-world impact. With consideration of the practical and enthusiasm for the implementation, you can integrate the latest inventions within AI into an application by using the commercial APIs various companies offer. To further hone your skills, there is also a great variety of online courses which help democratize the field of AI. Fast.ai, as an example, offers a higher-level wrapper that is easy to use for any developer and already integrates the current deep learning best practices.

The lesson here: AI is not only for the tech elite! You do not necessarily need a Ph.D. to create value with AI, as you can make use of all the great open-source materials available.

Myth #9: AI is making humans obsolete

Instead of fighting the machine revolution, we should use intelligent systems to augment our intelligence.

One common negative sentiment towards AI is the idea of machines developing into a one-to-one replacement of humans, rendering us obsolete. This fear implies that artificial intelligence will resemble human intelligence very soon. However, besides the obvious biological differences, artificial and human intelligence differ fundamentally in their scope and capacity.

The AI systems of today are very good at scaling their current intelligence beyond the boundaries of a single unit, as no human can ever do, but they struggle to reason about the unknown, or when taking on a knowledge transfer task. This situation leads us to believe that the AI revolution will end up having a similarly positive impact as the industrial revolution: back then we were shifting from physical to cognitive work, and now human labor will transition from repetitive to complex cognitive tasks. We cannot escape the fact that most professions will change, and that this will directly affect people. However, I genuinely believe that in the long-term AI will be used to augment, rather than replace, human intelligence. In the short term, we think policymakers will need to support displaced jobs and structurally prepare us for the new AI age.

The good news: it has never been this easy to acquire new skills due to the emergence of massive open online courses (MOOCs), further democratizing learning. Feel free to check out our past blog posts on how AI will be your future best-friend and how AI can shape the future of education.

Many people think that an AI is this magical thing that gets better and better automatically. In most of the ML systems used today, the system has been trained on historical data, i.e., it has been shown many past examples and has created a general understanding from them. When it is running in production, it uses this knowledge to make a judgment about new observations it has never seen before. And often, that is the end of it.

When you have the chance to check these judgments (e.g., predictions) against the (potentially future) reality, you get feedback about how well your model has done. Ideally, if you can collect a lot of this feedback, you can finally re-train your model to improve on its earlier mistakes. However, in most cases, this is not happening automatically and on the fly but requires an engineer and some additional tuning to be effective.

The closest we get to this myth is the concept of Reinforcement Learning, where an agent learns by interacting with an environment and observing the causes of its actions.

Myth #10: AI cannot be creative

The Starry Night by Vincent Van Gogh, while dreaming of animals.

Creativity is often considered as an exclusively human privilege that a machine could never claim for itself.

In AI, this holds true for many supervised learning systems, where the model is used to predict an outcome based on an input, e.g., classifying an image. We also know that today’s AI systems draw their knowledge from their training data and/or experiences with the environment — so how can this be creative?

Without extending this to a more philosophical question, one can argue that as humans, we get our creative ideas from combining different sources of inspiration. Some generative models today follow a similar logic, even when doing a particular task.

For example, researchers have built systems like DeepBach, which creates new compositions in the style of Bach, or engines like AIVA, which are composing original emotional soundtracks. Techniques like Generative Adversarial Networks (GANs) are used to imagine hyper-realistic faces or create new streets scenes images for autonomous driving. Alpha Go shocked the Go world with its move 37, involving a strategy no human has ever followed before. And lastly, Google’s Deep Dream fostered a market for art generated by algorithms. The Deep Dream Generator offers tools for you to play around and create your own art pieces.

So, when we apply the definition of creativity as “a phenomenon where something new and valuable is formed”, even our current forms of AI are starting to make contributions.

Myth #11: AI models are biased

Since Turkish does not distinguish the male and female pronoun, the algorithm used to choose the “more likely” outcome given the dataset it was trained on. Google’s gender-specific translation detects these incidents and now offers alternative solutions.

First, we have to remember the distinction between AI and Machine Learning from our earlier posts. If we look at a non-ML AI system or expert system, it is designed by a human trying to teach the program how to act based on rules. If not quality controlled, these subjective rules offer the potential to be biased towards the thinking of the creator.

When moving to ML, there were a lot of cases in the media about biased systems, e.g., favoring white people when giving parole or showing common gender stereotypes in translation. However, we have to understand that it is not the model or its architecture introducing the bias, but the data that it has been trained on. Today’s models are designed to internalize the concepts they are discovering — including all their shortcomings. Once they are trained and evaluated on the imperfect data generated by biased humans, they are incentivized to copy those biases.

Luckily, there are a lot of efforts to change this, from excluding personal data in the first place to detecting biases and counteracting them. In December 2018, Google introduced gender-specific translation, and we are excited for more to come.

Myth #12: AI models know causality

The birth-weight paradox: Researchers showed that a mother’s smoking during pregnancy seemed to benefit the health of her newborn if the baby happened to be born underweight. The causal diagram shows that a potential other reason for an underweight baby, e.g., a birth defect, is more severe and thereby implies a higher mortality rate. When excluding this effect, our intuition of the harmfulness of a mother’s smoking holds again.

From statistics, we know, that correlation, i.e., two events occurring together, does not imply a causal relationship. Bayesian networks, pioneered by Judea Pearl in 1985, define an essential framework to consider causal effects and were introduced as an alternative to the expert system common during that time. Causal diagrams form the basis of such networks, encoding the influences of different events on each other as conditional probabilities. These allow us to understand what happens if we perform interventions (e.g., setting a sales price), i.e., build a model that actually describes the causal relationships in the data.

The same Judea Pearl described current achievements in deep learning as “just curve fitting”, stressing their shortcomings in not knowing causality. In fact, plain supervised learning like we are using it today is about looking for complex patterns or correlations, fitting a function that best represents the data. While suited for many problems, this approach reaches its limitations when designing interaction-based models, e.g. for dynamic pricing. Here, causal inference serves as a framework to structure and understand the problem. Within this framework, we can again rely on deep learning as a powerful curve fitting tool, bringing the complete system a step closer to modeling causality. If you want to dig deeper into the math, I can highly recommend this post by Ferenc Huszár.

Myth #13: AI for industry is like academia

The different workflows in academia and industry

For many companies, the answer to transforming their business into a more AI-driven organization is clear: just hire kick-ass people from research. However, the requirements of research are often very different from the ones needed to ship a successful AI application. When trying to include findings from academia in industry, we are facing new commercial challenges and additional technical challenges. In this segment, I want to focus on the striking distinctions in the receptive machine learning workflows.

In academia, one often starts with a fixed data set and tweaks the model to achieve even better performance on an agreed-upon metric (e.g., pushing the top-5 accuracy on ImageNet, a renown image dataset). In industry, it works the other way around: most of the time, you know the performance criteria you have to hit to bring an application into production, which are more diverse than in academia (e.g., including speed and explainability). The model is variable, but you are well advised starting with an open-source contribution suited for your problem. However, the biggest lever you have is the training data. Analyzing where your model still makes mistakes and collecting more data in this domain is the central component of your work.

I can highly recommend Andrej Karpathy’s talk about the Software 2.0 stack and Rasmus Rothe’s blog post on bringing ML research into commercialization.

Myth #14: AI applications are all about the ML code

The machine learning code is just a small part of the software needed for a successful AI application

The many recent successes involving Machine Learning have further added to the hype around AI, however, the powerful quick-wins you can achieve with help of the ML toolbox often rely on many other working components.

The ML code itself is often just a tiny puzzle piece of an application, embedded in surrounding infrastructure and software. Before hitting the ML model, the input data has to be measured or sensed, aggregated, and preprocessed. The outputs of the model have to be monitored, combined with other signals, and finally be acted on. In parallel, we have to guarantee proper execution of the processes, including optimal usage of the available hardware.

The massive ongoing maintenance costs in real-world ML systems are very well described in Google’s paper about their “hidden technical debt”. This includes the missing modularity of trained ML models, the strong dependency on the training data, the need for feedback loops and monitoring, and lastly the challenge of a constantly changing external world. In a recent blog post, we have shown how diverse the deep learning toolset is, and that model architecture is just one step in the lifecycle.

Conclusion

With AI becoming an increasingly public topic, I am sure that misconceptions as those above will keep coming up, perpetuating a distorted picture of this technology. I hope this post helped you get some additional insights into the field and made it clear that AI is not magic, but applied math with clear limitations. We should neither be paralyzed by the fear of a general AI taking over the world, nor disappointed by the limitations of current state-of-the-art systems.

My former Stanford professor Andrew Ng coined the phrase “AI is the new electricity”, suggesting that AI will have an equally transformative effect on all parts of our lives as electricity once did. This idea leaves only one sensible course of action: we have to educate ourselves about AI and understand how we can incorporate it to solve our problems if we do not want to get left behind.

The original post can be found here.

About the author:

Sebastian Schaal is Founder at Luminovo, where he works on B2B deep learning projects and building tools to automate common deep learning workflows. He graduated in Electrical and Computer Engineering from TU Munich and is an alumnus of the CDTM. He obtained a second M.Sc. from Stanford University. 

Share this post

Leave a Comment

More posts: