2019 was a year of consolidation for the field of artificial intelligence (AI), which stands in stark contrast to the significant breakthroughs of the year before.
In 2018, we witnessed NVIDIA’s use of generative adversarial networks (GANs) to generate hyper-realistic human faces, the development of a new model called BERT (Bidirectional Encoder Representations from Transformers) that makes state-of-the-art language processing more accessible, and AI that can learn in its own dream environment, among other innovations.
The changes throughout this year were more subtle but still important. As I look forward to a new decade and the opportunities that lie on the horizon for this fast-moving field, here are six trends that defined this past year and will have the most influence on the next.
These thoughts were developed through my research, conversations with savvy peers in the industry, and the work I have created with my team at Helixa. I have also drawn key learnings from several international conferences on applied AI and machine learning, including O’Reilly AI in London, ML for Enterprises in Rome, and ICLR.
2019 was the year of data privacy and AI responsibility – rightfully so.
To have an AI system we can trust, we need to ensure fairness, robustness, explainability, and transparency, among other things. Tools like Empathy Map, Stakeholder Map and Inclusive Panda have come onto the scene to identify the ways consumers will ultimately be affected by the tools and applications we design.
With multiple philosophical foundations that differ quite a bit, it can be hard to decide which version of ethical and responsible practice to adopt. Going forward, the challenge for those who design and implement AI will be to build systems able to account for different points of view, weight pros and cons based on criteria like population size, transparently communicate and collaborate with all of the involved actors, and react in a timely manner to accommodate changes of the society needs.
The Institute for Ethical AI & Machine Learning has a set of principles that should guide ethical decision-making in this space, and Design Ethically provides a great set of resources to augment those principles.
I dig further into ethics and responsibility in this article.
Federated learning and differential privacy are coming into their own to help us scale AI more rapidly.
Taken together, federated learning and differential privacy provide an opportunity to learn much more from consumer data while treating their data responsibly and maintaining privacy.
Federated learning is the use of machine learning to train algorithms on consumer devices without exchanging data or compromising privacy. This comic from Google AI explains it well. With differential privacy, we can share information and learnings from a data set that describes patterns of a group while withholding information about individuals.
We apply the principles of federated learning to our audience projection model at Helixa. If you want to learn more, I take a more in-depth look at both federated learning and differential privacy in this post.
The tools and providers used by AI-focused companies are converging.
Though certain companies may include other tools, I’m mostly seeing stacks that include Python, Golang, Jupyter, TensorFlow, Docker, Kubernetes, MLFlow/Kubeflow or similar ML pipelines tools, Spark, and Ray.
The providers are consolidating as well, to a few major providers: Amazon AWS, Google Cloud Platform, Microsoft Azure, and IBM. It’s worth noting that the integration of serverless technology is premature, but still the most promising solution.
You can read more in this post I wrote about AI technology trends.
More AI tools are becoming industrial-grade for enterprise applications.
There were many instances this year where we saw existing tools progress into more robust executions. TensorFlow 2.0 was released. BERT models evolved into leaner versions such as DistilBERT, FastBERT, and the recently released ALBERT. Computer Vision can be considered a solved problem for the majority of commercial applications that tolerate a small percentage of mistakes.
However, not every application has made this specific leap. Reinforcement learning is showing continuous signs of progress led by DeepMind and OpenAI, but most applications involve teaching agents to play games at a level that matches or surpasses human performances. There are some early enterprise applications for this technology, though, like Bonsai in the manufacturing industry and nascent opportunities in financial markets.
Off-the-shelf solutions are progressing but haven’t eliminated the need for data science teams.
Certain tasks, like computer vision, natural language, and conversational speech have been solved by existing applications, and AutoML can help take care of traditional tasks with domain-specific models. Some may take this as a sign that their data science workflow is becoming redundant, but that is not the case at all.
Data science teams frame the problem correctly from the start and supervise every stage of the process. Off-the-shelf solutions are powerful, but still cannot recognize real business needs or respond with the right scientific methodology.
In fact, half of the projects I see fall outside the conventional machine learning solutions, and those teams end up relying on bespoke algorithms developed by highly skilled scientists and engineers. You can learn more about the importance of a strong data science team from my post here.
AI is making the news, but mostly in areas where it’s still catching up.
Self-driving cars might be the application of AI that is struggling the most. The optimistic attitudes around a new automotive revolution through autonomous vehicles have been scaled back to more realistic scenarios like electric cars. Tesla’s promise to deliver “full-service driving” has been pushed back to the end of 2020 at the earliest. The limitations may be the hardware and sensors needed for high-speed scenarios, but we should still be thinking of full-service driving as less of a feature and more of an aspirational limit that we will asymptotically approach.
Also, as with any technological progress, AI also has negative perceptions due to its misuse by bad actors. When looking at a timeline for AI, years from now, 2019 may be remembered as the year when deepfakes exploded.
The community fight against deepfakes has just begun. Facebook, in partnership with Microsoft and other academics from major universities, recently launched the Deepfake Detection Challenge. Twitter made a call for feedback from the public regarding synthetic and manipulated media. The California Governor just signed two laws to ban deepfakes, though it is not clear how practically they can be enforced.
In the last couple of years, we moved from questioning the practicality of AI to discussing the best business cases to start with. Now we are beginning to have the conversation about the best ways to maximize and monetize enterprise AI products.
There are plenty of commercially valuable tasks that AI is currently solving with great accuracy, efficiency, and reliability. Many of them are accessible by simply browsing the state-of-the-art or off-the-shelf AI services.
There are no longer doubts about the maturity of AI technology, so in 2020 I could see more of a focus on finding our own way in the jungle of different tools and algorithms. I hope to dive into my thoughts on 2020 and our path forward in an upcoming post, so stay tuned.
Gianmario is the Chief Scientist and Head of AI at Helixa. His experience covers a diverse portfolio of machine learning algorithms and data products across different industries. He is also co-author of the book "Python Deep Learning", contributor to the "Professional Manifesto for Data Science", and founder of the DataScienceMilan.org community. Read more of his work at Vademecum of Practical Data Science.