Artificial intelligence (AI) has become one of the most significant emerging technologies of the past few years. Some market estimates anticipate that AI will contribute 16 trillion dollars to the global GDP (gross domestic product) by 2030. While there has been accelerating interest in implementing AI as a technology, there has been concurrent growth in interest in implementing successful AI strategies. Some key elements of such strategies that have emerged include explainable AI, trusted AI, AI ethics, operationalizing AI, scaling sustainable AI operations, workforce development (training), and how to speed up all of this development.
The 2020 year of the pandemic has forced organizations to speed up their digital transformation and advanced technology adoption plans, essentially compressing several years of anticipated developments into several months. These accelerated developments cover a wide scope, including: technology-enabled remote work solutions, technology-enhanced health and safety programs, AI-powered implementations of “all of the above”, and sharpened focus and attention on their workforce: future of work in the age of AI, AI-assisted human work and process enhancements, and training initiatives (including data literacy and AI).
In the recent 2020 RELX Emerging Tech Study, results were presented from a survey of over 1000 U.S. senior executives across eight industries: agriculture, banking, exhibitions, government, healthcare, insurance, legal, and science/medical. The survey, carried out by RELX (a global provider of information-based analytics and decision tools for professional and business customers), focused on the state of interest, investment, and implementations in AI tech during the pandemic period. But it was not just a snapshot on the state of AI in 2020. The survey also had forward-looking questions, as well as historical comparisons and trends from results of similar “state of AI in the enterprise” surveys in 2018 and 2019.
Some of the most remarkable findings in the 3-year trending data include these changes from 2018 to 2020:
1) The percentage of senior executives who stated that AI technologies are being utilized in their business dramatically increased from 48% to 81%.
2) The percentage who are concerned about other countries being more advanced than the U.S. in AI technology and implementation increased from 70% to 82%.
3) The percentage who believe that government programs should assist in AI workforce development increased from 45% to 59%.
4) There was a slight increase (though still a minority) in those who believe that the government should leave the promotion of AI technologies to the private sector, growing from 30% to 36%.
5) There is a continued (and growing) strong belief that U.S. companies should invest in the future AI workforce through educational initiatives such as university partnerships, increasing from 92% to 95%.
6) The percentage who are offering training opportunities in AI technologies to their workforce increased significantly from 46% to 75%.
Some of the interesting findings specifically in the 2020 survey results include:
1) Across all sectors, a strong majority (86%) of survey respondents believe that ethical considerations are a strategic priority in the design and implementation of their AI systems, ranging from 77% in government and 80% in the legal sectors, up to 93% in banking and 94% in the insurance sectors, with healthcare and science/medical near the middle of that range.
2) The majority (68%) of respondents stated that they increased their investment in AI technologies, 48% of which invested in new AI technologies. The sectors with the greatest increases in investment were insurance, banking, and agriculture, followed closely by healthcare and science/medical.
3) 82% stated that AI technologies are most likely to be used to increase efficiencies and worker productivities.
4) Only 26% of respondents reported that AI technologies are being used to replace human labor.
Regarding the last point, this minority view regarding AI’s potential negative impact on employment is consistent with a 2019 worldwide survey of 19,000 employers, which found that 87% plan to increase or maintain the size of their staff as a result of automation and AI, and that just 9% of companies across the globe and 4% in the U.S. anticipate cutting jobs. Another report stated, “Rather, than being replaced, humans will be redeployed into higher-order jobs requiring more cognitive skills.”
(a) COVID-19 drove increased AI tech investment and adoption.
(b) The use of AI has increased across all sectors that were polled.
(c) Ethical AI is viewed as both a priority and a competitive advantage.
(d) International competition remains a concern for U.S. organizations.
(e) AI workforce training and development is a major component of AI strategy, though AI implementations consistently outpace training initiatives.
Vijay Raghavan, Executive Vice President and Chief Technology Officer for the Risk & Business Analytics division of RELX, has summarized the survey results very well in the following statements:
“Businesses’ response to COVID-19 has confirmed the view of US business leaders that artificial intelligence has the power to create smarter, more agile and profitable businesses.”
“Businesses face more complex challenges every day and AI technologies have become a mission-critical resource in adapting to, if not overcoming, these types of unforeseen obstacles and staying resilient.”
“Companies that do not dedicate the necessary resources to training existing employees on new AI technologies risk leaving growth opportunities on the table and using biased or otherwise flawed systems to make and enforce major decisions.”
Therefore, when we consider AI strategy, a global perspective is no less important than local organization-specific mission objectives. Understanding your competition, the marketplace, and both the current expectations and future needs of your stakeholders (customers, employees, citizens, and shareholders) is vital. Furthermore, non-technology considerations should be incorporated alongside technology implementation and operationalization requirements. Performance metrics and goals associated with AI governance, ethics, talent, and training must be on the same balance sheet as AI tools, techniques, and technologies. How to bring all of these pieces together in a successful AI strategy has become clearer with the results and insights revealed in the 2020 RELX Emerging Tech survey.
Definitions of terminology frequently seen and used in discussions of emerging digital technologies.
Additive Manufacturing: see 3D-Printing
AGI (Artificial General Intelligence): The intelligence of a machine that has the capacity to understand or learn any intellectual task that a human being can. It is a primary goal of some artificial intelligence research and a common topic in science fiction and future studies.
AI (Artificial Intelligence): Application of Machine Learning algorithms to robotics and machines (including bots), focused on taking actions based on sensory inputs (data). Examples: (1-3) All those applications shown in the definition of Machine Learning. (4) Credit Card Fraud Alerts. (5) Chatbots (Conversational AI). There is nothing “artificial” about the applications of AI, whose tangible benefits include Accelerated Intelligence, Actionable Intelligence (and Actionable Insights), Adaptable Intelligence, Amplified Intelligence, Applied Intelligence, Assisted Intelligence, and Augmented Intelligence.
Algorithm: A set of rules to follow to solve a problem or to decide on a particular action (e.g., the thermostat in your house, or your car engine alert light, or a disease diagnosis, or the compound interest growth formula, or assigning the final course grade for a student).
Analytics: The products of Machine Learning and Data Science (such as predictive analytics, health analytics, cyber analytics).
AR (Augmented Reality): A technology that superimposes a computer-generated image on a user’s view of the real world, thus providing a composite view. Examples: (1) Retail. (2) Disaster Response. (3) Machine maintenance. (4) Medical procedures. (5) Video games in your real world. (6) Clothes shopping & fitting (seeing the clothes on you without a dressing room). (7) Security (airports, shopping malls, entertainment & sport events).
Autonomous Vehicles: Self-driving (guided without a human), informed by data streaming from many sensors (cameras, radar, LIDAR), and makes decisions and actions based on computer vision algorithms (ML and AI models for people, things, traffic signs,…). Examples: Cars, Trucks, Taxis
BI (Business Intelligence): Technologies, applications and practices for the collection, integration, analysis, and presentation of business information. The purpose of Business Intelligence is to support better business decision-making.
Big Data: An expression that refers to the current era in which nearly everything is now being quantified and tracked (i.e., data-fied). This leads to the collection of data and information on nearly full-population samples of things, instead of “representative subsamples”. There have been many descriptions of the characteristics of “Big Data”, but the three dominant attributes are Volume, Velocity, and Variety — the “3 V’s” concept was first introduced by Doug Laney in 2001 here. Read more in this article: “Why Today’s Big Data is Not Yesterday’s Big Data“. Some consider the 2011 McKinsey & Company research report “Big Data: The Next Frontier for Innovation, Competition, and Productivity” as the trigger point when the world really started paying attention to the the volume and variety of data that organizations everywhere are collecting — the report stated, “The United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions.”
Blockchain: A system in which a permanent and verifiable record of transactions is maintained across several computers that are linked in a peer-to-peer network. It has many applications beyond its original uses for bitcoin and other cryptocurrencies. Blockchain in an example of Distributed Ledger Technology, in which independent computers (referred to as nodes) record, share and synchronize transactions in their respective electronic ledgers (instead of keeping data centralized as in a traditional ledger). Blockchain’s name refers to a chain (growing list) of records, called blocks, which are linked using cryptography, and are used to record transactions between two parties efficiently and in a verifiable and permanent way. In simplest terms, Blockchain is a distributed database existing on multiple computers at the same time. It grows as new sets of recordings, or ‘blocks’, are added to it, forming a chain. The database is not managed by any particular body; instead, everyone in the network gets a copy of the whole database. Old blocks are preserved forever and new blocks are added to the ledger irreversibly, making it impossible to manipulate by faking documents, transactions and other information. All blocks are encrypted in a special way, so everyone can have access to all the information but only a user who owns a special cryptographic key is able to add a new record to a particular chain.
Chatbots (see also Virtual Assistants): These typically are text-based user interfaces (often customer-facing for organizations) that are designed and programmed to reply to only a certain set of questions or statements. If the question asked is other than the learned set of responses by the customer, the chatbot will fail. Chatbots cannot hold long, continuing human interaction. Traditionally they are text-based but audio and pictures can also be used for interaction. They provide more like an FAQ (Frequently Asked Questions) type of an interaction. They cannot process language inputs generally.
Cloud: The cloud is a metaphor for a global network of remote servers that operates transparently to the user as a single computing ecosystem, commonly associated with Internet-based computing.
Cloud Computing: The practice of using a network of remote servers hosted on the Internet to store, manage, and process data, rather than a local server, local mainframe, or a personal computer.
Computer Vision: An interdisciplinary scientific field that focuses on how computers can be made to gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to automate tasks that the human visual system can do, including pattern detection, pattern recognition, pattern interpretation, and pattern classification.
Data Mining: Application of Machine Learning algorithms to large data collections, focused on pattern discovery and knowledge discovery in data. Pattern discovery includes clusters (class discovery), correlation (and trend) discovery, link (association) discovery, and anomaly detection (outlier detection, surprise discovery).
Data Science: Application of scientific method to discovery from data (including Statistics, Machine Learning, data visualization, exploratory data analysis, experimentation, and more).
Digital Transformation: Refers to the novel use of digital technology to solve traditional problems. These digital solutions enable — other than efficiency via automation — new types of innovation and creativity, rather than simply enhance and support traditional methods.
Digital Twins: A phrase used to describe a computerized (or digital) version of a real physical asset and/or process. The digital twin contains one or more sensors that collects data to represent real-time information about the physical asset. By bridging the physical and the virtual world, data is transmitted seamlessly allowing the virtual entity to exist simultaneously with the physical entity. Digital Twins are used in manufacturing, large-scale systems (such as maritime vessels, wind farms, space probes) and other complex systems. Digital Twins are virtual replicas of physical devices that data scientists and IT pros can use to run simulations before actual devices are built and deployed, and also while those devices are in operation. They represent a strong merging and optimization of numerous digital technologies such as IoT (IIoT), AI, Machine Learning, and Big Data Analytics.
Drone (UAV, UAS): An unmanned aerial vehicle (UAV) or uncrewed aerial vehicle (commonly known as a Drone) is an aircraft without a human pilot on board. UAVs are a component of an unmanned aircraft system (UAS); which include a UAV, a ground-based controller, and a system of communications between the two.
Dynamic Data-driven Application (Autonomous) Systems (DDDAS): A paradigm in which the computation and instrumentation aspects of an application system are dynamically integrated in a feed-back control loop, such that instrumentation data can be dynamically incorporated into the executing model of the application, and in reverse the executing model can control the instrumentation. Such approaches can enable more accurate and faster modeling and analysis of the characteristics and behaviors of a system and can exploit data in intelligent ways to convert them to new capabilities, including decision support systems with the accuracy of full scale modeling, efficient data collection, management, and data mining. See http://dddas.org/.
Edge Computing (and Edge Analytics): A distributed computing paradigm which brings computation to the data, closer to the location where it is needed, to improve response times in autonomous systems and to save bandwidth. Edge Analytics specifically refers to an approach to data collection and analysis in which an automated analytical computation is performed on data at a sensor, network switch or other device instead of waiting for the data to be sent back to a centralized data store. This is important in applications where the result of the analytic computation is needed as fast as possible (at the point of data collection), such as in autonomous vehicles or in digital manufacturing.
Industry 4.0: A reference to a new phase in the Industrial Revolution that focuses heavily on interconnectivity, automation, Machine Learning, and real-time data. Industry 4.0 is also sometimes referred to as IIoT (Industrial Internet of Things) or Smart Manufacturing, because it joins physical production and operations with smart digital technology, Machine Learning, and Big Data to create a more holistic and better connected ecosystem for companies that focus on manufacturing and supply chain management.
IoT (Internet of Things) and IIoT (Industrial IoT): Sensors embedded on devices and within things everywhere, measuring properties of things, and sharing that data over the Internet (over fast 5G), to fuel ML models and AI applications (including AR and VR) and to inform actions (robotics, autonomous vehicles, etc.). Examples: (1) Wearable health devices (Fitbit). (2) Connected cars. (3) Connected products. (4) Precision farming. (5) Industry 4.0
Knowledge Graphs (see also Linked Data): Knowledge graphs encode knowledge arranged in a network of nodes (entities) and links (edges) rather than tables of rows and columns. The graph can be used to link units of data (the nodes, including concepts and content), with a link (the edge) that explicitly specifies what type of relationship connects the nodes.
Linked Data (see also Knowledge Graphs): A data structure in which data items are interlinked with other data items that enables the entire data set to be more useful through semantic queries. The easiest and most powerful standard designed for Linked Data is RDF (Resource Description Framework).
Machine Learning (ML): Mathematical algorithms that learn from experience (i.e., pattern detection and pattern recognition in data). Examples: (1) Digit detection algorithm (used in automated Zip Code readers at Post Office. (2) Email Spam detection algorithm (used for Spam filtering). (3) Cancer detection algorithm (used in medical imaging diagnosis).
MR (Mixed Reality): Sometimes referred to as hybrid reality, is the merging of real and virtual worlds to produce new environments and visualizations where physical and digital objects co-exist and interact in real time. It means placing new imagery within a real space in such a way that the new imagery is able to interact, to an extent, with what is real in the physical world we know. The key characteristic of MR is that the synthetic content and the real-world content are able to react to each other in real time.
NLP (Natural Language Processing), NLG (NL Generation), NLU (NL Understanding): NLP a subfield of computer science, information engineering, and artificial intelligence concerned with the interactions between computers and human languages, in particular how to program computers to process and analyze large amounts of natural language data. NLG is a software process that transforms structured data into human-language content. It can be used to produce long form content for organizations to automate custom reports, as well as produce custom content for a web or mobile application, or produce the words that will be spoken by a Virtual (Voice-based) Assistant. NLU is a subtopic of Natural Language Processing in Artificial Intelligence that deals with algorithms that have language comprehension (understanding the meaning of the words, both their content and their context).
Quantum Computing: The area of study focused on developing computer technology based on the principles of quantum theory and quantum phenomena (such as superposition of states and entanglement). Qubits are the fundamental units of quantum computing — they are somewhat analogous to bits in a classical computer.
Robotics: A branch of AI concerned with creating devices that can move and react to sensory input (data). Examples: (1) Automated manufacturing assembly line. (2) Roomba (vacuums your house). (3) Warehouse / Logistics. (4) Prosthetics.
Statistics: the practice or science of collecting and analyzing numerical data, especially for the purpose of inferring proportions in a whole population from measurements of those properties within a representative subsample.
UAV (Unmanned Aerial Vehicle) and UAS (Unmanned Aircraft System): see Drones.
Virtual Assistants (see also Chatbots): A sophisticated voice-based interface in an interactive platform for user and customer interactions. Virtual assistants understand not only the language but also the meaning of what the user is saying. They can learn from their conversation instances, which can lead to an unpredictability in their behavior. Consequently, they can have extended adaptable human interaction. They can be set to perform slightly complicated tasks as well, such as order-taking and task fulfillment.
VR (Virtual Reality): Computer-generated simulation of a three-dimensional environment that can be interacted with in a seemingly real or physical way by a person using special electronic equipment, such as a helmet with a screen inside or gloves fitted with sensors. Examples: (1) Games. (2) Family adventures. (3) Training & Education. (4) Design. (5) Big Data Exploration.
XAI (eXplainable AI, Trusted AI): Artificial intelligence that is programmed to describe (explain) its purpose, rationale and decision-making process in a way that can be understood by the average person. This includes the specific criteria the program uses to arrive at a decision.
XPU: One of the many specialized CPUs for specific applications (similar to an ASIC), which may be real-time, data-intensive, data-specific, or at the edge (see Edge Analytics). For more information, refer to the article “Sensor Analytics at Micro Scale on the xPU“.
3D-Printing … moving on to 4D-printing: Additive Manufacturing — the action or process of making a physical object from a three-dimensional digital model, typically by laying down many thin layers of a material in succession. The terms “additive manufacturing” and “3D printing” both refer to creating an object by sequentially adding build material in successive cross-sections, one stacked upon another.
5G: Fifth-generation wireless, the latest iteration of cellular technology, engineered to greatly increase the speed and responsiveness of wireless networks. 5G will also enable a sharp increase in the amount of data transmitted over wireless systems due to more available bandwidth. Example applications: (1) High-definition and 3D video. (2) Gbit/sec Internet. (3) Broadband network access nearly everywhere. (4) IoT.
I encourage you to read a position paper that I wrote (along with a few astronomy colleagues) for the US National Academies of Science in 2009 that addressed the data science literacy requirements in astronomy. Though focused on the needs in astronomy workforce development for the coming decade, the paper also contains more general discussion of “data literacy for the masses” that is applicable to any and all disciplines, domains, and organizations: “Data Science For The Masses.”
I was recently asked five questions by Alex Woodie of Datanami for the article, “So You Want To Be A Data Scientist” that he was preparing. He used a few snippets from my full set of answers. The longer version of my answers provided additional advice. For aspiring data scientists of all ages, I provide here the full, unabridged version of my answers, which may help you even more to achieve your goal. (Note: I paraphrase Alex’s original questions in quotes below.)
1. “What is the number one piece of advice you give to aspiring data scientists?”
My number one piece of advice always is to follow your passions first. Know what you are good at and what you care about, and pursue that. So, you might be good at math, or programming, or data manipulation, or problem solving, or communications (data journalism), or whatever. You can do that flavor of data science within the context of any domain: scientific research, government, media communications, marketing, business, healthcare, finance, cybersecurity, law enforcement, manufacturing, transportation, or whatever. As a successful data scientist, your day can begin and end with you counting your blessings that you are living your dream by solving real-world problems with data. I saw a quote recently that summarizes this: “If you think your scarce data science skills could be better used elsewhere, be bold and make the move.” (Reference).
2. “What are the most important skills for an aspiring data scientist to acquire?”
There are many skills under the umbrella of data science, and we should not expect any one single person to be a master of them all. The best solution to the data science talent shortage is a team of data scientists. So I suggest…
I am frequently asked for suggestions regarding academic institutions, professional organizations, or MOOCs that provide Data Science training. The following list will be updated occasionally (LAST UPDATED: 2018 March 29) .