You are here

Feed aggregator

International Women's Day at O'Reilly

O'Reilly Radar - Fri, 2019/03/08 - 10:00

At O’Reilly, we seek to foster a culture that creates opportunity, rewards and recognizes accomplishments, and treats everyone with respect.

On this International Women’s Day, I want to call out the simple truth that actions speak louder than words. That respect and doing the right thing always matter regardless of gender, and that treating people with respect and equality are the constant responsibility of everyone. I'm proud to say at O’Reilly our actions do speak for themselves and we have a record of taking action to promote equality.

O’Reilly in a nutshell
  • 50% of the executive team are women.
  • 47% of our current workforce are women.
  • Of the 86 management positions in the company, 50% are held by women.
  • 45% of our promotions last year went to 31 women who assumed new positions of responsibility within the company.
  • We have a firm commitment to job and salary parity across all divisions, departments, and roles, evidenced by the fact that O'Reilly joined many other companies in signing the White House Equal Pay Pledge in 2016 (and we ensure we are always in compliance).
  • We've created event scholarship and diversity programs to provide opportunities for women and recognize people of all races, ethnicities, genders, ages, abilities, religions, sexual orientation, and military service.
  • We developed clear and specific anti-harassment and code of conduct policies, which are in force at all of our events (and widely used as a model by other tech events.)
  • We are actively committed to increasing the diversity of our conference speakers, which helps highlight women and members of other under-represented groups as visible leaders in the tech industry. 37% of our keynote speakers in 2018 were women, up from 32% in 2017. We also donated $36,710 to organizations that support women in tech throughout 2018.

I’m incredibly proud of these statistics, but they cannot stand in isolation. We must continue to push the envelope and strive for diversity and inclusion, not just for women, but for everyone. And, while our internal efforts are solid, we still have a ways to go regarding diversity and inclusion in our own hiring practices. 

What we truly seek to accomplish at O’Reilly is to make sure we foster a culture that creates opportunity for everyone, rewards and recognizes accomplishments, and treats everyone with respect regardless of their gender. Now that's something to celebrate.

Happy International Women's Day!

Continue reading International Women's Day at O'Reilly.

Categories: Technology

Four short links: 8 March 2019

O'Reilly Radar - Fri, 2019/03/08 - 05:00

Ethics and Skepticism, Corporate Open Source, Detecting Fake Text, and DNA Computation

  1. Round Up: Ethics and Skepticism -- There are a whole lot of different ways to misunderstand or be duped by data. This is my round up of good links that illustrate some of the most common problems with relying on data. Real-world examples. (via Elizabeth Goodman)
  2. oss-contributors -- Adobe's s/w for measuring companies' contributions to GitHub, released as open source natch. The BigQuery table is also open. (via Fil Maj)
  3. -- inspect the visual footprint of automatically generated text. It enables a forensic analysis of how likely an automatic system generated text. (via @harvardnlp and Miles Brundage)
  4. Protocells use DNA Logic to Communicate and Compute (Bristol) -- a new approach called BIO-PC (Biomolecular Implementation Of Protocell communication) based on communities of semi-permeable capsules (proteinosomes) containing a diversity of DNA logic gates that together can be used for molecular sensing and computation.

Continue reading Four short links: 8 March 2019.

Categories: Technology

Lessons learned building natural language processing systems in health care

O'Reilly Radar - Thu, 2019/03/07 - 05:55

NLP systems in health care are hard—they require broad general and medical knowledge, must handle a large variety of inputs, and need to understand context.

We’re in an exciting decade for natural language processing (NLP). Computers will get as good as humans in complex tasks like reading comprehension, language translation, and creative writing. Language understanding benefits from every part of the fast-improving ABC of software: AI (freely available deep learning libraries like PyText and language models like BERT), big data (Hadoop, Spark, and Spark NLP), and cloud (GPU's on demand and NLP-as-a-service from all the major cloud providers).

In health care, several applications have already moved from science fiction to reality. AI systems passed the medical licensing exams in both China and England—doing better than average doctors. A new system diagnoses 55 pediatric conditions better than junior doctors. These systems are harder to build than some of the first computer vision deep learning applications (i.e., study one image)—they require a broader general and medical knowledge, handle a bigger variety of inputs, and must understand context.

I’ve been lucky to be involved in building NLP systems in health care for the past seven years. The goal of this article is to share key lessons I learned along the way to help you build similar systems faster and better.

Meet the language of emergency room triage notes

Many people, me included, make the mistake of assuming that clinical notes in the U.S. are written in English. That happens because that’s how doctors will answer if you ask them what language they use. However, consider this example of three de-identified triage notes taken from emergency room visits:

Triage Notes

states started last night, upper abd, took alka seltzer approx 0500, no relief. nausea no vomiting

Since yesterday 10/10 "constant Tylenol 1 hr ago. +nausea. diaphoretic. Mid abd radiates to back

Generalized abd radiating to lower x 3 days accompanied by dark stools. Now with bloody stool this am. Denies dizzy, sob, fatigue.

Most people without a medical education do not understand the meaning of these typical sentences. Here are a few things to note:

  • None of these sentences are grammatically correct sentences in English.
  • None of them use the words “patient” or “pain.” They don’t have a subject.
  • They use a lot of jargon: 10/10 refers to the intensity of pain. “Generalized abd radiating to lower” refers to general abdominal (stomach) pain that radiates to the lower back.

ER doctors I’ve shown these notes to, though, consider them useful—they’re concise and focus on what matters. They would consider these as common and not “bad” examples of ER triage notes.

Yes, emergency rooms have their own language

As a philosopher or linguist, you might argue that this still does not constitute a “different language” in the typical sense of the word. However, if you’re a data scientist or NLP practitioner, there shouldn’t be any doubt that it is:

  • It has a different vocabulary. The Unified Medical Language System (UMLS) includes more than 200 vocabularies for English alone, covering more than three million terms. In contrast, the Oxford English Dictionary of 1989 had 171,476 words (although, that should be roughly tripled to include derivatives that UMLS directly lists).
  • It has a different grammar. The text has its own definition of what sentences are and what parts of speech are. Statements like “+nausea” and “since yesterday 10/10” are grammatical structures that don’t exist anywhere else.
  • It has different semantics. “Sob” means “shortness of breath” (and not the other meaning you had in mind). “Denies” means the patient says they don’t have the symptom, although the clinician thinks they might.
  • It goes beyond jargon. Jargon refers to the 100-200 new words you learn in the first month after you join a new school or workplace. In contrast, understanding health care language takes people as long as it takes to master day-to-day Italian or Portuguese.
Lesson #1: Off-the-shelf NLP models don’t work

In practice, off-the-shelf NLP libraries and algorithms built for English will fail miserably on this “different language” in the health care industry. Not only will named entity recognition or entity resolution models fail, but even basic tasks such as tokenization, part of speech tagging, and sentence segmentation will fail for the majority of sentences.

If you don’t believe me, feel free to test it yourself with the six popular NLP cloud services and libraries listed below. All but Amazon Comprehend provide a web user interface so you can copy and paste sentences to see how the service would analyze it:

  1. Google Cloud Natural Language
  2. IBM Watson NLU
  3. Azure Text Analytics
  4. spaCy Named Entity Visualizer
  5. Amazon Comprehend (offline)
  6. Stanford Core NLP

In a test done during December 2018, of the six engines, the only medical term (which only two of them recognized) was Tylenol as a product.

Health care has hundreds of languages

The next mistake I made, like many others, was building models that “solve health care.” Amazon’s Comprehend Medical is now taking this approach with a universal medical-NLP-as-a-service. This assumes that health care is one language. In reality, every sub-specialty and form of communication is fundamentally different. Here’s a handful of de-identified examples:

Pathology (Surgical pathology, cancer):

Part #1 which is labeled "? metastatic tumor in jugular vein lymph node" consists of an elliptical fragment of light whitish-tan tissue which measures approximately 0.3 x 0.2 x 0.2 cm.

Radiology (MRI Cervical Spine):

C6-7: There is a diffuse disc osteophyte which results in flattening of the ventral thecal sac with a mild spinal canal stenosis and moderate to severe bilateral neural foraminal narrowing. OTHER FINDINGS: No paraspinal soft tissue mass.


Based on the outcome of the Phase I trial, the patient will receive permanent implantation of the stimulator. Specifically, this patient will receive a Spinal Cord Stimulator System, made by Boston Scientific Neuromodulation Corporation. This SCS System includes a re-chargeable battery within the implanted stimulator, allowing the physician and patient to control pain at the most optimal settings without compromising battery life compared to non-rechargeable SCS systems. The Boston Scientific SCS System is FDA-approved.

Postop (from "Objective" section of a SOAP note):

Vitals- Tmax: 99.8, BP- 128/82, P- 82, R-18 I/O- 3000ml NS IV / 200ml out via foley, 800ml on own, in past 24 hours

General- laying in bed, appears comfortable

Skin- Surgical incision margins have minimal erythema and are well approximated with staples, no dehiscence, no drainage. No signs of hematoma or seroma formation. No jaundice

Dental (Anesthetic, Specific Tooth):

Benzocaine was placed on the palate, adjacent to tooth 1. A total of 0 .00 carpules of Articaine, 4% with Epinephrine 1:100,000 was injected into the palate using a long, 25-gauge needle.

Medications (Dosage, route, frequency, duration, form)

aspirin is required 20 mg po daily for 2 times as tab

Need more examples? Take some time to learn about deciphering your lab reports. Or consider that medical students starting a specialty in dermatology need to master the aptly named Dermatology—learning the language. Even Identifying Patient Smoking Status from medical discharge records is complex enough to be an active area of academic research.

Then, there are many variants within each medical specialty. For example, deciding whether or not to approve a pre-authorization request for an MRI versus, say, an implantable spinal cord stimulator requires extracting completely different items from the pre-authorization forms. As another example, within pathology, different terms are used to discuss different types of cancer. This has a real-world impact: the company I work for is undertaking a project that requires training separate NLP models for extracting facts about lung, breast, and colon cancer from pathology reports.

Amazon’s Comprehend Medical has, so far, only focused on normalizing medication values (see that last “aspirin” example in the above table). The service also comes with standard medical named entity recognition—which doesn’t address any specific application’s needs. Please do not take my word for it—try it yourself on the examples above or on your own text. Such NLP services are mostly used nowadays as a means to attract customers into professional services engagements. Other companies like 3M and Nuance that sell “health care NLP” are more up front about this in their marketing.

Lesson #2: Build trainable NLP pipelines

If you need to build NLP systems in health care yourself, you’ll need to train NLP models that are specific to the application you’re building. This doesn’t mean you cannot reuse existing software—there is a lot you can reuse:


Medical terminologies

Medical embeddings

Neural network graphs

NLP Pipeline API’s

Training & inference framework

To build:

What medications is this patient taking?

Does this patient require a chest CT scan?

What’s the right E/M billing code for this visit?

Has this patient been pregnant before?

Do they have known allergies?

When we built Spark NLP for Healthcare—an extension of the open source NLP library for Apache Spark—the goal was to provide as many reusable out-of-the-box components as possible. These include, for example, production-grade implementations of the state-of-the-art academic papers for clinical named entity recognition and de-identification, biomedical entity normalization, and assertion status (i.e., negation) detection. Using these implementations doesn’t require learning to use TensorFlow (or any other framework), since the deep learning framework is embedded in the library under easy-to-use Python, Java, and Scala APIs. The library itself is a native extension of Spark ML and reuses its Pipeline class for building, serializing, and extending NLP, ML, and DL flows.

Making this library perform in real-world projects taught us a lot about just how different “health care languages” are from human ones. Here are some of the things we had to build:

  • Deep learning-based sentence segmentation. While splitting sentences in Wikipedia articles often can be done just using regular expressions, handling multi-page clinical documents was a bigger challenge. In particular, the algorithms had to deal with headers and footers, lists, enumerations, call-outs, two-column documents, and other formatting.
  • Health care-specific part-of-speech tagging. Not only was a different model required, but additional parts of speech are used for health care models. This was done because it actually improves the accuracy of clinical named entity recognition.
  • Health care-specific entity normalization algorithms. Named entity recognition by itself is often useless in practice: annotating from “both eyes seem to be infected” that “eye” and “infection” are medical terms doesn’t help much. In contrast, marking the whole chunk of text as code 312132001 from the standard SNOMED-CT clinical terminology, while normalizing for the different ways to describe the same finding, is much more useful. It enables your application to base business logic based on this code, no matter how it was normalized or how, exactly, it was expressed in the free-form text it came from.

In short: the deeper we go into treating health care texts as different languages, the closer we get to matching and exceeding human accuracy on the same tasks.

Lesson #3: Start with labeling ground truth

So, how do you start your own project? How do you know how far off you are and whom to trust? One way is to start by building a labeled validation set. For example, if you are interested in automating ICD-10 coding from outpatient notes, have clinicians define a representative sample of such records, de-identify them, and have professional clinical coders label them (by assigning the correct codes). If you are interested in extracting critical events from radiology reports or missed safety events from in-patient notes, have clinicians define the sample and label them correctly first.

This will often uncover blockers you need to address before involving (and wasting the time of) your data science team. If you don’t have access to enough data, or can’t de-identify it at scale, then there’s no way to build a reliable model anyway. If clinicians cannot consistently agree on the correct labels in some cases, then the first problem to solve is to agree on clinical guidelines instead of involving data scientists to try to automate a disagreement. Finally, if you find you’re facing highly unbalanced classes (i.e., you are looking for something that happens to a handful of patients per year), it may be wise to change the definition of the problem before calling in the data scientists.

Once you have a representative and an agreed upon and correctly labeled validation set, you can start testing existing libraries and cloud providers. Most likely, the first test will immediately uncover the gaps between each offering and your needs. The smartest teams we’ve worked with have set up week-long or two-week-long test projects, in which the goal is to use a library or service to reach the maximum level of accuracy for your specific needs. Doing this enables you to evaluate how easy each service is to train custom models, define domain-specific features and pipeline steps that your solution requires, and explain the results back to you.

Such an approach can be a great education opportunity for your team. It tests both the packaged software and the support/consulting aspects of the services you’ll evaluate. It will show you how far you are from achieving a level of accuracy that’s in line with your business needs. Finally, this third lesson enables you to validate lessons #1 and #2 on your own, without taking my word for them.

Best of luck and success in your projects. Since this is health care we’re talking about, the world needs you to succeed!

Related resources:

Continue reading Lessons learned building natural language processing systems in health care.

Categories: Technology

Four short links: 7 March 2019

O'Reilly Radar - Thu, 2019/03/07 - 05:05

Privacy Exercise, Tragedy of the Commons, Program Repair, and Privacy != Safety

  1. Privacy Exercise (Twitter) -- Professor Kate Klonick gave her students a great exercise to teach them about deanonymization, "if you have nothing to hide," etc. (via BoingBoing)
  2. The Tragedy of the Tragedy of the Commons (Twitter) -- Matto Mildenberger succinctly points not only to the critiques of Hardin's "Tragedy of the Commons" idea, but also to the other ideas of the man himself. Now, lots of awful people have left noble ideas that outlive them. But in Hardin’s case, the intellectual legacy is largely built on top of his racist, flawed science that we still treat as gospel and uncritically assign in undergraduate courses year after year.
  3. SimFix -- Automatically fix programs by leveraging existing patches from other projects and similar code snippets from the faulty project. If automated program repair interests you, you should know about (via Hacker News)
  4. Zuck Thinks Encrypted Message Will Save Facebook -- no mention of misinformation, doxxing, or any of the other evils on Facebook's core platform. Instead, a move to end-to-end encrypted communication, where Facebook can't monitor the contents. Feels like The Problem was defined as "we have to police content" instead of the pain actually felt by users.

Continue reading Four short links: 7 March 2019.

Categories: Technology

Four short links: 6 March 2019

O'Reilly Radar - Wed, 2019/03/06 - 05:00

Reverse Engineering, Public Policy, New Editor, and Burnout

  1. Ghidra -- software reverse-engineering tool, rival for IDAPro. Open source, released by NSA.
  2. Cybersecurity in the Public Interest (Bruce Schneier) -- We need public-interest technologists in policy discussions. We need them on congressional staff, in federal agencies, at non-governmental organizations (NGOs), in academia, inside companies, and as part of the press. In our field, we need them to get involved in not only the Crypto Wars, but everywhere cybersecurity and policy touch each other: the vulnerability equities debate, election security, cryptocurrency policy, Internet of Things safety and security, big data, algorithmic fairness, adversarial machine learning, critical infrastructure, and national security.
  3. Kakoune -- a code editor that implements vi’s "keystrokes as a text editing language" model. As it’s also a modal editor, it is somewhat similar to the Vim editor (after which Kakoune was originally inspired). In the words of a Hacker News commenter, it's trying to ditch some of the historical ed/ex syntax and thought patterns that make vi weirdly inconsistent.
  4. Burnout Self Test -- This tool can help you check yourself for burnout. It helps you look at the way you feel about your job and your experiences at work, so you can get a feel for whether you are at risk of burnout.

Continue reading Four short links: 6 March 2019.

Categories: Technology

170+ live online training courses opened for March and April

O'Reilly Radar - Wed, 2019/03/06 - 04:00

Get hands-on training in machine learning, AWS, Kubernetes, Python, Java, and many other topics.

Learn new topics and refine your skills with more than 170 new live online training courses we opened up for March and April on the O'Reilly online learning platform.

AI and machine learning

Spotlight on Innovation: AI Trends with Roger Chen, March 13

Beginning Machine Learning with scikit-learn, April 2

Deep Learning for Machine Vision, April 4

Ingenious Game AI Development in Unity, April 11-12

Artificial Intelligence for Big Data, April 15-16

Intermediate Machine Learning with scikit-learn, April 16

Deep Learning with TensorFlow, April 17

AI for Product Managers, April 19

Getting Started with Machine Learning, April 22

A Practical Introduction to Machine Learning, April 22

Probabilistic Modeling With TensorFlow Probability, April 29

Deploying Machine Learning Models to Production: A Toolkit for Real-World Success , April 29-30

An Introduction to Amazon Machine Learning on AWS, April 29-30

Artificial Intelligence: AI For Business, May 1

Building Intelligent Bots in Python, May 7

Modern AI Programming with Python, May 16

Hands-On Chatbot and Conversational UI Development, June 20-21


An Introduction to Ethereum DApps, March 26

Certified Blockchain Solutions Architect (CBSA) Certification Crash Course, April 2

Blockchain Applications and Smart Contracts, April 2

Blockchain and Cryptocurrency Essentials, April 5


Better Business Writing, March 19

Giving a Powerful Presentation, March 25

Introduction to Digital Marketing, March 25

Introduction to Leadership Skills, March 26

Spotlight on Learning from Failure: Assessing Talent Beyond Technical Skills with Tony Tjan, March 26

Getting Unstuck, April 1

How to Be a Better Mentor, April 3

Empathy at Work, April 15

Applying Critical Thinking, April 15

Managing Team Conflict, April 16

Product Management for Enterprise Software, April 17

Salary Negotiation Fundamentals, April 18

Navigating and Succeeding During Rapid Organizational Change, April 18

Introduction to Leadership Skills, April 22

Building Your People Network, April 23

Actionable Insights in a Week: User Research for Everyone, April 23

Managing Your Manager, April 25

60 Minutes to Designing a Better PowerPoint Slide, April 25

Understanding Business Strategy, April 25

Introduction to Critical Thinking, April 29

Why Smart Leaders Fail, May 7

Thinking Like a Manager, May 10

Introduction to Time Management Skills, May 10

Data science and data tools

Practical Linux Command Line for Data Engineers and Analysts, March 13

Data Modelling with Qlik Sense, March 19-20

Foundational Data Science with R, March 26-27

Learning MongoDB: A Hands-on Guide, April 1

What You Need to Know About Data Science, April 1

Developing a Data Science Project, April 2

Analyzing and Visualizing Data with Microsoft Power BI, April 5

Programming with Data: Foundations of Python and Pandas, April 8

Mastering Pandas, April 10

Managing Enterprise Data Strategies with Hadoop, Spark, and Kafka, April 15

Real-Time Data Foundations: Flink, April 17

Data Pipelining with Luigi and Spark, April 17

Real-Time Data Foundations: Time Series Architectures, April 18

Building Dashboards with Power BI, April 18-19

Business Data Analytics Using Python, April 29

Intermediate SQL for Data Analysis, April 30

Visualization and Presentation of Data, April 30

Design and product management

Design Thinking for Non-Designers, April 23

Introduction to UI and UX Design, April 29


Java 8 Generics in 3 Hours, March 15

Scaling Python with Generators, March 25

Pythonic Object-Oriented Programming, March 26

Rust Programming: A Crash Course, March 27

Pythonic Design Patterns, March 27

Test-Driven Development in Python, March 28

Data Structures in Java, April 1

Concurrency in Python, April 1

Discovering Modern Java, April 2

Clean Code, April 2

Getting Started with PySpark, April 3

Working with Dataclasses in Python 3.7, April 3

OCA Java SE 8 Programmer Certification Crash Course, April 3-5

IoT Fundamentals, April 4-5

Python Data Handling: A Deeper Dive, April 5

Bash Shell Scripting in 4 Hours, April 15

Basic Android Development, April 15-16

Ground Zero Programming with JavaScript, April 16

SQL for Any IT Professional , April 16

Getting Started with Go, April 16-17

Design Patterns Boot Camp, April 17-18

Kotlin for Android, April 17-18

Design Patterns in Java, April 17-18

Getting Started with Node.js, April 18

Hands-on Augmented Reality for Game Developers, April 22-23

Functional Programming in Java, April 22-23

Next-Generation Java Testing with JUnit 5, April 24

Beyond Python Scripts: Logging, Modules, and Dependency Management, April 25

Learning Spring Boot 2, April 25-26

Beyond Python Scripts: Exceptions, Error Handling and Command-Line Interfaces, April 26

Developing Applications on Google Cloud Platform, April 29-30

Functional Design for Java 8, April 29-30

Python: The Next Level, April 29-30

Rethinking REST: A Hands-on Guide to GraphQL and Queryable APIs, April 30

Learn the Basics of Scala, April 30

Advanced SQL Series: Relational Division, May 2

Fundamentals of Clojure, May 6

Getting Started with Spring and Spring Boot, May 6-7

Getting Started with Python's Pytest, May 7

Java Testing with Mockito and the Hamcrest Matchers, May 9

Reactive Spring and Spring Boot, May 10

Getting Started with Python 3, May 13-14


Network Security Testing with Kali Linux, March 25

Getting Started with Cyber Investigations and Digital Forensics, April 1

Ethical Hacking and Information Security, April 2

Linux, Python, and Bash Scripting for Cybersecurity Professionals, April 5

Cyber Security Defense, April 10

Introduction to Digital Forensics and Incident Response (DFIR), April 12

Defensive Cybersecurity Fundamentals, April 15

Ethical Hacking Bootcamp with Hands-On Labs, April 15-17

Start your Security Certification Career Today, April 23

CISSP Crash Course, April 23-24

CISSP Certification Practice Questions and Exam Strategies, April 24

Introduction to Ethical Hacking and Penetration Testing, April 25-26

CompTIA Cybersecurity Analyst CySA+ CS0-001 Crash Course, April 25-26

Security for Machine Learning, April 29

Systems engineering and operations

Beginner's Guide to Writing AWS Lambda Functions in Python, April 1

AWS Account Setup Best Practices, April 1

Red Hat Certified Engineer (RHCE) Crash Course, April 1-4

Visualizing Software Architecture with the C4 Model, April 2

AWS Machine Learning Specialty Certification Crash Course, April 3-4

Introduction to Google Cloud Platform, April 3-4

AWS Access Management, April 4

Docker: Beyond the Basics (CI & CD), April 4-5

Cloud Computing on the Edge, April 9

Ansible for Managing Network Devices, April 11

AWS CloudFormation, April 11-12

Advanced Kubernetes in Practice, April 11-12

Istio on Kubernetes: Enter the Service Mesh, April 12

Microservice Fundamentals, April 15

Shaping and Communicating Architectural Decisions, April 15

Designing Serverless Architecture with AWS Lambda, April 15-16

Ansible in 4 Hours, April 16

Microservice Collaboration, April 16

Kubernetes Serverless with Knative, April 17

Software Architecture by Example, April 18

Developing Incremental Architecture, April 18-19

Automating with Ansible, April 19

Network Troubleshooting: Basic Theory and Process, April 19

9 Steps to Awesome with Kubernetes, April 19

JIRA 8 for Users and Managers, April 22-23

Architecture for Continuous Delivery , April 23

Serverless Architectures with Azure, April 23-24

AWS Certified Cloud Practitioner Crash Course, April 23-24

Google Cloud Platform Professional Cloud Architect Certification Crash Course , April 24-25

Kafka Fundamentals, April 24-25

Linux Foundation System Administrator (LFCS) Crash Course, April 24-25

Building Micro-Frontends, April 26

Building and Managing Kubernetes Applications, April 29

AWS Core Architecture Concepts, April 29-30

AWS Monitoring Strategies, April 29-30

Docker: Up and Running, April 30-May 1

AWS Security Fundamentals, May 6

Introduction to Docker Compose, May 6

Creating Serverless APIs with AWS Lambda and API Gateway, May 8

Introduction to Kubernetes, May 8-9

Amazon Web Services (AWS): Up and Running, May 9

Linux Filesystem Administration, May 13-14

Domain-Driven Design and Event-Driven Microservices, May 14-15

Microservice Fundamentals, May 28

Microservice Decomposition Patterns, May 29

Microservice Collaboration, June 26

Microservice Decomposition Patterns, June 27

Web programming

Developing Modern React Patterns, April 4

Introduction to Vue.js, April 16-17

Building Web Apps with Vue.js, April 18

CSS Layout Fundamentals: From Floats to Flexbox and CSS grid, April 23

Beginning Frontend Development with React, April 25-26

React Beyond the Basics: Master React's Advanced Concepts, May 9

Advanced React.JS, May 9

Continue reading 170+ live online training courses opened for March and April.

Categories: Technology

Four short links: 5 March 2019

O'Reilly Radar - Tue, 2019/03/05 - 05:05

CEO Personality, Adult Development, Big Idea Famine, and Neuron Imaging

  1. Strategic Decisions: Behavioral Differences Between CEOs and Others -- All subjects participated in three incentivized games—Prisoner’s Dilemma, Chicken, Battle-of-the-Sexes. Beliefs were elicited for each game. We report substantial and robust differences in both behavior and beliefs between the CEOs and the control group. The most striking results are that CEOs do not best respond to beliefs; they cooperate more, play less hawkish, and thereby earn much more than the control group. (via Marginal Revolution)
  2. Robert Keegan's Theory of Adult Development -- interesting set of stages: selfish/transaction; weak sense of self/strongly influenced by others; emotionally aware and strong sense of personal value and belief system; and able to continuously grow by adopting, adapting, and discarding mental models and "identities." Found it via YC Startup School's excellent How to Win lecture by Daniel Gross, which also included the mantra that "sleep is a nootropic."
  3. Big Idea Famine (Nicolas Negroponte) -- I believe that 30 years from now people will look back at the beginning of our century and wonder what we were doing and thinking about big, hard, long-term problems, particularly those of basic research. They will read books and articles written by us in which we congratulate ourselves about being innovative. The self-portraits we paint today show a disruptive and creative society, characterized by entrepreneurship, startups, and big company research advertised as moonshots. Our great-grandchildren are certain to read about our accomplishments, all the companies started, and all the money made. At the same time, they will experience the unfortunate knock-on effects of an historical (by then) famine of big thinking. (via Daniel G. Siegel)
  4. CalmAn -- an open source library for calcium imaging data analysis. [...] CaImAn is suitable for two-photon and one-photon imaging, and also enables real-time analysis on streaming data. [...] We demonstrate that CaImAn achieves near-human performance in detecting locations of active neurons.

Continue reading Four short links: 5 March 2019.

Categories: Technology

Four short links: 4 March 2019

O'Reilly Radar - Mon, 2019/03/04 - 04:50

Open Source Chat, Assistant Shim, Data Science, and Quantum Computing

  1. Zulip 2.0 Out -- open source chat, a-la Slack. Better support for thread-like multiple conversations in a channel.
  2. Project Alias -- a teachable “parasite” that is designed to give users more control over their smart assistants, both when it comes to customization and privacy. Through a simple app, the user can train Alias to react on a custom wake-word/sound, and once trained, Alias can take control over your home assistant by activating it for you. There's no problem in technology that can't be solved by the addition of another layer of indirection. Interesting to see attempts to make third-party improvements to these things in our house that we have no control over except that which The Maker has given us.
  3. Python Data Science in Jupyter Notebooks -- full text.
  4. When Will Quantum Computing Have Real Commercial Value? -- Nobody really knows. The field sits behind a number of difficult science and engineering breakthroughs before we get to the equivalent of a UNIVAC, let alone a PDP-1, Altair 8800, or iPhone. My read is that the military's hopes and fears for quantum crypto and comms have buoyed VCs' hopes, so they're wading in before the fundamental research is done and are just hoping there'll be a breakthrough within the lifetime of their fund.

Continue reading Four short links: 4 March 2019.

Categories: Technology

Four short links: 1 March 2019

O'Reilly Radar - Fri, 2019/03/01 - 02:00

Voice Data, Lunar Library, IP Rights, Computational Photography

  1. Common Voice Data (Mozilla) -- largest dataset of human voices available for use, including 18 different languages, adding up to almost 1,400 hours of recorded voice data from more than 42,000 contributors. (via Mozilla blog)
  2. The Lunar Library -- The Arch Mission Foundation is nano-etching 30,000,000 pages' worth of "archives of human history and civilization, covering all subjects, cultures, nations, languages, genres, and time periods" onto 25 DVD-sized, 40-micron-thick discs that will be deposited on the surface of the moon in 2019 by the Beresheet lander. (that's from BoingBoing's coverage)
  3. IP Rights of Hackathons (OKFN) -- The tool is a set of documents. [...] The “default settings” of the documents are such that participants’ intellectual property rights stay with participants. [....] These settings are in the pre-participation document, and the post-participation document is for writing down an agreement among the contributing group members over their rights, permissions, terms, etc. so that they know what they can do with the output of their team.
  4. Spectre App -- Spectre lets you erase moving tourists from busy locations or capture light trails and water movements from the camera on your iPhone. Computational photography is awesome, but I feel like all these apps are implementing what will become features of The Camera App on your phone. Unless, that is, the UI for a camera app that can do everything is Too Much For Mortals and people actually want a half-dozen different camera apps.

Continue reading Four short links: 1 March 2019.

Categories: Technology

Why your attention is like a piece of contested territory

O'Reilly Radar - Thu, 2019/02/28 - 05:00

The O’Reilly Data Show Podcast: P.W. Singer on how social media has changed, war, politics, and business.

In this episode of the Data Show, I spoke with P.W. Singer, strategist and senior fellow at the New America Foundation, and a contributing editor at Popular Science. He is co-author of an excellent new book, LikeWar: The Weaponization of Social Media, which explores how social media has changed war, politics, and business. The book is essential reading for anyone interested in how social media has become an important new battlefield in a diverse set of domains and settings.

Continue reading Why your attention is like a piece of contested territory.

Categories: Technology

You created a machine learning application. Now make sure it’s secure.

O'Reilly Radar - Thu, 2019/02/28 - 04:00

The software industry has demonstrated, all too clearly, what happens when you don’t pay attention to security.

In a recent post, we described what it would take to build a sustainable machine learning practice. By “sustainable,” we mean projects that aren’t just proofs of concepts or experiments. A sustainable practice means projects that are integral to an organization’s mission: projects by which an organization lives or dies. These projects are built and supported by a stable team of engineers, and supported by a management team that understands what machine learning is, why it’s important, and what it’s capable of accomplishing. Finally, sustainable machine learning means that as many aspects of product development as possible are automated: not just building models, but cleaning data, building and managing data pipelines, testing, and much more. Machine learning will penetrate our organizations so deeply that it won’t be possible for humans to manage them unassisted.

Organizations throughout the world are waking up to the fact that security is essential to their software projects. Nobody wants to be the next Sony, the next Anthem, or the next Equifax. But while we know how to make traditional software more secure (even though we frequently don’t), machine learning presents a new set of problems. Any sustainable machine learning practice must address machine learning’s unique security issues. We didn’t do that for traditional software, and we’re paying the price now. Nobody wants to pay the price again. If we learn one thing from traditional software’s approach to security, it’s that we need to be ahead of the curve, not behind it. As Joanna Bryson writes, “Cyber security and AI are inseparable.”

The presence of machine learning in any organization won’t be a single application, a single model; it will be many applications, using many models—perhaps thousands of models, or tens of thousands, automatically generated and updated. Machine learning on low-power edge devices, ranging from phones to tiny sensors embedded in assembly lines, tools, appliances, and even furniture and building structures, increases the number of models that need to be monitored. And the advent of 5G mobile services, which significantly increases the network bandwidth to mobile devices, will make it much more attractive to put machine learning at the edge of the network. We anticipate billions of machines, each of which may be running dozens of models. At this scale, we can't assume that we can deal with security issues manually. We need tools to assist the humans responsible for security. We need to automate as much of the process as possible, but not too much, giving humans the final say.

In “Lessons learned turning machine learning models into real products and services,” David Talby writes that “the biggest mistake people make with regard to machine learning is thinking that the models are just like any other type of software.” Model development isn’t software development. Models are unique—the same model can’t be deployed twice; the accuracy of any model degrades as soon as it is put into production; and the gap between training data and live data, representing real users and their actions, is huge. In many respects, the task of modeling doesn’t get started until the model hits production, and starts to encounter real-world data.

Unfortunately, one characteristic that software development has in common with machine learning is a lack of attention to security. Security tends to be a low priority. It gets some lip service, but falls out of the picture when deadlines get tight. In software, that’s been institutionalized in the “move fast and break things” mindset. If you’re building fast, you’re not going to take the time to write sanitary code, let alone think about attack vectors. You might not “break things,” but you’re willing to build broken things; the benefits of delivering insecure products on time outweigh the downsides, as Daniel Miessler has written. You might be lucky; the vulnerabilities you create may never be discovered. But if security experts aren’t part of the development team from the beginning, if security is something to be added on at the last minute, you’re relying on luck, and that’s not a good position to be in. Machine learning is no different, except that the pressure of delivering a product on time is even greater, the issues aren’t as well understood, the attack surface is larger, the targets are more valuable, and companies building machine learning products haven’t yet engaged with the problems.

What kinds of attacks will machine learning systems see, and what will they have to defend against? All of the attacks we have been struggling with for years, but there are a number of vulnerabilities that are specific to machine learning. Here’s a brief taxonomy of attacks against machine learning:

Poisoning, or injecting bad (“adversarial”) data into the training data. We’ve seen this many times in the wild. Microsoft’s Tay was an experimental chatbot that was quickly taught to spout racist and anti-semitic messages by the people who were chatting with it. By inserting racist content into the data stream, they effectively gained control over Tay’s behavior. The appearance of “fake news” in channels like YouTube, Facebook, Twitter, and even Google searches, was similar: once fake news was posted, users were attracted to it like flies, and the algorithms that made recommendations “learned” to recommend that content. danah boyd has argued that these incidents need to be treated as security issues, intentional and malicious corruption of the data feeding the application, not as isolated pranks or algorithmic errors.

Any machine learning system that constantly trains itself is vulnerable to poisoning. Such applications could range from customer service chat bots (can you imagine a call center bot behaving like Tay?) to recommendation engines (real estate redlining might be a consequence) or even to medical diagnosis (modifying recommended drug dosages). To defend against poisoning, you need strong control over the training data. Such control is difficult (if not impossible) to achieve. “Black hat SEO” to improve search engine rankings is nothing if not an early (and still very present) example of poisoning. Google can’t control the incoming data, which is everything that is on the web. Their only recourse is to tweak their search algorithms constantly and penalize abusers for their behavior. In the same vein, bots and troll armies have manipulated social media feeds to spread views ranging from opposition to vaccination to neo-naziism.

Evasion, or crafting input that causes a machine learning system to misclassify it. Again, we’ve seen this both in the wild and in the lab. CV Dazzle uses makeup and hair styles as “camouflage against face recognition technology.” Other research projects have shown that it’s possible to defeat image classification by changing a single pixel in an image: a ship becomes a car, a horse becomes a frog. Or, just as with humans, image classifiers can miss an unexpected object that’s out of context: an elephant in the room, for example. It’s a mistake to think that computer vision systems “understand” what they see in ways that are similar to humans. They’re not aware of context, they don’t have expectations about what’s normal; they’re simply doing high-stakes pattern matching. Researchers have reported similar vulnerabilities in natural language processing, where changing a word, or even a letter, in a way that wouldn’t confuse human researchers causes machine learning to misunderstand a phrase.

Although these examples are often amusing, it’s worth thinking about real-world consequences: could someone use these tricks to manipulate the behavior of autonomous vehicles? Here’s how that could work: I put a mark on a stop sign—perhaps by sticking a fragment of a green sticky note at the top. Does that make an autonomous vehicle think the stop sign is a flying tomato, and if so, would the car stop? The alteration doesn’t have to make the sign “look like” a tomato to a human observer; it just has to push the image closer to the boundary where the model says “tomato.” Machine learning has neither the context nor the common sense to understand that tomatoes don’t appear in mid-air. Could a delivery drone be subverted to become a weapon by causing it to misunderstand its surroundings? Almost certainly. Don’t dismiss these examples as academic. A stop sign with a few pixels changed in the lab may not be different from a stop sign that has been used for target practice during hunting season.

Impersonation attacks attempt to fool a model into misidentifying someone or something. The goal is frequently to gain unauthorized access to a system. For example, an attacker might want to trick a bank into misreading the amount written on a check. Fingerprints obtained from drinking glasses, or even high resolution photographs, can be used to fool fingerprint authentication. South Park trolled Alexa and Google Home users by using the words “Alexa” and “OK Google” repeatedly in an episode, triggering viewers’ devices; the devices weren’t able to distinguish between the show voices and real ones. The next generation of impersonation attacks will be “deep fake” videos that place words in the mouths of real people.

Inversion means using an API to gather information about a model, and using that information to attack it. Inversion can also mean using an API to obtain private information from a model, perhaps by retrieving data and de-anonymizing it. In “The Secret Sharer: Measuring Unintended Neural Network Memorization & Extracting Secrets,” the authors show that machine learning models tend to memorize all their training data, and that it’s possible to extract protected information from a model. Common approaches to protecting information don’t work; the model still incorporates secret information in ways that can be extracted. Differential privacy—the practice of carefully inserting extraneous data into a data set in ways that don’t change its statistical properties—has some promise, but with significant cost: the authors point out that training is much slower. Furthermore, the number of developers who understand and can implement differential privacy is small.

While this may sound like an academic concern, it’s not; writing a script to probe machine learning applications isn’t difficult. Furthermore, Michael Veale and others write that inversion attacks raise legal problems. Under the GDPR, if protected data is memorized by models, are those models subject to the same regulations as personal data? In that case, developers would have to remove personal data from models—not just the training data sets—on request; it would be very difficult to sell products that incorporated models, and even techniques like automated model generation could become problematic. Again, the authors point to differential privacy, but with the caution that few companies have the expertise to deploy models with differential privacy correctly.

Other vulnerabilities, other attacks

This brief taxonomy of vulnerabilities doesn’t come close to listing all the problems that machine learning will face in the field. Many of these vulnerabilities are easily exploited. You can probe Amazon to find out what products are recommended along with your products, possibly finding out who your real competitors are, and discovering who to attack. You might even be able to reverse-engineer how Amazon makes recommendations and use that knowledge to influence the recommendations they make.

More complex attacks have been seen in the field. One involves placing fake reviews on an Amazon seller’s site, so that when the seller removes the reviews, Amazon bans the seller for review manipulation. Is this an attack against machine learning? The attacker tricks the human victim into violating Amazon’s rules. Ultimately, though, it’s the machine learning system that’s tricked into taking an incorrect action (banning the victim) that it could have prevented.

Google bowling” means creating large numbers of links to a competitor’s website in hopes that Google’s ranking algorithm will penalize the competitor for purchasing bulk links. It’s similar to the fake review attack, except that it doesn’t require a human intermediary; it’s a direct attack against the algorithm that analyzes inbound links.

Advertising was one of the earliest adopters of machine learning, and one of the earliest victims. Click fraud is out of control, and the machine learning community is reluctant to talk about (or is unaware of) the issue—even though, as online advertising becomes ever more dependent on machine learning, fraudsters will learn how to attack models directly in their attempts to appear legitimate. If click data is unreliable, then models built from that data are unreliable, along with any results or recommendations generated by those models. And click fraud is similar to many attacks against recommendation systems and trend analysis. Once a “fake news” item has been planted, it’s simple to make it trend with some automated clicks. At that point, the recommendation takes over, generating recommendations which in turn generate further clicks. Anything automated is prone to attack, and automation allows those attacks to take place at scale.

The advent of autonomous vehicles, ranging from cars to drones, presents yet another set of threats. If the machine learning systems on an autonomous vehicle are vulnerable to attack, a car or truck could conceivably be used as a murder weapon. So could a drone—either a weaponized military drone or a consumer drone. The military already knows that drones are vulnerable; in 2011, Iran captured a U.S. drone, possibly by spoofing GPS signals. We expect to see attacks on “smart” consumer health devices and professional medical devices, many of which we know are already vulnerable.

Taking action

Merely scolding and thinking about possible attacks won’t help. What can be done to defend machine learning models? First, we can start with traditional software. The biggest problem with insecure software isn’t that we don’t understand security; it’s that software vendors, and software users, never take the basic steps they would need to defend themselves. It’s easy to feel defenseless before hyper-intelligent hackers, but the reality is that sites like Equifax become victims because they didn’t take basic precautions, such as installing software updates. So, what do machine learning developers need to do?

Security audits are a good starting point. What are the assets that you need to protect? Where are they, and how vulnerable are they? Who has access to those resources, and who actually needs that access? How can you minimize access to critical data? For example, a shipping system needs customer addresses, but it doesn’t need credit card information; a payment system needs credit card information, but not complete purchase histories. Can this data be stored and managed in separate, isolated databases? Beyond that, are basic safeguards in place, such as two-factor authentication? It’s easy to fault Equifax for not updating their software, but almost any software system depends on hundreds, if not thousands, of external libraries. What strategy do you have in place to ensure they’re updated, and that updates don't break working systems?

Like conventional software, machine learning systems should use monitoring systems that generate alerts to notify staff when something abnormal or suspicious occurs. Some of these monitoring systems are already using machine learning for anomaly detection—which means the monitoring software itself can be attacked.

Penetration testing is a common practice in the online world: your security staff (or, better, consultants) attack your site to discover its vulnerabilities. Attack simulation is an extension of penetration testing that shows you “how attackers actually achieve goals against your organization.” What are they looking for? How do they get to it? Can you gain control over a system by poisoning its inputs?

Tools for testing computer vision systems by generating "adversarial images" are already appearing, such as cleverhans and IBM’s ART. We are starting to see papers describing adversarial attacks against speech recognition systems. Adversarial input is a special case of a more general problem. Most machine learning developers assume their training data is similar to the data their systems will face in the real world. That’s an idealized best case. It’s easy to build a face identification system if all your faces are well-lit, well-focused, and have light-skinned subjects. A working system needs to handle all kinds of images, including images that are blurry, badly focused, poorly lighted—and have dark-skinned subjects.

Safety verification is a new area for AI research, still in its infancy. Safety verification asks questions like whether models can deliver consistent results, or whether small changes in the input lead to large changes in the output. If machine learning is at all like conventional software, we expect an escalating struggle between attackers and defenders; better defenses will lead to more sophisticated attacks, which will lead to a new generation of defenses. It will never be possible to say that a model has been “verifiably safe.” But it is important to know that a model has been tested, and that it is reasonably well-behaved against all known attacks.

Model explainability has become an important area of research in machine learning. Understanding why a model makes specific decisions is important for several reasons, not the least of which is that it makes people more comfortable with using machine learning. That “comfort” can be deceptive, of course. But being able to ask models why they made particular decisions will conceivably make it easier to see when they’ve been compromised. During development, explainability will make it possible to test how easy it is for an adversary to manipulate a model, in applications from image classification to credit scoring. In addition to knowing what a model does, explainability will tell us why, and help us build models that are more robust, less subject to manipulation; understanding why a model makes decisions should help us understand its limitations and weaknesses. At the same time, it’s conceivable that explainability will make it easier to discover weaknesses and attack vectors. If you want to poison the data flowing into a model, it can only help to know how the model responds to data.

In “Deep Automation in Machine Learning,” we talked about the importance of data lineage and provenance, and tools for tracking them. Lineage and provenance are important whether or not you’re developing the model yourself. While there are many cloud platforms to automate model building and even deployment, ultimately your organization is responsible for the model’s behavior. The downside of that responsibility includes everything from degraded profits to legal liability. If you don’t know where your data is coming from and how it has been modified, you have no basis for knowing whether your data has been corrupted, either through accident or malice.

Datasheets for Datasets” proposes a standard set of questions about a data set’s sources, how the data was collected, its biases, and other basic information. Given a specification that records a data set’s properties, it should be easy to test and detect sudden and unexpected changes. If an attacker corrupts your data, you should be able to detect that and correct it up front; if not up front, then later in an audit.

Datasheets are a good start, but they are only a beginning. Whatever tools we have for tracking data lineage and provenance need to be automated. There will be too many models and data sets to rely on manual tracking and audits.

Balancing openness against tipping off adversaries

In certain domains, users and regulators will increasingly prefer machine learning services and products that can provide simple explanations for how automated decisions and recommendations are being made. But we’ve already seen that too much information can lead to certain parties gaming models (as in SEO). How much to disclose depends on the specific application, domain, and jurisdiction.

This balancing act is starting to come up in machine learning and related areas that involve the work of researchers (who tend to work in the open) who are up against adversaries who prize unpublished vulnerabilities. The question of whether or not to “temporarily hold back” research results is a discussion that the digital media forensics community has been having. In a 2018 essay, Hany Farid noted: “Without necessarily advocating this as a solution for everyone, when students are not involved on a specific project, I have held back publication of new techniques for a year or so. This approach allows me to always have a few analyses that our adversaries are not aware of.”

Privacy and security are converging

Developers will also need to understand and use techniques for privacy-preserving machine learning, such as differential privacy, homomorphic encryption, secure multi-party computation, and federated learning. Differential privacy is one of the few techniques that protects user data from “inverting” a model and extracting private data from it. Homomorphic encryption allows systems to do computations directly on encrypted data, without the need for decryption. And federated learning allows individual nodes to compute parts of a model, and then send their portion back to be combined to build a complete model; individual users’ data doesn’t have to be transferred. Federated learning is already being used by Google to improve suggested completions for Android users. However, some of these techniques are slow (in some cases, extremely slow), and require specialized expertise that most companies don’t have. And you often will need a combination of these techniques to achieve privacy. It’s conceivable that future tools for automated model building will incorporate these techniques, minimizing the need for local expertise.

Live data

Machine learning applications increasingly interact with live data, complicating the task of building safe, reliable, and secure systems. An application as simple as a preference engine has to update itself constantly as its users make new choices. Some companies are introducing personalization and recommendation models that incorporate real-time user behavior. Disinformation campaigns occur in real time, so detecting disinformation requires knowledge bases that can be updated dynamically, along with detection and mitigation models that can also be updated in real time. Bad actors who create and propagate disinformation are constantly getting more sophisticated, making it harder to detect, particularly with text-based content. And recent developments in automatic text generation means that the creation of “fake news” can be automated. Machine learning can detect potential misinformation, but at present, humans are needed to verify and reject misinformation. Machine learning can aid and support human action, but humans must remain in the loop.

Applications of reinforcement learning frequently interact with live data, and researchers are well aware of the need to build reinforcement learning applications that are safe and robust. For applications like autonomous driving, failures are catastrophic; but at the same time, the scarcity of failure makes it harder to train systems effectively.

Organization and culture

In traditional software development, we are finally learning that security experts have to be part of development teams from the beginning. Security needs to be part of the organization’s culture. The same is true for machine learning: from the beginning, it’s important to incorporate security experts and domain experts who understand how a system is likely to be abused. As Facebook’s former chief security officer Alex Stamos has said, “the actual responsibility [for security] has to be there when you’re making the big design decisions.” Every stage of a machine learning project must think about security: the initial design, building the data pipelines, collecting the data, creating the models, and deploying the system. Unfortunately, as Stamos notes, few teams are actually formed this way.


Whatever they might believe, most organizations are in the very early stages of adopting machine learning. The companies with capabilities equivalent to Google, Facebook, Amazon, or Microsoft are few and far between; at this point, most are still doing some early experiments and proofs of concepts. Thought and effort haven’t gone into security. And maybe that’s fair; does a demo need to be secure?

Perhaps not, but it’s worth thinking carefully about history. Security is a problem in part because the inventors of modern computer networking didn’t think it was necessary. They were building the ARPAnet: an academic research net that would never go beyond a few hundred sites. Nobody anticipated the public internet. And yet, even on the proto-internet, we had the Morris worm in the 80s, and email spam in the '70s. One of the things we do with any technology is abuse it. By ignoring the reality of abuse, we entered a never-ending race; it’s impossible to win, impossible to quit, and easy to lose.

But even if we can give the internet’s early naivete a pass, there’s no question that we live in a world where security is a paramount concern. There is no question that applications of machine learning will touch (indeed, invade) people’s lives, frequently without their knowledge or consent. It is time to put a high priority on security for machine learning.

We believe that attacks against machine learning systems will become more frequent and sophisticated. That’s the nature of the security game: an attack is countered by a defense, which is countered in turn by a more sophisticated attack, in a game of endlessly increasing complexity. We’ve listed a few kinds of attacks, but keep in mind we’re in the early days. Our examples aren’t exhaustive, and there are certainly many vulnerabilities that nobody has yet thought of. These vulnerabilities will inevitably be discovered; cybercrime is a substantial international business, and the bad actors even include government organizations.

Meanwhile, the stakes are getting higher. We’ve only begun to pay the penalty for highly vulnerable networked devices—the Internet of Things (IoT)—and while the security community is aware of the problems, there are few signs that manufacturers are addressing the issues. IoT devices are only becoming more powerful, and 5G networking promises to extend high-bandwidth, low-latency connectivity to the edges of the network. We are already using machine learning in our phones; will machine learning extend to near-microscopic chips embedded in our walls? There are already voice activity detectors that can run on a microwatt; as someone on Twitter suggests, a future generation could possibly run on energy generated from sound waves. And there are already microphones where we least suspect them. Deploying insecure "smart devices" on this scale isn't a disaster waiting to happen; it's a disaster that's already happening.

We have derived a lot of value from machine learning, and we will continue to derive value from pushing it to the limits; but if security issues aren’t addressed, we will have to pay the consequences. The software industry has demonstrated, all too clearly, what happens when you don’t pay attention to security. As machine learning penetrates our lives, those stakes will inevitably become higher.

Related resources:

Continue reading You created a machine learning application. Now make sure it’s secure..

Categories: Technology

Four short links: 28 February 2019

O'Reilly Radar - Thu, 2019/02/28 - 02:00

Breakthrough Technologies, AI Habitat, Simplified Datomic, and Metrics

  1. Breakthrough Technologies 2019 (MIT TR) -- robot dexterity, new-wave nuclear power, predicting preemies, gut probe in a pill, custom cancer vaccines, the cow-free burger, CO2 catcher, ECG on your wrist, sanitation without sewers, AI assistants.
  2. AI Habitat (Facebook) -- enables training of embodied AI agents (virtual robots) in a highly photorealistic & efficient 3D simulator, before transferring the learned skills to reality.
  3. Asami -- In-memory graph store that implements the Naga storage protocol. This has a query API that looks very similar to a simplified Datomic.
  4. Metrics -- Metrics are lossily compressed logs. Traces are logs with parent child relationships between entries. The only reason we have three terms is because getting value from them has required different compromises to make them cost effective. --Clint Sharp. (via Simon Willison)

Continue reading Four short links: 28 February 2019.

Categories: Technology

Four short links: 27 February 2019

O'Reilly Radar - Wed, 2019/02/27 - 02:00

Universal Binaries, Front-End Training, ML Myths, and Recommended Books

  1. WASMer -- Universal Binaries Powered by WebAssembly. (open source)
  2. Frontend Workshop from HTML/CSS/JS to TypeScript/React/Redux -- Microsoft's training materials. Open sourced.
  3. Myths in Machine Learning -- TensorFlow is a Tensor manipulation library; Image datasets are representative of real images found in the wild; Machine Learning researchers do not use the test set for validation; Every datapoint is used in training a neural network; We need (batch) normalization to train very deep residual networks; Attention > Convolution; Saliency maps are robust ways to interpret neural networks.
  4. Books I Recommend (Jessie Frazelle) -- tight set of recommendations that overlap enough with my own reading that I'm already ordering the books that are new to me.

Continue reading Four short links: 27 February 2019.

Categories: Technology

3 reasons to add deep learning to your time series toolkit

O'Reilly Radar - Tue, 2019/02/26 - 04:00

The most promising area in the application of deep learning methods to time series forecasting is in the use of CNNs, LSTMs, and hybrid models.

The ability to accurately forecast a sequence into the future is critical in many industries: finance, supply chain, and manufacturing are just a few examples. Classical time series techniques have served this task for decades, but now deep learning methods—similar to those used in computer vision and automatic translation—have the potential to revolutionize time series forecasting as well.

Due to their applicability to many real-life problems—such as fraud detection, spam email filtering, finance, and medical diagnosis—and their ability to produce actionable results, deep learning neural networks have gained a lot of attention in recent years. Generally, deep learning methods have been developed and applied to univariate time series forecasting scenarios, where the time series consists of single observations recorded sequentially over equal time increments. For this reason, they have often performed worse than naïve and classical forecasting methods, such as exponential smoothing (ETS) and autoregressive integrated moving average (ARIMA). This has led to a general misconception that deep learning models are inefficient in time series forecasting scenarios, and many data scientists wonder whether it’s really necessary to add another class of methods—such as convolutional neural networks or recurrent neural networks—to their time series toolkit.

In this post, I'll discuss some of the practical reasons why data scientists may still want to think about deep learning when they build time series forecasting solutions.

Deep learning neural networks: Some foundational concepts

The goal of machine learning is to find features to train a model that transforms input data (such as pictures, time series, or audio) to a given output (such as captions, price values, transcriptions). Deep learning is a subset of machine learning algorithms that learn to extract these features by representing input data as vectors and transforming them with a series of clever linear algebra operations into a given output.

Data scientists then evaluate whether the output is what they expected using an equation called loss function. The goal of the process is to use the result of the loss function from each training input to guide the model to extract features that will result in a lower loss value on the next pass. This process has been used to cluster and classify large volumes of information—for example, millions of satellite images; thousands of video and audio recordings from YouTube; and historical, textual, and sentiment data from Twitter.

Deep learning neural networks have three main intrinsic capabilities:

  1. They can learn from arbitrary mappings from inputs to outputs
  2. They support multiple inputs and outputs
  3. They can automatically extract patterns in input data that spans over long sequences

Thanks to these three characteristics, deep learning neural networks can offer a lot of help when data scientists deal with more complex but still very common problems, such as time series forecasting.

Here are three reasons data scientists should consider adding deep learning to their time series toolkits.

Reason #1: Deep learning neural networks are capable of automatically learning and extracting features from raw and imperfect data

Time series is a type of data that measures how things change over time. In time series, time isn’t just a metric, but a primary axis. This additional dimension represents both an opportunity and a constraint for time series data because it provides a source of additional information but makes time series problems more challenging, as specialized handling of the data is required. Moreover, this temporal structure can carry additional information, like trends and seasonality, that data scientists need to deal with in order to make their time series easier to model with any type of classical forecasting methods.

Neural networks can be useful for time series forecasting problems by eliminating the immediate need for massive feature engineering processes, data scaling procedures, and the need for making the data stationary by differencing.

In real-world time series scenarios—for example, weather forecasting, air quality and traffic flow forecasting, and forecasting scenarios based on streaming IoT devices like geo-sensors—irregular temporal structures, missing values, heavy noise, and complex interrelationships between multiple variables present limitations for classical forecasting methods. These techniques typically rely on clean, complete data sets in order to perform well: missing values, outliers, and other imperfect features are generally unsupported.

Speaking of more artificial and perfect data sets, classical forecasting methods are based on the assumption that a linear relationship and a fixed temporal dependence exist among variables of a data set, and this assumption by default excludes the possibility of exploring more complex (and probably more interesting) relationships among variables. Data scientists must make subjective judgements when preparing data for classical analysis—like the lag period used to remove trends—which is time consuming and introduces human biases to the process. On the contrary, neural networks are robust to noise in input data and in the mapping function, and can even support learning and prediction in the presence of missing values.

Convolutional neural networks (CNNs) are a category of neural networks that have proven very effective in areas such as image recognition and classification. CNNs have been successful in identifying faces, objects, and traffic signs in addition to powering vision in robots and self-driving cars. CNNs derive their name from the “convolution” operator. The primary purpose of convolution in the case of CNNs is to extract features from the input image. Convolution preserves the spatial relationship between pixels by learning image features using small squares of input data. In other words, the model learns how to automatically extract the features from the raw data that are directly useful for the problem being addressed. This is called "representation learning" and the CNN achieves this in such a way that the features are extracted regardless of how they occur in the data, so-called "transform" or "distortion" invariance.

The ability of CNNs to learn and automatically extract features from raw input data can be applied to time series forecasting problems. A sequence of observations can be treated like a one-dimensional image that a CNN model can read and refine into the most relevant elements. This capability of CNNs has been demonstrated to great effect on time series classification tasks, such as indoor movement prediction using wireless sensor strength data to predict the location and motion of subjects within a building.

Reason #2: Deep learning supports multiple inputs and outputs

Real-world time series forecasting is challenging for several reasons, such as having multiple input variables, the requirement to predict multiple time steps, and the need to perform the same type of prediction for multiple physical sites. Deep learning algorithms can be applied to time series forecasting problems and offer benefits such as the ability to handle multiple input variables with noisy complex dependencies. Specifically, neural networks can be configured to support an arbitrary but fixed number of inputs and outputs in the mapping function. This means that neural networks can directly support multivariate inputs, providing direct support for multivariate forecasting. A univariate time series, as the name suggests, is a series with a single time-dependent variable. For example, if we want to predict the next energy consumption in a specific location: in a univariate time series scenario, our data set will be based on two variables: time values and historical energy consumption observations.

A multivariate time series has more than one time-dependent variable. Each variable depends not only on its past values, but also has some dependency on other variables. This dependency is used for forecasting future values. Let’s consider the above example again. Now suppose our data set includes weather data, such as temperature values, dew point, wind speed, cloud cover percentage, etc., along with the energy consumption value for the past four years. In this case, there are multiple variables to be considered to optimally predict an energy consumption value. A series like this would fall under the category of a multivariate time series.

With neural networks, an arbitrary number of output values can be specified, offering direct support for more complex time series scenarios that require multivariate forecasting and even multi-step forecast methods. There are two main approaches to using deep learning methods to make multi-step forecasts: 1) direct, where a separate model is developed to forecast each forecast lead time; and 2) recursive, where a single model is developed to make one-step forecasts, and the model is used recursively where prior forecasts are used as input to forecast the subsequent lead time.

The recursive approach can make sense when forecasting a short contiguous block of lead times, whereas the direct approach may make more sense when forecasting discontiguous lead times. The direct approach may be more appropriate when we need to forecast a mixture of multiple contiguous and discontiguous lead times over a period of a few days; such is the case, for example, with air pollution forecasting problems or for anticipatory shipping forecasting, used to predict what customers want and then ship the products automatically.

Key to the use of deep learning algorithms for time series forecasting is the choice of multiple input data. We can think about three main sources of data that can be used as input and mapped to each forecast lead time for a target variable; they are: 1) univariate data, such as lag observations from the target variable that is being forecasted; 2) multivariate data, such as lag observations from other variables (for example, weather and targets in case of air pollution forecasting problems); 3) metadata, such as data about the date or time being forecast. Data can be drawn from across all chunks, providing a rich data set for learning a mapping from inputs to the target forecast lead time.

Reason #3: Deep learning networks are good at extracting patterns in input data that span over relatively long sequences

Deep learning is an active research area, and CNNs are not the only class of neural network architectures being used for time series and sequential data. Recurrent neural networks (RNNs) were created in the 1980s but have been recently gaining popularity and increased computational power from graphic processing units. They are especially useful with sequential data because each neuron or unit can use its internal memory to maintain information about the previous input. An RNN has loops that allow information to be carried across neurons while reading in input.

However, a simple recurrent network suffers from a fundamental problem of not being able to capture long-term dependencies in a sequence. This is a major reason why RNNs faded from practice for a while until some great results were achieved using a long short-term memory (LSTM) unit inside the neural network. Adding the LSTM to the network is like adding a memory unit that can remember context from the very beginning of the input.

LSTM neural networks are a particular type of RNN that have internal contextual state cells that act as long-term or short-term memory cells. The output of the LSTM network is modulated by the state of these cells. This is a very important property when we need the prediction of the neural network to depend on the historical context of inputs, rather than only on the very last input. They are a type of neural network that adds native support for input data comprised of sequences of observations. The addition of sequence is a new dimension to the function being approximated. Instead of mapping inputs to outputs alone, the network can learn a mapping function for the inputs over time to an output. The example of video processing can be very effective when we need to understand how LSTM networks work: in a movie, what happens in the current frame is heavily dependent on what was in the previous frame. Over a period of time, an LSTM network tries to learn what to keep and how much to keep from the past, and how much information to keep from the present state, which makes it powerful compared to other types of neural networks.

This capability can be used in any time series forecasting context, where it can be extremely helpful to automatically learn the temporal dependence from the data. In the simplest case, the network is shown one observation at a time from a sequence and can learn which prior observations are important and how they are relevant to forecasting. The model both learns a mapping from inputs to outputs and learns what context from the input sequence is useful for mapping and can dynamically change this context as needed. Not surprisingly, this approach has been often used in the finance industry to build models that forecast exchange rates based on the idea that past behavior and price patterns may affect currency movements and can be used to predict future price behavior and patterns.

On the other hand, there are downsides that data scientists need to be careful about with neural network architectures. Large volumes of data are required, and models require hyper-parameter tuning and multiple optimization cycles.


Deep learning neural networks are powerful engines capable of learning from arbitrary mappings from inputs to outputs, supporting multiple inputs and outputs, and automatically extracting patterns in input data that span long sequences of time. All these characteristics together make neural networks helpful tools when dealing with more complex time series forecasting problems that involve large amounts of data, multiple variables with complicated relationships, and even multi-step time series tasks. A lot of research has been invested into using neural networks for time series forecasting with modest results. Perhaps the most promising area in the application of deep learning methods to time series forecasting is in the use of CNNs, LSTMs, and hybrid models.

Useful resources

Recent improvements in tools and technologies has meant that techniques like deep learning are now being used to solve common problems, including forecasting, text mining, language understanding, and personalization. Below are some useful resources and presentations involving deep learning:

Continue reading 3 reasons to add deep learning to your time series toolkit.

Categories: Technology

Four short links: 26 February 2019

O'Reilly Radar - Tue, 2019/02/26 - 02:00

Cloud Act, Content Moderation, Conference Diversity, and Exposing Secrets

  1. US Cloud Act (Bloomberg) -- nations fight over corporations that are fighting over consumers. A few years ago it was hypothesised that corporations had the advantage in an age of globalisation, but there is still fight left in the nation.
  2. Facebook Content Moderators (The Verge) -- a horrific story about the low-paid traumatic work of moderating user-generated content. Now it's being done domestically (was, and may still be, primarily in Philippines), but with little concern to the damage done by watching murders and rapes every day. When I ask about the risks of contractors developing PTSD, a counselor I’ll call Logan tells me about a different psychological phenomenon: “post-traumatic growth,” an effect whereby some trauma victims emerge from the experience feeling stronger than before. The example he gives me is that of Malala Yousafzai, the women’s education activist, who was shot in the head as a teenager by the Taliban. Despicable.
  3. Do Better at Conference Diversity -- good to have a single place to point conference organisers who are, let's put it diplomatically, starting their diversity journey. This is a HOWTO for better events.
  4. Exposing Company Secrets Through Your API: Part 1 -- a lot of companies have private API routes that define their testing infrastructure. The client will request the current tests enabled for the account, and server will reply back, often with a huge list of all the tests the company is running. Below I explored how some of the most popular sites do it. (via Simon Willison)

Continue reading Four short links: 26 February 2019.

Categories: Technology

Four short links: 25 February 2019

O'Reilly Radar - Mon, 2019/02/25 - 02:00

Generalising, Combining, Deciding, and Working.

  1. Learning to Generalize from Sparse and Underspecified Rewards (Google) -- where an agent receives a complex input, such as a natural language instruction, and needs to generate a complex response, such as an action sequence, while only receiving binary success-failure feedback. Such success-failure rewards are often underspecified: they do not distinguish between purposeful and accidental success. Generalization from underspecified rewards hinges on discounting spurious trajectories that attain accidental success, while learning from sparse feedback requires effective exploration. [...] The MeRL approach outperforms our alternative reward learning technique based on Bayesian Optimization, and achieves the state-of-the-art on weakly-supervised semantic parsing. It improves previous work by 1.2% and 2.4% on WikiTableQuestions and WikiSQL datasets respectively. An important area of machine learning because most successes and failures don't come with a root cause analysis.
  2. Generating Combinations -- Gosper's Hack is a very elegant piece of code for generating combinations. I love hacks like this (this one first appeared in the classic MIT text, HAKMEM. Gosper is Bill Gosper who also invented the Game of Life glider gun among his many claims to fame).
  3. Manifold's Decision-Making Process -- there's nothing specific to Manifold here, this is just good advice about knowing who is making a decision and then involving people according to the consequence and irreversibility of the decision. Every organisation has to learn how to make decisions before its dysfunction grinds progress to a halt.
  4. Workism is Making Americans Miserable (The Atlantic) -- The economists of the early 20th century did not foresee that work might evolve from a means of material production to a means of identity production. They failed to anticipate that, for the poor and middle class, work would remain a necessity; but for the college-educated elite, it would morph into a kind of religion, promising identity, transcendence, and community. Call it workism. The punchline is great, and the journey there is hard to argue with: The vast majority of workers are happier when they spend more hours with family, friends, and partners, according to research. Work is not that.

Continue reading Four short links: 25 February 2019.

Categories: Technology

Four short links: 22 February 2019

O'Reilly Radar - Fri, 2019/02/22 - 05:05

Decentralized Comms, Multiple Screens, HTTP/S Troubleshooting, and Early Adopter

  1. Matrix -- an open standard for interoperable, decentralized, real-time communication over IP. (via LWN)
  2. RAMSES -- Rendering Architecture for Multi-Screen EnvironmentS: It implements a distributed system for rendering 3D content with a focus on bandwidth and resource efficiency.
  3. -- a shell script for http/https troubleshooting and profiling. It's also a simple wrapper script around several open source security tools.
  4. Early Adopter -- a Valentine's Day sci-fi short story by Kevin Bankston, and it's very good. (via Cory Doctorow)

Continue reading Four short links: 22 February 2019.

Categories: Technology

The evolution and expanding utility of Ray

O'Reilly Radar - Thu, 2019/02/21 - 10:05

There are growing numbers of users and contributors to the framework, as well as libraries for reinforcement learning, AutoML, and data science.

In a recent post, I listed some of the early use cases described in the first meetup dedicated to Ray—a distributed programming framework from UC Berkeley’s RISE Lab. A second meetup took place a few months later, and both events featured some of the first applications built with Ray. On the development front, the core API has stabilized and a lot of work has gone into improving Ray’s performance and stability. The project now has around 5,700 stars on GitHub and more than 100 contributors across many organizations.

At this stage of the project, how does one describe Ray to those who aren’t familiar with the project? The RISE Lab team describes Ray as a “general framework for programming your cluster or cloud.” To place the project into context, Ray and cloud functions (FaaS, serverless) currently sit somewhere in the middle between extremely flexible systems on one end or systems that are much more targeted and emphasize ease of use. More precisely, users currently can avail of extremely flexible cluster management and virtualization tools on one end (Docker, Kubernetes, Mesos, etc.), or domain specific systems on the other end of the flexibility spectrum (Spark, Kafka, Flink, PyTorch, TensorFlow, Redshift, etc.).

How does this translate in practice? Ray’s support for both stateless and stateful computations, and fine-grained control over scheduling allows users to implement a variety of services and applications on top of it:

Figure 1. Examples of libraries that can be built on top of Ray. Source: Robert Nishihara and Philipp Moritz, used with permission.

Libraries on top of Ray are already appearing: RLlib (scalable reinforcement learning), Tune (a hyperparameter optimization framework), and a soon-to-be-released library for streaming are just a few examples. As I describe below, I expect more to follow soon.

That’s all well and good, but most developers, data scientists, and researchers aren’t necessarily looking to write libraries. They are likely to be interested in tools that already provide libraries they can use. If you are a data scientist or developer who uses Python, there are several reasons you should start looking into Ray:

  • Modin lets you scale your pandas workflows by changing one line of code. Given that many data scientists already love using pandas, Modin is an extremely simple way to scale and speed up existing code.
  • Scalable data science: It has not gone unnoticed that Ray provides a simple way for Python users to parallelize their code. At a recent UC Berkeley course—Data Science 100—co-taught by one of the creators of Jupyter, Ray was the focus of a lecture on distributed computing.
  • Reinforcement learning (RL): RL is one of those topics that data scientists are beginning to explore. But just as many machine learning users take advantage of existing libraries (e.g., scikit-learn), most RL users won’t be writing libraries and tools from scratch. The good news is that RLlib provides both a unified API for different types of RL training, and all of RLlib’s algorithms are distributed. Thus, both RL users and RL researchers benefit from using RLlib.
Figure 2. Reinforcement learning and Ray. Source: Eric Liang, used with permission.
  • AutoML: In a recent post, we described tools for automating various stages of machine learning projects—with model building being an important component. Ray users can already take advantage of Tune, a scalable hyperparameter optimization framework. Hyperparameter tuning is a critical and common step in machine learning model exploration and building. There are also other AutoML projects that use Ray, and hopefully some will be released into open source in the near future.

What about actual usage? The growing number of project contributors, along with the rise of libraries and tools has translated into additional use cases and production deployments of Ray. In a previous post, I listed a UC Berkeley research group investigating mixed-autonomy mobility and Ant Financial as organizations already using Ray (Ant Financial has several Ray use cases in production). Since then, I’ve heard of other industry use cases in various settings:

  • Financial services and industrial automation applications
  • Ray is being used in text mining, in the construction of knowledge graphs, and for graph queries
  • Companies exploring the use of Ray for real-time recommendation systems—which involves learning models against live data.

There are also a growing roster of research groups in AI, machine learning, robotics, and engineering that have adopted Ray. With Ray beginning to be available on cloud platforms—Amazon SageMaker RL includes Ray—I expect to hear of many more interesting case studies involving Ray.

In closing, this is a great time to explore Ray. The core API is stable, libraries are improving and expanding, more production deployments are emerging, and (most importantly) the community of users and contributors is growing.

Related content:

Continue reading The evolution and expanding utility of Ray.

Categories: Technology

Four short links: 21 February 2019

O'Reilly Radar - Thu, 2019/02/21 - 05:00

Internet of Shite, Parsing JSON, Remote-First, and Biased ML

  1. Nike Just Bricked Its Self-Lacing Shoes by Accident -- Android users are experiencing problems. The bug reports (left in app comments) are classic 21C This Is Not The Cyberpunk Future I Was Promised. The first software update for the shoe threw an error while updating, bricking the right shoe. [...] Also, app says left shoe is already connected to another device whenever I try to reinstall and start over.
  2. simdjson -- Parsing gigabytes of JSON per second.
  3. MobileJazz Company Handbook -- they're remote-first, and this talks about how they do it.
  4. Dirty Data, Bad Predictions: How Civil Rights Violations Impact Police Data, Predictive Policing Systems, and Justice -- Deploying predictive policing systems in jurisdictions with extensive histories of unlawful police practices presents elevated risks that dirty data will lead to flawed, biased, and unlawful predictions which in turn risk perpetuating additional harm via feedback loops throughout the criminal justice system. Thus, for any jurisdiction where police have been found to engage in such practices, the use of predictive policing in any context must be treated with skepticism and mechanisms for the public to examine and reject such systems are imperative.

Continue reading Four short links: 21 February 2019.

Categories: Technology

Three surveys of AI adoption reveal key advice from more mature practices

O'Reilly Radar - Wed, 2019/02/20 - 12:30

An overview of emerging trends, known hurdles, and best practices in artificial intelligence.

Recently, O’Reilly Media published AI Adoption in the Enterprise: How Companies Are Planning and Prioritizing AI Projects in Practice, a report based on an industry survey. That was the third of three industry surveys conducted in 2018 to probe trends in artificial intelligence (AI), big data, and cloud adoption. The other two surveys were The State of Machine Learning Adoption in the Enterprise, released in July 2018, and Evolving Data Infrastructure, released in January 2019.

This article looks at those results in further detail, comparing high-level themes based on the three reports, plus related presentations at the Strata Data Conference and the AI Conference. These points would have been out of scope for any of the individual reports.

Exploring new markets by repurposing AI applications

Looking across industry sectors in AI Adoption in the Enterprise, we see how technology, health care, and retail tend to be the leaders in AI adoption, whereas the public sector (government) tends to be the laggards, along with education and manufacturing. Although that gap could be taken as commentary about the need for “data for social good,” it also points toward opportunities. Consider this: finance has enjoyed first-mover advantages in artificial intelligence adoption, as have the technology and retail sectors. After having matured in these practices, now we see financial services firms exploring opportunities that just a few years ago might have been considered niches. For example, at our recent AI Conference in London, two talks—Ashok Srivastava of Intuit and Johnny Ball of Fluidy—presented business applications for AI aimed at establishing safety nets for small businesses. Both teams applied anomaly detection techniques (for example, reused from aircraft engine monitoring) to spot when small businesses were likely to fail. That’s important since more than 50% of small businesses fail, mostly due to exactly those “anomalies”: cash flow problems and late payments.

Given how government and education trail as laggards in the AI space, could similar kinds of technology reuse apply there? For example, within the past few years, it’s become common practice in U.S. grade schools for teachers to provide detailed information online to parents about student assignments and grades. This data can be extremely helpful as early warning signals for at-risk students who might be failing school—although, quite frankly, few working parents can afford the time to track that much data. Moreover, few schools have resources to act on that data in aggregate. Even so, the anomaly detection used in small business cash-flow analysis is strikingly similar to what a homework “safety net” for students would need. Undoubtedly, there are areas within government (especially at the local level) where similar AI applications could lead to considerable public upside, which would otherwise be understaffed due to budget restraints. As the enterprise adoption of AI continues to mature, we can hope that diffusion from the leaders to the laggards comes through similarly innovative acts of technology repurposing. The trick seems to be finding enough people with depth in both technical and business skills who can recognize business use cases for AI.

Differentiated tooling

Looking at the “Tools for Building AI Applications” section of AI Adoption in the Enterprise for trends about technology adoption, we see how frameworks such as Spark NLP, scikit-learn, and H2O hold popularity in finance, whereas Google Cloud ML Engine gets higher share within the health care industry. Compared with analysis last year, both Keras and PyTorch have picked up significant gains over the category leader TensorFlow. Also, while there has been debate in the industry about the relative merits of using Jupyter Notebooks in production, usage has been growing dramatically. We see from this survey’s results that support for notebooks (23%) now leads over support for IDEs (17%).

The summary results about health care and life sciences create an interesting picture. 70 percent of all respondents from the health sector are using AI for R&D projects. Respondents from the health care sector also had significantly less trouble identifying appropriate uses cases for AI, although hurdles for the sector seem to come later in the AI production lifecycle. In general, health care leads other verticals in how it checks for a broad range of AI-related risks, and this vertical makes more use of data visualization than others, as would be expected. It’s also gaining in use of reinforcement learning, which was not expected. Although we know of reinforcement learning production use cases in finance, we don’t have optics into how reinforcement learning is used in health care. That could be a good topic for a subsequent survey.

Advice from the leaders

Admittedly, the survey for AI Adoption in the Enterprise drew from the initiated: 81% of respondents work for organizations that already use AI. We have much to learn from their collective experiences. For example, there’s a story unfolding in the contrast between mature practices and firms that are earlier in their journey toward AI adoption. Some of the key advice emerging from the mature organizations includes:

  • Work toward overcoming challenges related to company culture or not being able to recognize the business use cases.
  • Be mindful that the lack of data and lack of skilled people will pose ongoing challenges.
  • While hiring data scientists, complement by also hiring people who can identify business use cases for AI solutions.
  • Beyond just optimizing for business metrics, also check for model transparency and interpretability, fairness and bias, and that your AI systems are reliable and safe.
  • Explore use cases beyond deep learning: other solutions have gained significant traction, including human-in-the-loop, knowledge graphs, and reinforcement learning.
  • Look for value in applications of transfer learning, which is a nuanced technique the more advanced organizations recognize.
  • Your organization probably needs to invest more in infrastructure engineering than it thinks, perpetually.

This is a story about the relative mix of priorities as a team gains experience. That experience is often gained by learning from early mistakes. In other words, there’s quite a long list of potential issues and concerns that an organization might consider at the outset of AI adoption in enterprise. However, “Go invest in everything, all at once” is not much of a strategy. Advice from leaders at the more sophisticated AI practices tends to be: “Here are the N things we tried early and have learned not to prioritize as much.” We hope that these surveys offer helpful guidance that other organizations can follow.

This is also a story about how to pace investments and sequence large initiatives effectively. For example, you must address the more foundational pain points early—such as problems with company culture, or the lack of enough personnel who can identify the business uses—or those will become blockers for other AI initiatives down the road. Meanwhile, some investments must be ongoing, such as hiring appropriate talent and working to improve data sets. As an executive, don’t assume that one-shot initiatives will work as a panacea. These are ongoing challenges and you must budget for them as such.

Speaking of budget, firms are clearly taking the matter of AI adoption seriously, allocating significant amounts of their IT budgets for AI-related projects. Even if your firm isn’t, you can pretty much bet that the competition will be. Which side of that bet will pay off?

Heading toward a threshold point

Another issue emerged from the surveys that concerns messaging about AutoML. Adoption percentages for AutoML had been in single-digit territory in our earlier survey just two quarters ago. Now, we see many organizations making serious budget allocations toward integrating AutoML over the course of the next year. This is especially poignant for the more mature practices: 86% will be integrating AutoML within the next year, nearly two times that of the evaluation stage firms. That shift is timed almost precisely as cloud providers extend their AutoML offerings. For example, this was an important theme emphasized at Amazon’s recent re:Invent conference in Las Vegas. Both sides, demand and supply, are rolling the dice on AutoML in a big way.

Even so, there’s a risk that less-informed executives might interpret the growing uptake of AutoML as a signal that “AI capabilities are readily available off-the-shelf.” That’s anything but the case at hand. The process of leveraging AI capabilities, even within the AutoML category, depends on multi-year transformations for organizations. That effort requires substantial capital investments and typically an extensive evolution of mindshare by the leadership. It’s not an impulse buy. Another important point to remember is that AutoML is only one portion of the automation that's needed. See the recent Data Show Podcast interview “Building tools for enterprise data science” with Vitaly Gordon, VP of data science and engineering at Salesforce, about their TransmogrifAI open source project for machine learning workflows. It's clear that automating the model building and model search step—the AutoML part—is just one piece of the puzzle.

We’ve also known—since studies published in 2017 plus the analysis that followed—that a “digital divide” is growing in enterprise between the leaders and the laggards in AI adoption. See the excellent “Notes from the frontier: Making AI work,” by Michael Chui at McKinsey Global Institute, plus the related report, AI adoption advances, but foundational barriers remain. What we observe now in Q4 2018 and Q1 2019 is how the mature practices are investing significantly, and based on lessons learned, they’re investing more wisely. However, most of the laggards aren’t even beginning to invest in crucial transformations that will require years. We cannot overstress how this demonstrates a growing divide between “haves” and “have nots” among enterprise organizations. At some threshold point relatively soon, the “have nots” might simply fall too many years behind their competitors to be worth the investments that will be needed to catch up.

Continue reading Three surveys of AI adoption reveal key advice from more mature practices.

Categories: Technology


Subscribe to LuftHans aggregator