You are here

Feed aggregator

AI at scale at Coinbase

O'Reilly Radar - Thu, 2018/09/06 - 13:00

Soups Ranjan describes the machine learning system that Coinbase built to detect potential fraud and fake identities.

Continue reading AI at scale at Coinbase.

Categories: Technology

How network professionals deal with attacks and disruptions

O'Reilly Radar - Thu, 2018/09/06 - 08:20

A new survey highlights concerns from network and cloud administrators, and reveals their coping strategies.

Reliability and response time are always pressing concerns for internet services. We’ve known for a long time that people on desktop or laptop computers have scant patience for slow websites, and the growing move to mobile devices makes the demands on internet services even worse.

O'Reilly Media and Oracle Dyn teamed up this year to survey network operators about where their resilience problems lie, what they're doing to avoid these problems, and what it's like to work in the field of network operations. The survey paid particular attention to issues of working in third-party cloud services, but covered a wide range of other problems in networking as well. Among the findings, we were surprised to hear the chief complaint of respondents in regard to resilience: their own ISP! We also uncovered a high concern over staff burnout, and disparities in the training of operators and in the handling of incidents.

This article starts with a brief description of the survey and who responded. Then we’ll get into the most interesting of the findings. We’ll end with some speculations about automation.

Who responded to the survey?

We collected answers to our survey from 621 network professionals with a variety of job descriptions (see Figure 1). Respondents work in many different industries, although IT services and software dominate. Organizational size was lopsided at the edges of the spectrum (Figure 2), with the largest fraction of respondents employing fewer than 200 people, and the next largest fraction having 10,000 or more. Small percentages represented in-between company sizes.

Figure 1. Industries where respondents work. Image: O'Reilly. Figure 2. Number of employees in organizations where respondents work. Image: O'Reilly.

The majority of respondents taking the survey (459, or 74%) reported working in the cloud. These respondents were asked further questions about what resilience problems they experienced and what measures they took to deal with them (Figure 3). The most common use for the cloud­ was apps/services (81% of cloud users) and development/testing (71%). Historically, we know that many organizations started their cloud use with development and testing before taking the leap of putting their production systems there. As one might expect, the cloud is also popular for web hosting, scaling compute power or storage, and backups.

Figure 3. How respondents use the cloud. Image: O'Reilly. The biggest single problem: Your own ISP

The most surprising result from our survey may be the answer to what respondents said was the leading cause of network disruptions: ISP reliability issues (Figure 4), cited by 44% of respondents. If you add up the various malicious attacks reported, they collectively surpass ISP problems. But given the prevalence of ISP problems, it makes sense that some kind of redundancy is the most popular way to ensure resilience: 39% of respondents use load balancing and 33% use multiple failover sites. Smaller but still significant numbers run a second DNS service, use multiple cloud providers, and use multiple ISPs (Figure 5).

Figure 4. Disruptions experienced by cloud users. Image: O'Reilly. Figure 5. How cloud users avoid disruptions. Image: O'Reilly. The cloud has not overtaken the landscape

The large percentage of respondents who over-provision resources (19%) suggests that on-premises deployments are still common, even among sites using the cloud as well. Overprovisioning is a common practice in on-premises deployments, and is usually rendered unnecessary in the cloud by autoscaling. And it's worth noting that 29% of respondents rely entirely on the cloud provider for resilience (Figure 5, above). A variety of SLAs and incident response remedies are in use (Figure 6).

Figure 6. Use of SLAs and responses to incidents. Image: O'Reilly. You’re not paranoid

Or you might as well be, because they’re out to get you. Attack vectors suggest that sites taking pre­emptive steps to mitigate the impact of attacks are justified. For example, among responses indicating the use of DDoS mitigation tools, 51% (85 of 166 respondents, Figure 7) also indicated that their organization actually had a DoS/DDoS attack in the last 12 months. This is nearly twice that of the overall share reporting DoS/DDoS attacks had occurred overall (121 of 459 respondents).

Figure 7. Disruptions experienced, broken down by measures taken to reduce disruptions. Image: O'Reilly.

A few other correlations turn up between the disruptions experienced and the measures taken for resilience:

  • Using multiple failover sites that share traffic is a popular way to ensure resilience, but it is particularly popular among those who experience DoS attacks or IP route hijacks.
  • Companies that experience those attacks (especially DoS attacks) are more likely to have a secondary DNS service. So are firms having problems with ISP reliability.
  • Spam and phishing were cited as problems by a large number of sites in general (around one-third) but somewhat more by sites that perform black-holing or sink-holing.

Disruptions don’t happen often, but they have to be expected. Each organization was likely to experience just one or two disruptions, but as we explained at the beginning, even a single disruption can be a serious concern (just remember the famous Netflix Christmas outage). A lucky or highly disciplined 20% of respondents reported no disruptions or interruptions during the past year.

Neither organizational size nor industry type made a difference in the problems faced by organizations or the responses they implemented. Similarly, no particular worry dominated responses to the survey. Everything was considered important by a substantial fraction of respondents, although none of them were cited by a majority. And all cloud users employed similar techniques to ensure resilience, regardless of their purpose for using the cloud.

Solutions are at hand—but not in use

It is interesting how many organizations fail to use certain preemptive tools that one might think are universal (Figure 8). For instance, among respondents who oversee the use of cloud services at their organization, more than one-third (37%) said they don’t use firewalls. Even more (42%) fail to use health monitoring. Similarly, DDoS mitigation is used by only 36% of respondents using the cloud. Perhaps the others don’t expect DDoS mitigation to be effective, or anticipate that their services can just wait out a DDoS attack. Or perhaps these respondents rely on the cloud provider to run these services, so some of the services might, in fact, be in place despite the responses we received.

Figure 8. Use of pre-emptive tools to avoid disruptions in the cloud. Image: O'Reilly.

This section wraps up our findings about outages, attacks, and technologies. Now we turn to the human element of network monitoring. Are there enough educated staff in this field? How are companies educating their staffs, and how does that affect responses to network disruptions?

Good network operators are at a premium

In 2015, network engineering was listed as one of the best career choices in computing. The article making that claim does not explain its criteria, but it apparently counts just the demand for such positions: 105,000 openings that year. Job conditions are a different matter, and our survey responses suggest that not all is well in the working conditions for network operators.

Respondents to the survey tend to move around. Fewer than 35% had been in their current jobs for more than five years. We didn't ask how long they had been working in total as network operators. But the greatest single concern respondents expressed about their staff was lack of experience (53%) and insufficient training (46%), both of which suggest that a lot of operators are fairly new to the field. Most organizations are willing to train new operators, either through formal training programs (38%) or informal mentoring (38%).

A substantial number of organizations (29%) hire only experienced people, a luxury that may elevate salaries throughout the field. One can expect people in this field to be highly employable, and therefore to move around a lot in search of more money or better work environments.

In this regard, it's significant that burnout was a major concern among respondents (46%), along with turnover (37%). This should not be surprising because in most institutions, network operations involve constant pressure. And no one likes those 3 a.m. wake-up calls.

Overall, respondents were concerned less about losing employees (turnover) and more about ensuring current employees remained satisfied in their jobs (i.e., that gaps in experience or training are addressed and that employees don’t burn out).

Sites handle outages and attacks differently

Many common forms of analysis and recovery are performed after an attack (Figure 9). The methods organizations use to respond to attacks or to ensure resilience are sometimes correlated with their efforts at skills development, and perhaps even influence staff burnout.

Figure 9. How organizations analyze and recover from disruptions. Image: O'Reilly.

Some examples of different responses include:

  • Among respondents who indicate that they rely completely on cloud service providers (135 respondents), only a little more than one-­third indicated that their organization protects against burnout through either advanced training or career advancement opportunities (35% and 36%, respectively). These percentages are considerably lower than the use of training or other tools to prevent burnout by respondents who employ load balancing across sites or the use of multiple failover sites.
  • The approach used for training network ops teams (Figure 10) may influence how incidents are handled in postmortems. For example, just 37% of those who only hire experienced people indicated they document “lessons learned” to ensure preventable incidents don’t occur.
Figure 10. How network operators are trained to handle resilience. Image: O'Reilly.
  • Among those who train their staff through internal programs, reliance on vendors, or employee "shadowing," around half also reported their postmortems documented lessons learned (48%, 52%, and 58%, respectively).
  • Rewards and recognition are the slightly preferred approach to addressing burnout (Figure 11). Notably, though, this approach does not appear to be considered enough for the majority. Of the 223 respondents who selected rewards and recognition (49% of total respondents), nearly two-thirds (65%, 145 respondents) also selected career advancement opportunities or advanced training to protect operations staff against burnout.
Figure 11. How organizations attempt to prevent burnout. Image: O'Reilly. Automation: Is there a better way?

The survey did not ask what forms of automation organizations use, but these are probably lacking. Despite the current enthusiasm for DevOps in development, the 2017 Puppet "State of DevOps Report" found that about 40% of survey respondents still do manual configuration management and deployment.

Thus, we can surmise that a large number of network operators have to suffer through repetitive recovery tasks under high pressure, along with being tied to their pagers. Even among the 60% of organizations that do some automation, it is probably incomplete. As tools become more widespread and better understood—and especially as cloud providers make them simple to deploy—we can look forward to less burnout and turnover. In summary, given how widespread burnout appears to be, organizations should perhaps make automation a priority.


Overall, our survey suggests that organizations are surviving network outages or reliability problems pretty well. Concerns about resilience, and the measures taken to address them, are fairly consistent across industries, organizational size, and the attacks or failures encountered.

ISP failures are a major concern, vying in importance with malicious attacks. In response to both concerns, sites employ many forms of redundancy, ranging from over-provisioning resources to using multiple cloud providers and DNS services. On the other hand, many respondents are happy sticking their services in the cloud and allowing the cloud provider to deal with reliability.

Warning flags crop up in the treatment of network operators and their job satisfaction. The pressure to hire and retain operators can be addressed by more training or by reducing the need for such staff through automation. Measures that seem to be put in place for the benefit of junior engineers, such as documenting incidents, may end up being healthy for the organization as a whole.

Perhaps this article will encourage more managers of network operations to take a closer look and expand their use of tools for preventing and mitigating problems. Currently, according to the results of the survey, such practices are not as widespread as one would think. We also suggest that operators put more effort and be more consistent in their postmortem handling of incidents, invest in training, and improve the jobs through modern automated practices to prevent burnout.

This post is a collaboration between O’Reilly and Oracle Dyn. See our statement of editorial independence.

Continue reading How network professionals deal with attacks and disruptions.

Categories: Technology

Four short links: 6 September 2018

O'Reilly Radar - Thu, 2018/09/06 - 03:55

BS in AI, Visual Exploration, Bad Predictions, and USB-C Development

  1. CMU's AI Bachelor's Degree -- ethics course mandatory, likewise seven humanities courses. Nice.
  2. GANlab -- interactive visualization of what's happening in a generative adversarial network, as well as an easy-to-read explanation.
  3. Errors, Insights, and Lessons of Famous AI Predictions -- These case studies illustrate several important principles, such as the general overconfidence of experts, the superiority of models over expert judgement, and the need for greater uncertainty in all types of predictions. The general reliability of expert judgement in AI timeline predictions is shown to be poor, a result that fits in with previous studies of expert competence.
  4. USB-C Explorer -- a development board with everything needed to start working with USB Type-C. It contains a USB-C port controller and Power Delivery PHY chip, a microcontroller, and several options for user interaction.

Continue reading Four short links: 6 September 2018.

Categories: Technology

Four short links: 5 September 2018

O'Reilly Radar - Wed, 2018/09/05 - 04:15

Atomic Receiver, Nerdery as AR, Open Access, and Journey Maps

  1. An Atomic Receiver for AM and FM Radio Communication -- lasers detect fluctuations in the outer shell of "Rydberg vapors" (a special form of Cesium) that are caused by radio waves. See also MIT Tech Review.
  2. Geology is Like AR for the Planet (Wired) -- looking at the planet through a geologic lens is something like strapping on an augmented-reality headset. It invites you, from your vantage point in the present, to summon up Earth’s deep past and far future—to see these parallel worlds with your own eyes, like digital overlays. All nerd-level expertise is awesome for this reason. Try going bar-hopping with a bar owner who can talk about fit-out costs, eyelines, liquor choices, branding, etc. Nothing is boring if you know enough about it. (via Dan Hon)
  3. Radical Open-Access Plan (Nature) -- Eleven research funders in Europe announce "Plan S" to make all scientific works free to read as soon as they are published.
  4. Journey Maps -- A journey map is a collection of customer research most recognizable by its timeline—a visual depiction of every touch point customers have with the product or business, laid out from left to right. [...] Seeing the journey visually helps reveal the emotional landscape of the customer, which helps the product, marketing, customer support, and analytics teams understand what users feel at each point and identify ways the team can improve the experience. Steps and advice on how to build them.

Continue reading Four short links: 5 September 2018.

Categories: Technology

Learn about data governance with these books, videos, and tutorials

O'Reilly Radar - Wed, 2018/09/05 - 03:00

This collection of data governance resources will get you up to speed on the basics and best practices.

Whether you’re just getting started with data governance or you have previous experience, you’ll find something useful on this list of data governance resources.

The items on this list were curated by O’Reilly’s editorial experts.

Continue reading Learn about data governance with these books, videos, and tutorials.

Categories: Technology

130+ live online training courses opened for September and October

O'Reilly Radar - Wed, 2018/09/05 - 03:00

Get hands-on training in machine learning, blockchain, Java, software architecture, leadership, and many other topics.

Learn new topics and refine your skills with more than 130 live online training courses we opened up for September and October on our learning platform.

Artificial intelligence and machine learning

High Performance TensorFlow in Production: Hands on with GPUs and Kubernetes, September 11-12

Deep Learning for Machine Vision, September 20

Essential Machine Learning and Exploratory Data Analysis with Python and Jupyter Notebook, September 24-25

Deep Learning for Natural Language Processing (NLP), October 1

Essential Machine Learning and Exploratory Data Analysis with Python and Jupyter Notebook, October 1-2

Artificial Intelligence for Big Data, October 1-2

Artificial Intelligence: AI For Business, October 2

Managed Machine Learning Systems and Internet of Things, October 4-5

Machine Learning with R, October 10-11

Machine Learning in Practice, October 12

Getting Started with Machine Learning, October 15


Blockchain Applications and Smart Contracts, October 11

Understanding Hyperledger Fabric Blockchain, October 18-19

Introducing Blockchain, October 31


Employee Onboarding for Managers, September 6

Introduction to Employee Performance Management, September 18

Introduction to Leadership Skills, October 2

Employee Onboarding for Managers, October 4

Leadership Communication Skills for Managers, October 8

Managing Team Conflict, October 9

Negotiation Fundamentals, October 10

Applying Critical Thinking, October 15

Mastering Usability Testing, October 30

Performance Goals for Growth, October 31

Data science and data tools

Kafka Fundamentals, September 10-11

Building Distributed Pipelines for Data Science Using Kafka, Spark, and Cassandra, September 10-12

Introduction to DAX: Elevate your Data Models with Powerful Calculations, September 13

Programming with Data: Python and Pandas, September 17

Advanced SQL Series: Relational Division, September 19-20

Mastering Relational SQL Querying, September 19-20

SQL for any IT Professional, October 4

Julia 1.0 Essentials, October 8

Building Distributed Pipelines for Data Science Using Kafka, Spark, and Cassandra, October 10-12

Shiny R, October 17

Practicing Agile Data Science, October 19

Fundamental PostgreSQL, October 24-25

Hands-On Introduction to Apache Hadoop and Spark Programming, October 24-25

Introduction to DAX: Elevate your Data Models with Powerful Calculations, October 29


Java Full Throttle with Paul Deitel: A One-Day, Code-Intensive Java Standard Edition Presentation, September 11

Design Patterns in Java, September 18-19

Linux Under the Hood, September 20

Java 8 Generics in 3 Hours, September 21

Bash Shell Scripting in 3 Hours, September 26

What's New in Java, September 28

Getting Started with Computer Vision Using Go, October 1

Consumer Driven Contracts - A Hands-On Guide to Spring Cloud Contract, October 3

Functional Programming in Java, October 3-4

Reactive Programming with Java 8 Completable Futures, October 4

Beginning IoT with JavaScript, October 4-5

JavaScript The Hard Parts: Closures, October 5

Linux Filesystem Administration, October 8-9

Getting Started with Spring and Spring Boot, October 8-9

OCA Java SE 8 Programmer Certification Crash Course Java Certification, October 8-10

Reactive Spring and Spring Boot, October 10

Learn the Basics of Scala in 3 hours, October 10

Scala Fundamentals: From Core Concepts to Real Code in 5 Hours, October 11

Using Redux to Manage State in Complex React Applications, October 11

Clean Code, October 15

Basic Android Development, October 15-16

Object-Oriented GUI design in Java, October 16

Programming with Java 8 Lambdas and Streams, October 16

Design Patterns in Java GUI Development, October 17

Next-Generation Java Testing with JUnit 5, October 17

Fundamentals of Virtual Reality Technology and User Experience, October 17

Setting up Scala Projects, October 19

Python Programming Fundamentals, October 19

Getting Started with Java: From Core Concepts to Real Code in 4 Hours, October 22

Kotlin for Android, October 22-23

Scala: Beyond the Basics, October 22-23

Java Testing with Mockito and the Hamcrest Matchers, October 24

Mastering Go for UNIX administrators, UNIX developers and Web Developers, October 24-25

Object Oriented Programming in C# and .NET Core, October 26

Intermediate Git, October 29

Groovy Programming for Java Developers, October 30-31

Modern JavaScript, November 29


Certified Ethical Hacker (CEH) Crash Course, September 24-25

CCNP R/S ROUTE (300-101) Crash Course, September 25-27

Introduction to Encryption, October 2

Introduction to Digital Forensics and Incident Response (DFIR), October 5

CISSP Crash Course, October 17-18

Cyber Security Fundamentals, October 22-23

Certified Ethical Hacker (CEH) Crash Course, October 25-26

Software architecture

Domain-Driven Design and Event-Driven Microservices, September 17-18

From Monolith to Microservices, September 19-20

Amazon Web Services: Architect Associate Certification - AWS Core Architecture Concepts, September 20-21

Information Architecture: Research and Design, September 25

Implementing Evolutionary Architectures, September 26-27

Microservices Architecture and Design, October 1-2

From Developer to Software Architect, October 2-3

From Monolith to Microservices, October 17-18

Architecture by Example, October 17-18

Amazon Web Services: AWS Design Fundamentals, October 17-18

Shaping and Communicating Architectural Decisions, October 22

Amazon Web Services: Architect Associate Certification - AWS Core Architecture Concepts, October 22-23

AWS Certified Solutions Architect Associate Crash Course, October 22-23

Implementing Evolutionary Architectures, October 24-25

Systems engineering and operations

AWS CloudFormation Deep Dive, September 20-21

Ansible in 3 Hours, September 21

Amazon Web Services Security Crash Course, September 25

AWS Certified Cloud Practitioner Crash Course, September 25-26

Docker: Beyond the Basics (CI/CD), September 26-27

Deploying Container-Based Microservices on AWS, October 1-2

9 Steps to Awesome with Kubernetes, October 3

Google Cloud Platform (GCP) for AWS Professionals, October 3

Google Cloud Certified Associate Cloud Engineer Crash Course, October 4-5

Learn Serverless Application Development with Webtask, October 8

Amazon Web Services: AWS Managed Services, October 8-9

CCNA Security Crash Course, October 9-10

Red Hat Certified Engineer (RHCE) Crash Course, October 9-12

Getting Started with Continuous Delivery (CD), October 11

Practical Kubernetes, October 11-12

Jenkins 2: Beyond the Basics, October 15

Introduction to Google Cloud Platform, October 15-16

Serverless Architectures with Azure, October 15-16

CCNA Routing and Switching 200-125 Crash Course, October 16, 18, 23, 25

Hands-On with Google Cloud AutoML, October 19

Introduction to Kubernetes, October 22-23

Red Hat Certified System Administrator (RHCSA) Crash Course, October 23-26

Practical Docker, October 24

Building and Managing Kubernetes Applications, October 25

AWS Monitoring Strategies, October 29

CCNP R/S SWITCH (300-115) Crash Course, October 29-31

Building a Cloud Roadmap, October 30

An Introduction to DevOps with AWS, October 30

Linux Foundation System Administrator (LFCS) Crash Course, October 30-31

Web programming

Using Redux to Manage State in Complex React Applications, September 13

Advanced Angular Applications with NgRx, September 24-25

First Steps with Angular, September 26

Getting Started with HTML and CSS, September 27

Rethinking REST: A Hands-On Guide to GraphQL and Queryable APIs, October 1

Better Angular Applications with Observables: QuickStart, October 5

Beginning API Development with Node.js, October 9-10

Component Driven Architecture in Angular, October 10

Angular Testing Quickstart, October 12

Advanced Angular Applications with NgRx, October 18-19

Developing Web Apps with Angular and TypeScript, October 29-31

Full Stack Development with MEAN, October 30-31

Continue reading 130+ live online training courses opened for September and October.

Categories: Technology

Four short links: 4 September 2018

O'Reilly Radar - Tue, 2018/09/04 - 04:50

New Hardware, Image Discovery, Interactive SQL, and Fooling Object Detection

  1. GATech Rogues Gallery -- acquire new and unique hardware (i.e., the aforementioned "rogues") from vendors, research labs, and startups, and make this hardware available to students, faculty, and industry collaborators within a managed data center environment. By exposing students and researchers to this set of unique hardware, we hope to foster cross-cutting discussions about hardware designs that will drive future performance improvements in computing long after the Moore's Law era of "cheap transistors" ends. (via Next Platform)
  2. The Art and Science of Image Discovery at Netflix -- really interesting breakdown of the process they go through to automatically identify good stills to use as ads for the video.
  3. Select Star SQL -- an interactive book that aims to be the best place on the internet for learning SQL. Nice. SQL and notebooks are a great idea, especially for education.
  4. The Elephant in the Room -- We showcase a family of common failures of state-of-the art object detectors. These are obtained by replacing image sub-regions by another sub-image that contains a trained object. We call this "object transplanting." Modifying an image in this manner is shown to have a non-local impact on object detection. Slight changes in object position can affect its identity according to an object detector as well as that of other objects in the image. We provide some analysis and suggest possible reasons for the reported phenomena.

Continue reading Four short links: 4 September 2018.

Categories: Technology

Highlights from JupyterCon in New York 2018

O'Reilly Radar - Fri, 2018/08/24 - 09:05

Watch keynotes covering Jupyter's role in business, data science, higher education, open source, journalism, and other domains, from JupyterCon in New York 2018.

People from across the Jupyter community are coming together in New York for JupyterCon. Below you'll find links to keynotes from the event.

All the cool kids are doing it, maybe we should too? Jupyter, gravitational waves, and the LIGO and Virgo Scientific Collaborations

Will Farr offers lessons about the many advantages and few disadvantages of using Jupyter for global scientific collaborations.

Jupyter trends in 2018

Paco Nathan shares a few unexpected things that emerged in Jupyter in 2018.

Sustaining wonder: Jupyter and the knowledge commons

Carol Willing shows how Jupyter's challenges can be addressed by embracing complexity and trusting others.

Jupyter in the enterprise

Luciano Resende explores some of the open source initiatives IBM is leading in the Jupyter ecosystem.

The reporter’s notebook

Mark Hansen explains how computation has forever changed the practice of journalism.

Why contribute to open source?

Julia Meinwald outlines effective ways to support the unseen labor maintaining a healthy open source ecosystem.

Machine learning and AI technologies and platforms at AWS

Dan Romuald Mbanga walks through the ecosystem around the machine learning platform and API services at AWS.

Democratizing data

Tracy Teal explains how to bring people to data and empower them to address their questions.

The future of data-driven discovery in the cloud

Ryan Abernathey makes the case for the large-scale migration of scientific data and research to the cloud.

Keynote by Michelle Ufford


Jupyter notebooks and the intersection of data science and data engineering

David Schaaf explains how data science and data engineering can work together to deliver results to decision makers.

Disease prediction using the world's largest clinical lab dataset


Data science as a catalyst for scientific discovery

Michelle Gill discusses how data science methods and tools can link information from different scientific fields and accelerate discovery.

Sea change: What happens when Jupyter becomes pervasive at a university?

Fernando Perez talks about UC Berkeley's transition into an environment where many undergraduates use Jupyter and the open data ecosystem as naturally as they use email.


Continue reading Highlights from JupyterCon in New York 2018.

Categories: Technology

Sustaining wonder: Jupyter and the knowledge commons

O'Reilly Radar - Fri, 2018/08/24 - 09:00

Carol Willing shows how Jupyter's challenges can be addressed by embracing complexity and trusting others.

Continue reading Sustaining wonder: Jupyter and the knowledge commons.

Categories: Technology

Jupyter trends in 2018

O'Reilly Radar - Fri, 2018/08/24 - 09:00

Paco Nathan shares a few unexpected things that emerged in Jupyter in 2018.

Continue reading Jupyter trends in 2018.

Categories: Technology

Jupyter in the enterprise

O'Reilly Radar - Fri, 2018/08/24 - 09:00

Luciano Resende explores some of the open source initiatives IBM is leading in the Jupyter ecosystem.

Continue reading Jupyter in the enterprise.

Categories: Technology

The reporter’s notebook

O'Reilly Radar - Fri, 2018/08/24 - 09:00

Mark Hansen explains how computation has forever changed the practice of journalism.

Continue reading The reporter’s notebook.

Categories: Technology

Why contribute to open source?

O'Reilly Radar - Fri, 2018/08/24 - 09:00

Julia Meinwald outlines effective ways to support the unseen labor maintaining a healthy open source ecosystem.

Continue reading Why contribute to open source?.

Categories: Technology

Machine learning and AI technologies and platforms at AWS

O'Reilly Radar - Fri, 2018/08/24 - 09:00

Dan Romuald Mbanga walks through the ecosystem around the machine learning platform and API services at AWS.

Continue reading Machine learning and AI technologies and platforms at AWS.

Categories: Technology

Four short links: 24 August 2018

O'Reilly Radar - Fri, 2018/08/24 - 04:00

Scheduling Notebooks, Telepresence Parasite, Bite-Size ML Tutorials, and AI Data Sheets

  1. Scheduling Notebooks -- we’re currently in the process of migrating all 10,000 of the scheduled jobs running on the Netflix Data Platform to use notebook-based execution.
  2. Fusion: A Collaborative Robotic Telepresence Parasite That Lives on Your Back -- I'm in favor of any telepresence system that lets me remotely punch people.
  3. 100 Days of ML Code -- tutorials, open sourced.
  4. Fact Sheet for AI (IBM) -- Fairness, safety, reliability, explainability, robustness, accountability—we all agree they are critical. Yet, to achieve trust in AI, making progress on these issues will not be enough; it must be accompanied by the ability to measure and communicate the performance levels of a system on each of these dimensions. One way to accomplish this would be to provide such information via SDoCs or factsheets for AI services.

Continue reading Four short links: 24 August 2018.

Categories: Technology

5 automation trends in software development, quantified

O'Reilly Radar - Thu, 2018/08/23 - 06:55

Lessons from hundreds of development practice assessments across the industry.

For more than 15 years, my colleagues and I at the Software Improvement Group (SIG) have been in the business of evaluating the quality of code, architecture, and development practices for our customers.

Recently, we dove into our assessment data to discover—and quantify—trends in software development, each time comparing 2016 to 2017. Below, we summarize our findings into five quantified trends, then discuss the developments that drive them. Spoiler alert: they are all about automation.

Trend 1: The percentage of teams that do not automate deployment shrank from 26% in 2016 to 11% in 2017

We measure the practice of deployment automation by looking at whether teams have a deployment process in place that is quick, repeatable, and (preferably) fully automated.

For example, a team that deploys each new release with a single push of a button would receive a perfect score on this practice, codified as fully applied. But a team that needs to go through a limited number of well-documented manual steps would be scored as partially applied.

Figure 1. More teams fully apply deployment automation (43%, up from 30%) and fewer teams do not apply deployment automation at all (11%, down from 26%). Image by Joost Visser. Trend 2: Teams that fully apply continuous integration (CI) now outnumber those that don’t (41% versus 32% in 2017; was 33% versus 39% in 2016)

The trend for continuous integration (automatic compilation and testing after each change) lags behind the trend for deployment automation, but overall, it shows a similar improvement.

Figure 2. Full or partial adoption of continuous integration (68% in 2017) has improved significantly but still lags compared to deployment automation (89% in 2017). Image by Joost Visser. Trend 3: The number of teams that apply continuous delivery (CD) is a small but growing minority (16% in 2017 versus 11% in 2016)

Especially for continuous delivery (automatic deployment after each change), the great majority of teams (and the organizations of which they are part) still have a long way to go. But their numbers are growing.

Figure 3. Most teams still do not apply continuous delivery, either fully (16%; was 11%) or partially (12%; was 6%). Image by Joost Visser. Trend 4: More teams are enforcing full quality control (31% in 2017, up from 23% in 2016)

To assess code quality control, we observe whether a team works with clear coding standards, systematically reviews code (against these standards and against principles of good design), and whether they perform automated code quality checks. Full adoption of code quality control has increased somewhat (31%, up from 23%), but 20% of teams are still producing code without adequate quality control in place.

Figure 4. Fewer teams are failing to enforce consistent code quality control (20%, down from 25%). Image by Joost Visser. Trend 5: The number of teams that change their code without proper regression testing is declining but was still a staggering 41% in 2017 (down from 48% in 2016)

To assess testing practices, we observe whether teams have an automated regression test suite that is being executed consistently after each change. Full adoption of this practice is increasing (33%, up from 29%). But changing code without proper regression testing is still uncannily common (41%; was 48%).

Figure 5. Fewer teams fail to run automated test at each commit (41%, down from 48%). Image by Joost Visser. So what do all of these numbers mean?

End-to-end automation of the software development process has gone mainstream. While automation of individual programming tasks (compilation, testing, documentation, etc.) has been part and parcel of our discipline for many years, we now see that all modern software development teams are striving to automate as much of the software development process as possible.

Automation helps to increase development speed, limit knowledge dissipation, and build quality into every step. If your team isn’t automating, now is the time to start.

If you’d like to learn where your team stands in terms of the above practices and trends, you can do a quick self-assessment by taking our survey.

This post is a collaboration between O'Reilly and SIG. See our statement of editorial independence.

Continue reading 5 automation trends in software development, quantified.

Categories: Technology

Design patterns for orchestrating collaborative groups

O'Reilly Radar - Thu, 2018/08/23 - 04:00

It’s only when you enable people to “do things” together that the real power of online social networks is unleashed.

I still remember the moment I saw a big piece of the future. It was mid-1999, and Dave Winer called to say there was something I had to see.

He showed me a web page. I don’t remember what the page contained except for one button. It said, "edit_this_page"—and, for me, nothing was ever the same again.

I clicked the button. Up popped a text box containing plain text and a small amount of HTML, the code that tells a browser how to display a given page. Inside the box I saw the words that had been on the page. I made a small change, clicked another button that said, “Save this page” and voilà, the page was saved with the changes....

Dave was a leader in a move that brought back to life the promise, too long unmet, that Tim Berners-Lee, inventor of the Web, had wanted from the start. Berners-Lee envisioned a read/write Web. But what had emerged in the 1990s was an essentially read-only Web on which you needed an account with an ISP to host your web site, special tools, and/or HTML expertise to create a decent site.

What Dave and the other early blog pioneers did was a breakthrough. They said the Web needed to be writeable, not just readable, and they were determined to make doing so dead simple.

Thus, the read/write Web was truly born again.



The first thing I ever posted on the Web was person: a story ( The next thing I produced and posted was collaborative: a magazine ( This was 1994. We knew we wanted to engage newcomers more fully than the traditional letters to the editor, and many of the letters (and email messages) we received at the time were submissions. People wanted to work with us and we wanted to work with them, and they came from all over the world. We managed for four years with no system in place besides a series of personal understandings, but any form of collaboration requires some form of orchestration, and our ad hoc approach didn’t scale.

In the earliest days of online social-networking applications (think SixDegrees and Friendster), there eventually came the “so what” problem: you could make an account, register your name, find people, connect to them, and then... what? There was no there there. You might be able to form groups and discuss things, but of course you could already do that through a lot of other interfaces (such as email lists and Usenet, for Pete’s sake), even if they weren’t explicitly noted as social.

No, it’s only when you begin enabling people to “do things” together that the real power of online social networks kicks in.

Today, it’s possible to orchestrate collaborative groups through a series of time-tested, well-proven design patterns. These patterns provide people with a shared space, give them a way to invite others, provide the means for managing tasks, employ version control, and look after people’s rights.

Wiki projects, such as the omnipresent Wikipedia; open source software development using tools such as Sourceforge, Collabnet, and Github; Yahoo! Groups; and charismatically driven groups of people such as Ze Frank’s ( fanbase, which you can see in Figure 1-1, have all demonstrated the power that can be unleashed when you give people interfaces for working together on their shared concerns.

Figure 1-1. Ze Frank’s “If the earth were a sandwich” challenge recruited numerous participants into attempting to place slices of bread at antipodes, to turn the planet into a sandwich.Manage Project What

When people get together and form groups, they often discover a shared desire to accomplish something tangible or complex, frequently something with a real-world (offline) impact (see Figure 1-2).

Figure 1-2. You can use most social interfaces to organize projects by sheer force of effort, but it’s easier if you’ve got at least the fundamentals of project management available, such as tasks, calendars, file upload, and collaborative editing.

This pattern is also known as a “Workspace” pattern.

Use when

Use this pattern when you have enabled group formation and wish to host and support group project activities. If you don’t have the bandwidth (literally or figuratively) to support this, consider supporting third-party services.

  • Support your members’ ability to orchestrate projects by coordinating goals, tasks, and deadlines among multiple participants with varying degrees of commitment and availability.

  • Provide a workspace for connecting all the facets of the project (people, tasks, dates, collateral) and, if possible, offer a summarized dashboard view linking to more detailed inventories by facet. This makes asynchronous communication possible across disconnected geographies.

  • Provide a mechanism for the creator of the project or a participant to bring in collaborators with Send Invitation, and possibly to assign varying rights by individual or group, as shown in Figure 1-3.

    Figure 1-3. With Basecamp, you can add an entire company (team) to your project or invite individuals by adding them to an existing company.
  • Support task management with the ability to assign tasks, accept tasks, and distribute processes among multiple participants by breaking them down into individual tasks. Optionally support the ability to declare that one task is dependent on another and possibly calculate the critical path to the end goal.

  • Provide a calendar on which deadline and milestone dates can be scheduled and then verified.

  • Offer the ability to send messages to project participants, as well as reminders and notifications.

  • Provide a means for collaborative editing of documents or source code, including version control (Figure 1-4).

    Figure 1-4. At GitHub, I can make my own clone of the YUI 3.0 codebase, fork it, and then have it merged back into the main trunk.
  • Include a way for project participants to make and keep track of decisions.

  • Optionally provide an interface for project blogging or statuscasting so that project participants can report on their progress and anyone can see at a glance what has been happening lately on the project. Or, provide a timeline view on the dashboard to roll up all recent events in chronological order.


Giving your community members the tools to work together or comanage their own efforts increases the utility of your service and the culture of the social environment. However, your users can often do this effectively via email and phone and perhaps a filesharing system. Do you have anything more to offer? Do you need to?

Related patterns


Face-to-Face Meeting

Group Conversation

Open APIs

Send Invitation

Activity Streams

As seen on



Basecamp (

Bugzilla (

Github (

Groove (

SharePoint (

Traction (

Voting What

To make decisions, the members or stakeholders of a group need a way to give their opinions, and project leaders need to know which options have the most support from the participating community, as illustrated in Figure 1-5.

Figure 1-5. Polls are one way to gather directed input from collaborators.

This pattern is also known as “Polls” or “Surveys.”

Use when

Use this pattern to collect the opinion of a group of people around a topic (with facets).

This pattern works best when groups are large enough that only a core subgroup is doing most of the collaboration, to provide a voice to the less fully engaged members of the group.

Voting can be in an enterprise or workgroup context. It also can integrate equally well in a consumer context, in which a group of people freely associating with one another need ways to discern their preferences and make collective decisions.


Provide a form by which a group moderator or participant can suggest a question or topic to be voted on, and then facilitate a series of possible votes (anything from “yes” or “no” to a multiple-choice option).

Optionally, provide configuration choices governing such issues as how long the voting will remain open, whether users can change their vote, whether votes are anonymous or open, whether a person is restricted to vote for a single choice or can vote for more than one, or whether a ranking of choices is preferred, as demonstrated in Figure 1-6.

Figure 1-6. Yahoo! Groups makes it easy to create an instant poll and invite the members of the group to vote in it, including the ability to change their vote up to a deadline.Why

Voting and surveys provide a means of soliciting feedback about specific questions from a wider participating community.

Note that it’s possible for some users to game voting systems, especially if no fixed identity is required or authenticated before voting; voting can provide perverse incentives in much the way that a leaderboard can; and there are many competing voting algorithms out there, each with its own pros and cons.

Related patterns

Ratings (Stars or 1–5)

Reputation Influences Behavior

Thumbs Up/Down Ratings

Vote to Promote

As seen on

Evite (

SurveyMonkey (

Yahoo! Groups (

Collaborative Editing What

People like to be able to work together on documents, encyclopedias, and software codebases, as depicted in Figure 1-7.

Figure 1-7. With asynchronous editing, multiple people can work on the same document.Use when

Use this pattern when you want your members to be able to work together to curate their collective wisdom or document their shared knowledge.

  • Provide a repository for hosting documents with version control. Give users a way to bring in additional collaborators with an invitation to participate, as in Figure 1-8.

    Figure 1-8. You can use the “Invite to Participate” pattern to invite collaborators to work together on a document.
  • Provide an Edit This Page link (see Edit This Page) directly on the document to be edited, or give users the means to upload incrementally updated versions of a stored document.

  • For direct editing, provide an edit box, similar to a blog or comment interface, such as that shown in Figure 1-9.

    Figure 1-9. It doesn’t get more meta than this: here I am editing this very pattern in the collaborative wiki where it lives outside of the book.
  • Optionally, give contributors mechanisms for tracking changes, whether via notifications or with RSS feeds.


Collaborative editing is better suited to the online (web or cloud) contexts than the alternative: sending documents via email to multiple participants and then orchestrating the proliferating multiple, asynchronous updated copies of a document, with aspirational filenames ending in “finalFinalfinal,” as painfully demonstrated in Figure 1-10.

Figure 1-10. Collaborative editing does away with multiple copies of files, unreconciled changes, and email overload.Related patterns


As seen on

Drupal (

Google Docs (

Mediawiki (

SocialText (

SubEthaEdit (

Twiki (

Wikipedia (

Writeboard (

Numerous FAQ documents that accompany active Usenet newsgroups


Also known as suggested edits, proposed changes, tracked changes, or “track changes,” or pull request.


People like to be able to ask for help with a document without always wanting to give direct editing control over the content itself. Other people feel more comfortable proposing changes and allowing the owner of document to accept or reject the changes, as shown in Figure 1-11.

Figure 1-11. Medium enables readers (and invited reviewers prepublication) to add comments in the margin of any paragraph, which remain unpublished until the author has a chance to review and either make the comment public or delete it.

This same model applies outside of documents as well—for example, when a Git user makes a pull request, proposing that a change be added to the core.

Use when

Use this pattern when more fine-grained control over editing permissions or review flow will help people be more productive.


Define a role or a mode in which edits or comments are not accepted or made public automatically but are instead left in a pending state for review by the document owner, who can choose to accept or implement the suggestion, or reject or ignore it.

There are two forms of this pattern, which can be used individually or mixed together:

  • Pending edits (as with the Track Changes feature in Microsoft Word or the suggestion feature offered by Google Docs), in which a proposed edit is entered directly into the document and then either implemented or rejected.

  • Pending comments (as with Medium), in which a suggestion is made in the context around the document and not directly in the form of its content. The suggestion is then either made public or left hidden or removed.


A layer of pending or proposed changes around a document makes room for discussion and deliberation and gives the owner of the document clarity about the final result.

As seen on
  • Medium

  • Google Docs

  • Git

Edit This Page What

The more difficult it is to edit a shared document, the fewer will be the number of people who will bother to do so. Even forcing people to switch contexts (to an “editing mode”) will create a barrier to participation for a significant fraction of potential contributors (see Figure 1-12).

Figure 1-12. A button or link inviting the reader to edit this page encourages collaboration (and lowers the threshold for making improvements by reducing the friction involved in offering edits).

This pattern is also known as “Edit This, “Universal Edit Button,” “Inline Editing,” “Read-Write Web,” or “Two-Way Web.”

Use when

Use this pattern in interfaces for editing shared or personal documents. You can use this for contexts in which universal editing, anonymous editing, or registered, authenticated, and privileged editing is permitted.

  • Provide a button or link on any editable content that links directly to an edit box for the content, preferably without even loading a new page, as illustrated in Figure 1-13.

    Figure 1-13. If you can display an edit box directly in the reader’s original context, the experience of making and saving an edit and then resuming reading is smoother than if the editing must be done in a separate context.
  • Optionally, when restricting editing only to privileged groups, hide the button from anyone who has not been authenticated as a contributor.

  • Consider providing a WYSIWYG editing environment. This will reduce one of the barriers to participation for the majority of people who are not comfortable using abbreviated markup languages to format and style text.

Special cases

When trying to cultivate a culture of collaborative editing, community moderators might need to make an extra effort to recruit, campaign, and encourage contributions. By default, many people are passive, even when invited to edit content, because they are afraid to break something or give offense to a preceding editor. The interface should be as inviting as possible, but be prepared to challenge incumbent behavioral patterns.

Offer a “sandbox” area for beginners (see Figure 1-14), in which they can practice editing safely without worrying about damaging anything or exposing themselves to criticism.

Figure 1-14. Giving your collaborators a sandbox in which to practice their editing skills can ease the slope of the learning curve and take some of the fear out of inline editing.Why

The great promise of the Web draws in part from its facilitation of two-way communication and collaboration across geographical and other boundaries. An interface element that invites the reader to become an author goes beyond the “second-class” forms of participation, such as giving feedback and ratings. The easier you make it to edit content, the more likely people will take the time to do so, and potentially spur one another on to build knowledge stores and other projects that otherwise might never have come into being.

As seen on

Wikipedia (

Just about every wiki, everywhere

The Wiki Way What

Collaborative editing can become bogged down in conversational mode. Moreover, when contributors become too attached to their own individual contributions, this can impede the development of the collaborative document (see Figure 1-15).

Figure 1-15. Many of the principles underpinning Ward Cunningham’s original wiki (created to house the Portland Pattern Repository) should be kept in mind when you’re trying to facilitate effective collaborative editing in a community setting.Use when

Use this pattern when providing an interface for collaborative editing.


Encourage anonymous editing, use version control, and enable refactoring of document content by contributors.

Here are the original principles Ward Cunningham cited when recalling the design principles that underpinned the first wiki:


Should a page be found to be incomplete or poorly organized, any reader can edit it as he sees fit.


Pages can cite other pages, including pages that have not been written yet.


The structure and text content of the product are open to editing and evolution.


A small number of (irregular) text conventions will provide access to the most useful page markup.


The mechanisms of editing and organizing are the same as those of writing; thus, any writer is automatically an editor and organizer.


The formatted (and printed) output will suggest the input required to reproduce it.


Page names will be drawn from a flat space so that no additional context is required to interpret them.


Pages will be titled with sufficient precision to avoid most name clashes, typically by forming noun phrases.


Interpretable (even if undesirable) behavior is preferred to error messages.


Activity within the product can be watched and reviewed by any other visitor.


Duplication can be discouraged or removed by finding and citing similar or related content.

There are many wiki authors and implementers. Here are some additional principles that guide them, but were not of primary concern to me:


This is the most important thing in a wiki. Trust the people, trust the process, foster trust-building. Everyone controls and checks the content. Wiki relies on the assumption that most readers have good intentions (but assume that there are limitations to good faith).


Everybody can contribute, but nobody has to.


The dissemination of information, knowledge, experience, ideas, views, and so on is paramount.


The wiki approach has unleashed a torrent of creativity on the Web and seems to have captured in its principles the fundamental grain of digital, electronic, web-enabled collaboration.

Related patterns

Learn from Games

Passive Sharing

Chapter 17: Corporations Are People, My Friend

As seen on

WikiWikiWeb (

Crowdsourcing What

Some jobs are too big for the immediate group of engaged collaborators to manage on by itself. The community will benefit if the interface provides a way to break a large project into smaller pieces and engage and give incentives to a wider group of people (or “crowd”) to tackle those smaller pieces, as shown in Figure 1-16.

Figure 1-16. Amazon’s Mechanical Turk plays matchmaker to people looking for distributed help in solving problems or answering questions, and other people willing to do work such as this for a fee.Use when

Use this pattern when you want your active core community members to engage with the wider set of people participating in your social environment and get their help accomplishing ambitious projects that would not be possible with fewer people.

  • Provide a method for splitting up a project into individual tasks so that each task can be advertised individually. Also, provide a venue for announcing crowdsourced projects.

  • Give community members a way to “shop for,” review, and claim individual tasks for the project.

  • Provide an upload interface or submission form with which participants can contribute their completed work (assuming the work isn’t accomplished directly in your interface).

  • Keep track of tasks that have been claimed but not completed by their deadline so that they can be returned to the general pool and reassigned.

  • Ideally, offer a dashboard view for management of the project.

  • Where appropriate, incorporate a mechanism for compensation for the participants.


Crowdsourcing breaks large jobs into pieces that can be tackled with a much lower commitment threshold, taking advantage of the loose ties in social networks.

As seen on

Amazon Mechanical Turk (

Assignment Zero (

The ESP Game (

iStockphoto (


SETI@home (

Threadless (

Further Reading
  1. “Berners-Lee on the read/write web.” BBC News, August 9, 2005.

  2. Cross Cultural Collaboration.

  3. Marjanovic, Olivera, Hala Skaf-Molli, Pascal Molli, and Claude Godart. “Deriving Process-driven Collaborative Editing Pattern from Collaborative Learning Flow Patterns.”

  4. Winer, Dave. Edit This Page.

  5. Edit This Page PHP.

  6. Paylancers blog.

  7. The Power of Many.

  8. Regulating Prominence: A Design Pattern for Co-Located Collaboration.

  9. Howe, Jeff. “The Rise of Crowdsourcing.” Wired, 14.06.

  10. Venners, Bill. “The Simplest Thing That Could Possibly Work.”

  11. Universal Edit Button.

  12. Wiki Design Principles.

  13. Udell, Jon The Wiki Way.

  14. Wired Crowdsourcing blog.

Continue reading Design patterns for orchestrating collaborative groups.

Categories: Technology

Four short links: 23 August 2018

O'Reilly Radar - Thu, 2018/08/23 - 03:55

Visualizing Toxicity, Rubrics, Mozilla Fellows, and Open Source

  1. Visualizing Toxicity in Twitter Conversations -- The project started with an initial design discussion in which we all agreed it would be cool to somehow visualize Twitter conversations as natural-looking trees, where replies form branches and the more toxic the reply, the more withered the branch would look. At this point, I had no idea how I’d even approach rendering a withered tree, but it sounded like a fun experiment, so I said I’d look into it and do my best.
  2. Rubrics for Engineering Role -- I love rubrics and ladders, and this combination would make me very happy.
  3. Mozilla's New Openness, Science, and Tech Policy Fellows -- interesting mix of projects and people.
  4. The Commons Clause Will Destroy Open Source -- the Commons Clause doesn’t present a solution for supporting open source software. It presents a framework for turning open source software into proprietary software. My take: open source is most valuable when it's free. Limiting freedom (including freedom to sell) limits the usefulness of the software. Create more value than you capture!

Continue reading Four short links: 23 August 2018.

Categories: Technology

Four short links: 22 August 2018

O'Reilly Radar - Wed, 2018/08/22 - 04:15

Software Licenses, Crowdsourced Laws, USB Power Over Ethernet, and ML Fairness

  1. Commons Clause -- a condition added to existing open source software licenses to create a new, combined software license.
  2. vTaiwan (MIT TR) -- the simple but ingenious system Taiwan uses to crowdsource its laws.
  3. Power USB Devices Over Ethernet -- cute!
  4. Fairness Without Demographics (Paper a Day) -- After showing that representation disparity and disparity amplification are serious issues with the current status quo, the authors go on to introduce a method based on distributionally robust optimization (DRO) which can can control the worst-case risk.

Continue reading Four short links: 22 August 2018.

Categories: Technology


Subscribe to LuftHans aggregator