You are here

O'Reilly Radar

Subscribe to O'Reilly Radar feed
All of our Ideas and Learning material from all of our topics.
Updated: 1 week 3 hours ago

Four short links: 8 February 2019

Fri, 2019/02/08 - 04:55

Data Explorer, PDP-1 in FPGA, Google's Fuzzer, and Preventing Neophilia

  1. Blazer -- Explore your data with SQL. Easily create charts and dashboards, and share them with your team.
  2. FPG-1 -- PDP-1 FPGA implementation in Verilog, with CRT, Teletype, and Console. The PDP-1 was groundbreaking: serial number 0 was delivered to the BBN offices where Licklider would see it as a way forward to his timesharing vision. From The Dream Machine: "The PDP-1 was revolutionary," Fredkin declares, still marveling four decades later. "Today such things don't happen. Today a machine comes along and is slightly faster than its competitors. But here was a machine that was off the charts. Its price performance ratio was spectacularly better than anything that had come before."
  3. ClusterFuzz -- a scalable fuzzing infrastructure that finds security and stability issues in software. See Google's announcement of the open-sourcing of it.
  4. Questions for a New Technology -- They aren’t particularly subtle in their bias. They aren’t supposed to be. They also aren’t meant to be a series of boxes to be checked or hoops to be jumped through.

Continue reading Four short links: 8 February 2019.

Categories: Technology

Four short links: 7 February 2019

Thu, 2019/02/07 - 05:00

VR, Learning Robot, Bubble Sort, and Graph Neural Networks

  1. Hamlet in Virtual Reality -- context for WGBH's Hamlet 360. It's 360º video, so you can pick what you look at but not where you look at it from. Interesting work, and a reminder that we're still trying to figure out what kinds of stories these media lend themselves to, and how best to tell stories with them.
  2. Self-Taught Robot Figures Out What It Looks Like and What It Can Do -- To begin with, the robot had no idea what shape it was and behaved like an infant, moving randomly while attempting various tasks. Within about a day of intensive learning, the robot built up an internal picture of its structure and abilities. After 35 hours, the robot could grasp objects from specific locations and drop them in a receptacle with 100% accuracy. Paper is behind a paywall, though Sci-Hub has it.
  3. Bubble Sort: An Archaeological Algorithmic Analysis -- Text books, including books for general audiences, invariably mention bubble sort in discussions of elementary sorting algorithms. We trace the history of bubble sort, its popularity, and its endurance in the face of pedagogical assertions that code and algorithmic examples used in early courses should be of high quality and adhere to established best practices. This paper is more an historical analysis than a philosophical treatise for the exclusion of bubble sort from books and courses. However, sentiments for exclusion are supported by Knuth: "In short, the bubble sort seems to have nothing to recommend it, except a catchy name and the fact that it leads to some interesting theoretical problems." Although bubble sort may not be a best practice sort, perhaps the weight of history is more than enough to compensate and provide for its longevity.
  4. Comprehensive Survey on Graph Neural Networks -- We propose a new taxonomy to divide the state-of-the-art graph neural networks into different categories. With a focus on graph convolutional networks, we review alternative architectures that have recently been developed; these learning paradigms include graph attention networks, graph autoencoders, graph generative networks, and graph spatial-temporal networks. We further discuss the applications of graph neural networks across various domains and summarize the open source codes and benchmarks of the existing algorithms on different learning tasks. Finally, we propose potential research directions in this fast-growing field.

Continue reading Four short links: 7 February 2019.

Categories: Technology

Design after Agile: How to succeed by trying less

Wed, 2019/02/06 - 14:00

Stuart Halloway explains how to augment agility with principles for designing systems.

Continue reading Design after Agile: How to succeed by trying less.

Categories: Technology

Design and architecture: Special Dumpster Fire Unit

Wed, 2019/02/06 - 14:00

Matt Stine looks at the tricky situations that sometimes emerge from design and architecture.

Continue reading Design and architecture: Special Dumpster Fire Unit.

Categories: Technology

Roaming free: The power of reading beyond your field

Wed, 2019/02/06 - 14:00

Glenn Vanderburg talks about the importance of letting your attention roam, and he shares examples of how insights from other fields have inspired software practitioners.

Continue reading Roaming free: The power of reading beyond your field.

Categories: Technology

Four short links: 6 February 2019

Wed, 2019/02/06 - 07:20

Video Editing, Assembling Textbooks, Amazon Advertising, and Blocking Autoplay

  1. Flowblade -- a multitrack non-linear video editor released under GPL3 license.
  2. Automatically Assembling Textbooks from Wikipedia -- Adamti and co have a plan for determining the utility of their approach. They plan to produce a range of Wikibooks on subjects not yet covered by human-generated books. They will then monitor the page views and edits to these books to see how popular they become and how heavily they are edited, compared with human-generated books.
  3. Amazon Knows What You Buy. And It’s Building a Big Ad Business From It (NYT) -- I'm sure nothing bad can happen from this.
  4. Firefox 66 to Block Automatically Playing Audible Video and Audio (Mozilla) -- user-friendly behavior ftw.

Continue reading Four short links: 6 February 2019.

Categories: Technology

170+ live online training courses opened for March and April

Wed, 2019/02/06 - 04:00

Get hands-on training in machine learning, microservices, blockchain, Python, Java, and many other topics.

Learn new topics and refine your skills with more than 170 new live online training courses we opened up for March and April on the O'Reilly online learning platform.

AI and machine learning

Spotlight on Innovation: Succeeding with Machine Learning with Alex Jaimes, February 13

Hands-On Adversarial Machine Learning, February 25

Probabilistic Modeling With TensorFlow Probability, February 27

Deep Learning Fundamentals, March 5

An Introduction to Amazon Machine Learning on AWS, March 6-7

Natural Language Processing (NLP) from Scratch, March 11

Deep Reinforcement Learning, March 12

Sentiment Analysis for Chatbots in Python, March 13

Hands-on Machine Learning with Python: Classification and Regression, March 13

TensorFlow Extended: Data Validation and Transform, March 14

Hands-On Machine Learning with Python: Clustering, Dimension Reduction, and Time Series Analysis, March 14

Building a Robust Machine Learning Pipeline, March 14-15

Machine Learning in Practice, March 19

TensorFlow Extended: Model Build, Analysis, and Serving, March 20

Artificial Intelligence: An Overview of AI and Machine Learning, March 20

Machine Learning for IoT , March 20

Next Generation Decision Making: Pragmatic Artificial Intelligence, March 20-21

Getting Started with Machine Learning, March 21

Artificial Intelligence for Robotics, March 21-22

Beginning Machine Learning with PyTorch, March 25

Artificial Intelligence: Real-World Applications, March 28

Active Learning, April 9

Hands On Adversarial Machine Learning, April 11

Practical Deep Learning with PyTorch, April 11-12

Blockchain

Introducing Blockchain, March 8

Building Smart Contracts on the Blockchain, March 21-22

IBM Blockchain Platform as a Service, March 25-26

Understanding Hyperledger Fabric Blockchain, March 28-29

Blockchain for Enterprise, April 1

Business

Innovative Teams, March 11

Fundamentals of Cognitive Biases, March 11

Artificial Intelligence: AI For Business, March 12

Business Strategy Fundamentals, March 13

The Power of Lean in Software Projects: Less Wasted Effort and More Product Results, March 14

Leadership Communication Skills for Managers, March 14

Emotional Intelligence in the Workplace, March 14

Thinking Like a Manager, March 14

Tools for the Digital Transformation, March 14-15

Introduction to Delegation Skills, March 21

Negotiation Fundamentals, March 22

Introduction to Critical Thinking, March 26

Your First 30 Days as a Manager, April 2

How to Give Great Presentations, April 5

Introduction to Strategic Thinking Skills, April 8

Data science and data tools

Business Data Analytics Using Python, February 27

Hands-on Introduction to Apache Hadoop and Spark Programming, March 5-6

Designing and Implementing Big Data Solutions with Azure, March 11-12

Time Series Forecasting, March 14

Cleaning Data at Scale, March 19

Practical Data Cleaning with Python, March 20-21

Building Distributed Pipelines for Data Science Using Kafka, Spark, and Cassandra , April 8-10

Real-Time Data Foundations: Kafka, April 9

Real-Time Data Foundations: Spark, April 10

Building Data APIs with GraphQL, April 11

Design and product management

From User Experience Designer to Digital Product Designer, March 1

Mastering UX Mapping, March 7-8

Writing User Stories, March 13

Product Roadmaps from the Ground Up, April 3

Programming

Design Patterns Boot Camp, February 19-20

Discovering Modern Java, March 1

Beginner’s Guide to Writing AWS Lambda Functions in Python, March 1

Building APIs with Django REST Framework, March 4

SQL for Any IT Professional, March 4

Spring Boot and Kotlin, March 5

Programming with Java Lambdas and Streams, March 5

Bootiful Testing, March 6

Learning Python 3 by Example, March 7

Getting Started with OpenShift, March 8

Setting Up Scala Projects, March 11

Getting Started with Pandas, March 11

Getting Started with Python 3, March 11-12

Java Full Throttle with Paul Deitel: A One-Day, Code-Intensive Java Standard Edition Presentation, March 12

Mastering Pandas, March 12

Scalable Concurrency with the Java Executor Framework, March 12

Getting Started with Python's Pytest, March 13

Python Programming Fundamentals, March 13

Mastering Python's Pytest, March 14

Kotlin Fundamentals, March 14

Quantitative Trading with Python, March 14

Advanced TDD (Test-Driven Development), March 15

Introduction to Python Programming, March 15

Bash Shell Scripting in 4 Hours, March 18

Java Testing with Mockito and the Hamcrest Matchers, March 19

Scala Core Programming: Methods, Classes Traits, March 19

Ansible in 4 Hours, March 19

Getting Started with PHP and MySQL , March 20

Mastering the Basics of Relational SQL Querying, March 20-21

Reactive Spring and Spring Boot, March 21

Automating with Ansible, March 22

Scala Core Programming: Sealed Traits, Collections, and Functions, March 25

Mastering SELinux, March 25

Intermediate Git, March 25

Scalable Programming with Java 8 Parallel Streams, March 27

Design Patterns Boot Camp, March 27-28

Mastering C# 8.0 and .NET Core 3.0, March 27-28

Rethinking REST: A Hands-On Guide to GraphQL and Queryable APIs, March 28

C# Programming: A Hands-On Guide, March 28

Web Application Programming in C# and ASP.NET Core with MVC and Entity Framework, March 28-29

Introduction to JavaScript Programming, April 2-3

Visualization in Python with Matplotlib, April 8

Python for Finance, April 8-9

Practical MQTT for the Internet of Things, April 8-9

Getting Started with Pandas, April 9

Getting Started with Python 3, April 9-10

Getting Started with React.js, April 10

What's New In Java, April 11

Fundamentals of Rust, April 11-12

Security

CompTIA PenTest+ Crash Course, March 5-6

Start Your Security Certification Career Today, March 8

Protecting Data Privacy in a Machine Learning World, March 11-12

Certified Ethical Hacker (CEH) Crash Course, March 12-13

CompTIA Security+ SY0-501 Crash Course, March 18-19

Intense Introduction to Hacking Web Applications, March 19

Cyber Security Fundamentals, March 26-27

CISSP Crash Course, March 26-27

CISSP Certification Practice Questions and Exam Strategies, March 27

AWS Certified Security - Specialty Crash Course, March 27-28

Systems engineering and operations

Software Architecture by Example, February 21

Red Hat Certified System Administrator (RHCSA) Crash Course, March 4-7

Creating Serverless APIs with AWS Lambda and API Gateway, March 5

Amazon Web Services (AWS): Up and Running, March 6

Docker Compose, March 6

Microservice Collaboration, March 7

Docker CI/CD, March 7

OpenStack for Cloud Architects, March 7-8

Red Hat RHEL 8 New Features, March 11

From Developer to Software Architect, March 11-12

Google Cloud Certified Associate Cloud Engineer Crash Course, March 11-12

AWS Certified Solutions Architect Associate Crash Course, March 11-12

9 Steps to Awesome with Kubernetes, March 12

IP Subnetting from Beginning to Mastery, March 12-13

Istio on Kubernetes: Enter the Service Mesh, March 14

How the Internet Really Works, March 15

Kubernetes Serverless with Knative, March 15

AWS Advanced Security with Config, GuardDuty, and Macie, March 18

Software Architecture by Example, March 18

Amazon Web Services: AWS Managed Services, March 18-19

Practical Kubernetes, March 18-19

AWS Certified SysOps Administrator (Associate) Crash Course, March 18-19

CCNA Routing and Switching 200-125 Crash Course, March 18-22

Managing Containers on Linux, March 19

Docker Images, March 19

Docker: Up and Running, March 19-20

Docker Containers, March 20

Implementing Evolutionary Architectures, March 20-21

Kubernetes in 4 Hours, March 21

AWS Security Fundamentals, March 21

Deploying Container-Based Microservices on AWS, March 21-22

Google Cloud Platform (GCP) for AWS Professionals, March 22

Architecture for Continuous Delivery , March 25

Docker for JVM projects, March 25

Implementing Azure for Enterprises, March 25-26

Building and Managing Kubernetes Applications, March 26

Cloud Computing Governance, March 26

Getting Started with Amazon Web Services (AWS), March 26-27

Microservices Caching Strategies, March 27

Cloud Complexity Management, March 28

Comparing Service-Based Architectures, March 28

Network DevOps, March 29

API Driven Architecture with Swagger and API Blueprint, March 29

Software Architecture for Developers, April 1

Implementing and Troubleshooting TCP/IP, April 2

Amazon Web Services (AWS) Technical Essentials, April 2

Building Applications with Apache Cassandra, April 3-4

Introduction to Kubernetes, April 3-4

CCNA Routing and Switching Crash Course, April 4-5

Architecting Secure IoT Applications with Azure Sphere, April 4-5

AWS Design Fundamentals, April 9-10

Microservices Architecture and Design, April 9-10

Practical Docker, April 10

Automation with AWS Serverless Technologies, April 10

Continue reading 170+ live online training courses opened for March and April.

Categories: Technology

The future of cloud-native programming

Tue, 2019/02/05 - 14:00

Tamar Eilam offers an overview of cloud-native programming and outlines a path toward the unification of the cloud programming model.

Continue reading The future of cloud-native programming.

Categories: Technology

Highlights from the O'Reilly Software Architecture Conference in New York 2019

Tue, 2019/02/05 - 14:00

Watch highlights from expert talks covering cloud-native programming, software architecture career advice, and more.

People from across the software architecture world came together in New York for the O'Reilly Software Architecture Conference. Below you'll find links to highlights from the event.

Architecting IT transformation

Gregor Hohpe explains how software architects can use what they know about technical systems to help refactor organizations.

--> Career advice for architects

Trisha Gee shares lessons she learned the hard way while managing her career as a developer, lead, and technical advocate.

From the trenches: An interview with Mark Richards

Neal Ford talks with Mark Richards about his career path and his work as a software architect.

The future of cloud-native programming

Tamar Eilam offers an overview of cloud-native programming and outlines a path toward the unification of the cloud programming model.

Design and architecture: Special Dumpster Fire Unit

Matt Stine looks at the tricky situations that sometimes emerge from design and architecture.

Design after Agile: How to succeed by trying less

Stuart Halloway explains how to augment agility with principles for designing systems.

Roaming free: The power of reading beyond your field

Glenn Vanderburg talks about the importance of letting your attention roam, and he shares examples of how insights from other fields have inspired software practitioners.

Continue reading Highlights from the O'Reilly Software Architecture Conference in New York 2019.

Categories: Technology

Career advice for architects

Tue, 2019/02/05 - 14:00

Trisha Gee shares lessons she learned the hard way while managing her career as a developer, lead, and technical advocate.

Continue reading Career advice for architects.

Categories: Technology

From the trenches: An interview with Mark Richards

Tue, 2019/02/05 - 14:00

Neal Ford talks with Mark Richards about his career path and his work as a software architect.

Continue reading From the trenches: An interview with Mark Richards.

Categories: Technology

Four short links: 5 February 2019

Tue, 2019/02/05 - 05:00

Creating the Future, LIDAR, Human-AI Design, and Command-line Course

  1. The Best Way to Predict the Future is to Create It. But Is It Already Too Late? (Alan Kay) -- Virtually everybody in the computing science has almost no sense of human history and context of where we are and where we are going. So, I think of much of the stuff that has been done as inverse vandalism. Inverse vandalism is making things just because you can. Every sentence is a cracker. (via Daniel G. Siegel)
  2. Trying to Make Powerful, Low-cost LIDAR (Ars Technica) -- a good intro to the tech and competition in the space.
  3. Guidelines for Human-AI Interaction -- Microsoft paper on design challenges in "smart" apps.
  4. MIT Hacker Tools -- lectures on the Unix tools that command-line natives use.

Continue reading Four short links: 5 February 2019.

Categories: Technology

3 emerging trends tech leaders should watch

Tue, 2019/02/05 - 04:00

Analysis of the O’Reilly online learning platform reveals a new approach to technical architecture, the rise of blockchain, and shifts in programming language adoption.

Keeping up with technology can be a daunting task for tech leaders. Each year, to make the task a little easier, we analyze behavior on the O’Reilly online learning platform, using the platform as a massive sensor that yields insights and identifies areas tech leaders should pay attention to, explore, and learn.

Our analysis includes the top search terms and the topics that garner the most usage on our learning platform.[1] This combination of search and usage data provides a holistic view; search data shows the areas where subscribers are exploring, and usage identifies topics where they’re actively engaged.

The signals from the O’Reilly online learning platform reveal:

  • Strong growth in cloud topics and Kubernetes, as well as interest in containers and decomposition (microservices), points toward the rise of a “Next Architecture.”
  • Interest in blockchain, which we first noted in 2017, continues. While the full potential of the blockchain gets sorted out, consider that if you’re not investigating blockchain, someone you compete with is.
  • Python, Java, and JavaScript—the “big three” languages on our learning platform—continue to dominate usage year after year. In addition, Rust and Go showed growing interest on the platform, suggesting that organizations are using languages that emphasize developer productivity while also embracing languages that tilt the balance toward performance and scaling.
Figure 1. The top search terms on the O’Reilly online learning platform in 2018 (left) and the rate of change for each term (right) Figure 2. Topics on the O’Reilly online learning platform with the most usage in 2018 (left) and the rate of change for each topic (right) The signs of a Next Architecture

The growth we’ve seen on our online learning platform in cloud topics, in orchestration and container-related terms such as Kubernetes and Docker, and in microservices is part of a larger trend in how organizations plan, code, test, and deploy applications that we call the Next Architecture. This architecture allows fast, flexible deployment, feature flexibility, efficient use of programmer resources, and rapid adapting, including scaling, to unpredictable resource requirements. These are all goals businesses feel increasingly pressured to achieve to keep up with nimble competitors.

There are four aspects of the Next Architecture, each of which shows up in the platform’s search and usage data.

Figure 3. AWS, Kubernetes, Docker, and Microservices—each representing an important part of the Next Architecture—appear in the top search terms from the O’Reilly online learning platform Decomposition

Organizations get a lot of benefits by breaking large and complex activities into small, loosely connected pieces. Through decomposition, these activities can be turned into standalone services that can be developed independently and linked together to create a more complex application. Microservices, the manifestation of decomposition, was the number 13 search term on our online learning platform in 2018.

Cloud

An organization needs the flexibility to adjust, scale, and innovate its digital presence—often across different time zones and geographies. The cloud supports these goals with compute instances that are fungible, coming and going as needed, and easy to replace automatically if failures are detected. The move toward decomposition (microservices) helps accelerate the trend toward the cloud by providing more impetus for quickly spinning up and managing services that support the need for dynamic, adaptable applications.

Cloud-related terms had a significant presence in the search and usage data. AWS, Amazon’s suite of cloud-based tools, was the number 4 search term, and it had 28% growth in year-over-year usage. Google Cloud (66% growth in usage over 2017) and Microsoft Azure (60% growth in usage) also increased. In addition, the topic “cloud migration” was up 40% in usage in 2018.

Containers

Containers provide a lightweight way to achieve the modularity favored by decomposition and the cloud. Docker, the number 7 search term in 2018, makes it easy to automate the deployment of the microservices that are created through decomposition.

Orchestration

The huge number of microservices running on containers—often in the hundreds or thousands—exceeds the capacity of humans to track and manage them. Orchestration tools, notably Kubernetes, fill the gap through rigorous specifications and automation. Kubernetes was the number 5 search term in 2018, jumping 11 spots, and usage growth was up a notable 160% year over year.

We’ll continue to explore the Next Architecture in the coming months.

Keep an eye on blockchain

Blockchain, which was one of the stars in our 2017 results, jumped seven spots in the top search terms (number 13), and it was up 36% in usage in 2018. Ethereum, a tool for implementing blockchains, was up 66% in year-over-year usage from a small base. Platform subscribers were likely exploring blockchain to assess its potential, developing an awareness of where blockchain may fit into their strategic plans or evaluating it as an existential threat, mostly in the areas of payments, supply chain logistics, and provenance.

Python, Java, and JavaScript continue their dominance

In 2018 we saw Python, Java, and JavaScript maintain the strong positions they’ve gained on our online learning platform over the years.

Python gets a boost, in part, from the increased interest in machine learning (ML). Many ML libraries, such as TensorFlow, are wrapped in Python libraries and promoted with Python interfaces. Ascendant ML tools also bolster interest in Python. For example, PyTorch, a library for computer vision and natural language processing, saw a 300% increase in year-over-year usage from a small base, and scikit-learn, another Python-based machine learning library, was up 39% in usage.

Many tools used in big data applications—notably the ones from the Apache Foundation, such as Spark and Kafka—feature Java interfaces. Thus, machine learning and big data may explain the popularity of both Python and Java. Java also remains a workhorse language for large-scale applications.

The JavaScript ecosystem of web frameworks and libraries saw less growth than Java and Python. However, usage trends show engagement with the popular JavaScript web frameworks. Angular was up 23% in usage, and React was up 39%, though search activity on both topics was flat. A third JavaScript framework, Vue, showed big usage growth, up 220% from a small base.

After JavaScript, one more language appears in our top searches. Go, the number 11 search term, jumped three spots in the top search results, and content usage was up 14%. Go sits conveniently between high-level languages, interpreted languages like Python, and low-level, fast systems-compiled languages like C. It combines the syntactic ease of the high-level languages with compiler-driven performance, good concurrency support, an active and growing developer community, and the full support of Google. When performance matters, or when an app or service written in a high-level language needs a performance boost, Go is (sorry for the pun) the go-to language for an increasing number of developers.

Finally, the fastest usage growth we saw for any language between 2017 and 2018 was for Rust (up 44%). Rust is a systems language with near-C performance, safe, efficient memory management, native concurrency support, and a modern syntax. Developers are increasingly finding Rust a good fit when performance is or becomes a priority.

Other findings

There are a few more items from the analysis that are worth calling out.

  • Machine learning (ML), the number 10 search term, has been a leader on our learning platform for more than a year, as we showed in last year's trends. In 2018 we saw a change in the distribution of interest in ML topics within the search and usage results. There was less growth in exploratory topics and phrases like “machine learning” and “deep learning.” This was coupled with a shift toward more specific topics like “natural language processing” (up 22% in search and 11% in usage) and “reinforcement learning” (up 122% in search and 331% in usage from a small base). We attribute the shift to the maturation of the ML topic and a move beyond exploration toward more engaged implementation. This is a trend reinforced by the ML and artificial intelligence surveys we’ve run.
  • A 5% increase in usage for business-related material on the platform highlights the importance of tech for every facet of a business. It also aligns with the idea that all companies are now tech companies.
  • Security content went up 6% in usage in 2018, which is a good sign since we’ve noted in the past that security was underappreciated. Increased scrutiny from notable breaches may partly explain the increase. The development of distributed systems also presents new security challenges organizations must confront.
  • Web and mobile topics showed slight but noticeable declines in search and usage. We think the decline relates to maturity and a semantic transition. Organizations no longer pursue “web” and “mobile” computing; the web and mobile are now endemic enough that it’s all just “computing.”
Looking ahead

The rise of the Next Architecture, the maturation of blockchain, and emerging patterns in programming languages are areas of focus for us in the year ahead. We’ll continue to examine search and usage data on the platform, and we’ll also engage in research via conversations with our conference speakers and attendees, through perspectives from our community of practitioners and thought leaders, from media coverage, and from other sources. Ultimately, we want to see if these additional signals reinforce or challenge the findings from our platform data.

[1] This article is based on non-personally-identifiable information about the top search terms and topics on the O’Reilly online learning platform in 2018.

Continue reading 3 emerging trends tech leaders should watch.

Categories: Technology

Artificial intelligence and machine learning adoption in European enterprise

Mon, 2019/02/04 - 12:20

How companies in Europe are preparing for and adopting AI and ML technologies.

In a recent survey, we explored how companies were adjusting to the growing importance of machine learning and analytics, while also preparing for the explosion in the number of data sources. In practice this means developing a coherent strategy for integrating artificial intelligence (AI), big data, and cloud components, and specifically investing in foundational technologies needed to sustain the sensible use of data, analytics, and machine learning. (You can find full results from the survey in the free report “Evolving Data Infrastructure”.)

This survey drew from more than 3,200 respondents, including more than 1,000 respondents from Western and Eastern Europe. In this post, I’ll describe some of the key areas of interest and concern highlighted by respondents from Europe, while describing how some of these topics will be covered at the upcoming Strata Data conference in London (April 29 - May 2, 2019).

As interest in machine learning (ML) and AI grow, organizations are realizing that model building is but one aspect they need to plan for. Given the end-to-end nature of many data products and applications, sustaining ML and AI requires a host of tools and processes, ranging from collecting, cleaning, and harmonizing data, understanding what data is available and who has access to it, being able to trace changes made to data as it travels across a pipeline, and many other components. Our survey showed that companies are beginning to build some of the foundational pieces needed to sustain ML and AI within their organizations:

Solutions, including those for data governance, data lineage management, data integration and ETL, need to integrate with existing big data technologies used within companies. To that end, we also asked respondents what technologies (open source, managed services) they use for things like data storage, data management, and data processing. For example, the chart below lists popular (batch and streaming) data processing tools used by respondents based in Europe:

Many of the systems listed in the previous chart—Apache Spark, Kafka, Hadoop, etc.,—have been in use at enterprises across the globe for several years. One of the newer systems is Apache Pulsar, a promising new messaging system that unifies queuing and streaming. Pulsar will be covered in a popular new tutorial at Strata Data London, “Architecture and Algorithms for End-to-End Streaming Data Processing”. More importantly, there will be many sessions on the foundational technologies needed for machine learning and AI:

Our survey also aligned with recent articles describing the strong demand for data scientists. As noted above, ML and AI involves more than model building. Just as one needs a suite of technologies to sustain success in ML and AI, one also needs a team with a broad range of skills that go beyond model building. Not only is ML quite different from traditional software engineering, as noted in a previous post, ML is changing the nature of software development itself. The chart below lists demand for data-related skills in Europe:

The data science and machine learning program for Strata Data London will cover tools and methodologies, case studies and best practices, deep dives into familiar data types (text, temporal data, graphs), and new automation tools for data and machine learning professionals:

At the 2018 Strata Data London, data privacy and GDPR were big topics. In fact, our 2018 conference happened the same week GDPR came online. A year later, companies are still navigating through GDPR while also preparing for a new set of regulations (including the California Consumer Privacy Act). At this year’s conference, we will continue to have tutorials and sessions on data privacy and data security, but we will also have sessions on techniques and tools for privacy-preserving analytics—the very tools needed to build analytic and AI products that respect user privacy:

We are beginning to see interesting industrial IoT applications and systems. There’s good reason to expect that streaming and real-time applications will explode in the years to come. Tools and infrastructure for collecting streaming data have improved and continue to get easier to use. 5G mobile services are just around the corner and will pave the way for many new machine-to-machine applications. Since 5G increases the network bandwidth to mobile devices, it potentially will make it much more attractive to put machine learning at the edge of the network. Coincidentally, we are also beginning to see specialized hardware for intelligence on edge devices.

The good news is that companies are beginning to build foundational technologies (described in Figure 1) that will be essential in a world where the number of machine learning models and AI applications explode. The program at the Strata Data Conference in London will cover all these areas and more:

In an upcoming survey on the use of AI technologies (report forthcoming), we found that companies consider their inability to maintain a portfolio of use cases to be a major obstacle to AI adoption. At this year’s conference, we have presentations from leading companies detailing how they have successfully deployed data and machine learning technologies in real-world settings:

Continue reading Artificial intelligence and machine learning adoption in European enterprise.

Categories: Technology

Reinforcement learning for the birds

Mon, 2019/02/04 - 08:35

Much like human speech, bird song learning is social; perhaps we'll discover machine learning is social, too.

I just read a fascinating article about an experiment in bird psychology. We've known for a long time that bird songs aren't innate; they're learned. If you listen carefully to your back yard birds in the spring, you can hear the young birds learning their songs; you'll probably hear a few that can't get it right, and that gradually get better as summer progresses.

We've also known that bird songs (as distinct from other bird calls) are strictly a male behavior: they're part of the mating rituals. Getting its song right is an important step in a male bird's education. Female birds don't sing. They stay quiet and choose the mate whose song they like best. (For a fascinating discussion of birds, mating rituals, and aesthetics, see The Evolution of Beauty.)

The common sense understanding of how birds learn their songs has been that it's simply imitation: the baby bird tries to sing like its father. But it's not that simple. The mother plays a crucial role. She gives tiny signals to her children that show them whether they're getting the song right. It's basically a birdie "thumbs up": fluffing her feathers or twitching a wing to show that she likes the song. We haven't noticed because a bird's mental processing is much faster than ours. We're too slow to see the minute twitches that the mother birds use to signal to their offspring. They only became evident when scientists used high-speed cameras and slowed down the video.

So, my first thought (well, actually, third or fourth thought) was, how does this relate to machine learning? And I realized this is essentially reinforcement learning: the mother is rewarding her offspring as it progresses toward a better song. In this context, "better" presumably means "more attractive to female birds," and the mother trains her male offspring to go forth and find mates. Get yourself good singing lessons, and you're all set.

Neural networks are about imitation. You get your tagged training set, see how well the model can imitate the tags in the training set, and when it's good enough, you try real-world data. That's gotten us a long way, but it has limitations: it requires a lot of training data, and lengthy training through many iterations. Bird learning, like reinforcement learning, is all about rewards, and that's fundamentally different. There's relatively little training data: just a mother that gives cues about whether the child is getting it right. (And even as I write this, I think, just a mother? Even considered purely as data, a mother probably represents more data than the largest training sets imaginable.) Is that terribly surprising? Human babies don't learn to speak on their own; they're constantly getting feedback from their parents.

Now we know that's how birds do it, too. And maybe that's how our machines will do it. Learning is social; perhaps we'll find out that, in the end, machine learning is also social.

Continue reading Reinforcement learning for the birds.

Categories: Technology

Four short links: 4 February 2019

Mon, 2019/02/04 - 05:00

Information Theory, Event Sourcing, Sunsetting Software, and Social Perception

  1. A Mini-Introduction To Information Theory -- This article consists of a very short introduction to classical and quantum information theory. Basic properties of the classical Shannon entropy and the quantum von Neumann entropy are described, along with related concepts such as classical and quantum relative entropy, conditional entropy, and mutual information. A few more detailed topics are considered in the quantum case.
  2. Event Sourcing is Hard (Chris Kiehl) -- In practice, this manages to somehow simultaneously be both extremely coupled and yet excruciatingly opaque.
  3. Executing a Sunset (Etsy) -- In this blog post, we will explore how we sunset these products at Etsy. This process involves a host of stakeholders, including marketing, product, customer support, finance, and many other teams, but the focus of this blog post is on engineering and the actual execution of the sunset.
  4. Social Perception for Machines -- a lecture by CMU's Yaser Ajmal Sheikh. In this talk, I will describe our research arc over the past decade at CMU to make human signaling a perceptible channel of information for machines.

Continue reading Four short links: 4 February 2019.

Categories: Technology

Four short links: 1 February 2019

Fri, 2019/02/01 - 04:55

GPU Analytics, 8-Bit Coding, Evil HCI, and CGI for Websockets

  1. AresDB -- Uber’s GPU-powered open source, real-time analytics engine.
  2. 8 Bit Workshop -- Learn how classic game hardware worked. Write code and see it run instantly. In your browser.
  3. CHI4Evil -- In this workshop, we will explore the creative use of HCI methods and concepts such as design fiction or speculative design to help anticipate and reflect on the potential downsides of our technology design, research, and implementation. Call for papers. Channel your inner Black Mirror. (via BoingBoing)
  4. websocketd -- CGI for WebSockets.

Continue reading Four short links: 1 February 2019.

Categories: Technology

Using machine learning and analytics to attract and retain employees

Thu, 2019/01/31 - 07:00

The O’Reilly Data Show Podcast: Maryam Jahanshahi on building tools to help improve efficiency and fairness in how companies recruit.

In this episode of the Data Show, I spoke with Maryam Jahanshahi, research scientist at TapRecruit, a startup that uses machine learning and analytics to help companies recruit more effectively. In an upcoming survey, we found that a “skills gap” or “lack of skilled people” was one of the main bottlenecks holding back adoption of AI technologies. Many companies are exploring a variety of internal and external programs to train staff on new tools and processes. The other route is to hire new talent. But recent reports suggest that demand for data professionals is strong and competition for experienced talent is fierce. Jahanshahi and her team are building natural language and statistical tools that can help companies improve their ability to attract and retain talent across many key areas.

Continue reading Using machine learning and analytics to attract and retain employees.

Categories: Technology

Four short links: 31 January 2019

Thu, 2019/01/31 - 04:55

Locke the Thinkfluencer, Open Source Semiconductor Manufacturing, AR/VR, and IT's Recycling Shame

  1. Cory Doctorow at Grand Reopening of the Public Domain -- Locke was a thinkfluencer. No transcript yet, but audio ripped on the Internet Archive.
  2. Libre Silicon -- We develop a free and open source semiconductor manufacturing process standard and provide a quick, easy, and inexpensive way for manufacturing. No NDAs will be required anywhere to get started, making it possible to build the designs in your basement if you wish. We are aiming to revolutionize the market by breaking through the monopoly of proprietary closed-source manufacturers.
  3. Predicting Visual Discomfort with Stereo Displays -- In a third experiment, we measured phoria and the zone of clear single binocular vision, which are clinical measurements commonly associated with correcting refractive error. Those measurements predicted susceptibility to discomfort in the first two experiments. A simple predictor of whether and when you're going to puke with an AR/VR headset would be a wonderful thing. Perception of synthetic realities are weird: a friend told me about encountering a bug in a VR renderer that made him immediately (a) fall over, and (b) puke. Core dumped?
  4. A New Circular Vision for Electronics (World Economic Forum) -- getting coverage because it says: Each year, close to 50 million tonnes of electronic and electrical waste (e-waste) are produced, equivalent in weight to all commercial aircraft ever built; only 20% is formally recycled. If nothing is done, the amount of waste will more than double by 2050, to 120 million tonnes annually. [...] That same e-waste represents a huge opportunity. The material value alone is worth $62.5 billion (€55 billion), three times more than the annual output of the world’s silver mines and more than the GDP of most countries. There is 100 times more gold in a tonne of mobile phones than in a tonne of gold ore. (via Slashdot)

Continue reading Four short links: 31 January 2019.

Categories: Technology

Four short links: 30 January 2019

Wed, 2019/01/30 - 11:35

No Code, Enterprise Sales, Deep-Learning the Brain, and Computer Architecture

  1. The Rise of No Code -- As creating things on the internet becomes more accessible, more people will become makers. It’s no longer limited to the >1% of engineers who can code, resulting in an explosion of ideas from all kinds of people. We see “no code” projects on Product Hunt often. This is related to my ongoing interest in Ways In Which Programmers Are Automating Themselves Out of A Job. This might be bad for some low-complexity programmers in the short term, and good for society. Or it might be that the AI Apocalypse is triggered by someone's Glitch bot achieving sentience. Watch this space!
  2. My Losing Battle with Enterprise Sales (Luke Kanies) -- All that discounting you have to do for enterprise clients? It’s because procurement’s bonus is based on how much of a discount they force you to give. Absolutely everyone knows this is how it works, and that everyone knows this, so it’s just a game. I offer my product for a huge price, you try to force a discount, and then at the end we all compare notes to see how we did relative to market. Neither of us really wants to be too far out of spec; I want to keep my average prices the same, and you just want to be sure you aren’t paying too much. Luke tells all.
  3. Decoding Words from Brain Waves -- In each study, electrodes placed directly on the brain recorded neural activity while brain-surgery patients listened to speech or read words out loud. Then, researchers tried to figure out what the patients were hearing or saying. In each case, researchers were able to convert the brain's electrical activity into at least somewhat-intelligible sound files.
  4. A New Golden Age for Computer Architecture (ACM) -- the opportunities for future improvements in speed and energy efficiency will come from (the authors predict): compiler tech and domain-specific architectures. This is a very good overview of how we got here, by way of Moore's Law, Dennard's Law, and Amdahl's Law.

Continue reading Four short links: 30 January 2019.

Categories: Technology

Pages