You are here

O'Reilly Radar

Subscribe to O'Reilly Radar feed
All of our Ideas and Learning material from all of our topics.
Updated: 3 weeks 3 days ago

Four short links: 16 November 2018

Fri, 2018/11/16 - 05:00

Illuminated Paper, Software Forge, Leak Checklist, and PC on ESP

  1. IllumiPaper -- illuminated elements built into regular paper, with implementation.
  2. sr.ht -- (pronounced "sir hat") a software forge like GitHub or GitLab, but with interesting strengths (e.g., very lightweight pages, and the CI system).
  3. Leak Mitigation Checklist -- If you just leaked sensitive information in public source code, read this document as part of your emergency procedure.
  4. Emulating an IBM PC on an ESP8266 -- an 8086 PC-XT emulation with 640K RAM, 80×25 CGA composite video, and a 1.44MB MS-DOS disk on an ESP12E without additional components. (via Alasdair Allen)

Continue reading Four short links: 16 November 2018.

Categories: Technology

Four short links: 15 November 2018

Thu, 2018/11/15 - 04:50

Punish Online Criminals, Fake Fingerprints, Implementing Identity, and Project Visbug

  1. USA Needs to Pursue Malicious Cyber Actors -- a report that argues that the United States currently lacks a comprehensive overarching strategic approach to identify, stop, and punish cyberattackers. (1) There is a burgeoning cybercrime wave. (2) There is a stunning cyber enforcement gap. (3) There is no comprehensive U.S. cyber enforcement strategy aimed at the human attacker. This is definitely a golden age of online crime.
  2. DeepMasterPrints: Generating MasterPrints for Dictionary Attacks via Latent Variable Evolution -- MasterPrints are real or synthetic fingerprints that can fortuitously match with a large number of fingerprints, thereby undermining the security afforded by fingerprint systems. Previous work by Roy, et al., generated synthetic MasterPrints at the feature level. In this work, we generate complete image-level MasterPrints known as DeepMasterPrints, whose attack accuracy is found to be much superior than that of previous methods. (via Mikko Hypponen)
  3. The Tripartite Identity Pattern (Randy Farmer) -- The three components of user identity are: the account identifier, the login identifier, and the public identifier.
  4. Project VisBug -- edit/tweak existing webpages.

Continue reading Four short links: 15 November 2018.

Categories: Technology

Four short links: 14 November 2018

Wed, 2018/11/14 - 05:45

ML Risk, IGF Session, Feature Engineering, and Solving Snake

  1. Managing Risk in Machine Learning Projects (Ben Lorica) -- Considerations for a world where ML models are becoming mission critical.
  2. Transcripts of 2018 IGF -- Internet Governance Forum session transcripts.
  3. Featuretools -- open source Python framework for automated feature engineering.
  4. Solving Snake -- fun exploration of different algorithms you might use to play the Snake game.

Continue reading Four short links: 14 November 2018.

Categories: Technology

Four short links: 13 November 2018

Tue, 2018/11/13 - 05:05

Ways of Working, Too-Smart AI, Wi-Fi Vision, and Materials Science AI

  1. Internet-Era Ways of Working -- an elegant brief summary of how we do software in 2018, from Tom Loosemore's public.digital team.
  2. Examples of AI Gaming the System -- a list of examples of AIs learning more than was intended. Neural nets evolved to classify edible and poisonous mushrooms, took advantage of the data being presented in alternating order, and didn't actually learn any features of the input images. (via BoingBoing)
  3. Using Wi-Fi to “See” Behind Closed Doors is Easier than Anyone Thought (MIT TR) -- if all you are interested in is the movement of people. Humans also reflect and distort this Wi-Fi light. The distortion, and the way it moves, would be clearly visible through Wi-Fi eyes, even though the other details would be smeared. This crazy Wi-Fi vision would clearly reveal whether anybody was behind a wall and, if so, whether the person was moving. That’s the basis of Zhu and co’s Wi-Fi-based peeping tom. It looks for changes in an ordinary Wi-Fi signal that reveal the presence of humans.
  4. Learning Process-Structure-Property Relations -- clever research project that mines research literature to learn relationships about the physical properties and processes in materials science, then automatically generates a diagam for the particular constraints your project has. Code released as open source.

Continue reading Four short links: 13 November 2018.

Categories: Technology

Managing risk in machine learning

Tue, 2018/11/13 - 05:00

Considerations for a world where ML models are becoming mission critical.

In this post, I share slides and notes from a keynote I gave at the Strata Data Conference in New York last September. As the data community begins to deploy more machine learning (ML) models, I wanted to review some important considerations.

Let’s begin by looking at the state of adoption. We recently conducted a survey which garnered more than 11,000 respondents—our main goal was to ascertain how enterprises were using machine learning. One of the things we learned was that many companies are still in the early stages of deploying machine learning (ML):

As far as reasons for companies holding back, we found from a survey we conducted earlier this year that companies cited lack of skilled people, a “skills gap,” as the main challenge holding back adoption.

Interest on the part of companies means the demand side for “machine learning talent” is healthy. Developers have taken notice and are beginning to learn about ML. In our own online training platform (which has more than 2.1 million users), we’re finding strong interest in machine learning topics. Below are the top search topics on our training platform:

Beyond “search,” note that we’re seeing strong growth in consumption of content related to ML across all formats—books, posts, video, and training.

Before I continue, it’s important to emphasize that machine learning is much more than building models. You need to have the culture, processes, and infrastructure in place before you can deploy many models into products and services. At the recent Strata Data conference we had a series of talks on relevant cultural, organizational, and engineering topics. Here's a list of a few clusters of relevant sessions from the recent conference:

Over the last 12-18 months, companies that use a lot of ML and employ teams of data scientists have been describing their internal data science platforms (see, for example, Uber, Netflix, Twitter, and Facebook). They share some of the features I list below, including support for multiple ML libraries and frameworks, notebooks, scheduling, and collaboration. Some companies include advanced capabilities, including a way for data scientists to share features used in ML models, tools that can automatically search through potential models, and some platforms even have model deployment capabilities:

As you get beyond prototyping and you actually begin to deploy ML models, there are many challenges that will arise as those models begin to interact with real users or devices. David Talby summarized some of these key challenges in a recent post:

  • Your models may start degrading in accuracy
  • Models will need to be customized (for specific locations, cultural settings, domains, and applications)
  • Real modeling begins once in production

There are also many important considerations that go beyond optimizing a statistical or quantitative metric. For instance, there are certain areas—such as credit scoring or health care—that require a model to be explainable. In certain application domains (including autonomous vehicles or medical applications), safety and error estimates are paramount. As we deploy ML in many real-world contexts, optimizing statistical or business metics alone will not suffice. The data science community has been increasingly engaged in two topics I want to cover in the rest of this post: privacy and fairness in machine learning.

Privacy and security

Given the growing interest in data privacy among users and regulators, there is a lot of interest in tools that will enable you to build ML models while protecting data privacy. These tools rely on building blocks, and we are beginning to see working systems that combine many of these building blocks. Some of these tools are open source and are becoming available for use by the broader data community:

  • Federated learning is useful when you want to collaborate and build a centralized model without sharing private data. It’s used in production at Google, but we still are in need of tools to make federated learning broadly accessible.
  • We’re starting to see tools that allow you to build models while guaranteeing differential privacy, one of the most popular and powerful definitions of privacy. At a high-level these methods inject random noise at different stages of the model building process. These emerging sets of tools aim to be accessible to data scientists who are already using libraries such as scikit-learn and TensorFlow. The hope is that data scientists will soon be able to routinely build differentially private models.
  • There’s a small and growing number of researchers and entrepreneurs who are investigating whether we can build or use machine learning models on encrypted data. This past year, we’ve seen open source libraries (HElib and Palisade) for fast homomorphic encryption, and we have startups that are building machine learning tools and services on top of those libraries. The main bottleneck here is speed: many researchers are actively investigating hardware and software tools that can speed up model inference (and perhaps even model building) on encrypted data.
  • Secure multi-party computation is another promising class of techniques used in this area.
Fairness

Now let’s consider fairness. Over the last couple of years, many ML researchers and practitioners have started investigating and developing tools that can help ensure ML models are fair and just. Just the other day, I searched Google for recent news stories about AI, and I was surprised by the number of articles that touch on fairness.

For the rest of this section, let’s assume one is building a classifier and that certain variables are considered “protected attributes” (this can include things like age, ethnicity, gender, ...). It turns out that the ML research community has used numerous mathematical criteria to define what it means for a classifier to be fair. Fortunately, a recent survey paper from Stanford—A Critical Review of Fair Machine Learning—simplifies these criteria and groups them into the following types of measures:

  • Anti-classification means the omission of protected attributes and their proxies from the model or classifier.
  • Classification parity means that one or more of the standard performance measures (e.g., false positive and false negative rates, precision, recall) are the same across groups defined by the protected attributes.
  • Calibration: If an algorithm produces a “score,” that “score” should mean the same thing for different groups.

However, as the authors from Stanford point out in their paper, each of the mathematical formulations described above suffers from limitations. With respect to fairness, there is no black box or series of procedures that you can stick your algorithm into that can give it a clean bill of health. There is no such thing as a “one size, fits all” procedure.

Because there’s no ironclad procedure, you will need a team of humans-in-the-loop. Notions of fairness are not only domain and context sensitive, but as researchers from UC Berkeley recently pointed out, there is a temporal dimension as well (“We advocate for a view toward long-term outcomes in the discussion of ‘fair’ machine learning”). What is needed are data scientists who can interrogate the data and understand the underlying distributions, working alongside domain experts who can evaluate models holistically.

Culture and organization

As we deploy more models, it’s becoming clear that we will need to think beyond optimizing statistical and business metrics. While I haven’t touched on them during this short post, it’s clear that reliability and safety are going to be extremely important moving forward. How do you build and organize your team in a world where ML models have to take many other important things under consideration?

Fortunately there are members of our data community who have been thinking about these problems. The Future of Privacy Forum and Immuta recently released a report with some great suggestions on how one might approach machine learning projects with risk management in mind:

  • When you’re working on a machine learning project, you need to employ a mix of data engineers, data scientists, and domain experts.
  • One important change outlined in the report is the need for a set of data scientists who are independent from this model-building team. This team of “validators” can then be tasked with evaluating the ML model on things like explainability, privacy, and fairness.
Closing remarks

So, what skills will be needed in a world where ML models are becoming mission critical? As noted above, fairness audits will require a mix of data and domain experts. In fact, a recent analysis of job postings from NBER found that compared with other data analysis skills, machine learning skills tend to be bundled with domain knowledge.

But you’ll also need to supplement your data and domain experts with with legal and security experts. Moving forward, we’ll need to have legal, compliance, and security people working more closely with data scientists and data engineers.

This shouldn’t come as a shock: we already invest in desktop security, web security, and mobile security. If machine learning is going to eat software, we will need to grapple with AI and ML security, too.

Related content:

Continue reading Managing risk in machine learning.

Categories: Technology

Four short links: 12 November 2018

Mon, 2018/11/12 - 06:15

Gov Open Source, Bruce Sterling, Robot Science, and Illustrated TLS 1.3

  1. FDA MyStudies App -- open source from government, designed to facilitate the input of real-world data directly by patients which can be linked to electronic health data supporting traditional clinical trials, pragmatic trials, observational studies, and registries.
  2. Bruce Sterling Interview -- on architecture, design, science fiction, futurism, and involuntary parks. (via Cory Doctorow)
  3. Inventing New Materials with AI (MIT TR) -- using machine learning to generate hypotheses for new materials, to be explored and tested by actual humans.
  4. The New Illustrated TLS Connection -- Every byte explained and reproduced. A revised edition in which we dissect the new manner of secure and authenticated data exchange, the TLS 1.3 cryptographic protocol.

Continue reading Four short links: 12 November 2018.

Categories: Technology

Four short links: 9 November 2018

Fri, 2018/11/09 - 06:00

Counting Computers, New Software, Unix History, and Tencent Framework

  1. How Many Computers Are In Your Computer? -- So, a desktop or smartphone can reasonably be expected to have anywhere from 15 to several thousand computers in the sense of a Turing-complete device which can be programmed and which is computationally powerful enough to run many programs from throughout computing history and which can be exploited by an adversary for surveillance, exfiltration, or attacks against the rest of the system. Which is why security folks sometimes sleep poorly at night.
  2. Some Notes on Running New Software in Production (Julia Evans) -- The playbook for understanding the software you run in production is pretty simple. Here it is: (1) Start using it in production in a non-critical capacity (by sending a small percentage of traffic to it, on a less critical service, etc); (2) Let that bake for a few weeks. (3) Run into problems. (4) Fix the problems. Go to step 3.
  3. Unix History (Rob Pike) -- know your past.
  4. Omi -- Tencent's ext generation web framework in 4KB JavaScript (Web Components + JSX + Proxy + Store + Path Updating).

Continue reading Four short links: 9 November 2018.

Categories: Technology

Lessons learned while helping enterprises adopt machine learning

Thu, 2018/11/08 - 05:20

The O’Reilly Data Show Podcast: Francesca Lazzeri and Jaya Mathew on digital transformation, culture and organization, and the team data science process.

In this episode of the Data Show, I spoke with Francesca Lazzeri, an AI and machine learning scientist at Microsoft, and her colleague Jaya Mathew, a senior data scientist at Microsoft. We conducted a couple of surveys this year—“How Companies Are Putting AI to Work Through Deep Learning” and “The State of Machine Learning Adoption in the Enterprise”—and we found that while many companies are still in the early stages of machine learning adoption, there’s considerable interest in moving forward with projects in the near future. Lazzeri and Mathew spend a considerable amount of time interacting with companies that are beginning to use machine learning and have experiences that span many different industries and applications. I wanted to learn some of the processes and tools they use when they assist companies in beginning their machine learning journeys.

Continue reading Lessons learned while helping enterprises adopt machine learning.

Categories: Technology

Four short links: 8 November 2018

Thu, 2018/11/08 - 04:55

Approximate Graph Pattern Mining, Ephemeral Containers, SaaS Metrics, and Edge Neural Networks

  1. ASAP: Fast, Approximate, Graph Pattern Mining at Scale (Usenix) -- we present A Swift Approximate Pattern-miner (ASAP), a system that enables both fast and scalable pattern mining. ASAP is motivated by one key observation: in many pattern mining tasks, it is often not necessary to output the exact answer [...] an approximate count is good enough. (via Morning Paper)
  2. Binci -- tackling the same problem space as Docker Compose, but aimed at ephemeral containers rather than long-running ones (e.g., for test/CI systems).
  3. Metrics for Investors (Andrew Chen) -- detailed take on the metrics through which investors view SaaS businesses.
  4. How to Fit Large Neural Networks on the Edge -- This blog explores a few techniques that can be used to fit neural networks in memory-constrained settings. Different techniques are used for the “training” and “inference” stages, and hence they are discussed separately.

Continue reading Four short links: 8 November 2018.

Categories: Technology

Four short links: 7 November 2018

Wed, 2018/11/07 - 05:10

Summarizing Text, Knowledge Database, AI Park, and Approximate Regexes

  1. Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting -- Inspired by how humans summarize long documents, we propose an accurate and fast summarization model that first selects salient sentences and then rewrites them abstractively (i.e., compresses and paraphrases) to generate a concise overall summary. We use a novel sentence-level policy gradient method to bridge the non-differentiable computation between these two neural networks in a hierarchical way, while maintaining language fluency. Source code available.
  2. KBPedia -- a comprehensive knowledge structure for promoting data interoperability and knowledge-based artificial intelligence, [which] combines seven "core" public knowledge bases—Wikipedia, Wikidata, schema.org, DBpedia, GeoNames, OpenCyc, and UMBEL—into an integrated whole. Now has a serious open source offering.
  3. Baidu Opens AI Park in Beijing -- autonomous buses, smart walkways that track people's steps using facial recognition, intelligent pavilions equipped with the company's conversational DuerOS system, and augmented reality Tai Chi lessons. It's theatre, but theatre sets perceptions. In this case, the perception that China is miles ahead of America in AI. It was the AR Tai Chi that caught my eye.
  4. TRE: A Regex Engine with Approximate Matching -- It does this by calculating the Levenshtein distance (number of insertions, deletions, or substitutions it would take to make the strings equal) as it searches for a match.

Continue reading Four short links: 7 November 2018.

Categories: Technology

140 live online training courses opened for November, December, and January

Wed, 2018/11/07 - 04:00

Get hands-on training in deep learning, Python, Kubernetes, blockchain, security, and many other topics.

Learn new topics and refine your skills with 140 live online training courses we opened up for November, December, and January on our learning platform.

Artificial intelligence and machine learning

Artificial Intelligence for Big Data, November 28-29

Essential Machine Learning and Exploratory Data Analysis with Python and Jupyter Notebook , December 3

Deep Learning for Machine Vision, December 4

Beginning Machine Learning with Scikit-Learn, December 5

Managed Machine Learning Systems and Internet of Things, December 5-6

Natural Language Processing (NLP) from Scratch, December 7

Machine Learning in Practice, December 7

Deep Learning with TensorFlow, December 12

Getting Started with Machine Learning, December 12

Essential Machine Learning and Exploratory Data Analysis with Python and Jupyter Notebook, January 7-8

Artificial Intelligence: AI for Business, January 9

Managed Machine Learning Systems and Internet of Things, January 9-10

Applied Deep Learning for Coders with Apache MXNet, January 10-11

Artificial Intelligence: An Overview of AI and Machine Learning, January 15

Hands-On Machine Learning with Python: Classification and Regression, January 16

Hands-On Machine Learning with Python: Clustering, Dimension Reduction, and Time Series Analysis, January 17

Blockchain

Building Smart Contracts on the Blockchain, November 29-30

Introducing Blockchain, December 7

Understanding Hyperledger Fabric Blockchain, December 10-11

Blockchain Applications and Smart Contracts: Developing Decentralized, December 13

Business

Spotlight on Innovation: The Future Beyond Digital, Entering a New Era of Exploration and Collaboration, November 28

Negotiation Fundamentals, December 7

Applying Critical Thinking, December 10

How to Give Great Presentations, December 10

Performance Goals for Growth, December 12

Leadership Communication Skills for Managers, January 9

Introduction to Critical Thinking, January 10

Introduction to Delegation Skills, January 10

Why Smart Leaders Fail, January 15

Data science and data tools

Real-Time Data Foundations: Kafka, December 3

Real-Time Data Foundations: Spark, December 4

Getting Started with Pandas, December 5

Getting Started with Python 3, December 5-6

Mastering Pandas, December 6

Real-Time Data Foundations: Flink, December 7

Apache Hadoop, Spark, and Big Data Foundations, December 10

Real-Time Data Foundations: Time Series Architectures, December 10

Sentiment Analysis for Chatbots in Python, December 11

Hands-on Introduction to Apache Hadoop and Spark Programming, December 12-13

Building Intelligent Bots in Python, December 13

Intermediate Machine Learning with Scikit-Learn, December 17

Design

3ds Max and V-Ray: The Path Towards Photorealism, December 14

Programming

Java Full Throttle with Paul Deitel: A One-Day, Code-Intensive Java Standard Edition Presentation, November 15

Next-Generation Java Testing with JUnit 5, November 15

Mastering the Basics of Relational SQL Querying, November 19-20

Designing Bots and Conversational Apps for Work, November 29

Pythonic Object-Oriented Programming, December 3

Bash Shell Scripting in 3 Hours, December 3

Beyond Python Scripts: Logging, Modules, and Dependency Management, December 5

Linux Filesystem Administration, December 5-6

Beyond Python Scripts: Exceptions, Error Handling, and Command-Line Interfaces, December 6

Next Level Git - Master your Workflow, December 6

Programming with Java 8 Lambdas and Streams, December 6

Consumer Driven Contracts - A Hands-On Guide to Spring Cloud Contract, December 10

SQL for Any IT Professional, December 10

Linux Under the Hood, December 10

Linux Troubleshooting, December 11

Scalable Concurrency with the Java Executor Framework, December 11

Next Level Git - Master your Content, December 13

Linux Performance Optimization, December 13

Mastering Go for UNIX Administrators, UNIX Developers, and Web Developers, December 13-14

Getting Started with Java: From Core Concepts to Real Code in 4 Hours, December 17

Reactive Spring Boot, December 17

Scala Fundamentals: From Core Concepts to Real Code in 5 Hours, December 18

Programming with Data: Python and Pandas, December 18

Spring Boot and Kotlin, December 18

Julia 1.0 Essentials, December 18

Functional Design for Java 8, December 18-19

Java 8 Generics in 3 Hours, December 20

Python: The Next Level, January 7-8

Design Patterns Boot Camp, January 9-10

Learning Python 3 by Example, January 10

Modern JavaScript, January 14

Learn the Basics of Scala, January 14

Getting Started with Pandas, January 14

Introduction to JavaScript Programming, January 14-15

Getting Started with Python 3, January 14-15

Mastering Pandas, January 15

Scaling Python with Generators, January 15

Getting Started with Pytest, January 16

OCA Java SE 8 Programmer Certification Crash Course, January 16-18

Mastering Python's Pytest, January 17

Pythonic Design Patterns, January 18

Visualization in Python with Matplotlib, January 18

Security

Cybersecurity Offensive and Defensive Techniques in 3 Hours, December 7

Cyber Security Fundamentals, December 10-11

Certified Ethical Hacker (CEH) Crash Course, December 13-14

Intense Introduction to Hacking Web Applications, December 17

CCNA Security Crash Course, December 18-19

CompTIA PenTest+ Crash Course, December 18-19

CompTIA Security+ SY0-501 Crash Course, January 7-8

AWS Certified Security - Specialty Crash Course, January 7-8

AWS Advanced Security with Config, GuardDuty, and Macie, January 14

Ethical Hacking Bootcamp with Hands-on Labs, January 15-17

CompTIA Security+ SY0-501 Certification Practice Questions and Exam Strategies, January 16

Cyber Ops SECFND 210-250 Crash Course, January 16

CCNA Cyber Ops SECOPS 210-255 Crash Course, January 18

Software architecture

Developing Incremental Architecture, December 10

Implementing Evolutionary Architectures, December 13-14

Architecture for Continuous Delivery, December 17

Architecture by Example, December 17-18

Comparing Service-Based Architectures, December 18

Software Architecture for Developers, January 7

Systems engineering and operations

Automating with Ansible, December 3

An Introduction to DevOps with AWS, December 3

Red Hat Certified Engineer (RHCE) Crash Course, December 4-7

9 Steps to Awesome with Kubernetes, December 5

Ansible for Managing Network Devices, December 5

Amazon Web Services: AWS Managed Services, December 5-6

Network Troubleshooting Using the Half Split and OODA, December 6

Google Cloud Certified Associate Cloud Engineer Crash Course, December 6-7

Getting Started with Continuous Delivery (CD), December 10

AWS Monitoring Strategies, December 10

Practical Docker, December 11

Getting Started with Amazon Web Services (AWS), December 11-12

Amazon Web Services: Architect Associate Certification - AWS Core Architecture Concepts, December 11-12

CCNP R/S ROUTE (300-101) Crash Course, December 11-13

Ansible in 3 Hours, December 12

Amazon Web Services: AWS Design Fundamentals, December 13-14

Deploying Container-Based Microservices on AWS, December 13-14

Kubernetes in 3 Hours, December 14

Jenkins 2: Up and Running, December 17

CompTIA Cloud+ CV0-002 Exam Prep, December 17

CCNP R/S SWITCH (300-115) Crash Course, December 17-19

Google Cloud Platform (GCP) for AWS Professionals, December 18

AWS CloudFormation Deep Dive, January 7-8

Red Hat Certified System Administrator (RHCSA) Crash Course, January 7-10

Building a Cloud Roadmap, January 9

Implementing and Troubleshooting TCP/IP, January 9

Docker: Up and Running, January 9-10

Building Distributed Pipelines for Data Science Using Kafka, Spark, and Cassandra, January 9-11

Understanding AWS Cloud Compute Options, January 10-11

Istio on Kubernetes: Enter the Service Mesh, January 16

AWS Certified SysOps Administrator (Associate) Crash Course, January 16-17

Chaos Engineering: Planning and Running Your First Game Day, January 17

Visualizing Software Architecture with the C4 Model, January 18

Web programming

Hands-On Chatbot and Conversational UI Development, December 3-4

Building APIs with Django REST Framework, December 17

Developing Web Apps with Angular and TypeScript, December 17-19

Rethinking REST: A Hands-On Guide to GraphQL and Queryable APIs, December 18

Continue reading 140 live online training courses opened for November, December, and January.

Categories: Technology

Kubernetes' scheduling magic revealed

Tue, 2018/11/06 - 07:40

Understanding how the Kubernetes scheduler makes scheduling decisions is critical to ensure consistent performance and optimal resource utilization.

Kubernetes is an industry-changing technology that allows massive scale and simplicity for the orchestration of containers. Most of us happily push thousands of deployments and pods to Kubernetes every day. Have you ever wondered what sorcery is at play in Kubernetes to determine where all those pods will be created in the Kubernetes cluster? All of this is made possible by the kube-scheduler.

Understanding how the Kubernetes scheduler makes scheduling decisions is critical in order to ensure consistent performance and optimal resource utilization. All scheduling in Kubernetes is done based upon a few key pieces of information. First, it is using information about the worker node to determine what the total capacity of the node is. Using kubectl describe node <node> will give you all the information you need to understand regarding how the scheduler sees the world.

Capacity: cpu: 4 ephemeral-storage: 103079200Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 16427940Ki pods: 110 Allocatable: cpu: 3600m ephemeral-storage: 98127962034 hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 14932524020 pods: 110

Here we see what the scheduler sees as being the total capacity of the worker node as well as the allocatable capacity. The allocatable numbers factor in kubelet settings for Kubernetes and system reserved space. Allocatable represents the total space the scheduler has to work with for a given node.

Next, we need to look at how we instruct the scheduler about our workload. It is important to note that Kubernetes does not consider actual CPU and memory utilization of a workload. It factors in only the resource descriptions provided by the developer or operator. Here is an example from a pod object definition:

resources: limits: cpu: 100m memory: 170Mi requests: cpu: 100m memory: 170Mi

These are the specifications provided at the container level. The developer must provide these resource requests and limits on a per container basis, not per pod. What do these specifications mean? The limits are only considered by the kubelet and are not a factor during scheduling. This indicates that the cgroup of this container will be set to limit CPU utilization to 10% of a single CPU core, and if memory utilization exceeds 170MB, then the process will be killed and restarted; there is no “soft” memory limit in Kubernetes use of cgroups. The requests are used by the scheduler to determine the best worker node on which to place this workload. Note that the scheduler is summing the resource requests of all containers in the pod to determine where to place it. The kubelet is enforcing limits on a per-container basis.

We now have enough information to understand the basic resource-based scheduling logic that Kubernetes uses. When a new pod is created, the scheduler looks at the total resource requests of the pod and then attempts to find the worker node that has the most available resources. This is tracked by the scheduler for each node, as seen in kubectl describe node:

Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) CPU Requests CPU Limits Memory Requests Memory Limits ------------ ---------- --------------- ------------- 1333m (37%) 2138m (59%) 1033593344 (6%) 1514539264 (10%)

You can investigate the exact details of the Kubernetes scheduler via the source code. There are two key concepts in scheduling. On the first pass, the scheduler attempts to filter the nodes that are capable of running a given pod based on resource requests and other scheduling requirements. On the second pass the scheduler weighs the eligible nodes based on absolute and relative resource utilization of the nodes and other factors. The highest weighted eligible node is selected for scheduling of the pod.

This post is part of a collaboration between O'Reilly and IBM. See our statement of editorial independence.

Continue reading Kubernetes' scheduling magic revealed.

Categories: Technology

Four short links: 6 November 2018

Tue, 2018/11/06 - 05:50

People Don't Change, Open Access, Event Database, and Apple Maps

  1. People Don't Change -- interesting and entertaining talk to remind you that modern people with their selfies and mobile phone obsessions aren't new special creatures unlike the people of the past. The first half is non-technical similarities, and the second half kicks into how the same human drives behind our tech obsessions can be found (with different tech) in the past. (via Daniel Siegel)
  2. Bill and Melinda Gates Foundation Endorses European Open-Access Plan (Nature) -- the Wellcome Trust, which funds over a billion pounds of research each year, will only permit publication in subscription journals if there's simultaneous release in PubMed Central. The Gates Foundation, which is already strongly pro-OA, is bringing its requirements in line with the new European Plan S. (via Slashdot)
  3. EventStore -- open source, functional database with complex event processing in JavaScript.
  4. Apple's New Maps -- fantastically detailed write-up of the new Apple Maps, coverage, visuals, omissions.

Continue reading Four short links: 6 November 2018.

Categories: Technology

Four short links: 5 November 2018

Mon, 2018/11/05 - 05:00

Probabilistic Model Checker, Notebooks to Docs, AWS 12-Factor Apps, and AI Physicist

  1. Stormchecker -- A modern model checker for probabilistic systems. Test your models of your distributed system.
  2. MonoCorpus -- a note-taking app for software and machine learning engineers meant to encourage learning, sharing, and easier development. Increase documentation for yourself and your team without slowing your velocity. Take notes as part of your process instead of dedicating time to writing them. An interesting use for notebooks.
  3. Odin -- Deploy your 12-factor-applications to AWS easily and securely with the Odin, an AWS Step Function based on the step framework that deploys services as auto-scaling groups (ASGs).
  4. Toward an AI Physicist for Unsupervised Learning -- We investigate opportunities and challenges for improving unsupervised machine learning using four common strategies with a long history in physics: divide-and-conquer, Occam's Razor, unification, and lifelong learning. Instead of using one model to learn everything, we propose a novel paradigm centered around the learning and manipulation of *theories*, which parsimoniously predict both aspects of the future (from past observations) and the domain in which these predictions are accurate. (see also MIT TR)

Continue reading Four short links: 5 November 2018.

Categories: Technology

What changes when we go offline-first?

Fri, 2018/11/02 - 13:00

Martin Kleppmann shows how recent computer science research is helping develop the abstractions and APIs for the next generation of applications.

Continue reading What changes when we go offline-first?.

Categories: Technology

The freedom of Kubernetes

Fri, 2018/11/02 - 13:00

Kris Nova looks at the new era of the cloud native space and the kernel that has made it all possible: Kubernetes.

Continue reading The freedom of Kubernetes.

Categories: Technology

The misinformation age

Fri, 2018/11/02 - 13:00

Jane Adams examines the ways data-driven recruiting fails to achieve intended results and perpetuates discriminatory hiring practices.

Continue reading The misinformation age.

Categories: Technology

Learning from the web of life

Fri, 2018/11/02 - 13:00

Claire Janisch looks at some of the best biomimicry opportunities inspired by nature’s software and wetware.

Continue reading Learning from the web of life.

Categories: Technology

Four short links: 2 November 2018

Fri, 2018/11/02 - 05:25

Colorizing Photos, Evolving Space Invaders, Is It Too Late?, and Decision-Making

  1. DeOldify -- Deep learning-based project for colorizing and restoring old images. Impressive, and open source.
  2. InvaderZ -- Space invaders, but the invaders evolve with a genetic algorithm.
  3. The Best Way to Predict the Future is to Create It. But Is It Already Too Late? -- Alan Kay lecture. If we've done things with technology that got us in a bit of a pickle, doing things with technology will probably only make that worse. When Alan Kay speaks, I listen.
  4. Farsighted -- new book by Steven Johnson, on powerful tools for honing the important skill of complex decision-making. Shades of Algorithms to Live By, but Johnson is a good writer and a good thinker, so this promises to be much more.

Continue reading Four short links: 2 November 2018.

Categories: Technology

Kubernetes: Good or evil? The ethics of data centers

Thu, 2018/11/01 - 10:00

Anne Currie says excessive and dirty energy use in data centers is one of the biggest ethical issues facing the tech industry.

Continue reading Kubernetes: Good or evil? The ethics of data centers.

Categories: Technology

Pages