You are here

Feed aggregator

Tools for generating deep neural networks with efficient network architectures

O'Reilly Radar - Thu, 2018/12/06 - 06:05

The O’Reilly Data Show Podcast: Alex Wong on building human-in-the-loop automation solutions for enterprise machine learning.

In this episode of the Data Show, I spoke with Alex Wong, associate professor at the University of Waterloo, and co-founder of DarwinAI, a startup that uses AI to address foundational challenges with deep learning in the enterprise. As the use of machine learning and analytics become more widespread, we’re beginning to see tools that enable data scientists and data engineers to scale and tackle many more problems and maintain more systems. This includes automation tools for the many stages involved in data science, including data preparation, feature engineering, model selection, and hyperparameter tuning, as well as tools for data engineering and data operations.

Wong and his collaborators are building solutions for enterprises, including tools for generating efficient neural networks and for the performance analysis of networks deployed to edge devices.

Continue reading Tools for generating deep neural networks with efficient network architectures.

Categories: Technology

Four short links: 6 December 2018

O'Reilly Radar - Thu, 2018/12/06 - 05:15

Public Domain, Optimistic Sci-Fi, C64 Defrag, and Quantum Computing

  1. Re-Opening of the Public Domain (Creative Commons) -- after years of legal extension of copyright terms, 2019 will be the first year in which new materials fall into the American public domain, and Creative Commons is throwing a bash at the Internet Archive.
  2. Better Worlds (The Verge) -- starting on January 14th, we’ll be publishing Better Worlds: 10 original fiction stories, five animated adaptations, and five audio adaptations by a diverse roster of science fiction authors who take a more optimistic view of what lies ahead in ways both large and small, fantastical and everyday. Necessary! I heard a great interview with Tyler Cowen where he said, "you cannot live with pessimism, right? There’s also a notion that more optimism is a partially self-fulfilling prophecy. Believing pessimistic views might make them more likely to come about." It is a fallacy to conflate optimism with naivete.
  3. A Disk Defragmenter for the Commodore 64 -- I don't know what's more insane: watching a great 40x25 homage to the classic Windows defrag progress screen or reading the bonkers BASIC code behind it.
  4. Quantum Computing Progress and Prospects -- an introduction to the field, including the unique characteristics and constraints of the technology, and assesses the feasibility and implications of creating a functional quantum computer capable of addressing real-world problems. This report considers hardware and software requirements, quantum algorithms, drivers of advances in quantum computing and quantum devices, benchmarks associated with relevant use cases, the time and resources required, and how to assess the probability of success. Separate the hype from the reality and develop a sense of the probability of different possible evolutionary paths for the technology.

Continue reading Four short links: 6 December 2018.

Categories: Technology

Distributed systems: A quick and simple definition

O'Reilly Radar - Thu, 2018/12/06 - 04:00

Get a basic understanding of distributed systems and then go deeper with recommended resources.

The technology landscape has evolved into an always-on environment of mobile, social, and cloud applications where programs can be accessed and used across a multitude of devices.

These always-on and always-available expectations are handled by distributed systems, which manage the inevitable fluctuations and failures of complex computing behind the scenes.

“The increasing criticality of these systems means that it is necessary for these online systems to be built for redundancy, fault tolerance, and high availability,” writes Brendan Burns, distinguished engineer at Microsoft, in Designing Distributed Systems. “The confluence of these requirements has led to an order of magnitude increase in the number of distributed systems that need to be built.”

In Distributed Systems in One Lesson, developer relations leader and teacher Tim Berglund says a simple way to think about distributed systems is that they are a collection of independent computers that appears to its user as a single computer.

Virtually all modern software and applications built today are distributed systems of some sort, says Sam Newman, director at Sam Newman & Associates and author of Building Microservices. Even a monolithic application talking to a database is a distributed system, he says, “just a very simple one.”

While those simple systems can technically be considered distributed, when engineers refer to distributed systems they’re typically talking about massively complex systems made up of many moving parts communicating with one another, with all of it appearing to an end-user as a single product, says Nora Jones, a senior software engineer at Netflix.

Think anything from, well, Netflix, to an online store like Amazon, to an instant messaging platform like WhatsApp, to a customer relationship management application like Salesforce, to Google’s search application. These systems require everything from login functionality, user profiles, recommendation engines, personalization, relational databases, object databases, content delivery networks, and numerous other components all served up cohesively to the user.

Benefits of distributed systems

These days, it’s not so much a question of why a team would use a distributed system, but rather when they should shift in that direction and how distributed the system needs to be, experts say. 

Here are three inflection points—the need for scale, a more reliable system, and a more powerful system—when a technology team might consider using a distributed system.

Horizontal Scalability

Computing processes across a distributed system happen independently from one another, notes Berglund in Distributed Systems in One Lesson. This makes it easy to add nodes and functionality as needed. Distributed systems offer “the ability to massively scale computing power relatively inexpensively, enabling organizations to scale up their businesses to a global level in a way that was not possible even a decade ago,” write Chad Carson, cofounder of Pepperdata, and Sean Suchter, director of Istio at Google, in Effective Multi-Tenant Distributed Systems.


Distributed systems create a reliable experience for end users because they rely on “hundreds or thousands of relatively inexpensive computers to communicate with one another and work together, creating the outward appearance of a single, high-powered computer,” write Carson and Suchter. In a single-machine environment, if that machine fails then so too does the entire system. When computation is spread across numerous machines, there can be a failure at one node that doesn’t take the whole system down, writes Cindy Sridharan, distributed systems engineer, in Distributed Systems Observability.


In Designing Distributed Systems, Burns notes that a distributed system can handle tasks efficiently because work loads and requests are broken into pieces and spread over multiple computers. This work is completed in parallel and the results are returned and compiled back to a central location.

The challenges of distributed systems

While the benefits of creating distributed systems can be great for scaling and reliability, distributed systems also introduce complexity when it comes to design, construction, and debugging. Presently, most distributed systems are one-off bespoke solutions, writes Burns in Designing Distributed Systems, making them difficult to troubleshoot when problems do arise.

Here are three of the most common challenges presented by distributed systems.


Because the work loads and jobs in a distributed system do not happen sequentially, there must be prioritization, note Carson and Suchter in Effective Multi-Tenant Distributed Systems:

One of the primary challenges in a distributed system is in scheduling jobs and their component processes. Computing power might be quite large, but it is always finite, and the distributed system must decide which jobs should be scheduled to run where and when, and the relative priority of those jobs. Even sophisticated distributed system schedulers have limitations that can lead to underutilization of cluster hardware, unpredictable job run times, or both.

Take Amazon, for example. Amazon technology teams need to understand which aspects of the online store need to be called upon first to create a smooth user experience. Should the search bar be called before the navigation bar? Think of the many ways both small and large that Amazon makes online shopping as useful as possible for its users.


With such a complex interchange between hardware computing, software calls, and communication between those pieces over networks, latency can become a problem for users.

“The more widely distributed your system, the more latency between the constituents of your system becomes an issue,” says Newman. “As the volume of calls over the networks increases, the more you’ll start to see transient partitions and potentially have to deal with them.”

Over time, this can lead to technology teams needing to make tradeoffs around availability, consistency, and latency, Newman says.

Performance monitoring and observability

Failure is inevitable, says Nora Jones, when it comes to distributed systems. How a technology team manages and plans for failure so a customer hardly notices it is key. When distributed systems become complex, observability into the technology stack to understand those failures is an enormous challenge.

Carson and Suchter illustrate this challenge in Effective Multi-Tenant Distributed Systems:

Truly useful monitoring for multi-tenant distributed systems must track hardware usage metrics at a sufficient level of granularity for each interesting process on each node. Gathering, processing, and presenting this data for large clusters is a significant challenge, in terms of both systems engineering (to process and store the data efficiently and in a scalable fashion) and the presentation-level logic and math (to present it usefully and accurately). Even for limited, node-level metrics, traditional monitoring systems do not scale well on large clusters of hundreds to thousands of nodes.

There are several approaches companies can use to detect those failure points, such as distributed tracing, chaos engineering, incident reviews, and understanding expectations of upstream and downstream dependencies. “There’s a lot of different tactics to achieve high quality and robustness, and they all fit into the category of having as much insight into the system as possible,” Jones says.

Learn more

Ready to go deeper into distributed systems? Check out these recommended resources from O’Reilly’s editors.

Distributed Systems Observability — Cindy Sridharan provides an overview of monitoring challenges and trade-offs that will help you choose the best observability strategy for your distributed system.

Designing Distributed Systems — Brendan Burns demonstrates how you can adapt existing software design patterns for designing and building reliable distributed applications.

The Distributed Systems Video Collection — This 12-video collection dives into best practices and the future of distributed systems.

Effective Multi-Tenant Distributed Systems — Chad Carson and Sean Suchter outline the performance challenges of running multi-tenant distributed computing environments, especially within a Hadoop context.

Distributed Systems in One Lesson — Using a series of examples taken from a fictional coffee shop business, Tim Berglund helps you explore five key areas of distributed systems.

Chaos Engineering — This report introduces you to Chaos Engineering, a method of experimenting on infrastructure that lets you expose weaknesses before they become problems.

Designing Data-Intensive Applications — Martin Kleppmann examines the pros and cons of various technologies for processing and storing data.

Continue reading Distributed systems: A quick and simple definition.

Categories: Technology

Four short links: 5 December 2018

O'Reilly Radar - Wed, 2018/12/05 - 05:00

NLP for Code, Monolith vs. Modular, Automatic Gender Recognition, and Budget Simulator

  1. code2vec -- a dedicated website for demonstrating the principles shown in the paper code2vec: Learning Distributed Representations of Code. An interesting start to using a productive NLP technique on code.
  2. Monolithic or Modular -- When monolithic adherents look at a modular project, they may think that it’s low quality or abandoned simply because commit count is low and rare, new features are not being added, and the project has no funding or community events. Interestingly, these same properties are what modular adherents will perceive as a good thing, likely to indicate that the module is complete. Monolithic adherents don’t believe a project could ever be “complete.”
  3. The Misgendering Machines: Trans/HCI Implications of Automatic Gender Recognition -- I show that AGR consistently operationalizes gender in a trans-exclusive way, and consequently carries disproportionate risk for trans people subject to it. In addition, I use the dearth of discussion of this in HCI papers that apply AGR to discuss how HCI operationalizes gender and the implications that this has for the field’s research. I conclude with recommendations for alternatives to AGR and some ideas for how HCI can work toward a more effective and trans-inclusive treatment of gender. (via Alvaro Videla)
  4. Occult Defence Agency Budgeting Simulator -- a hilarious exercise whose point is about what happens the year after you cut the budget, with parallels to UK fiscal policy left as exercise for the (pixie-ravaged) reader. I've long held that simulations are a fantastic way to make a point. (via David Stark)

Continue reading Four short links: 5 December 2018.

Categories: Technology

120+ live online training courses opened for January and February

O'Reilly Radar - Wed, 2018/12/05 - 04:00

Get hands-on training in Kubernetes, machine learning, blockchain, Python, management, and many other topics.

Learn new topics and refine your skills with more than 120 new live online training courses we opened up for January and February on our online learning platform.

Artificial intelligence and machine learning

Getting Started with Chatbot Development with the Microsoft Bot Framework, January 7-8

Essential Machine Learning and Exploratory Data Analysis with Python and Jupyter Notebook, January 7-8

Managed Machine Learning Systems and Internet of Things, January 9-10

Machine Learning in Practice, January 15

Deep Learning Fundamentals, January 17

Practical MQTT for the Internet of Things, January 17-18

Natural Language Processing (NLP) from Scratch, January 22

Getting Started with Machine Learning, January 24

Artificial Intelligence for Robotics, January 24-25

Machine Learning in Python and Jupyter for Beginners, January 30

Protecting Data Privacy in a Machine Learning World, January 31

Artificial Intelligence: Real-World Applications, January 31

Hands-On Chatbots and Conversational UI Development, February 4-5

Building a Deep Learning Model Using TensorFlow, February 7-8

A Practical Introduction to Machine Learning, February 13


Blockchain Applications and Smart Contracts, January 11

Introducing Blockchain, January 22

IBM Blockchain Platform as a Service, January 23-24

Certified Blockchain Solutions Architect (CBSA) Certification Crash Course, January 25

Building Smart Contracts on the Blockchain, January 31-February 1


Spotlight on Innovation: AI Explained with James Cham, December 12

Building the Courage to Take Risks, January 8

Fundamentals of Cognitive Biases, January 14

Negotiation Fundamentals, January 17

Emotional Intelligence in the Workplace, January 22

Writing User Stories, January 23

Adaptive Project Management, January 24

Business Strategy Fundamentals, January 24

Introduction to Time Management Skills, January 25

Having Difficult Conversations, January 28

The Power of Lean in Software Projects: Less Wasted Effort and More Product Results, January 29

Giving a Powerful Presentation, January 30

Tools for the Digital Transformation, January 30-31

Managing Your Manager, January 31

Introduction to Critical Thinking, February 6

How to Give Great Presentations, February 7

Introduction to Strategic Thinking Skills, February 11

Your First 30 Days as a Manager, February 12

Data science and data tools

Apache Hadoop, Spark, and Big Data Foundations, January 15

Python Data Handling - A Deeper Dive, January 22

Practical Data Science with Python, January 22-23

Hands-On Introduction to Apache Hadoop and Spark Programming, January 23-24

Cleaning Data at Scale, January 24

Foundational Data Science with R, January 30-31

Introduction to DAX Using Power BI, February 1

Managing Enterprise Data Strategies with Hadoop, Spark, and Kafka, February 13


Reactive Spring Boot, January 7

Design Patterns in Java, January 7-8

Spring Boot and Kotlin, January 8

Ground Zero Programming with JavaScript, January 8

SOLID Principles of Object-Oriented and Agile Design, January 11

Fundamentals of Rust, January 14-15

Mastering C++ Game Development, January 14-15

Mastering SELinux, January 15

Java Full Throttle with Paul Deitel: A One-Day, Code-Intensive Java Standard Edition Presentation, January 15

Introduction to Android Application Development with Kotlin, January 17-18

Learn Linux in 3 Hours, January 18

Scala Core Programming: Methods, Classes Traits, January 22

Programming with Java Lambdas and Streams, January 22

Getting Started with Node.js, January 23

Mastering the Basics of Relational SQL Querying, January 23-24

Developing Modern React Patterns, January 24

Getting Started with Spring and Spring Boot, January 24-25

Building Data APIs with GraphQL, January 28

Getting Started with React.js, January 28

Functional Programming in Java, January 28-29

Julia 1.0 Essentials, January 30

Reactive Spring and Spring Boot, January 30

Advanced React.JS, February 6

React Beyond the Basics - Master React's Advanced Concepts, February 7

Advanced SQL Series: Relational Division, February 7

Scala: Beyond the Basics, February 7-8

Basic Android Development, February 7-8

Object Oriented Programming in C# and .NET Core, February 8

Developing Incremental Architecture, February 11-12

Beginning Frontend Development with React, February 11-12

Getting Started with Pandas, February 12

CSS Layout Fundamentals: From Floats to Flexbox and CSS Grid, February 12

Advanced SQL Series: Proximal and Linear Interpolations, February 12

Getting Started with Python 3, February 12-13

Mastering Pandas, February 13

Kotlin for Android, February 14-15

Fundamentals of IoT with JavaScript, February 14-15


Introduction to Ethical Hacking and Penetration Testing, January 8-9

CompTIA Network+ Crash Course, January 16-18

Introduction to Encryption, January 22

AWS Security Fundamentals, January 28

CISSP Crash Course, January 29-30

Professional SQL Server High Availability and Disaster Recovery, January 29-30

CompTIA PenTest+ Crash Course, January 30-31

Security for Machine Learning, February 13

Systems engineering and operations

Hands-On Infrastructure as Code, December 11

Introduction to Kubernetes, January 3-4

Red Hat Certified System Administrator (RHCSA) Crash Course, January 7-10

Creating Serverless APIs with AWS Lambda and API Gateway, January 8

Amazon Web Services (AWS): Up and Running, January 11

Getting Started with OpenShift, January 11

Building a Deployment Pipeline with Jenkins 2, January 14-15

Microservices Architecture and Design, January 16-17

AWS Certified Solutions Architect Associate Crash Course, January 16-17

Google Cloud Platform (GCP) for AWS Professionals, January 18

Red Hat RHEL 8 New Feature, January 22

Rethinking REST: A Hands-On Guide to GraphQL and Queryable APIs, January 22

Docker: Beyond the Basics (CI & CD), January 22-23

Domain-Driven Design and Event-Driven Microservices, January 22-23

Chaos Engineering: Planning, Designing, and Running Automated Chaos Experiments, January 23

Building and Managing Kubernetes Applications, January 24

Continuous Deployment to Kubernetes, January 24-25

API Driven Architecture with Swagger and API Blueprint, January 25

DevOps Toolkit, January 28-29

End-to-End Containerization with Amazon ECS, January 28-30

Ansible in 4 Hours, January 29

CompTIA Cloud+ CV0-002 Exam Prep, January 29

Amazon Web Services: AWS Managed Services, January 29-30

CISSP Certification Practice Questions and Exam Strategies, January 30

AWS Monitoring Strategies, February 4-5

From Developer to Software Architect, February 6-7

Building Applications with Apache Cassandra, February 6-7

Moving from Server-Side to Client-Side with Angular, February 7-8

Docker: Up and Running, February 12-13

Web programming

Modern Web Development with TypeScript and Angular, January 22-23

Continue reading 120+ live online training courses opened for January and February.

Categories: Technology

Survey reveals the opportunities and realities of microservices

O'Reilly Radar - Tue, 2018/12/04 - 07:00

A new report explores how far companies have come with microservices.

Fads come and go in the technology world—anyone remember AJAX? When new, shiny things appear, architects often struggle to determine whether this is merely the latest fad or a genuine future direction.

Microservices are evolving from fad to trend. Several years ago, many companies experimented with microservices but had doubts about the operational complexity and engineering maturity required to achieve success. However, enough companies tamed the dragons to realize real benefits, making this architectural style the prevailing trend in many industries for both new application development and the migration target for many existing systems.

The O'Reilly Software Architecture Conference tracks microservices, and we periodically check in with practitioners to see how it’s being implemented in the real world. O’Reilly conducted a survey on microservices maturity in July 2018 that aimed to assess how far companies have come with microservices, what challenges they face, and some common best practices. The 866 responses were summarized and analyzed in our free report, The State of Microservices Maturity.

Insights from the report include:

  • Containers continue to rise in popularity for microservices: 69% of survey respondents use containers for microservices deployment.
  • Although Kubernetes enjoys great popularity in the press and at conferences, adoption is still below the 40% mark for our survey respondents.
  • More than 50% of respondents use continuous deployment, which speaks to overall engineering maturity in the industry.
  • 86% of respondents rate their microservices efforts at least partially successful.

For the full survey findings and analysis, download The State of Microservices Maturity.

Continue reading Survey reveals the opportunities and realities of microservices.

Categories: Technology

Four short links: 4 December 2018

O'Reilly Radar - Tue, 2018/12/04 - 04:35

Voice Technology, AI Summaries, Time Tracker, and Homomorphic Encryption

  1. Fifteen Unconventional Uses of Voice Technology (Nicole He) -- Students had half a semester to learn tools like the Web Speech API, Dialogflow, and Actions on Google, and then were tasked with making something...interesting. The in-class code examples we used are on GitHub. Here are 15 funny, subversive, and impressively weird final projects from the class.
  2. Summary of 2018's Most Important AI Papers -- To help you catch up, we’ve summarized 10 important AI research papers from 2018 to give you a broad overview of machine learning advancements this year. There are many more breakthrough papers worth reading as well, but we think this is a good list for you to start with.
  3. arbtt -- a time tracker that sits in the background. You write rules that tell it how to categorize your activity.
  4. Microsoft Simple Encrypted Arithmetic Library -- an easy-to-use but powerful homomorphic encryption library written in C++. It supports both the BFV and the CKKS encryption schemes. (via Microsoft Research Blog)

Continue reading Four short links: 4 December 2018.

Categories: Technology

Four short links: 3 December 2018

O'Reilly Radar - Mon, 2018/12/03 - 04:45

Amazon and OSS, Audio to Keystrokes, The New OS, and Software Sprawl

  1. Amazon is Competing with Its Customers -- What's more, Kreps said, Amazon has not contributed a single line of code to the Apache Kafka open source software and is not reselling Confluent's cloud tool. Sometimes Amazon contributes back, but increasingly often it seems like its software MO is exploitation not co-creation. This is what prompted the creation of various "open except if you resell it as a cloud service"-source licenses, like the Commons Clause.
  2. kbd-audio -- tools for capturing and analyzing keyboard input paired with microphone capture.
  3. Kubernetes is the OS That Matters (Matt Asay) -- provocative clickbait title, but the point is important: if single-machine apps are the exception, then the lowest layer of critical shared software is no longer the OS but instead the cluster manager.
  4. Software Sprawl, The Golden Path, and Scaling Teams with Agency (Charity Majors) -- good talk on how to recover from "we're using too many shiny tools, and it's hard to make progress because there's no common set of tools, so everyone's reinventing the wheel, and omg fire."

Continue reading Four short links: 3 December 2018.

Categories: Technology

Four short links: 30 November 2018

O'Reilly Radar - Fri, 2018/11/30 - 05:00

Advents are Coming, Open Source, Restricted Exports, and Misinformation Operations

  1. QEMU Advent Calendar -- An amazing QEMU disk image every day!. It's that time of year again! See also Advent of Code.
  2. De Facto Closed Source -- You want to download thousands of lines of useful, but random, code from the internet, for free, run it in a production web server, or worse, your user’s machine, trust it with your paying users’ data and reap that sweet dough. We all do. But then you can’t be bothered to check the license, understand the software you are running, and still want to blame the people who make your business a possibility when mistakes happen, while giving them nothing for it? This is both incompetence and entitlement.
  3. U.S. Government Wonders What to Limit Exports Of -- The representative general categories of technology for which Commerce currently seeks to determine whether there are specific emerging technologies that are essential to the national security of the United States include: (1) Biotechnology, such as: (i) Nanobiology; (ii) Synthetic biology; (iv) Genomic and genetic engineering; or (v) Neurotech. (2) Artificial intelligence (AI) and machine learning technology, such as: (i) Neural networks and deep learning (e.g., brain modeling, time series prediction, classification); (ii) Evolution and genetic computation (e.g., genetic algorithms, genetic programming); (iii) Reinforcement learning; (iv) Computer vision (e.g., object recognition, image understanding); (v) Expert systems (e.g., decision support systems, teaching systems); (vi) Speech and audio processing (e.g., speech recognition and production); (vii) Natural language processing (e.g., machine translation); (viii) Planning (e.g., scheduling, game playing); (ix) Audio and video manipulation technologies (e.g., voice cloning, deepfakes); (x) AI cloud technologies; or (xi) AI chipsets. (3) Position, Navigation, and Timing (PNT) technology. (4) Microprocessor technology, such as: (i) Systems-on-Chip (SoC); or (ii) Stacked Memory on Chip. (5) Advanced computing technology, such as: (i) Memory-centric logic. (6) Data analytics technology, such as: (i) Visualization; (ii) Automated analysis algorithms; or (iii) Context-aware computing. (7) Quantum information and sensing technology, such as (i) Quantum computing; (ii) Quantum encryption; or (iii) Quantum sensing. (8) Logistics technology, such as: (i) Mobile electric power; (ii) Modeling and simulation; (iii) Total asset visibility; or (iv) Distribution-based Logistics Systems (DBLS). (9) Additive manufacturing (e.g., 3D printing); (10) Robotics such as: (i) Micro-drone and micro-robotic systems; (ii) Swarming technology; (iii) Self-assembling robots; (iv) Molecular robotics; (v) Robot compliers; or (vi) Smart Dust. (11) Brain-computer interfaces, such as (i) Neural-controlled interfaces; (ii) Mind-machine interfaces; (iii) Direct neural interfaces; or (iv) Brain-machine interfaces. (12) Hypersonics, such as: (i) Flight control algorithms; (ii) Propulsion technologies; (iii) Thermal protection systems; or (iv) Specialized materials (for structures, sensors, etc.). (13) Advanced Materials, such as: (i) Adaptive camouflage; (ii) Functional textiles (e.g., advanced fiber and fabric technology); or (iii) Biomaterials. (14) Advanced surveillance technologies, such as: Faceprint and voiceprint technologies. It's a great list of what's in the next Gartner Hype Cycle report.
  4. The Digital Maginot Line (Renee DiResta) -- We know this is coming, and yet we’re doing very little to get ahead of it. No one is responsible for getting ahead of it. [...] platforms aren’t incentivized to engage in the profoundly complex arms race against the worst actors when they can simply point to transparency reports showing that they caught a fair number of the mediocre actors. [...] The regulators, meanwhile, have to avoid the temptation of quick wins on meaningless tactical bills (like the Bot Law) and wrestle instead with the longer-term problems of incentivizing the platforms to take on the worst offenders (oversight), and of developing a modern-day information operations doctrine.

Continue reading Four short links: 30 November 2018.

Categories: Technology

Four short links: 29 November 2018

O'Reilly Radar - Thu, 2018/11/29 - 05:00

Security Sci-Fi, AWS Toys, Quantum Ledger, and Insecurity in Software in Hardware

  1. The Cliff Nest -- sci-fi story with computer security challenges built in.
  2. Amazon Textract -- OCR in the cloud, extracting not just text but also structured tables. Part of a big feature dump Amazon's done today, including recommendations, AWS on-prem, and a fully managed time series database.
  3. Quantum Ledger Database -- a fully managed ledger database that provides a transparent, immutable, and cryptographically verifiable transaction log owned by a central trusted authority. Amazon QLDB tracks each and every application data change and maintains a complete and verifiable history of changes over time. Many of the advantages of a blockchain ledger without the distributed pains. Quantum in the sense of "minimum chunk of something," not "uses quantum computing."
  4. Sennheiser Headset Software Enabled MITM Attacks -- When users have been installing Sennheiser's HeadSetup software, little did they know the software was also installing a root certificate into the Trusted Root CA Certificate store. To make matters worse, the software was also installing an encrypted version of the certificate's private key that was not as secure as the developers may have thought. This is the price of using software to improve hardware.

Continue reading Four short links: 29 November 2018.

Categories: Technology

Four short links: 28 November 2018

O'Reilly Radar - Wed, 2018/11/28 - 05:00

FaaS, Space as a Service, Bot Yourself, and Facebook's RL Platform

  1. Firecracker -- Amazon's open source virtualization technology that is purpose-built for creating and managing secure, multitenant containers and functions-based services. Docker but for FaaS platforms. Best explanation is on Firecracker is solving the problem of multitenant container density while maintaining the security boundary of a VM. If you’re entirely running first-party trusted workloads and are satisfied with them all sharing a single kernel and using Linux security features like cgroups, selinux, and seccomp, then Firecracker may not be the best answer. If you’re running workloads from customers similar to Lambda, desire stronger isolation than those technologies provide, or want defense in depth, then Firecracker makes a lot of sense. It can also make sense if you need to run a mix of different Linux kernel versions for your containers and don’t want to spend a whole bare-metal host on each one.
  2. Amazon Ground Station: Ingest and Process Data from Orbiting Satellites -- a sign that space is becoming more mainstream. Also interesting because they're doing a bunch of processing in EC2 rather than at the basestation. General-purpose computers often beat specialized ones.
  3. Me Bot -- A simple tool to make a bot that speaks like you, simply learning from your WhatsApp Chats. (via Hacker News)
  4. Horizon -- FB open sources reinforcement learning platform for large-scale products and services, built on PyTorch.

Continue reading Four short links: 28 November 2018.

Categories: Technology

Four short links: 27 November 2018

O'Reilly Radar - Tue, 2018/11/27 - 04:55

Open Source, Interactive Fiction, Evolving Images, and Closed Worlds

  1. Open Source is Not About You (Rich Hickey) -- As a user of something open source, you are not thereby entitled to anything at all. You are not entitled to contribute. You are not entitled to features. You are not entitled to the attention of others. You are not entitled to having value attached to your complaints. You are not entitled to this explanation. Tough love talk. See also this statement by the author of the event-stream NPM module, who passed maintenance onto someone who added malware to it. If it's not fun anymore, you get literally nothing from maintaining a popular package.
  2. Ganbreeder -- explore images created by generative adversarial networks.
  3. 2018 IFComp Winners -- interactive fiction is nextgen chatbot tech. Worth keeping up with to see how they stretch parsers and defy expectations of the genre.
  4. The Architecture of Closed Worlds (We Make Money Not Art) -- One of the most striking lessons of the book is that it is extremely difficult to create a miniaturized world without inheriting some of the problems of the surrounding world. No matter how much control was exerted on the synthetic habitats, no matter how ambitious the vision, the breadth of engineering and human ingeniosity, the results were marred by surprisingly mundane obstacles: gerbils outsmarting the machine, bacteria loss, fingernails and skin infiltrating collectors, or simply the difficulty of implementing behavioural changes. The physical version of online social networks that are shocked to discover their userbase includes pedophiles, racists, stalkers, murderers, nutters, and malicious folks.

Continue reading Four short links: 27 November 2018.

Categories: Technology

Four short links: 26 November 2018

O'Reilly Radar - Mon, 2018/11/26 - 05:35

Graphics Engine, Graph Library, Docker Tool, and Probabilistic Cognition

  1. Heaps -- a mature cross-platform graphics engine designed for high-performance games. It is designed to leverage modern GPUs that are commonly available on both desktop and mobile devices. 2D and 3D game framework, built on the Haxe language and toolkit.
  2. VivaGraphJS -- JavaScript graph manipulation and rendering in JavaScript, designed to be extensible and to support different rendering engines and layout algorithms.
  3. dive -- tool for exploring each layer in a docker image.
  4. Probabilistic Models of Cognition -- This book explores the probabilistic approach to cognitive science, which models learning and reasoning as inference in complex probabilistic models. We examine how a broad range of empirical phenomena, including intuitive physics, concept learning, causal reasoning, social cognition, and language understanding, can be modeled using probabilistic programs (using the WebPPL language).

Continue reading Four short links: 26 November 2018.

Categories: Technology

Four short links: 23 November 2018

O'Reilly Radar - Fri, 2018/11/23 - 06:20

Chinese iPhone Users, Sci-Fi UI, MITM Framework, and HTTP/3

  1. Chinese iPhone Users are Poor -- The Shanghai-based firm also found that most iPhone users are unmarried females aged between 18 and 34, who graduated with just a high school certificate and earn a monthly income of below 3,000 yuan (HK$3,800). They are perceived to be part of a group known as the “invisible poor”—those who do not look as poor as their financial circumstances.
  2. eDEX-UI -- a fullscreen desktop application resembling a sci-fi computer interface, heavily inspired from DEX-UI and the TRON Legacy movie effects. It runs the shell of your choice in a real terminal and displays live information about your system. It was made to be used on large touchscreens but will work nicely on a regular desktop computer or perhaps a tablet PC or one of those funky 360° laptops with touchscreens.
  3. evilginx2 -- a man-in-the-middle attack framework used for phishing login credentials along with session cookies, which in turn allows one to bypass 2-factor authentication protection.
  4. Some Notes About HTTP/3 (Errata Security) -- QUIC is really more of a new version of TCP (TCP/2???) than a new version of HTTP (HTTP/3). It doesn't really change what HTTP/2 does so much as change how the transport works. Therefore, my comments below are focused on transport issues rather than HTTP issues.

Continue reading Four short links: 23 November 2018.

Categories: Technology

Four short links: 22 November 2018

O'Reilly Radar - Thu, 2018/11/22 - 06:50

XOXO Talks, Git Illustrated, Post-REST Services, and Learning Projects

  1. XOXO 2018 Videos -- playlist of talks from XOXO 2018. (via BoingBoing)
  2. Learn Git Branching -- visual!
  3. Post-REST (Tim Bray) -- musings on what might replace REST in different parts of the current world of web services.
  4. Projects -- list of practical projects that anyone can solve in any programming language, divided into categories according to what the project will exercise your knowledge of—e.g., Files, Data Structures, Threading, etc. Good for teachers looking for ideas.

Continue reading Four short links: 22 November 2018.

Categories: Technology

Building tools for enterprise data science

O'Reilly Radar - Wed, 2018/11/21 - 07:00

The O’Reilly Data Show Podcast: Vitaly Gordon on the rise of automation tools in data science.

In this episode of the Data Show, I spoke with Vitaly Gordon, VP of data science and engineering at Salesforce. As the use of machine learning becomes more widespread, we need tools that will allow data scientists to scale so they can tackle many more problems and help many more people. We need automation tools for the many stages involved in data science, including data preparation, feature engineering, model selection and hyperparameter tuning, as well as monitoring.

I wanted the perspective of someone who is already faced with having to support many models in production. The proliferation of models is still a theoretical consideration for many data science teams, but Gordon and his colleagues at Salesforce already support hundreds of thousands of customers who need custom models built on custom data. They recently took their learnings public and open sourced TransmogrifAI, a library for automated machine learning for structured data, which sits on top of Apache Spark.

Continue reading Building tools for enterprise data science.

Categories: Technology

Four short links: 21 November 2018

O'Reilly Radar - Wed, 2018/11/21 - 05:55

Black Mirror, Innovation Toolkits, Code-Generator for APIs, and Hardware Effects

  1. Black Mirror Brainstorms (Aaron Lewis) -- In light of the latest FB scandal, here's my proposal for replacing Design Sprints: "Black Mirror Brainstorms." A workshop in which you create a Black Mirror episode. The plot must revolve around misuse of your team's product. See Casey Fiesler's Black Mirror, Light Mirror, which I've linked to before on 4SL.
  2. Toolkit Navigator -- A compendium of toolkits for public sector innovation and transformation, curated by OPSI and our partners around the world.
  3. Conjure -- Palantir's open source simple but opinionated toolchain for defining APIs once and generating client/server interfaces in multiple languages. For more, read the blog post.
  4. Hardware Effects -- this repository demonstrates various hardware effects that can degrade application performance in surprising ways and that may be very hard to explain without knowledge of the low-level CPU and OS architecture. For each effect I try to create a proof of concept program that is as small as possible so it can be understood easily. How full stack ARE you?

Continue reading Four short links: 21 November 2018.

Categories: Technology

Four short links: 20 November 2018

O'Reilly Radar - Tue, 2018/11/20 - 05:15

East African ML Needs, Autonomy Corrections, Information Security, and UIs from Doodles

  1. Some Requests for Machine Learning Research from the East African Tech Scene -- Based on 46 in–depth interviews [...] a list of concrete machine learning research problems, progress on which would directly benefit tech ventures in East Africa. Example: Priors for autocorrect and low-literacy SMS use—SMS text contains many language misuses due to a combination of autocorrection and low literacy. E.g., “poultry farmer” becoming “poetry farmer.” Such mistakes are bound to occur in any written language corpus, but engineers working with rural populations in East Africa report that this is a prevalent issue for them, confounding the use of pretrained language models. This problem also exists to some degree in voice data with respect to English spoken in different accents. Priors over autocorrect substitution rules, or custom, per–dialect confusion matrices between phonetically similar words could potentially help. Expect much more work like this as AI/ML moves into non-WEIRD (Western Educated Industrialized Rich Democratic) nations.
  2. How the Media Gets Tesla Wrong -- a reminder that our convenient shorthand and once-over-lightly reading of the news gives a false and rosy picture of what's possible.
  3. Why Information Security is Hard: An Economic Perspective -- fascinating arguments! I particularly like the statistical argument: a lone attacker might find 10 bugs a year, a well-prepared defender might find 1,000 bugs a year, but if there are 100,000 available bugs for exploitation, then there's very low probability that the defender found and patched the same bugs that the attacker found...
  4. DoodleMaster -- sketches->UI via a CNN, a proof-of-concept.

Continue reading Four short links: 20 November 2018.

Categories: Technology


Subscribe to LuftHans aggregator