You are here

Feed aggregator

Four short links: 11 September 2018

O'Reilly Radar - Tue, 2018/09/11 - 05:10

Serverless, Predicting Personality, Broken Design, and Hamming Lectures

  1. Serverless Cold Start War -- hard numbers on the cold start time on different function-as-a-service providers.
  2. Eye Movements During Everyday Behavior Predict Personality Traits -- Using a state-of-the-art machine learning method and a rich set of features encoding different eye movement characteristics, we were able to reliably predict four of the big five personality traits (neuroticism, extraversion, agreeableness, conscientiousness) as well as perceptual curiosity, only from eye movements.
  3. Broken Product Design (We Make Money Not Art) -- Not only did he ask them to fabricate items that would be unusable but he also requested that each worker had full license to decide what the error, flaw, and glitch in the final product would be. Hutchison ended up with a collection of dysfunctional objects and prints of online exchanges with baffled factory managers.
  4. Learning to Learn (Richard Hamming) -- watch lectures in computer architecture, engineering, data, measurement, and quantum mechanics from a legend. (via Star Simpson)

Continue reading Four short links: 11 September 2018.

Categories: Technology

The ethics of data flow

O'Reilly Radar - Tue, 2018/09/11 - 04:00

If we’re going to think about the ethics of data and how it’s used, then we have to take into account how data flows.

Data, even “big data,” doesn’t stay in the same place: it wants to move. There’s a long history of language about moving data: we have had dataflow architectures, there's a great blog on visualization titled FlowingData, and Amazon Web Services has a service for moving data by the (literal) truckload. Although the scale and speed at which data moves has changed over the years, we’ve recognized the importance of flowing data ever since the earliest years of computing. If we’re going to think about the ethics of data and how it’s used, then, we can’t just think about the content of the data, or even its scale: we have to take into account how data flows.

In Privacy in Context, Helen Nissenbaum connects data’s mobility to privacy and ethics. For Nissenbaum, the important issue isn’t what data should be private or public, but how data and information flow: what happens to your data, and how it is used. Information flows are central to our expectations of privacy, and respecting those expectations is at the heart of data ethics. We give up our data all the time. It’s next to impossible to live in modern society without giving up data: we use credit cards to pay for groceries, we make reservations at restaurants, we fill prescriptions at pharmacies. And we usually have some sort of expectation that our data will be used. But those expectations include expectations about how the data will be used: who will have access to it, for what reason, and for what purposes.

Problems arise when those expectations are violated. As Nissenbaum writes, "What people care most about is not simply restricting the flow of information but ensuring that it flows appropriately." The infamous Target case, in which Target outed a pregnant teenager by sending ad circulars to her home, is a great example. We all buy things, and when we buy things, we know that data is used—to send bills and to manage inventory, if nothing else. In this case, the surprise was that Target used this customer's purchase history to identify her as pregnant, and send circulars advertising products for pregnant women and new mothers to her house. The problem isn't the collection of data, or even its use; the problem is that the advertising comes from, and produces, a different and unexpected data flow. The data that’s flowing isn’t just the feed to the marketing contractor. That ad circular, pushed into a mailbox (and read by the girl’s father) is another data flow, and one that’s not expected. To be even more precise: the problem isn’t even putting an ad circular in a mailbox, but that this data flow isn’t well defined. Once the circular goes in the mailbox, anyone can read it.

Facebook’s ongoing problems with the Cambridge Analytica case aren’t problems of data theft or intrusion; they’re problems of unexpected data flows. Customers who played the game "This is Your Digital Life" didn’t expect their data to be used in political marketing—to say nothing of their friend’s data, which was exposed even if they didn’t play. Facebook asked Cambridge Analytica to delete the data back in 2015, but apparently did nothing to determine whether the data was actually deleted, or shared further. Once data has started flowing, it is very difficult to stop it.

Data flows can be very complex. danah boyd, in the second chapter of It’s Complicated: The Social Lives of Networked Teens, describes the multiple contexts that teenagers use on social media, and their strategies for communicating within their groups in a public medium: in particular, their use of coded messages that are designed to be misunderstood by parents or others not in their group. They are creating strategies to control information flows that appear to be out of their control. Teens can’t prevent parents from seeing their Facebook feeds, but they can use a coded language to prevent their parents from understanding what they’re really saying.

Everyone who works with data knows that data becomes much more powerful when it is combined with data from other sources. Data that seems innocuous, like a grocery store purchase history, can be combined with geographic data, medical data, and other kinds of data to characterize users and their behavior with great precision. Knowing whether a person purchases cigarettes can be of great interest to an insurance company, as can knowing whether a cardiac patient is buying bacon. Increasing the police presence in some neighborhood areas inevitably leads to more arrests in those neighborhoods, creating the appearance of more crime. Data flows have complex topologies: multiple inputs, outputs, and feedback loops. The question isn’t just where your data goes and how it will be shared; it’s also what incoming data will be mixed with your data.

Nissenbaum argues that we shouldn’t be asking about absolute notions of what data should or shouldn’t be “private,” but about where the data can travel, our expectations about that travel, and what happens when data reaches its destination. That makes a lot of intuitive sense. A pharmacy or a grocery store collects a lot of data just to do business: again, it has to do billing, it has to manage stock. It has some control over how that data is remixed, shared, and commoditized. But it doesn't have control over how its partners ultimately use the data. It might be able to control what mailers its advertising agencies sends out—but who's going to raise a red flag about an innocent circular advertising baby products? It can't control what an insurance company, or even a government agency, might do with that data: deny medical benefits? Send a social worker? In many cases, consumers won't even know that their privacy has been violated, let alone how or why; they'll just know that something has happened.

As developers, how can we understand and manage data flows according to our users' expectations? That's a complex question, in part because our desires and expectations as both users and developers are different from our users’, and we can’t assume that users understand how their data might be put to work. Furthermore, enumerating and evaluating all possible flows, together with the consequences of those flows, is certainly NP-hard.

But we can start asking the difficult questions, recognizing that we’re neither omniscient nor infallible. The problem facing us isn’t that mistakes will be made, because they certainly will; the problem is that more mistakes will be made, and more damage will be done, if we don’t start taking responsibility for data flows. What might that responsibility mean?

Principles for ethical data handling (and human experimentation in general) always stress "informed consent"; Nissenbaum’s discussion about context suggests that informed consent is less about usage than about data flow. The right question isn't, "can our partners make you offers about products you may be interested in?" but, "may we share your purchase data with other businesses?" (If so, what businesses?) Or perhaps, “may we combine your purchase data with other demographic data to predict your future purchases?” (If so, what other demographic data?)

One way to prevent unexpected data flows is to delete the data before it has a chance to go anywhere. Deleted data is hard to abuse. A decade ago, data developers were saying "Save everything. Storage is cheap." We now understand that's naive. If data is collected for a purpose, it might be necessary to delete it when it has served its purpose—for example, most libraries delete records of the books a user has checked out after the books have been returned. Deleted data can’t be stolen, inadvertently shared, or demanded by a legal warrant. “Save everything” invites troublesome data flows.

But data deletion is easier said than done. The difficulty, as Facebook found out with Cambridge Analytica, is that asking someone to delete data doesn’t mean they will actually delete it. It isn’t easy to prove that data has been deleted; we don’t have auditing tools that are up to the task. In many cases, it’s not even clear what “deletion” means: does it mean that the data is removed from backups? Backups from which data is removed after-the-fact aren’t really backups; can they be trusted to restore the system to a known state? Reliable backups are an important (and infrequently discussed) part of ethical data handling, but they are also a path through which data can escape and continue to flow in the wild.

And deletion doesn’t always work in the users’ favor. Deleting data prematurely makes it difficult for a customer to appeal a decision; redress assumes we can reconstruct what happened to find an appropriate solution. Historically, it’s almost certainly true that more data has been deleted to preserve entrenched power than to preserve individual privacy. The ability to "undelete" is powerful, and shouldn't be underestimated. Data should be deleted as soon as it’s no longer needed, but no sooner—and determining when data really is no longer needed isn’t a trivial problem.

These aren’t problems to be solved in a short article. However, they are problems that we in the data community need to recognize and face. They won’t go away; they will become more serious and urgent as time goes on. How does data flow? What dams and levees can we create that will prevent data from flowing in unexpected or unwanted ways? And once we create those levees, what will happen when they break? That will inevitably be one of the most important stories of the next year.

Continue reading The ethics of data flow.

Categories: Technology

Four short links: 10 September 2018

O'Reilly Radar - Mon, 2018/09/10 - 04:25

Optoelectronics, Checked C, MagicScroll, Quantum AWS

  1. The Largest Cognitive Systems Will be Optoelectronic -- Electrons and photons offer complementary strengths for information processing. Photons are excellent for communication, while electrons are superior for computation and memory. Cognition requires distributed computation to be communicated across the system for information integration. We present reasoning from neuroscience, network theory, and device physics supporting the conjecture that large-scale cognitive systems will benefit from electronic devices performing synaptic, dendritic, and neuronal information processing operating in conjunction with photonic communication.
  2. Checked C -- This paper presents Checked C, an extension to C designed to support spatial safety, implemented in Clang and LLVM. Checked C’s design is distinguished by its focus on backward-compatibility, incremental conversion, developer control, and enabling highly performant code. Like past approaches to a safer C, Checked C employs a form of checked pointer whose accesses can be statically or dynamically verified. Performance evaluation on a set of standard benchmark programs shows overheads to be relatively low. More interestingly, Checked C introduces the notions of a checked region and bounds-safe interfaces. Here's the source.
  3. MagicScroll: A Rollable Display Device with Flexible Screen Real Estate and Gestural Input -- a rollable tablet with two concatenated flexible multitouch displays, actuated scrollwheels, and gestural input. When rolled up, MagicScroll can be used as a rolodex, smartphone, expressive messaging interface, or gestural controller. When extended, it provides full access to its 7.5-inch high-resolution multitouch display, providing the display functionality of a tablet device.
  4. Rigetti Launches Quantum Cloud Services (FastCompany) -- AWS-style cloud platform with a fast connection to 128-qubit computing. Grabbing land ahead of quantum computing actually being useful.

Continue reading Four short links: 10 September 2018.

Categories: Technology

Machine learning in the cloud

O'Reilly Radar - Fri, 2018/09/07 - 13:00

Hagay Lupesko explores key trends in machine learning, the importance of designing models for scale, and the impact that machine learning innovation has had on startups and enterprises alike.

Continue reading Machine learning in the cloud.

Categories: Technology

Four success factors for building your AI business journey

O'Reilly Radar - Fri, 2018/09/07 - 13:00

Manish Goyal shows you how to best unlock the value of enterprise AI.

Continue reading Four success factors for building your AI business journey.

Categories: Technology

AI and security: Lessons, challenges, and future directions

O'Reilly Radar - Fri, 2018/09/07 - 13:00

Dawn Song explains how AI and deep learning can enable better security and how security can enable better AI.

Continue reading AI and security: Lessons, challenges, and future directions.

Categories: Technology

Connected arms

O'Reilly Radar - Fri, 2018/09/07 - 13:00

Joseph Sirosh tells an intriguing story about AI-infused prosthetics that are able to see, grip, and feel.

Continue reading Connected arms.

Categories: Technology

Customized ML for the enterprise

O'Reilly Radar - Fri, 2018/09/07 - 13:00

Levent Besik explains how enterprises can stay ahead of the game with customized machine learning.

Continue reading Customized ML for the enterprise.

Categories: Technology

The breadth of AI applications: The ongoing expansion

O'Reilly Radar - Fri, 2018/09/07 - 13:00

Peter Norvig says one of the most exciting aspects of AI is the diversity of applications in fields far astray from the original breakthrough areas.

Continue reading The breadth of AI applications: The ongoing expansion.

Categories: Technology

Accelerating AI on Xeon through SW optimization

O'Reilly Radar - Fri, 2018/09/07 - 13:00

Huma Abidi discusses the importance of optimization to deep learning frameworks.

Continue reading Accelerating AI on Xeon through SW optimization.

Categories: Technology

A new golden age for computer architecture

O'Reilly Radar - Fri, 2018/09/07 - 13:00

David Patterson explains why he expects an outpouring of co-designed ML-specific chips and supercomputers.

Continue reading A new golden age for computer architecture.

Categories: Technology

10 talks to look for at the 2018 O'Reilly Software Architecture Conference in London

O'Reilly Radar - Fri, 2018/09/07 - 08:10

From chaos architecture to event streaming to leading teams, the O'Reilly Software Architecture Conference offers a unique depth and breadth of content.

We received more than 200 abstracts for talks for the 2018 O'Reilly Software Architecture Conference in London—on both expected and surprising topics. We continue to see strong interest in microservices and its related ecosystem, including topics like DevOps and tools like Kubernetes. The quality of the abstracts led to a stellar lineup of speakers, talks, and keynotes.

Two of the outstanding features of the O'Reilly Software Architecture Conference are the depth and breadth of our content. While most conferences have a single software architecture track, our whole conference revolves around software architecture. That means we can go much deeper, covering topics that would be too rarefied for other conferences. That also means we can spread out, tackling subjects critical to success as an architect (like soft skills) but too broad for most developers conferences. To showcase our depth and breadth, I've chosen a few sessions to highlight, illustrating the astounding variety of topics and perspectives on display this year.

Introduction to Chaos Architecture: Gaining from Learning Loops and System Weaknesses, by Russ Miles, ChaosIQ.io

Chaos engineering and the attendant architectural concerns is a red-hot topic—we had a lot of interesting talks at the Software Architecture Conference in New York this year, too. In the talk linked here, Russ Miles asks the pointed question: "What happens when (not if) something in your system breaks? How will you handle it?" Pioneered by Netflix, chaos engineering represents a great example of continuing innovation in software engineering practices.

Technology Strategy Patterns for Architects, by Eben Hewitt, Sabre

This is cheating, as both this and the previous talks happen at the same time, but I can't not mention both! I had the privilege to review Hewitt's upcoming O'Reilly book by the same title. A great example of our breadth, this talk exposes the uninitiated to the seemingly arcane world and nomenclature of strategy consultants. Architects must often participate in these types of meetings, so understanding the building blocks and approaches of these consultants allows architects to participate and contribute. A highly recommended talk in an area almost never covered at technology conferences.

7 Years of DDD: Tackling Complexity in Large-Scale Marketing Systems, by Vladik Khononov, Naxex

We get a huge number of excellent case studies at the O'Reilly Software Architecture Conference, which we like because it gives attendees insight into real-world problems and solutions; this one is an outstanding example. Many organizations think about embracing the domain-driven design (DDD) philosophy but wonder about the long game—where will we be seven years from now? This talk provides useful perspective of the long-term implications, problems, and solutions inherent in fully embracing a technique like DDD.

Observable Microservices, by Maria Gomez, ThoughtWorks

Like peanut butter and chocolate, monitoring and microservices are a natural combination. This talk delves into the nuances of building microservices that incorporate sophisticated monitoring and fitness functions for a better understanding of the runtime characteristics of your system.

Beyond the Technical—Succeed at Leading a Software Architecture Team, by Maggie Carroll, Ausley.us

Many architects are shocked when they realize how many non-technical skills are required to be successful at their jobs, and team leadership tops the list. This talk covers critical skills and perspectives for architects to embrace to help team building.

Architecting for Data-Driven Reliability, by Yaniv Aknin, Google Cloud

Data is often ignored in architecture talks as a messy inconvenience, but the real world doesn't allow that luxury. As architectures become more distributed, data reliability becomes a critical concern. This talk covers how to design an architecture to ensure modern capabilities while still maintaining old-school reliability.

Event Streaming as a Source of Truth, by Benjamin Stopford

As teams build more sophisticated distributed systems, they sometimes move to event streams rather than databases as the source of truth. However, architects must deal with numerous issues and considerations when making such a fundamental shift. This talk covers the pros and cons, along with some best practices and warnings.

Distributed Systems Are a UX Problem, by Tyler Treat, Real Kinetic

No matter how distributed many architectures become, they still re-unify at a monolithic user interface. Many architects struggle reconciling a highly distributed microservices architecture with a monolithic user interface; this talk covers many of the issues and offers some solutions.

Sundhed.dk's Journey From Monolith to GDPR-Compliant Microservices, by Tobias Uldall-Espersen and Thomas Krogsgaard Holme, Sundhed.dk

This journey is another great case study. Many attendees find themselves on the same journey of restructuring a monolithic application to microservices, and many have been recently blindsided by GDPR. This talk offers great real-world insight into how to migrate and mitigate architectural concerns simultaneously.

Using Continuous Delivery with Machine Learning to Tackle Fraud, by Sarah LeBlanc and Hany Elemary, ThoughtWorks

This talk is a great mashup of ideas that could only find a home at a software architecture conference! Yet another case study-based talk, it covers a real-world concern that many companies struggle with by combining state-of-the-art engineering practices, machine learning, and security, and tying it all together with architecture.

As the O'Reilly Software Architecture Conference has grown over the last couple of years, the quality of talks continues to rise. The big problem is choosing between what’s on offer in every time slot.

Continue reading 10 talks to look for at the 2018 O'Reilly Software Architecture Conference in London.

Categories: Technology

Four short links: 7 September 2018

O'Reilly Radar - Fri, 2018/09/07 - 05:00

Quantifying Facebook, Deep Learning IDE, REPL + Debugger, and RPC Library

  1. Unveiling and Quantifying Facebook Exploitation of Sensitive Personal Data for Advertising Purposes -- This paper quantifies the portion of Facebook users in the European Union (EU) who were labeled with interests linked to potentially sensitive personal data in the period prior to when GDPR went into effect. The results of our study suggest that Facebook labels 73% of EU users with potential sensitive interests. This corresponds to 40% of the overall EU population. We also estimate that a malicious third party could unveil the identity of Facebook users who have been assigned a potentially sensitive interest at a cost as low as €0.015 per user. Finally, we propose and implement a web browser extension to inform Facebook users of the potentially sensitive interests Facebook has assigned them. (via Morning Paper)
  2. Subgraphs -- a deep learning IDE.
  3. REPLugger: REPL + Debugger -- My belief is that providing tools to augment programmer understanding is one of the most important interventions we can make. Me, too.
  4. brpc -- Baidu's RPC library, with 1,000,000+ instances (not counting clients) and thousands of kinds of services.

Continue reading Four short links: 7 September 2018.

Categories: Technology

Using machine learning in workload automation

O'Reilly Radar - Thu, 2018/09/06 - 13:00

Akhilesh Tripathi shows you how to use machine learning to identify root causes of problems in minutes instead of hours or days.

Continue reading Using machine learning in workload automation.

Categories: Technology

China: AI superpower

O'Reilly Radar - Thu, 2018/09/06 - 13:00

Kai-Fu Lee outlines the factors that enabled China's rapid ascension in AI.

Continue reading China: AI superpower.

Categories: Technology

AI foundations: What shapes the AI that’s shaping our world?

O'Reilly Radar - Thu, 2018/09/06 - 13:00

Meredith Whittaker says the benefits of AI will only come if we have a clear-eyed perspective on its dark side.

Continue reading AI foundations: What shapes the AI that’s shaping our world?.

Categories: Technology

Fireside chat with Tim O'Reilly and Kai-Fu Lee

O'Reilly Radar - Thu, 2018/09/06 - 13:00

Tim O'Reilly and Kai-Fu Lee discuss differences in how China and the U.S. approach AI and why AI might give humanity larger purpose.

Continue reading Fireside chat with Tim O'Reilly and Kai-Fu Lee.

Categories: Technology

Raising AI to benefit business and society

O'Reilly Radar - Thu, 2018/09/06 - 13:00

Kishore Durg explains why deploying AI requires raising it to act as a responsible representative of the business and a contributing member of society.

Continue reading Raising AI to benefit business and society.

Categories: Technology

Highlights from the Artificial Intelligence Conference in San Francisco 2018

O'Reilly Radar - Thu, 2018/09/06 - 13:00

Watch highlights from expert talks covering artificial intelligence, machine learning, security, and more.

People from across the AI world came together in San Francisco for the Artificial Intelligence Conference. Below you'll find links to highlights from the event.

China: AI superpower

Kai-Fu Lee outlines the factors that enabled China's rapid ascension in AI.

Fireside chat with Tim O'Reilly and Kai-Fu Lee

Tim O'Reilly and Kai-Fu Lee discuss differences in how China and the U.S. approach AI and why AI might give humanity larger purpose.

AI foundations: What shapes the AI that’s shaping our world?

Meredith Whittaker says the benefits of AI will only come if we have a clear-eyed perspective on its dark side.

Using machine learning in workload automation

Akhilesh Tripathi shows you how to use machine learning to identify root causes of problems in minutes instead of hours or days.

AI at scale at Coinbase

Soups Ranjan describes the machine learning system that Coinbase built to detect potential fraud and fake identities.

OpenAI and the path toward safe AGI

Greg Brockman discusses OpenAI's recent advancements and their implications for how we should plan for creating safe artificial general intelligence (AGI).

Raising AI to benefit business and society

Kishore Durg explains why deploying AI requires raising it to act as a responsible representative of the business and a contributing member of society.

Beyond hype: AI in the real world

Julie Shin Choi reviews real-world customer use cases that take AI from theory to reality.

Machine learning in the cloud

Hagay Lupesko explores key trends in machine learning, the importance of designing models for scale, and the impact that machine learning innovation has had on startups and enterprises alike.

Customized ML for the enterprise

Levent Besik explains how enterprises can stay ahead of the game with customized machine learning.

Accelerating AI on Xeon through SW optimization

Huma Abidi discusses the importance of optimization to deep learning frameworks.

AI and security: Lessons, challenges, and future directions

Dawn Song explains how AI and deep learning can enable better security and how security can enable better AI.

Four success factors for building your AI business journey

Manish Goyal shows you how to best unlock the value of enterprise AI.

The breadth of AI applications: The ongoing expansion

Peter Norvig says one of the most exciting aspects of AI is the diversity of applications in fields far astray from the original breakthrough areas.

Connected arms

Joseph Sirosh tells an intriguing story about AI-infused prosthetics that are able to see, grip, and feel.

A new golden age for computer architecture

David Patterson explains why he expects an outpouring of co-designed ML-specific chips and supercomputers.

Continue reading Highlights from the Artificial Intelligence Conference in San Francisco 2018.

Categories: Technology

Beyond hype: AI in the real world

O'Reilly Radar - Thu, 2018/09/06 - 13:00

Julie Shin Choi reviews real-world customer use cases that take AI from theory to reality.

Continue reading Beyond hype: AI in the real world.

Categories: Technology

Pages

Subscribe to LuftHans aggregator