Insight, analysis, and research about emerging technologies from O'Reilly Media.
…
continue reading
O'Reilly Radar tracks the technologies and people that will shape our world in the years to come. Each episode of O'Reilly Radar features an interview with an industry thought leader, with topics touching on everything from programming to data to experience design. We also take a step back from the breathless pace of the latest tech news to examine why new developments are important and what they might mean down the road.
…
continue reading
O'Reilly Media spreads the knowledge of innovators. At O’Reilly, a big part of our business is paying attention to what’s new and interesting in the world of technology. We have a pretty good record at having anticipated some of the big technology developments in recent history. For instance, we launched the first commercial Web site, GNN, in 1993; we organized the meeting at which the term “open source” was first adopted; we were early investors in Blogger, which helped launch the blogging ...
…
continue reading
In this episode of the Data Show, I speak with Peter Bailis, founder and CEO of Sisu, a startup that is using machine learning to improve operational analytics. Bailis is also an assistant professor of computer science at Stanford University, where he conducts research into data-intensive systems and where he is co-founder of the DAWN Lab.…
…
continue reading
In this episode of the Data Show, I speak with Arun Kejariwal of Facebook and Ira Cohen of Anodot (full disclosure: I’m an advisor to Anodot). This conversation stemmed from a recent online panel discussion we did, where we discussed time series data, and, specifically, anomaly detection and forecasting. Both Kejariwal (at Machine Zone, Twitter, an…
…
continue reading
In this episode of the Data Show, I speak with Michael Mahoney, a member of RISELab, the International Computer Science Institute, and the Department of Statistics at UC Berkeley. A physicist by training, Mahoney has been at the forefront of many important problems in large-scale data analysis. On the theoretical side, his works spans algorithmic a…
…
continue reading
In this episode of the Data Show, I speak with Kesha Williams, technical instructor at A Cloud Guru, a training company focused on cloud computing. As a full stack web developer, Williams became intrigued by machine learning and started teaching herself the ML tools on Amazon Web Services. Fast forward to today, Williams has built some well-regarde…
…
continue reading
In this episode of the Data Show, I speak with Alex Ratner, project lead for Stanford’s Snorkel open source project; Ratner also recently garnered a faculty position at the University of Washington and is currently working on a company supporting and extending the Snorkel project. Snorkel is a framework for building and managing training data. Base…
…
continue reading
In this interview, Tim Craig and fellow Googler Gustavo Franco, a site reliability engineer (SRE), discuss the wide range of events that qualify as “incidents;” the need for a conscious, robust, and well-defined process for understanding them; the role of training; and how to get buy-in from management so you can spread incident response training t…
…
continue reading
In this episode of the Data Show, I speak with Cassie Kozyrkov, technical director and chief decision scientist at Google Cloud. She describes "decision intelligence" as an interdisciplinary field concerned with all aspects of decision-making, and which combines data science with the behavioral sciences. Most recently she has been focused on develo…
…
continue reading
In this episode of the Data Show, I spoke with Roger Chen, co-founder and CEO of Computable Labs, a startup focused on building tools for the creation of data networks and data exchanges. Chen has also served as co-chair of O'Reilly's Artificial Intelligence Conference since its inception in 2016. This conversation took place the day after Chen and…
…
continue reading
In this week's episode of the Data Show, we're featuring an interview Data Show host Ben Lorica participated in for the Software Engineering Daily Podcast, where he was interviewed by Jeff Meyerson. Their conversation mainly centered around data engineering, data architecture and infrastructure, and machine learning (ML).…
…
continue reading
In this episode of the Data Show, I spoke with Nick Pentreath, principal engineer at IBM. Pentreath was an early and avid user of Apache Spark, and he subsequently became a Spark committer and PMC member. Most recently his focus has been on machine learning, particularly deep learning, and he is part of a group within IBM focused on building open s…
…
continue reading
At Google’s 2019 Cloud Next conference, I sat down with Stephen Thorne, site reliability engineer on Google’s customer reliability engineering team and co-author of "The Site Reliability Workbook," to talk about how organizations, both large and small, can use SRE to reduce operational costs, improve reliability, and create productive cross-functio…
…
continue reading
In this episode of the Data Show, I spoke with Dhruba Borthakur (co-founder and CTO) and Shruti Bhat (SVP of Marketing) of Rockset, a startup focused on building solutions for interactive data science and live applications. Borthakur was the founding engineer of HDFS and creator of RocksDB, while Bhat is an experienced product and marketing executi…
…
continue reading
In this episode of the Data Show, I spoke with Jike Chong, chief data scientist at Acorns, a startup focused on building tools for micro-investing. Chong has extensive experience using analytics and machine learning in financial services, and he has experience building data science teams in the U.S. and in China.We had a great conversation spanning…
…
continue reading
In this episode of the Data Show, I spoke with Jeff Jonas, CEO, founder and chief scientist of Senzing, a startup focused on making real-time entity resolution technologies broadly accessible. He was previously a fellow and chief scientist of context computing at IBM. Entity resolution (ER) refers to techniques and tools for identifying and linking…
…
continue reading
In this episode of the Data Show, I spoke with Neelesh Salian, software engineer at Stitch Fix, a company that combines machine learning and human expertise to personalize shopping. As companies integrate machine learning into their products and systems, there are important foundational technologies that come into play. This shouldn’t come as a sho…
…
continue reading
1
What Data Scientists and Data Engineers Can Do with Current Generation Serverless Technologies
36:34
In this episode of the Data Show, I spoke with Avner Braverman, co-founder and CEO of Binaris, a startup that aims to bring serverless to web-scale and enterprise applications. This conversation took place shortly after the release of a seminal paper from UC Berkeley (“Cloud Programming Simplified: A Berkeley View on Serverless Computing”), and thi…
…
continue reading
In this episode of the Data Show, I spoke with Forough Poursabzi-Sangdeh, a postdoctoral researcher at Microsoft Research New York City. Poursabzi works in the interdisciplinary area of interpretable and interactive machine learning. As models and algorithms become more widespread, many important considerations are becoming active research areas: f…
…
continue reading
In this episode of the Data Show, I spoke with Kartik Hosanagar, professor of technology and digital business, and professor of marketing at The Wharton School of the University of Pennsylvania. Hosanagar is also the author of a newly released book, "A Human’s Guide to Machine Intelligence," an interesting tour through the recent evolution of AI ap…
…
continue reading
In this episode of the Data Show, I spoke with P.W. Singer, strategist and senior fellow at the New America Foundation, and a contributing editor at Popular Science. He is co-author of an excellent new book, LikeWar: The Weaponization of Social Media, which explores how social media has changed war, politics, and business. The book is essential rea…
…
continue reading
In this episode of the Data Show, I spoke with Siwei Lyu, associate professor of computer science at the University at Albany, State University of New York. Lyu is a leading expert in digital media forensics, a field of research into tools and techniques for analyzing the authenticity of media files. Over the past year, there have been many stories…
…
continue reading
In this episode of the Data Show, I spoke with Maryam Jahanshahi, research scientist at TapRecruit, a startup that uses machine learning and analytics to help companies recruit more effectively. In an upcoming survey, we found that a “skills gap” or “lack of skilled people” was one of the main bottlenecks holding back adoption of AI technologies. M…
…
continue reading
In this episode of the Data Show, I spoke with Andrew Burt, chief privacy officer and legal engineer at Immuta, a company building data management tools tuned for data science. Burt and cybersecurity pioneer Daniel Geer recently released a must-read white paper (“Flat Light”) that provides a great framework for how to think about information securi…
…
continue reading
In this episode of the Data Show, I spoke with Haoyuan Li, CEO and founder of Alluxio, a startup commercializing the open source project with the same name (full disclosure: I’m an advisor to Alluxio). Our discussion focuses on the state of Alluxio (the open source project that has roots in UC Berkeley’s AMPLab), specifically emerging use cases her…
…
continue reading
For the end-of-year holiday episode of the Data Show, I turned the tables on Data Show host Ben Lorica to talk about trends in big data, machine learning, and AI, and what to look for in 2019. Lorica also showcased some highlights from our upcoming Strata Data and Artificial Intelligence conferences.…
…
continue reading
In this episode of the Data Show, I spoke with Alex Wong, associate professor at the University of Waterloo, and co-founder of DarwinAI, a startup that uses AI to address foundational challenges with deep learning in the enterprise. As the use of machine learning and analytics become more widespread, we’re beginning to see tools that enable data sc…
…
continue reading
In this episode of the Data Show, I spoke with Vitaly Gordon, VP of data science and engineering at Salesforce. As the use of machine learning becomes more widespread, we need tools that will allow data scientists to scale so they can tackle many more problems and help many more people. We need automation tools for the many stages involved in data …
…
continue reading
In this episode of the Data Show, I spoke with Francesca Lazzeri, an AI and machine learning scientist at Microsoft, and her colleague Jaya Mathew, a senior data scientist at Microsoft. We conducted a couple of surveys this year—“How Companies Are Putting AI to Work Through Deep Learning” and “The State of Machine Learning Adoption in the Enterpris…
…
continue reading
In this episode of the Data Show, I spoke with Alon Kaufman, CEO and co-founder of Duality Technologies, a startup building tools that will allow companies to apply analytics and machine learning to encrypted data. In a recent talk, I described the importance of data, various methods for estimating the value of data, and emerging tools for incentiv…
…
continue reading
In this episode of the Data Show, I spoke with Jacob Ward, a Berggruen Fellow at Stanford University. Ward has an extensive background in journalism, mainly covering topics in science and technology, at National Geographic, Al Jazeera, Discovery Channel, BBC, Popular Science, and many other outlets. Most recently, he’s become interested in the inte…
…
continue reading
In this episode of the Data Show, I spoke with Sharad Goel, assistant professor at Stanford, and his student Sam Corbett-Davies. They recently wrote a survey paper, “A Critical Review of Fair Machine Learning,” where they carefully examined the standard statistical tools used to check for fairness in machine learning models. It turns out that each …
…
continue reading
This episode of the O’Reilly Podcast, features a conversation on serverless and Kubernetes, with Kelsey Hightower, developer advocate for Google Cloud Platform at Google (and co-author of "Kubernetes: Up and Running"), and Chris Gaun, Kubernetes product marketing manager at Mesosphere.Por O'Reilly Radar
…
continue reading
In this episode of the Data Show, I spoke with Alan Nichol, co-founder and CTO of Rasa, a startup that builds open source tools to help developers and product teams build conversational applications. About 18 months ago, there was tremendous excitement and hype surrounding chatbots, and while things have quieted lately, companies and developers con…
…
continue reading
In this episode of the Data Show, I spoke with Eric Jonas, a postdoc in the new Berkeley Center for Computational Imaging. Jonas is also affiliated with UC Berkeley’s RISE Lab. It was at a RISE Lab event that he first announced Pywren, a framework that lets data enthusiasts proficient with Python run existing code at massive scale on Amazon Web Ser…
…
continue reading
In this episode of the O’Reilly Media Podcast, Rachel Roumeliotis, VP of content strategy at O’Reilly, sat down with Daniel Krook, IBM developer advocate. They discussed how developers across industries can participate in the Call for Code initiative, the benefits of the program, support from its charitable partners—United Nations Human Rights and …
…
continue reading
In this episode of the Data Show, I spoke with Harish Doddi, co-founder and CEO of Datatron, a startup focused on helping companies deploy and manage machine learning models. As companies move from machine learning prototypes to products and services, tools and best practices for productionizing and managing models are just starting to emerge. Toda…
…
continue reading
In this episode of the Data Show, I spoke with Chang Liu, applied research scientist at Georgian Partners. In a previous post, I highlighted early tools for privacy-preserving analytics, both for improving decision-making (business intelligence and analytics) and for enabling automation (machine learning). One of the tools I mentioned is an open so…
…
continue reading
In this episode of the Data Show, I spoke with Andrew Feldman, founder and CEO of Cerebras Systems, a startup in the blossoming area of specialized hardware for machine learning. Since the release of AlexNet in 2012, we have seen an explosion in activity in machine learning, particularly in deep learning. A lot of the work to date happened primaril…
…
continue reading
In a recent episode of the O’Reilly Media Podcast, we spoke with George Miranda about the importance of service mesh technology in creating reliable distributed systems. As discussed in the new report The Service Mesh: Resilient Service-to-Service Communication for Cloud Applications, service mesh technology has emerged as a popular tool for compan…
…
continue reading
In this episode of the Data Show, I spoke with Aurélie Pols of Mind Your Privacy, one of my go-to resources when it comes to data privacy and data ethics. This interview took place at Strata Data London, a couple of days before the EU General Data Protection Regulation (GDPR) took effect. I wanted her perspective on this landmark regulation, as wel…
…
continue reading
In this episode of the Data Show, I spoke with Andrew Burt, chief privacy officer at Immuta, and Steven Touw, co-founder and CTO of Immuta. Burt recently co-authored an upcoming white paper on managing risk in machine learning models, and I wanted to sit down with them to discuss some of the proposals they put forward to organizations that are depl…
…
continue reading
In this episode of the Data Show, I spoke with Ashok Srivastava, senior vice president and chief data officer at Intuit. He has a strong science and engineering background, combined with years of applying machine learning and data science in industry. Prior to joining Intuit, he led the teams responsible for data and artificial intelligence product…
…
continue reading
In this episode of the O’Reilly Podcast, I talk with Tammy Butow, a site reliability engineer at Gremlin, and Annie Lau, a software engineering manager at Trulia, about creating a culture of learning, how experimentation is important to business, and their careers in tech.Por O'Reilly Radar
…
continue reading
This episode of the Data Show marks our 100th episode. This podcast stemmed out of video interviews conducted at O’Reilly’s 2014 Foo Camp. We had a collection of friends who were key members of the data science and big data communities on hand and we decided to record short conversations with them. We originally conceived of using those initial con…
…
continue reading
In this episode of the O’Reilly Podcast, I talk with Cory Doctorow, who is a science fiction author, editor of Boing Boing, the former European director of the Electronic Frontier Foundation (EFF), and currently a special advisor for the EFF. Doctorow will be a keynote speaker at the O’Reilly Fluent Conference, July 11-14, 2018, in San Jose.…
…
continue reading
In this episode of the O'Reilly Podcast, Fluent Conference Speaker Series chair and author Kyle Simpson sat down with Brian Holt, a senior cloud developer at Microsoft. Holt will be teaching a training course, A Complete Introduction to React" and hosting a session "10 KB or bust: The delicate power of webpack and Babel" at the O'Reilly Fluent Conf…
…
continue reading
In this episode of the O’Reilly Media Podcast, I talk with JP Phillips, platform engineer at IBM Cloud. IBM is driving development in the container space, as shown through last year’s launch of Istio, an open cloud service that allows developers to connect, manage, and secure networks of different microservices. Istio, a joint collaboration between…
…
continue reading
In this episode of the O’Reilly Podcast, I talk with Brendan Eich, the creator of JavaScript, co-founder of the Mozilla Project and Foundation, and CEO and founder of Brave Software. Eich will be a keynote speaker at the upcoming O’Reilly Fluent Conference, July 11-14, 2018, in San Jose.Por O'Reilly Radar
…
continue reading
In this episode of the Data Show, I spoke with Jason Dai, CTO of Big Data Technologies at Intel, and one of my co-chairs for the AI Conference in Beijing. I wanted to check in on the status of BigDL, specifically how companies have been using this deep learning library on top of Apache Spark, and discuss some newly added features. It turns out ther…
…
continue reading
In this episode of the Data Show, I spoke with Jerry Overton, senior principal and distinguished technologist at DXC Technology. I wanted the perspective of someone who works across industries and with a variety of companies. I specifically wanted to explore the current state of data science and AI within companies and public sector agencies. As mu…
…
continue reading
In this episode of the O’Reilly podcast, I spoke with Stephen Gates of Oracle Dyn. Gates joined the Oracle Dyn Global Business Unit from Zenedge, the web application security company recently acquired by Oracle. Gates and I discussed how growing malicious bot activity impacts organizations.Por O'Reilly Radar
…
continue reading