The @WalmartLabs Blog

Read about how we’re innovating the way our customers shop.

  1. Why we chose OpenStack for Walmart Global eCommerce

    Posted on by Bao Nguyen

    Written by Amandeep Juneja, senior director of cloud operations and engineering, @WalmartLabs

    When people think of Walmart and the nitty gritty of how we do business, they usually think about merchandising, item placement, and inventory management—the hallmarks of running a global chain of retail stores. So Walmart’s decision to invest in cloud infrastructure might not be something you’d expect from a brick and mortar retailer with over $480B in revenue.

    So why did we decide to invest so heavily in the cloud?

    The answer is that Walmart has always relied on cutting-edge technology to fuel our growth. Walmart was a pioneer in opening up our inventory systems to vendors to reduce inventory costs and bring lower prices to our customers. We were the first company to connect our store network with satellite communication, long before the advent of Internet, enabling us to reach consumers who until then had no access to discount stores.

    We’ve always sought to expand fast, adapt to changing consumer preferences, and keep the costs of operations low.

    Walmart is growing fast, and Walmart Global eCommerce is leading the charge. Our customers want to use our eCommerce platform from many different access points, not only from their home computers but also from mobile phones, tablets, and kiosks within Walmart retail stores—always expecting a seamless experience.

    With such rapid growth, we needed a technology stack that would scale to meet the explosive demand, flexible enough to build applications that adapt to ever-changing user preferences, and with enough big data smarts to predict what customers want and provide them with recommendations.

    For traditional businesses, growing in size means that economies of scale kick in, leading to lower per-unit costs. On the retail side, Walmart has always enjoyed these cost savings, which we’ve passed down to our customers in the form of our trademark “everyday low prices.”

    But when it comes to technology, things aren’t so simple. As a company’s technology footprint increases, the expansion can lead to “diseconomies of scale,” meaning the cost per transaction actually goes up. The cost of doing business goes up, and having more users is actually bad for business.

    Such an inversion of the cost curve can happen due to bad infrastructure architecture, when you find yourself locked into a certain system or application, where vertically scaling costs more than horizontal scaling. It can also be the result of bad application design. Maintaining and adding new features becomes a nightmare, increasing opportunity cost for businesses for delivering new products.

    That’s where cloud architecture comes in. Instead of expanding vertically by, say, buying big, powerful machines for ten times the cost, distributed computing means you can use large number of commodity machines to spread and gather — providing the same power, but at a fraction of the cost of traditional data centers and infrastructures.

    A second benefit of the cloud is that distributed architecture provides a higher degree of resilience and reliability. A single machine can go down, but the odds that ten machines go down at once are much lower.

    Last year, @WalmartLabs made a decision. In order to meet the challenges of eCommerce 3.0, we needed to overhaul our technology stack and the tech vision that goes with it. We decided to build an elastic cloud, running applications using a services-oriented architecture.

    We wanted to choose the platform that would best support our application developers, enabling them to rapidly build all kinds of applications, including mobile, WebApps, and RestFul APIs for vendors. A platform that would empower product managers to iterate over new product ideas in an agile manner. A platform that would enable Walmart to respond to customer needs more efficiently.

    We chose OpenStack as our cloud platform, not only because it’s best of the breed, but also because open source software comes with several big advantages:

    • Using open source means we avoid long-term lock-ins with any single private vendor.
    • More importantly, we know that Walmart Global eCommerce is growing into something unique. Using open source means we can modify and customize software to meet our needs.
    • Finally, OpenStack has a true community around it. It’s been used and supported by market leaders all over the Bay Area. Walmart wants to be part of that community. We have a team of very talented developers, and we plan to contribute aggressively to the open source community.

    In the nine months, since we started building OpenStack cloud, we’ve already built an OpenStack Compute layer with 100K cores and counting. Our next step is to bring in more block storage and venture into software-defined networks using OpenStack projects such as Neutron and Cinder. We’re currently building a multi-petabyte object storage using Swift.

    A lot of people use OpenStack, but what makes Walmart’s OpenStack project so exciting is the scale of our investment. Over 140 million customers shop our stores and online in the US every week. Unlike other large installations, we’re using the OpenStack platform for real production loads. By the holidays last year, Walmart.com’s entire U.S. production traffic was on OpenStack compute.

    This is an incredibly exciting time. Around the world, eCommerce will only continue to grow, and Walmart is lucky to have the opportunity to contribute to the technologies that will make it happen.

  2. Data Science in Search @Labs: an Interview with Dr. Manas Pathak

    Posted on by Bao Nguyen

    Data science is fast becoming one of the most ubiquitous and powerful fields today. Sought after across multiple industries to generate value from the growing preponderance of data, data scientists have become not only a valuable asset, but also a critical part of the team that any company must build and retain. @WalmartLabs is no exception to this trend; for Polaris, the @Labs team that builds the search engine and discovery experience powering Walmart.com, data science is an integral part of everything that we do – from building ranking algorithms and new features to assortment analytics and user behavior analysis. Dr. Manas Pathak, a staff software engineer on our Search Relevance team, is a testament to this and is responsible for several core features used by over a billion people to search and connect to products. Using his experiences here @Labs, Dr. Pathak recently authored a book, Beginning Data Science with R. He also took some time to answer some questions about his work here, his thoughts on data science, and what inspired him to write a book.

    You work on complex data science problems here in search at @WalmartLabs. Can you give us a brief overview some of your work here and what kind of problems you’re solving? 

    @WalmartLabs is one of the best places to do data science anywhere in the world. The biggest opportunity here is due to the gigantic amount of data. Even small insights obtained from this data often lead to huge business impact in absolute terms. This is especially true for the search team, where we use this data to build and optimize algorithms that help users find the products they are looking for.

    I have contributed to multiple core search relevance features including click engagement modeling, where we build a statistical model to learn from the past customer behavior to improve the ranking of search results. This feature is a major component of the search ranking algorithm powering Walmart.com and other eCommerce websites and has led to significant improvement in site-wide conversion rates and revenue. Another set of features I have worked on is left hand navigation, where we determine the important attributes: categories and facets for a given search query. I have created multiple models to rank these attributes in the most relevant order.

    To many people who have just heard of the term “data science”, it seems very much a new field, although we can perhaps more accurately describe it as an extension of statistics and computer science. It certainly has jumped into the limelight and its popularity seems to continue to reach new heights. How do you see data science evolving? What about the role and skills of the data scientist?

    Fundamentally, data science is the methodology of extracting useful insights from the data. More than being an extension of any single discipline, data science is at the intersection of programming, statistics, and domain knowledge. For the most part, data science is not entirely new; techniques in these areas have been practiced for decades under different names. Only in the last few years, we are seeing a set of these techniques fall under the area of data science. As a part of the evolution of data science, I continue to foresee the techniques being standardized both in terms of techniques and tools. I also foresee R being the dominant tool for data science with its vast package system providing open source implementations for most data analysis techniques.

    With the growing popularity of data science, there is an acute shortage of skilled data science professionals in industry. An important area of standardization is with respect to the training of data scientists. Currently most data scientists have a background in computer science, statistics, or one of the quantitative sciences such as Physics or Math. Currently, there are a few universities such as Columbia and UC Berkeley offering professional masters programs in Data Science. I foresee more universities offering a standardized data science curriculum both at undergraduate and graduate levels.

    As the popularity of data science has risen, the field has attracted newcomers from various disciplines who want to either understand data science or become data scientists themselves. Coming from a strong technical background and having spent time across various data science teams and companies, how do you recommend beginners to start learning data science?

    Being an applied discipline, the only effective way to learn data science is by doing. It is helpful for newcomers to get a good understanding of the different aspects of data science, but the biggest benefit comes from trying them out on their own datasets. For newcomers in industry, the best way to learn is to look for insights in the data from their business processes. Students can similarly analyze public datasets and also participate in open data analysis competitions such as the ones on Kaggle. A good starting point is to get hands on experience with data science and R through real world case studies.

    What was your main motivation in writing a book on data science?

    Over the years, I had gained a lot of experience applying data science on diverse datasets with R programming language, especially here in Search @WalmartLabs. I found most other books on R were either focusing on only the features of R programming language or a specialized application area. Either category of books are not the most useful for readers who do not already have a good understanding of the data science concepts. My motivation to write this book was to provide the intuitive understanding of data science as well as the steps to carry them out with R. The goal is to help readers quickly get started with their own data science problems.

    In Beginning Data Science with R, you cite your purpose of striking a balance between the “how” and the “why” of various data science techniques. I think this is a fantastic idea but comes with rather difficult execution; many introductory books tend to be either completely intuitive or highly practical. Can you speak to how you aimed to accomplish this sought-after balance in the book?

    Maintaining this balance was one of the main challenges in writing this book. In my experience, most books on R are either too hard for beginners or too shallow for intermediate or expert readers. I wanted this book to be accessible to readers without a background in data science, so I spent a lot of time covering the basics and introducing the R programming language.

    For every data science methodology, the book introduces the motivation and gives an intuitive understanding before directly diving into the technical detail. A great way to cover both “how” and “why” of data science together is through case studies. In this book, I included data analysis of a real world dataset with each chapter. Designing all of the features I worked on here at Search required applying many data science techniques that I have covered in my book.

    About Manas Pathak: Dr. Manas A. Pathak received a BTech degree in computer science from Visvesvaraya National Institute of Technology, Nagpur, India, in 2006, and MS and PhD degrees from the Language Technologies Institute at Carnegie Mellon University (CMU) in 2009 and 2012 respectively. His PhD thesis on “Privacy-Preserving Machine Learning for Speech Processing” was published as a monograph in the Springer best thesis series. His research received significant press coverage, including articles in the Economist and MIT Tech Review. He has many years of experience with data analysis using the R programming language. He is currently working as a staff software engineer in Search Relevance at @WalmartLabs’s Polaris team.

  3. Mobile Development Story: 2 Engineers Tapping Into Massive Scale

    Posted on by Charles McBrian

    by Charles McBrian, Sr. Manager, Mobile Engineering @WalmartLabs

    As a mobile engineer and manager, I’m always on the lookout for engaging projects for our teams. Challenging projects that have a high signal-to-noise ratio. Lots of real work. Most importantly, I want projects that make a difference for our customers. Earlier this year, we found one of those projects: Walmart Android Photo (codename: Blixt). This is the story of how two engineers, leveraged the massive scale of Walmart, and made a huge impact. Planning Blixt Earlier this year, my partner on the business side came to me and asked if we could “sneak-in” photo support for the Walmart mobile app. At the time, a lot of our people were working on a mobile Pharmacy (a “mega-project” many of us were involved with as well). Finding volunteers was easy: photo was an interesting and relevant project. We put together a two-man team: a summer intern we turned into a node.js developer and a senior Android developer. Our first task was to align the team on what we were building. We started with “page one” specs that defined the basic spirit of what we wanted to make. We prototyped at the same time. We reviewed and iterated on each other’s specs, demo‘d prototypes, and set down a clear roadmap of what we could build, incorporating additional feedback from the larger team. From all of this, we created a pitch demo and a plan of what we should build. This was presented to our execs. The execs told us to go for it, but advised us to “stay small and scrappy.”

    Stealth project. Define it. Chip away at it. Be scrappy. Business has your back. Engineering is in the driver’s seat. Own it end to end (client and services). What was there not to love?

    Building Blixt

    Every week, more than 140 million Americans visit Walmart stores or Walmart’s web site. APIs that connect to our stores and our customers create massive leverage for us in engineering. For Blixt, we tapped into a service that allowed us to print photos at more than 3,400 stores. Using the service required that we write some additional services code that would transiently store and print photos at the store. We called this Blixt Server. Blixt Server was written using the hapi framework in node.js. At its core, Blixt Server handles image uploading, printing, image cropping, and notifications. Uploads are stored in a replicated Riak database. Printing is handled through the previously mentioned store printing API. We’ve implemented server-side cropping/zooming using node.js GraphicsMagick / ImageMagick libraries. Notifications are done via Walmart e-mail services and Google Cloud Messaging (Android). We’ve got load balancers in front of a scalable cluster of VMs, a rolling-stage environment that allows us to rationally upgrade software, logging, and more. We started with the premise that we wanted to give users a way to choose which photos they liked and easily order prints from within the app—a workflow we dubbed “pick and print”. With that one directive, we gave our Android developer free reign to create the feature as he saw fit. When he was done, our UX folks were pleasantly surprised that the design was mostly there. It was truly amazing to see the improvements in our workflow and visual design once our designers got their hands on the feature.

    Blixt Client on Android is made up of a few basic components: local photo gallery, cart, cropping UI, uploader, order placement, and notifications. It was critical that photo gallery performed well even with a high number of images on the device. We required the uploader to work in the background. The uploader also needed to be incredibly robust with built-in backoff and retry, as well as cached re-uploading. Specific print sizes could be specified for all images or a la carte on a per-image basis. Prints could be cropped and zoomed. Users could specify the store location at which to pick up their order. Users would receive native push notifications when orders were confirmed and ready for pick up in-store. All of this talked to the Blixt Services via RESTful interfaces that were jointly defined by our client and service engineers. I was happy to see tight collaboration on the API specification.

    Shipping Blixt — It Ain’t Ready Til It’s Ready

    One of the most and fun and unique things about working for Walmart is the connection of our mobile applications to physical stores. Even us techies get to go to stores, talk to store employees (Associates), and try stuff out. I remember how much fun it was to do our first in-store, end-to-end, smoke test. Watching the first prints get uploaded from an Android phone and printed through an in-store printer was a very gratifying experience for the team. We fixed a couple bugs and talked to the folks staffing the Photo Center about their process and what we could do to make things better. More importantly, we got to be customers for a day. Standing in the shoes of our store Associates and customers was a great empathy-building moment.

    A few days before deploying to the Google Play store we ran one last round of usability testing. As a team, we had used the product a zillion times and, printed countless photos. Overall, we felt confident that we had built the right product. Good thing we ran that last usability test. Out of 12 subjects, 11 couldn’t figure out the “bulk order” screen, a fundamental part of the order flow. We asked our usability expert, the same woman who commissioned the tests, how bad it was. She showed us videos of the test subjects trying to get through the screen. Ugh. It was agonizing, especially given the timing.

    After the findings, our small team huddled and worked through a range of options. We settled on a plan of action that would delay our ship date by a week. As someone who cares deeply about what we ship, I was happy was to see the team rally around the problem and fix it. So How Did We Do? Alight, enough cheerleading.

    How did we actually perform? Performance for Android alone, as of September 9, 2014:

    • Android is currently outperforming desktop web by 2x.

    • Growth is purely organic. No marketing. Soft launch.

    • Services uptime and client quality have been very good.

    • Financial metrics (private) have far exceeded expectations.

    With the addition of an iOS client (in development soon) and marketing support (fingers crossed), our numbers could soon rise to 6x to 10x of the already amazingly high numbers we are already seeing. When the holidays arrive (our busiest time), we’ve forecast 6x on top of that! Over the course of running the Blixt Service for the past couple months, the uptime has been really good. Bugs happen and in every case development, QA, and dev ops have done an amazing job at keeping the system up for the customers. Whenever you ship a new production service you have a period of time where you are watching it scale, and look to optimize the footprint. The initial demand for this feature has been 5x what we originally forecast. It’s been great to watch Blixt server evolve and scale to meet this unexpected challenge. With the growth, we once again found some interesting issues that we could quickly debug and fix. For this article I’ve focused mostly on engineering. Given the nature of the project, it was possible for the core engineering team to deliver the bulk of the work. That being said, the Blixt team is standing on other people’s shoulders, who are standing on the shoulders of our giant company. Mobile design, usability, quality, dev ops, and business folks gladly came in and contributed to this project. We have a truly amazing mobile team and we work for a truly amazing company. If it weren’t for the APIs that allow us to print to photo centers, there would be no Blixt. The fact that every week we have over 140 million customers, many of whom visit one of our 3400+ photo centers is just… mind boggling. For the Blixt team, Walmart really delivered on its promise of scale. Even a really small team, a core team of 2 developers, can have an enormous impact.

    About the Author

    “Software engineering and making great mobile products are my passions. It still feels like it’s early days for mobile. It feels like the best is yet to come.” Lives in Belmont with his two boys (10 and 12), fiancée, and dog (Augie). Loves music and art.