Python @ BuzzFeed

    How to run a global media enterprise on an open-source, community-driven language.

    Our development philosophy at BuzzFeed rests on several time-tested, proven principles of modern application design. These principles include modular programming, standardization of interfaces, collaborative development, open-source code, and test-driven development.

    Among modern programming languages, Python stands out as a great example of a language that implements and supports these principles from the core on up. As a way of showing our appreciation to the amazingly diverse and talented community of Python developers worldwide, we would like to take a minute to discuss how BuzzFeed uses this language to support the day-to-day operations of a global media company.

    OOP and Modular Design

    The philosophy of “Don’t Repeat Yourself” is fundamental to good application design, and Python has some fantastic built-in tools which inherently promote this philosophy. At BuzzFeed we rely on these features every day to help us write DRY, reusable code:

    Everything Is an Object: In Python, everything is an Object. Base types, user-defined types, Exceptions, even function definitions, class definitions, and imported modules are all Objects. Python’s objects natively support a wide range of standard inheritance patterns through the language’s built-in inheritance system.

    Modules and Namespaces: Python goes out of its way to prevent you from polluting your global namespace. Since the concept of an importable module with its own namespace is fundamental to the language itself, it means that your code can be easily plugged into any existing project without the fear of namespace collisions.

    Packages: The Python distutils standard provides a fantastic set of core tools for bundling a group of modules into a package which can then be installed and imported into other Python projects. When used effectively, Python packages can be used to extend the DRY philosophy across an entire company or even (in conjunction with PyPI) the community at large.

    Standardization of Interfaces and Open-Source Software

    The Python community has produced a large number of excellent tools that solve many common programming tasks using simple, class-based interfaces. We rely heavily on these tools in our day-to-day development work at BuzzFeed.

    Web Development: For traditional server-side MVC web development, Django and Django REST Framework are hard to beat. These tools allow you to build performant, scalable, standards-compliant REST services with incredible ease.

    HTTP: Python’s comprehensive selenium bindings turn complicated scraping tasks into a manageable process. Likewise, the excellent requests package makes lower level server-to-server calls a breeze. We also make heavy use of the retrying package in this context, for simple and idiomatic request retries.

    Async: We mainly use Celery for simple offline and periodic tasks, often in combination with django-celery for easy integration into any Django project. We also use flower as a beautiful off-the-shelf monitoring solution and luigi for defining arbitrarily complex task pipelines.

    Event-driven applications: Some applications, such as high-throughput data pipes, require something more powerful than a standard REST service. For these types of applications, we rely on Tornado and pynsq. Relatedly, pynsq was developed by the talented team at Torando Labs, which was recently acquired by BuzzFeed.

    Test-Driven Development

    Python itself ships with a unit testing package already built into the language, which speaks volumes about the community’s dedication to best practices. Moreover, there is an incredible collection of additional testing tools that have been added over time. Here are some of the most common we use at BuzzFeed:

    Test Runners: Like many Python shops, we have relied on the excellent nose package as our go-to choice for test runners. Recently we have also adopted pytest as an alternative. More than just a test runner, pytest could accurately be described as an entire testing framework.

    Object mocking: To keep test cases isolated, developers often discover the need to replace runtime objects and function calls with simple predefined “mock” objects. In Python the gold-standard for this functionality is the mock package.

    Stubbing out services: Similar to object mocking, your tests should ideally be stubbing out calls to external API’s and other services. A nice way to achieve this is to make a live API call on the very first test run, then pickle the response and use the pickle to mock out the API call in future test runs. At BuzzFeed we use VCR.py to achieve this.

    Auto-fixtures: Many developers can relate to the frustration of generating and maintaining large, complex fixture files simply to provide test data for your application. Wouldn’t it be so much easier and more robust if your tests could just parse your model definitions and generate randomized fixtures on the fly? As Python developers we are in luck because the packages model_mommy and factory_boy both provide that exact functionality.

    We covered only the basics in this post, but Python also excels in more advanced areas such as mathematical programming, natural-language processing, and machine learning, all of which see regular usage at BuzzFeed. We hope to cover some of these topics in future posts, so stay tuned.

    P.S. If you’re a Python developer and would like to learn more about how we use Python at BuzzFeed, check out our Jobs page!