Vinay Pai: Jay Kreps Shares Lessons He Learned at LinkedIn

Jay Kreps, Principal Staff Engineer, shared lessons he learned during his past seven years at LinkedIn.

Scaling the Site: Started with standard Oracle DB and separate systems for search and social graph. Moved to key-value store (Voldemort) and Hadoop, which augmented the Oracle DB’s. Now added kafka on top of Hadoop and separated newsfeed and espresso.

Few simple cheap primitives: Like OpenGL which has triangles as basic primitive, think about high-performance basic primitives like key-value store. Picked on an anti-pattern from alums with centralized user table.
Ops first: Teams with most operational focus spent the least amount of time on operations. Keep production designs simple and do them well—maybe at the expense of newer, fancier designs.
Do hard things later: Use asynchronous and offline/batch when possible instead of request-response patterns.

Scaling the code base: How do you scale as you grow the size of the Engineering team? Started with a monolithic application and decomposed the application into services. Originally was not a very disciplined decomposition.

Services (may) scale development. Bad services are worse than no services. 300-400 services without appropriate tools to diagnose and debug services. Dependencies not well understood. Treat the service as a product, with a team that owns the service and provides appropriate documentation and artifacts. Service layer evolution: Moved from Spring-RPC to REST + JSON.
The service contract is binary: Instead of just exposing existing code as a service, develop the service contract first and then write the appropriate code.
Isolation vs Utilization: Develop services with the right granularity. How many services should you have? He recommends having a team of 5 develop each service as a rule-of-thumb to ensure that services have the right granularity. He recommends against a model where a single engineer is responsible for several services.

Scaling software engineering: Cutting up the codebase.

Build your process: They moved from using wiki pages that track lifecycle to tools and services that manage the life cycle: from review, to check-in, to build, to rollout.
Governance: Need a balance between central planning (communist) and free-for-all every team does their own thing (capitalist).
Treat code as property. Someone is responsible for the code— functionality, hygiene. That eliminated a lot of bureaucracy.
But you need effective government. Don’t duplicate integration layers. Standardize where you can: monitoring.

Great stuff! I always enjoy learning about best practices and lessons from other companies. It's a great way to compare notes and also accelerate your own Engineering practices and technology.

Vinay Pai

Nov 5, 2014

Jay Kreps Shares Lessons He Learned at LinkedIn

No comments: