Q: Is C++ good as a first language to learn?
A: Not really. As an educator I’d recommend something as simple as possible - the most important first step is to learn to be an algorithmic thinker. The more you can get language complexity out of the way while tackling that task, the better. I’m personally a fan of Python for intro programming, although eventually something like Rust, Go, or C++ is important for back-end / systems programming.
Q: Did u have experience before applying to your first software development job?
A: My first programming jobs were back in the late 90s and were very web-development focused. I had already been programming on my own for maybe 5 years by that point.
Q: In academia, I have been pounding my fist for years regarding the importance and value of a Software Engineering undergraduate degree, in contrast to Computer Science. Why do you think we do not see lots of SE degree programs in the US? Do you also believe there is a need/opportunity for higher ed to provide authentic SE degrees?
A: If SE is programming, time, and teamwork, it’s inherently challenging to make it authentic in an undergraduate curriculum. It’s hard to have projects or lessons where time becomes the dominant factor when we only have a couple months. Working on a team of people with the same background / experience is also awkward and inauthentic. I think the material we can/should teach undergrads can be improved on these points, but it’s hard. You’ll get more (and more authentic) experience in the first month on the job than you will in a class.
Q: Seemingly offhand, our enterprise production manager has determined that we cannot afford the man-hours for simply upgrading our campus cluster’s operating system (running at centos 6.5) to support the installation of a container platform (Singularity). How do I/we go about assembling a business case for that upgrade?
A: I love citing security vulnerabilities for this. With Heartbleed, Spectre, Meltdown, log4j, etc all making the news it isn’t too hard to show that even the most common tech has the potential to be vulnerable. Even if there isn’t a published vulnerability today, that’s no promise for the future. Really you’ve got three options: stay current (many small/cheap upgrades), upgrade when it’s an emergency (one large, risky, and expensive upgrade), or don’t upgrade (risking a lot). From a business perspective, avoiding risk AND having a known set of costs are both valuable.
Q: Regarding Hyrum;s Law, if an API consumer depends upon observable behavior not covered by the API specification/documentation, isn’t the onus on the consumer to make changes to their client code when that behavior changes?
A: Yes and no. In a monorepo or CI/CD world, proving that a given breakage is because of inappropriate use still takes effort. And more often than not the breakage isn’t limited to the team that violated that contract - all of their users are also affected. But even in a totally distributed model, changing anything that people believed was working is going to cause some grumbling and reputation cost. It’s best to try to mitigate it from the start.
Q: What should i do first to become a software engineer?
A: Learn to program. Then practice reading code, technical communication, and fixing bugs in code you didn’t write.
Q: What would be a good way to update projects that have 3rd party dependencies which themselves depend on old language versions with deprecated features?
A: There isn’t a cheap/easy answer to that. Either try to drop those 3p dependencies, or try to upgrade them and contribute those patches upstream.
Q: Do software engineers of large scale systems have the luxury to try new things (language support, library update) for better performance? Might not be merged but exploring the possibilities of new implementations.
A: Oh, absolutely. A surprising amount of our infrastructure revolves around “How can we run experiments safely?” and “Is X better performance than Y?”
Q: From the perspective of applying these principles, what do you see as the primary difference between software and hardware engineering.
A: Hardware scares me, specifically because the release cycle is so much longer. We know that high-performing software teams have fast release times and can release several times a week or more. Getting new hardware versions on a daily basis sounds like a recipe for chaos - so everything we know about the software process needs to be applied to hardware just prior to the “send the design to the factory” step. The stakes are just a lot higher, but that also means attention to quality process is at least as important.
Q: There is a plethora of software quality metrics such as cyclometric complexity, Halstead complexity measures, function points etc., yet it is very difficult to get a clear signal on software quality, since most of them are either not too informative or easily misleading. Which metrics do you recommend putting emphasis on?
A: I think the union of all of those tends to give some signal, but we still run the risk of streetlight effects. Those sort of static metrics capture one form of complexity, but dynamic operations (like microservices and production environments) can’t generally be tracked by those. And those dynamic things tend to be harder/scarier. In the end, I think asking all of the devs/engineers on the project “Where do we have the most technical debt / unnecessary complexity?” is probably a better signal - it’s not quantitative/objective, but it does drawn from reality.
Q: How can the team that triggers the change and has to do the bulk of the work, work to make the necessary changes in systems and code bases they are not familiar with?
A: Local invariants and reasoning. Obviously you can’t replace an airplane with a minivan - the replacements have to be basically similar. And if you know “this is an airplane like X, but now it’s green and the cabin door is 500cm higher” it’s not really hard to figure out how to swap it out, even without knowing much about the local airport.
Q: What does he mean by Sub-Linear?
A: Assume you assign 1 person on a 10 person project to do one piece of the work. Now the work gets 10x larger and the team gets 10x larger - do you get 10 people to do that work? Or does it take 20? Or can you do it in 5? You want to do it in 5, but that requires automation, expertise, consistency, etc.
Q: How do you reduce dependence on “long lived dev branches”
A: Feature flags, build from trunk, and commit in small pieces. (See the ending chapters in the Flamingo book or some of the flags/experiments/release chapters in the SRE books).
Q: On the slide for shifting left, why are “Unit Tests” listed as a post-submit test, rather than as a pre-submit test?
A: Should be both, really.
Q: If i understood it correct, you said that there is a (scientifically validated) publication, that working in the trunk leads to better results then working in branches. Could you please provide the link to this publication?
A: My favorite citation for this (there are several) is the book “Accelerate: The Science of Lean Software and DevOps: Building and Scaling High Performing Technology Organizations” by Forsgren et al.
Q: What strategies do you find useful to encourage small fry organizations to embrace these principles? I often hear a justification “We are not Google.”
A: I think “We are not Google” is fair - most of the things that would kill us at scale would be an annoyance to others. (Like, deciding on a merge strategy in a company-wide weekly meeting.) In the book we tried to be as clear as we could about what tradeoffs are involved in adopting any given policy/practice/technology. I don’t really want to say,“Do it this way, we know best,” I want y’all to honestly evaluate it and pick what will be best in your context. But I don’t see that: so many shops seem very fixated on short-term cost-cutting and ignore the long-term productivity implications. Maybe start small, “We are not Google, we don’t have a massive build farm. Buy us faster machines, we spend many times more on payroll than on computers.”
Q: How did you become confident in your own abilities?
A: Wait, did I? I don’t feel confident. (Honestly, imposter syndrome is very significant in Google and across the industry.)
Q: What would you recommend a college student do to better their chances at getting a software engineering job?
A: Practice. I think it honestly takes 2-3x more hours of programming drill and practice to become a fluent programmer than are required in a CS program. It’s at least as complex as learning to read - and it takes the same sort of time commitment and commitment to practice.
Q: Can you show that last quote again!
A: It’s programming if clever is a compliment.
It’s software engineering if clever is an accusation.
Q: What is canary?
A: Rather than release a new version of the software to all users all at once, we do gradual rollouts. These “canary” releases are then checked to see if they use similar resources, don’t crash, produce similar results, etc so we get some ground-truth that the new potential release is “good.” (See the SRE book.)
Q: How do you propagate these values across the corporation, with employee “churn”?
A: By writing things down and giving the same talks many, many times. It might be enough, but it’s definitely a challenge.
Q: With incremental development and Agile all the rage, how do you balance upfront design to take advantage of affordability of those phases versus the need to develop incrementally?
A: I don’t think every small incremental piece needs the same design attention, but we do need to design the big components first. Then those can be broken up in several Agile sprints (in parallel or series). I don’t see Agile as conflicting with any of this.
Q: Can you remind me, concisely, how expertise turns scale problems into a benefit? Or what you really said?
A: If you’ve got 10 people working on a project, having one of them be a superstar in (language, testing, design, graphics, whatever) is limited to what they can do directly + the influence they have on 9 people. But if you’ve got 100 people, that expert can influence 99 others - the balance starts to shift to having more potential impact through education and influence, but that depends on having scale to start with.
Q: What are the most interesting challenges you came across during your time at Google?
Q: GitHub Code Reviews seem to rely exclusively on branch-based Pull Requests. Do you have any suggestions on building good Code Review practices around trunk-based development?
A:Short lived branches aren’t a problem - every commit to git or any other version control system is morally equivalent to a short-lived branch. The real concern is to ensure that everyone knows to commit to trunk, and to only depend on the version in trunk (not someone else’s work in flight).
Q: How’s the work life balance @ google?
A: It varies, Google is a huge place. Down in the areas where I work it’s very good and our management is very supportive. I can’t speak with any authority about the rest of the company - it’s too big to be consistent.
Q: Having extensive experience in the field, do you have any tips for those who are just starting their careers?
A: Practice, practice, practice. Read, watch talks, and consciously practice.
Q: Thoughts on monorepo based development?
A: Wildly in favor. I can’t imagine scaling up to even 100 people without something like this. But it doesn’t have to specifically be one repo, just like a filesystem can be composed of multiple storage devices - it’s the usage model, not the implementation that matters.
Q: How do you think software development life cycle management will change as a result of the exponential increase in cybersecurity attacks?
A: I expect more reliance on property-based testing, fuzzing, dynamic analysis, and test coverage. In most domains I see those approaches as being the sweet spot in the space between formal methods proofs of correctness vs. ad hoc test case generation (or no tests).
Q: : How would you suggest addressing a need for change where you have highly entrenched features that are “bad” or negative for future customers but absolutely dependent for existing customers?
A: It depends a lot on context. Sometimes you can cut a final legacy release for those existing customers and move that legacy branch to be only maintenance or fully unsupported. Sometimes you can get those existing customers to explicitly opt-in to the legacy behavior, and then you can provide the new behavior as a default for new customers (which is much cleaner). In most/all cases you’ll have to find a way to have two versions of the behavior, either separated in time (distinct releases) or switched per user configuration, etc. It’s hard.
Q: A “people” question. Developer, as any other population, respond to incentives. How to reward great developers? Classic problem: how to differentiate programmers who cause and fix many bugs vs. from those who seldom write buggy code in the first place? The latter folks often go unnoticed.
A: In theory this should be handled by having peers as part of performance reviews - if your peers are fed up with your hacks and buggy code, they’re gonna say something. In practice, I’m not sure we know exactly how to handle that rationally. You’re a lot more likely to get attention/funding/praise for heroically rescuing the company from an outage (that you may have contributed to) than to just consistently doing quiet solid work.
Q: Should developers test and debug their codes in a container-based environment?
A: I’m not sure “container” is the requirement, but that’s certainly one approach. It does need to be consistent across developers and production, and containers is a good way to get there. “It worked on my machine” is a pretty strong indicator that the software process isn’t quite as reliable as it should be.
Q: If you are working on projects which are expected to last years or decades would you rather use standard code versus proprietary functions, like it is an option in SQL (ISO vs. Microsoft, IBM, Oracle proprietary functions)?
A: It depends, but largely I think it’s cheaper to build a thing yourself than to migrate off of an interface that you’ve already been using for years. Building it yourself means you pay more up front and have more ongoing maintenance (and training costs), but you won’t have the same sorts of surprises when that vendor goes away or you lose license rights and have to scramble to migrate away. That balance can go either way depending on the timeframes and circumstances.
Q: Do you have advice/insight on applying automated fixes (e.g. clang-tidy fixes, clang-format fixes) over the large codebase. Would you touch old modules not touched in several years?
A: Yep, definitely. The SREs have an important phrase, “No Haunted Graveyards.” This means there can’t be things in your software environment that people are afraid to touch - those are going to be where the bugs come from, and ignoring it just exacerbates the problem.
Q: Do you feel the field of software engineering in academia is properly addressing problems encountered in the wild, given that most groups/professors/PhD students are NOT working on “multi-person multi-version projects” themselves?
A: Generally not. There’s certainly some good work there, but I do see a fair number of papers in software engineering conferences/journals that are more “anthropology of software engineering” than software engineering. “We studied 20 software engineers and observed the following behaviors” is often not the same topic as “What is the role of testing in a high-performing organization” or “Why is it easier to refactor a function than a class/type?”
Q: Sometimes you don’t know the time aspect of the software. How do you deal with the problem when it started as a short time span code to a longer duration code. Do you have some inspection mechanism to handle that?
A: Nope. Try to overestimate? Being aware of the problem is a big step in the right direction, but it’s impossible to be perfect: that’d require us to accurately predict the future.
Q: There’s a strong tendency in many software cultures to favor flat organizational and communication structures that have a lot of democracy and broadcast-style communication, but does that negatively affect super-linear scaling and communication/synchronization costs? If so, is a deeper hierarchy in the organization the only solution?
A: Google seems to keep “discovering” that ~everything needs an owner or decider. The open discussion is still important, and most things deserve some level of consensus to avoid “because I said so,” but you can’t really have everyone have a veto on every topic. You can’t even really ask everyone to learn the details of everything. I don’t think hierarchy is inherently the answer, but “Everyone is in charge!” certainly fails as we grow. I suspect there’s a lot more (and better) information from business and management and organization thinkers - we’re not unique on this.