While working as an SDE2 at Google, I noticed a 0.3% webhook drop rate in the Platform team's payment notification service. This service was not my team’s responsibility, no ticket existed, and nobody had asked me to investigate. The drop caused delayed payment confirmations affecting merchant trust and revenue flow. I chose to investigate proactively despite the risk of stepping outside my scope and potential pushback from the Platform team.
Transcript
In this scenario, the candidate noticed a 0.3% webhook drop rate in a service outside their team with no ticket or request to investigate. They chose transparency despite risk, pulled logs, traced and fixed a race condition, and added alerts. The drop rate went to zero, recovering $8K weekly, and the fix pattern was adopted team-wide. Reflection highlighted the lack of shared webhook SLOs as an organizational gap. Key takeaways: explicit ownership proof, quantifying impact, and systemic insight in reflection.