back to home

vasanthk / how-web-works

What happens behind the scenes when we type www.google.com in a browser?

16,717 stars
1,748 forks
7 issues

AI Architecture Analysis

This repository is indexed by RepoMind. By analyzing vasanthk/how-web-works in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.

Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.

Source files are only loaded when you start an analysis to optimize performance.

Embed this Badge

Showcase RepoMind's analysis directly in your repository's README.

[![Analyzed by RepoMind](https://img.shields.io/badge/Analyzed%20by-RepoMind-4F46E5?style=for-the-badge)](https://repomind.in/repo/vasanthk/how-web-works)
Preview:Analyzed by RepoMind

Repository Overview (README excerpt)

Crawler view

How Web Works What happens behind the scenes when we type google.com in a browser? **Table of Contents** • Google's 'g' key is pressed • When you hit 'Enter' • Parse the URL • Check HSTS list (deprecated) • DNS lookup • Opening of a socket + TLS handshake • HTTP protocol • HTTP Server Request Handle • Server Response • Behind the scenes of the Browser • The browser's high level structure • Rendering Engine • The Main flow • Parsing Basics • DOM Tree • Why is the DOM slow? • Render Tree • Render tree's relation to the DOM tree • CSS Parsing • Layout • Painting • Trivia • The birth of the web Google's 'g' key is pressed When you just press "g", the browser receives the event and the entire auto-complete machinery kicks into high gear. Depending on your browser's algorithm and if you are in private/incognito mode or not various suggestions will be presented to you in the dropbox below the URL bar. Most of these algorithms prioritize results based on search history and bookmarks. You are going to type "google.com" so none of it matters, but a lot of code will run before you get there and the suggestions will be refined with each key press. It may even suggest "google.com" before you type it. When you hit 'Enter' To pick a zero point, let's choose the Enter key on the keyboard hitting the bottom of its range. At this point, an electrical circuit specific to the enter key is closed (either directly or capacitively). This allows a small amount of current to flow into the logic circuitry of the keyboard, which scans the state of each key switch, debounces the electrical noise of the rapid intermittent closure of the switch, and converts it to a keycode integer, in this case 13. The keyboard controller then encodes the keycode for transport to the computer. This is now almost universally over a Universal Serial Bus (USB) or Bluetooth connection. In the case of the USB keyboard: • The keycode generated is stored by internal keyboard circuitry memory in a register called "endpoint". • The host USB controller polls that "endpoint" every ~10ms, so it gets the keycode value stored on it. • This value goes to the USB SIE (Serial Interface Engine) sent at a maximum speed of 1.5 Mb/s (USB 2.0). • This serial signal is then decoded at the computer's host USB controller, and interpreted by the computer's Human Interface Device (HID) universal keyboard device driver. • The value of the key is then passed into the operating system's hardware abstraction layer. In the case of touch screen keyboards: • When the user puts their finger on a modern capacitive touch screen, a tiny amount of current gets transferred to the finger. This completes the circuit through the electrostatic field of the conductive layer and creates a voltage drop at that point on the screen. The screen controller then raises an interrupt reporting the coordinate of the 'click'. • Then the mobile OS notifies the current focused application of a click event in one of its GUI elements (which now is the virtual keyboard application buttons). • The virtual keyboard can now raise a software interrupt for sending a 'key pressed' message back to the OS. • This interrupt notifies the current focused application of a 'key pressed' event. Parse the URL The browser now has the following information contained in the URL (Uniform Resource Locator): • Protocol "http": Use 'Hyper Text Transfer Protocol' • Resource "/": Retrieve main (index) page When no protocol or valid domain name is given the browser proceeds to feed the text given in the address box to the browser's default web search engine. Check HSTS list (deprecated) • ~The browser checks its "preloaded HSTS (HTTP Strict Transport Security)" list. This is a list of websites that have requested to be contacted via HTTPS only.~ • ~If the website is in the list, the browser sends its request via HTTPS instead of HTTP. Otherwise, the initial request is sent via HTTP.~ Note: The website can still use the HSTS policy without being in the HSTS list. The first HTTP request to the website by a user will receive a response requesting that the user only send HTTPS requests. However, this single HTTP request could potentially leave the user vulnerable to a downgrade attack, which is why the HSTS list is included in modern web browsers. Modern browsers requests https first DNS lookup The browser tries to figure out the IP address for the entered domain. The DNS lookup proceeds as follows: • **Browser cache:** The browser caches DNS records for some time. Interestingly, the OS does not tell the browser the time-to-live for each DNS record, and so the browser caches them for a fixed duration (varies between browsers, 2 – 30 minutes). • **OS cache:** If the browser cache does not contain the desired record, the browser makes a system call (gethostbyname in Windows). The OS has its own cache. • **Router cache:** The request continues on to your router, which typically has its own DNS cache. • **ISP DNS cache:** The next place checked is the cache ISP’s DNS server. With a cache, naturally. • **Recursive search:** Your ISP’s DNS server begins a recursive search, from the root nameserver, through the .com top-level nameserver, to Google’s nameserver. Normally, the DNS server will have names of the .com nameservers in cache, and so a hit to the root nameserver will not be necessary. Here is a diagram of what a recursive DNS search looks like: One worrying thing about DNS is that the entire domain like wikipedia.org or facebook.com seems to map to a single IP address. Fortunately, there are ways of mitigating the bottleneck: • **Round-robin DNS** is a solution where the DNS lookup returns multiple IP addresses, rather than just one. For example, facebook.com actually maps to four IP addresses. • **Load-balancer** is the piece of hardware that listens on a particular IP address and forwards the requests to other servers. Major sites will typically use expensive high-performance load balancers. • **Geog…