vahidk / EffectiveTensorflow
TensorFlow tutorials and best practices.
AI Architecture Analysis
This repository is indexed by RepoMind. By analyzing vahidk/EffectiveTensorflow in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.
Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.
Repository Overview (README excerpt)
Crawler viewEffective TensorFlow 2 Table of Contents ================= Part I: TensorFlow 2 Fundamentals • TensorFlow 2 Basics • Broadcasting the good and the ugly • Take advantage of the overloaded operators • Control flow operations: conditionals and loops • Prototyping kernels and advanced visualization with Python ops • Numerical stability in TensorFlow --- _We updated the guide to follow the newly released TensorFlow 2.x API. If you want the original guide for TensorFlow 1.x see the v1 branch._ _To install TensorFlow 2.0 (alpha) follow the instructions on the official website:_ _We aim to gradually expand this series by adding new articles and keep the content up to date with the latest releases of TensorFlow API. If you have suggestions on how to improve this series or find the explanations ambiguous, feel free to create an issue, send patches, or reach out by email._ Part I: TensorFlow 2.0 Fundamentals TensorFlow Basics TensorFlow 2 went under a massive redesign to make the API more accessible and easier to use. If you are familiar with numpy you will find yourself right at home when using TensorFlow 2. Unlike TensorFlow 1 which was purely symbolic, TensorFlow 2 hides its symbolic nature behind the hood to look like any other imperative library like NumPy. It's important to note the change is mostly an interface change, and TensorFlow 2 is still able to take advantage of its symbolic machinery to do everything that TensorFlow 1.x can do (e.g. automatic-differentiation and massively parallel computation on TPUs/GPUs). Let's start with a simple example, we want to multiply two random matrices. First we look at an implementation done in NumPy: Now we perform the exact same computation this time in TensorFlow 2.0: Similar to NumPy TensorFlow 2 also immediately performs the computation and produces the result. The only difference is that TensorFlow uses tf.Tensor type to store the results which can be easily converted to NumPy, by calling tf.Tensor.numpy() member function: To understand how powerful symbolic computation can be let's have a look at another example. Assume that we have samples from a curve (say f(x) = 5x^2 + 3) and we want to estimate f(x) based on these samples. We define a parametric function g(x, w) = w0 x^2 + w1 x + w2, which is a function of the input x and latent parameters w, our goal is then to find the latent parameters such that g(x, w) ≈ f(x). This can be done by minimizing the following loss function: L(w) = ∑ (f(x) - g(x, w))^2. Although there's a closed form solution for this simple problem, we opt to use a more general approach that can be applied to any arbitrary differentiable function, and that is using stochastic gradient descent. We simply compute the average gradient of L(w) with respect to w over a set of sample points and move in the opposite direction. Here's how it can be done in TensorFlow: By running this piece of code you should see a result close to this: Which is a relatively close approximation to our parameters. Note that in the above code we are running Tensorflow in imperative mode (i.e. operations get instantly executed), which is not very efficient. TensorFlow 2.0 can also turn a given piece of python code into a graph which can then optimized and efficiently parallelized on GPUs and TPUs. To get all those benefits we simply need to decorate the train_step function with tf.function decorator: What's cool about tf.function is that it's also able to convert basic python statements like while, for and if into native TensorFlow functions. We will get to that later. This is just tip of the iceberg for what TensorFlow can do. Many problems such as optimizing large neural networks with millions of parameters can be implemented efficiently in TensorFlow in just a few lines of code. TensorFlow takes care of scaling across multiple devices, and threads, and supports a variety of platforms. Broadcasting the good and the ugly TensorFlow supports broadcasting elementwise operations. Normally when you want to perform operations like addition and multiplication, you need to make sure that shapes of the operands match, e.g. you can’t add a tensor of shape [3, 2] to a tensor of shape [3, 4]. But there’s a special case and that’s when you have a singular dimension. TensorFlow implicitly tiles the tensor across its singular dimensions to match the shape of the other operand. So it’s valid to add a tensor of shape [3, 2] to a tensor of shape [3, 1] Broadcasting allows us to perform implicit tiling which makes the code shorter, and more memory efficient, since we don’t need to store the result of the tiling operation. One neat place that this can be used is when combining features of varying length. In order to concatenate features of varying length we commonly tile the input tensors, concatenate the result and apply some nonlinearity. This is a common pattern across a variety of neural network architectures: But this can be done more efficiently with broadcasting. We use the fact that f(m(x + y)) is equal to f(mx + my). So we can do the linear operations separately and use broadcasting to do implicit concatenation: In fact this piece of code is pretty general and can be applied to tensors of arbitrary shape as long as broadcasting between tensors is possible: So far we discussed the good part of broadcasting. But what’s the ugly part you may ask? Implicit assumptions almost always make debugging harder to do. Consider the following example: What do you think the value of c would be after evaluation? If you guessed 6, that’s wrong. It’s going to be 12. This is because when rank of two tensors don’t match, TensorFlow automatically expands the first dimension of the tensor with lower rank before the elementwise operation, so the result of addition would be [[2, 3], [3, 4]], and the reducing over all parameters would give us 12. The way to avoid this problem is to be as explicit as possible. Had we specified which dimension we would want to r…