Fourier Transforms FTW:

Joseph Fourier was a fairly prolific mathematician and scientist. He get credit for figuring out that our atmosphere helps keep in heat somehow (the greenhouse effect). But, for us, it’s this gem below that we are interested in.

Fourier Transform, standard form.

This is the Fourier Transform. To be fair, that GIF of a dancing dog uses different technology that doesn’t depend on Fourier’s discovery. But, that JPEG you took of your cat does. And that MPEG of that cute baby video. Oh, and that MP3 or AAC file of your favorite songs as well.

If we used a naïve way of storing picture information, we likely store 24 bits of information for each dot on a picture (or, if we are really fancy, 32 bits). For good quality pictures, that adds up. Quickly. To be fair, this naïve way is still used all the time. So called RAW images from a good camera basically do exactly that (they use some of those same tricks that GIF uses to make it smaller). Professional photographers uses gigabytes of storage for photo shoots. Making movies digitally? Terabytes a day. Thankfully, flash is cheap. And hard drives are too, by movie standards. But moving that stuff much around is still a problem. If that lolcat took up megabytes of space, the internet would be crushed under the weight of the laughs. Video? Forget it. What is needed is a trick to reduce the size of the image or images.

And Fourier Transforms hold the key. Basically, Fourier figured out that you can take any function and represent it as a infinite series of sine waves. This helped with some interesting math problems, but it turned out to be super useful in the age of electronics, as creating and adding up sine waves is something circuits are really good at doing. Any radio transmission is essentially just sine waves in the air. So, we can thank Fourier for WiFi  (and a ground breaking actress as well of course. More later).

The trick is that at a certain point, adding more sine waves doesn’t make an noticeable difference to the signal. So, you just store the parameters for those waves, and store those instead. Sum them back up and you get the approximation of the signal.

As it turns out, you can treat a picture as if it was a mathematical function, apply these techniques and bam, pictures are 1/10 the size they would be. If you can tolerate some loss of quality, you get them 1/20 of the size. So, pictures can now be kilobytes big.

You can do the same with music and movies. As it turns out, not everything is amendable to Fourier transformation, but stunningly, a mathematical discovery 300+ years old drives the web of pictures, movies and music. And lest you despair on the trivia of all, the same technology is at the heart of MRIs and CAT scans.

Posted in Uncategorized | Leave a comment

Turing’s Legacy: Cat Pictures?

Okay, we started out in the 1940s, with a set of mathematical concepts that led to the foundations of computer science. We moved from the transistor and bit to computers with devices. We noted how bits can represent information (of any sort), how CPUs are made to calculate and perform on those bits, and how that CPU uses things like memory to load and store those bites, and device to get new bits and present the results of calculation to users via a display, a printout, or by uploading new information to another computer.

Basically, the modern computer and internet is just a way of shoving bits around. Lots of them. And as Turing and others discovered, once you have bits and machines to calculate with them, you can manipulate information itself.

Thanks to the internet, I was exposed to this quote (paraphrased below):

What would somebody from the 80s be truly surprised by?

That I have a device smaller than a Walkman that allows me to access the largest library of information in the world. And I use it mainly to look at pictures of cats.

One could look at this and despair. But, thankfully, the modern computing revolution drives a lot of good things, and not just lolcats. And it is in the nature of modern computing that information is just information. For a computer, the set of bits that is used to research disease are no different than the bits that make up this picture: Image

It is us humans that determine the value of the bits we see. And here, those sneaky mathematicians managed to make an impact without even knowing it. Next time, let’s see how a French mathematician born in 1768 set the foundation for the modern lolcat. Oh, we will tip our hand to another mathematician later down the way.

Posted in Uncategorized | Leave a comment

Ins and Outs: The IO Subsystem

What people think of a computer is really just the borders of a complex machine. The keyboard, mouse, display, printers, network, and so on are on the edges of the machine. They are the gateway for getting and display information.

To handle the amazing number of devices that can be attached to a computer, computers come with an input output subsystem. If you’ve ever looked at a computer specification, you’ll see things like PCI-Express, SATA, USB, FireWire, Thunderbolt and more. Well, this is just a dizzying array of standardized ways of getting bits from devices into the computers memory hierarchy.

And these standard means of moving bits are complex. The USB (Universal Serial Bus) standard is thousands of pages long, covering everything from the physical layout of the connector (which you always seen to never get right on the first try) to how USB devices send and receive information (the protocol).

Each protocol has its own advantages and disadvantages, and newer, better ones are coming out all the time. But at the end, it’s all about moving bits from one place to another. Move bits to the network device, and you uploaded a video to YouTube. Send bits to your printer device, and that photo you liked is printed.

And the devices that are attached to computers are staggering. The engine of your car. The power grid. Modern manufacturing robotics. Cash registers at your store, the card reader you use to pay with, the scanners that are used to track inventory. The list goes on and on.

Every time you press a key on the keyboard, an complex set of interactions occurs just to get that character to that web page or Word document.

In a real sense, there is no one computer. Any useful computer is a collection of parts, all of them with their own computing power. An affordable inkjet printer has and uses more computing power that they use than full blown computers of twenty years ago.

The complexity of computing is made affordable by scale. The cost for developing devices is large, and it is only by selling large amounts of devices does computing even make sense. Your laptop or desktop at home is the product of billions of dollars of yearly research and development costs. Also, the modern internet runs on truly staggering infrastructure, with its own billion dollar figures.

If you’ve ever plugged in a device only to find that “Device Driver not Found” message, maybe it will help to think about the large amount of devices that do exist. Modern operating systems ship with thousands of special programs (drivers) that just manage the communication between the computer and the device. It’s a complex part of any operating system.

It is hard to express to a layman just how complex a system it is. To get an idea, let’s look through the lens of a common internet trope. The lolcat.

Posted in Uncategorized | Leave a comment

Memory and the Memory Managment Unit

A critical part of every computer is memory. If you go to buy a PC, you will often see how much main memory (4 to 8 Gigabytes is the norm) and hard drive space the machine has.  Main memory is implemented using Dynamic Random Access Memory. This is a set of chips on a small board that plugs into the computer. A hard drive is a larger device that plugs in using a standard interface.

These components form the bottom of what is called the memory hierarchy of a machine. In general, this hierarchy goes from fastest access (CPU registers) to slowest (Hard Drive). But, as the memory gets slower, it also can get bigger. A register can hold 128 bytes of information (at the most). Level 1 Caches hold about 32 kilobytes of information. Level 2 caches are bigger, holding two to eight megabytes of information. Main memory is from four gigabytes to sixteen gigabytes for most machines. And hard drives store one to three terabytes.

But, at each step, speed of access goes down. Accessing information on a hard drive is at least a million times slower than L1 cache. A hard drive is tens of thousands times slower than main memory, and main memory is tens of thousands times slower than cache.

One of the main functions of any operating system is to hide this latency (the time it takes to access information) as best it can. But, if you’ve ever booted up your machine, started to work only to hear the hard drive spin and wait, you can see the vast differences in speed that loading information from a hard drive can take.

At the heart of managing memory is the Memory Management Unit. This part of every CPU is responsible for moving information to and from the CPU into main memory. Before information can be used, it must be available in the main memory. To get information from a hard drive, a different part of system is used (this is called the IO subsystem. More on this later).

The tricky part about the MMU is that it gives every program that runs on a computer the illusion of having all the available memory for it’s use. In fact, it provides the illusion of more memory than a machine might have. To understand this, we have to understand the notion of bit size in architecture.

You may have heard about a CPU being 64 bits. Most desktop and laptop processors are 64 bit now, and many smartphone processors are now as well. Fundamentally, the bit size has a lot of impact on the CPU design, but what is useful for us is the impact on what is termed virtual memory.

Virtual memory is the illusion that a MMU provides to a program. For most programs, the operating system gives the program the illusion that it has two gigabytes (sometimes three) of main memory to use. This is because most programs are still targeted to the older 32 bit model for compatibility. 64 bit programs have the illusion that they have one terabyte of main memory. This is way more than any computer can actually provide. The amount of memory a program thinks it has is called the virtual address space.

Basically, all memory is addressable. An address is merely a set of bit (32 bits or 64 bits depending on the program) that points to where a set of information is. The MMU takes these virtual addresses and maps them to the physical memory. As needed, the MMU in cooperation with the operating system moves information to and from main memory to the hard drive. This operation is called paging. Also, the MMU moves information to and from the main memory to the CPU caches. This operation is called caching.

When a program is being run, it is moving millions of bytes of information back and forth from the CPU to memory as billions of instructions are being run. The instructions and data for every program running on a machine need to be loaded into memory. A special program, called the operating system, is responsible for the management and running of other programs and their memory (along with many other tasks). Computer have special hardware (the BIOS) that find and load the operating system from the hard drive.

The operating system uses a complex set of rules to determine what to page in and out of memory. When these rules don’t match how the computer is being used, the computer can slow down considerably. When a computer is not responding, it is often because it is performing a large set of paging operations. In fact, this often referred to as paging. For example, “My desktop is running really slow, it keeps paging all the time”.

Posted in Uncategorized | Leave a comment

Time Flies…

It’s easy to start a blog, not so easy to keep it up. Life changes, and circumstances get away from you, and before you know it, six months fly by, then eight. Now, it’s almost a year.

But, what is worth starting is worth restarting. So, we will see if this sticks for now.

Posted in Uncategorized | Leave a comment

CPU Building Blocks: The Arithmetic and Logic Unit

Well, it’s been a bit of a summer break. Almost two weeks, 12 days in fact. Your author has started a new position. But, for the few of you reading, I shall press on.

Computers are great at certain calculations. Want to know how many days there are between June 12, 1973 and August 27, 1975? A simple computer program will do it in no time flat.

As we now, bits are the fundamental building blocks of information. And we can use bits to represent numbers. The Arithmetic and Logic Unit (ALU) part of a CPU is what performs those basic operations that are required for computers to do their work.

The arithmetic part of the ALU does basic mathematical operations. Initially, ALU could only perform the most basic of operations on integers. Now, most ALUs can do numerous operations on integers well as real numbers (also called floating point numbers). An example of a real number is .232343×10^-4. These kind of numbers are used in all kinds of scientific calculations as well as complex games. In older computers, operations on floating point numbers required a completely separate chip.

The logic part of the ALU allows one to treat a number as its underlying bits, and perform various operations to create a new number from those bits. These are called bitwise operations. Many of these are critical to basic operations in graphics and text processing. Also, they can compare two integers and see if they are the same or different. This is an operation that is performed very often.

The ALU is a critical part of a CPU. The operations it enables are what all programs use to do their work. Again, the CPUs ability to execute billions of these instructions in a second is what enables complex programs to work.

Let’s go to a simple example. One things computers do a lot is compare two words to see if they are equal. If you wanted to search a file, you’d have to be able to do this simple operation.

Firstly, we have to figure out how to map letters into numbers (and therefore bits). Thankfully, we don’t have to do this work, it’s been done for us. One common representation is ASCII, which uses the numbers 0-255 to represent the common characters used in early computing. A more modern version is called Unicode, which uses numbers 0-65355 to represent all known glyphs in all languages (with room for more).

One common way to represent a word (or a collection of letters) is to have a series of numbers that end with a zero. For example, the word hello can be represented as the following:

104 101 108 108 111 0

ASCII tells us that 104 is a lower case h, 111 is a lower case o, and so on. So, how do we tell if two strings are the same? Well, we just have to take each number in the sequence, and if they are equal, move to the next number. Repeat until both numbers are zero, and we know that the strings are equal. If the numbers are different, then the strings are not the same. The ALU provides the comparison operations.

The next building block we will discuss is the Memory Management Unit. The MMU is responsible for interfacing the CPU to the main memory of the computer.

Posted in Uncategorized | 1 Comment

Clarifications on Turing Completeness

I have noted that some of you have been referred to this blog when performing a search on Turing Completeness. Now this blog is intended for general audiences, but this is worth a bit of a diversion.

Turing Completeness is a property of most programming languages. However, it means very little in terms of how easy using a language is or how expressive it is. Instead, it means that all programming languages essentially perform the same tasks; in other words, they are equally powerful.

In fact, it is easier to think of languages that are not complete. Basic finite automations are not complete, nor are context free grammars. When you study theoretical computer science, these abstractions are used to build up to the concept of the Turing machine by showing what they can’t compute.

An example of a computation language that does useful work but is not Turing complete are regular expressions. They do perform a set of calculations. For example, a regular expression can determine if a string is a valid phone number, but you can’t use regular expressions to perform general computing.

Basically, the key point of Turing completeness is that all the computing languages we use, from assembly to JavaScript conceptually do the same thing, and the theory of computing applies to them all.

Posted in Uncategorized | Leave a comment