Where is memory stored in a computer

In-memory computing is about two things: making computing faster and scaling it to potentially support petabytes of in-memory data. In-memory computing leverages two key technologies: random-access memory (RAM) storage and parallelization. 

Speed: RAM Storage

The first key is that in-memory computing takes the data from your disk drives and moves it into RAM. The hard drive is by far the slowest part of your server. A typical hard drive is literally a spinning disk, like an old- fashioned turntable. It has many moving parts, and it spins in a vacuum where the arm of the turntable physically scans across the disk to read your data. In addition, moving data from your disk to RAM for processing is time consuming, which adds more delays to the speed at which you can process data. Meanwhile, RAM is the second-fastest component in your server. Only the processor is faster.

With RAM, there are no moving parts. Memory is just a chip. In physical terms, an electrical signal reads the information stored in RAM. It works at the speed of electricity, which is the speed of light. When you move data from a disk to RAM storage, your computer runs anywhere from five thousand to a million times faster.

The human mind has a hard time grasping that kind of speed. We are talking about nanoseconds, milliseconds, and microseconds. A good analogy is that traditional computing is like a banana slug crawling through your garden at 0.007 miles per hour while in-memory computing is like an F-18 fighter jet traveling at 1,190 miles per hour, or twice the speed of sound. In other words, disk drives are really, really slow. And when you copy all of your data from disk and put it into RAM, computing becomes really, really fast.

You can look at it like a chef in a restaurant. The chef needs ingredients to cook his meals: that's your data. The ingredients might be in the chef's refrigerator or they might be ten miles down the road at the grocery store. The refrigerator is like RAM storage: The chef can instantly access the ingredients he needs. When he's done with the ingredients and the meal is finished, he puts the leftovers back in the refrigerator, all at the same time. The grocery store is like disk storage. The chef has to drive to the store to get the ingredients he needs. Worse, he has to pick them up one at a time. If he needs cheese, garlic, and pasta, he has to make one trip to the grocery store for the cheese, bring it back, and use it. Then he has to go through the whole process again for the garlic and the pasta. If that isn't enough, he has to drive the leftover ingredients back to the grocery store again, one by one, right after he's done using each of them.

But that's not all. Assume you could make a disk drive that was as fast as RAM, similar to flash drives. The system that traditional computing uses to look for the information on a hard disk - processor to RAM to controller to disk - would still make it much slower than in-memory computing.

To return to our example, let's say there are two chefs: one representing in-memory computing and the other traditional computing. The chef representing in-memory computing has his refrigerator right next to him and he also knows exactly where everything is on the shelves. Meanwhile, the chef representing traditional computing doesn't know where any of the ingredients are in the grocery store. He has to walk down all of the aisles until he finds the cheese. Then he has to walk down the same aisles again for the garlic, then the pasta, and so on. That's the difference in efficiency between RAM and disk storage.

RAM versus Flash

Flash storage was created to replace a disk drive. When it's used for that purpose, it is also called a solid-state device, or SSD. SSDs are made of silicon and are five to ten times faster than disk drives. However, both flash storage and disk drives are attached to the same controller in your computer. Even when you use flash, you still have to go through the same process of reading and writing from a disk. The processor goes to RAM, RAM goes to the controller, and the controller retrieves the information from the disk.

Flash accesses the information faster than disk, but it still uses the same slow process to get the data to the processor. Moreover, because of the inherent limitation in flash's physical design, it has a finite number of reads and writes before it needs to be replaced. Modern RAM, on the other hand, has unlimited life and takes up less space than flash. Flash may be five to ten times faster than a standard disk drive but RAM is up to a million times faster than the disk. Combined with the other benefits, there's no comparison.

Scale: Parallelization

RAM handles the speed of in-memory computing. But the scalability of the technology comes from parallelization. Parallelization came about in the early 2000s to solve a different problem: the inadequacy of 32-bit processors. By 2012, most servers had switched to 64-bit processors which can handle a lot more data. But in 2003, 32-bit processors were common and they were very limited. They couldn't manage more than four gigabytes of RAM memory at a time. Even if you put more RAM on the computer, the 32-bit processor couldn't see it. But the demand for more RAM storage was growing anyway.

The solution was to put data into RAM across a lot of different computers. Once it was broken down like this, a processor could address it. The cluster of computers looked like it was one application running on one computer with lots of RAM. You split up the data and the tasks, you use the collective RAM for storage, and you use all the computers for processing. That was how you handled a heavy load in the 32-bit world and it was called parallelization or massively parallel processing (MPP).

When 64-bit processors were released, they could handle more or less an unlimited amount of RAM. Parallelization was no longer necessary for its original use. But in-memory computing saw a different way to take advantage of it: scalability.

Even though 64-bit processors could handle a lot more data, it was still impossible for a single computer to support a billion users. But when you distributed the processing load across many computers, that kind of support was possible. Better, if the number of users increased, all you had to do was add a few more computers to grow with them.

Picture a row of six computers. You could have thousands of computers but we'll use six for this example. These computers are connected through a network, so we call them a cluster. Now imagine you have an application that will draw a lot of traffic, too much traffic to store all of the data on one computer. With parallelization, you take your application and break its data into pieces. Then you put one piece of it in computer 1, another piece in computer 2, and so on until the data is distributed optimally across the cluster. Your single application runs on the whole cluster of computers. When the cluster gets a request for data, it knows where that data is and processes the information in RAM from there. The data doesn't move around the way it does in traditional computing.

Even better, you can replicate specific parts of your data on different computers in the same cluster. In our example, let's say the data on computer 6 is in high demand. You can add another computer to the cluster that carries the same data. That way, not only can you handle things faster, but if computer 6 goes down, the extra one just takes over and carries on as usual.

If you tried to scale up like this with a single computer, it would get more and more expensive. At the end of the day, it would still slow you down. With parallelization, in-memory computing allows you to scale to demand linearly and without limits.

Let's return to the chef analogy, where a computer processor is a chef and memory storage is the chef's stove. A customer comes in and orders an appetizer. The chef cooks the appetizer on his one stove right away and the customer is happy.

Now what happens when 20 customers order appetizers? The one chef with his one stove can't handle it. That 20th customer is going to wait three hours to get her appetizer. The solution is to bring in more chefs with more stoves, all of them trained to cook the appetizer the same way. The more customers you get, the more chefs and stoves you bring into the picture so that no one has to wait. And if one stove breaks, it's no big deal: plenty of other stoves in the kitchen can take its place.

The Internet has created a level of scale that would have been unheard of just 15 or 20 years ago. Parallelization gives in-memory computing the power to scale to fit the world.

Memory is the electronic holding place for the instructions and data a computer needs to reach quickly. It's where information is stored for immediate use. Memory is one of the basic functions of a computer, because without it, a computer would not be able to function properly. Memory is also used by a computer's operating system, hardware and software.

There are technically two types of computer memory: primary and secondary. The term memory is used as a synonym for primary memory or as an abbreviation for a specific type of primary memory called random access memory (RAM). This type of memory is located on microchips that are physically close to a computer's microprocessor.

If a computer's central processer (CPU) had to only use a secondary storage device, computers would become much slower. In general, the more memory (primary memory) a computing device has, the less frequently the computer must access instructions and data from slower (secondary) forms of storage.

This image shows how primary, secondary and cache memory relate to each other in terms of size and speed.

Memory vs. storage

The concept of memory and storage can be easily conflated as the same concept; however, there are some distinct and important differences. Put succinctly, memory is primary memory, while storage is secondary memory. Memory refers to the location of short-term data, while storage refers to the location of data stored on a long-term basis.

Memory is most often referred to as the primary storage on a computer, such as RAM. Memory is also where information is processed. It enables users to access data that is stored for a short time. The data is only stored for a short time because primary memory is volatile, meaning it isn't retained when the computer is turned off.

The term storage refers to secondary memory and is where data in a computer is kept. An example of storage is a hard drive or a hard disk drive (HDD). Storage is nonvolatile, meaning the information is still there after the computer is turned off and then back on. A running program may be in a computer's primary memory when in use -- for fast retrieval of information -- but when that program is closed, it resides in secondary memory or storage.

How much space is available in memory and storage differs as well. In general, a computer will have more storage space than memory. For example, a laptop may have 8 GB of RAM while having 250 GB of storage. The difference in space is there because a computer will not need fast access to all the information stored on it at once, so allocating approximately 8 GB of space to run programs will suffice.

The terms memory and storage can be confusing because their usage today is not always consistent. For example, RAM can be referred to as primary storage -- and types of secondary storage can include flash memory. To avoid confusion, it can be easier to talk about memory in terms of whether it is volatile or nonvolatile -- and storage in terms of whether it is primary or secondary.

How does computer memory work?

When a program is open, it is loaded from secondary memory to primary memory. Because there are different types of memory and storage, an example of this could be a program being moved from a solid-state drive (SSD) to RAM. Because primary storage is accessed faster, the opened program will be able to communicate with the computer's processor at quicker speeds. The primary memory can be accessed immediately from temporary memory slots or other storage locations.

Memory is volatile, which means that data in memory is stored temporarily. Once a computing device is turned off, data stored in volatile memory will automatically be deleted. When a file is saved, it will be sent to secondary memory for storage.

There are multiple types of memory available to a computer. It will operate differently depending on the type of primary memory used, but in general, semiconductor-based memory is most associated with memory. Semiconductor memory will be made of integrated circuits with silicon-based metal-oxide-semiconductor (MOS) transistors.

Types of computer memory

In general, memory can be divided into primary and secondary memory; moreover, there are numerous types of memory when discussing just primary memory. Some types of primary memory include the following

  • Cache memory. This temporary storage area, known as a cache, is more readily available to the processor than the computer's main memory source. It is also called CPU memory because it is typically integrated directly into the CPU chip or placed on a separate chip with a bus interconnect with the CPU.
  • RAM. The term is based on the fact that any storage location can be accessed directly by the processor.
  • Dynamic RAM. DRAM is a type of semiconductor memory that is typically used by the data or program code needed by a computer processor to function.
  • Static RAM. SRAM retains data bits in its memory for as long as power is supplied to it. Unlike DRAM, which stores bits in cells consisting of a capacitor and a transistor, SRAM does not have to be periodically refreshed.
  • Double Data Rate SDRAM. DDR SRAM is SDRAM that can theoretically improve memory clock speed to at least 200 MHz.
  • Double Data Rate 4 Synchronous Dynamic RAM. DDR4 RAM is a type of DRAM that has a high-bandwidth interface and is the successor to its previous DDR2 and DDR3 versions. DDR4 RAM allows for lower voltage requirements and higher module density. It is coupled with higher data rate transfer speeds and allows for dual in-line memory modules (DIMMS) up to 64 GB.
  • Rambus Dynamic RAM. DRDRAM is a memory subsystem that promised to transfer up to 1.6 billion bytes per second. The subsystem consists of RAM, the RAM controller, the bus that connects RAM to the microprocessor and devices in the computer that use it.
  • Read-only memory. ROM is a type of computer storage containing nonvolatile, permanent data that, normally, can only be read and not written to. ROM contains the programming that enables a computer to start up or regenerate each time it is turned on.
  • Programmable ROM. PROM is ROM that can be modified once by a user. It enables a user to tailor a microcode program using a special machine called a PROM programmer.
  • Erasable PROM. EPROM is programmable read-only memory PROM that can be erased and re-used. Erasure is caused by shining an intense ultraviolet light through a window designed into the memory chip.
  • Electrically erasable PROM. EEPROM is a user-modifiable ROM that can be erased and reprogrammed repeatedly through the application of higher than normal electrical voltage. Unlike EPROM chips, EEPROMs do not need to be removed from the computer to be modified. However, an EEPROM chip must be erased and reprogrammed in its entirety, not selectively.
  • Virtual memory. A memory management technique where secondary memory can be used as if it were a part of the main memory. Virtual memory uses hardware and software to enable a computer to compensate for physical memory shortages by temporarily transferring data from RAM to disk storage.

Timeline of the history and evolution of computer memory

In the early 1940s, memory was only available up to a few bytes of space. One of the more significant signs of progress during this time was the invention of acoustic delay line memory. This technology enabled delay lines to store bits as sound waves in mercury, and quartz crystals to act as transducers to read and write bits. This process could store a few hundred thousand bits. In the late 1940s, nonvolatile memory began to be researched, and magnetic-core memory -- which enabled the recall of memory after a loss of power -- was created. By the 1950s, this technology had been improved and commercialized and led to the invention of PROM in 1956. Magnetic-core memory became so widespread that it was the main form of memory until the 1960s.

Metal-oxide-semiconductor field-effect transistors, also known as MOS semiconductor memory, was invented in 1959. This enabled the use of MOS transistors as elements for memory cell storage. MOS memory was cheaper and needed less power compared to magnetic-core memory. Bipolar memory, which used bipolar transistors, started being used in the early 1960s.

In 1961, Bob Norman proposed the concept of solid-state memory being used on an integrated circuit (IC) chip. IBM brought memory into the mainstream in 1965. However, users found solid-state memory to be too expensive to use at the time compared to other memory types. Other advancements during the early to mid-1960s were the invention of bipolar SRAM, Toshiba's introduction of DRAM in 1965 and the commercial use of SRAM in 1965. The single-transistor DRAM cell was developed in 1966, followed by a MOS semiconductor device used to create ROM in 1967. From 1968 to the early 1970s, N-type MOS (NMOS) memory also started to become popularized.

By the early 1970s, MOS-based memory started becoming much more widely used as a form of memory. In 1970, Intel had the first commercial DRAM IC chip. One year later, erasable PROM was developed and EEPROM was invented in 1972.