The key difference between columns and rows is that a column arranges data vertically from top to bottom, while a row arranges data horizontally from left to right. Rows and columns are common concepts in computer science and data analytics. These terms are also used in common parlance to describe the arrangements of everyday things. The definition of a row is, “any set of objects or data that is aligned in a way that each one can be described as being to the left or right of another.” In contrast to a row, columns organize items in a top-to-bottom fashion. In contrast to a row where items are to the left or right of another, all items in a column are described as either above or below another. Here are the top five differences between rows and columns in terms of how they organize data and objects:
Sometimes a real-world example can drive a concept home better than a precise definition. Here are five examples of everyday objects people describe as being organized in a row:
Column examplesHere are five examples of how the term column is used in common parlance:
The concepts of rows and columns becomes especially powerful when the two concepts are combined. For example, the periodic table of elements uses both rows and columns to help scientists visualize the similarities between elements that sit on the same row, and the commonalities between elements that reside in the same column. Spreadsheet software, such as Microsoft Excel, Google Sheets, Apple Numbers and OpenOffice Calc, allow users to organize data into columns and rows. A common pattern in spreadsheet software is to put a different date in each column, and list all of the days of a week or a month from the top of the spreadsheet to the bottom. Then in each row expenses or sales are listed corresponding to the date of the row. Spreadsheet software is then capable of performing various addition and subtraction functions to provide more meaning to the data.
Relational databases also rely heavily on the idea of columns and rows. A combination of database columns and rows is knowns as a table. Each database table row has the same set of data fields. For example, each row of a person table would include a first name, last name, date of birth and email address. All of the fields in a table row are said to make up a complete “entity.” If you query every row in a table for a single field, such as the first name from the person table, the results return as a column of data.
Traditional databases, such as MySQL, PostgreSQL, DB2 and Oracle, were optimized for column-based access. This means data is accessed through a primary key, which is a unique value each row possesses. Relational databases are very efficient at column-based searches that extract unique fields of interest from a row. However, many social media websites have found that row-based access is more efficient than access based on a primary key column. This has led to an emergence of popular NoSQL databases such as MongoDB, Cassandra, HBase and Redis. While the concept of rows and columns is prolific in the world of math, data analytics, chemistry and computer science, the concept is very egalitarian and permeates our daily life as well. One can query rows and columns in a database, and deduce information from how elements are organized in columns and rows in a periodic table. At the same time, one can also admire a row of ducks as they swim by, and be humbled by the beautify of the Corinthian or Doric columns that hold up an ancient Greek building. 1Animated columns vs. rows differences and similarities Gif adapted from: Periodic table image: User:Double sharp, based on File:Simple Periodic Table Chart-en.svg by User:Offnfopt, CC BY-SA 4.0 via Wikimedia Commons Each object, i.e., a real-world individual of a class (for example, each customer who does business with our enterprise), is represented by a row of information in a database table. The row is defined in the relational model as a tuple that is constructed over a given scheme. Mathematically, the tuple is a function that assigns a constant value to each attribute of the scheme from its corresponding attribute domain. Notice that because the scheme is a set of attributes, we could show them in any order without changing the meaning of the data in the row (tuple). In formal notation, we could show the assignments explicitly, where the identifier t represents a tuple: tTJ = ‹first_name := 'Tom', last_name := 'Jewett', phone := '714-555-1212', street := '10200 Slater', zipcode := '92708'› In practice, when we create a table row in SQL, we are actually making the assignment of domain values to attributes, just as in the tuple definition. In SQL, you can omit the attribute names from the INSERT INTO statement, as long as you keep the comma-delimited list of values in exactly the same order as was used to create the table. This syntax is more prone to errors, so use sparingly. It is provided for completion. Removing the attribute names from the above statement, gives the SQL statement below. It is also possible to list only some of the attribute names, the ones we provide values for; in that case, the remaining attributes will be assigned NULL in such rows, assuming there is no constraint prohibiting NULLs in such columns. In the case of the customers table, it was created with NOT NULL constraint on the name and phone number columns, so non-NULL values for these must be inserted. On the other hand, there is no such constraint on the other columns, so NULL is allowed. The UPDATE statement is used to modify the value for one or more attributes in the specified rows of a table. When we change the data in a table row using SQL, we are also following the tuple definition of assigning domain values to attributes. The statement below will modify all rows in table customers that have a phone number of '714-555-1212' and change them to have the phone number specified in the SET clause. UPDATE customers SET phone = '714-555-2323' WHERE phone = '714-555-1212'; SQL statement to update a phone number in the table.The SQL DELETE statement is used to delete rows. When deleting rows, you can specify the rows to delete via an optional condition. The following figures provide two examples. The statement below will delete all rows in table customers that have a value of '90840' in the cZipCode column. DELETE FROM customers WHERE cZipCode = '90840'; SQL statement to delete rows from a table.The statement below will delete all rows in table customers; in this case, no criteria is specified in a WHERE clause, so all rows are deleted. Beware running such statements as you will lose all rows in the table. DELETE FROM customers; SQL statement to delete all rows from a table.A database table is simply a collection of zero or more rows. This follows from the relational model definition of a relation as a set of tuples over the same scheme. (The name “relational model” comes from the relation being the central object in this model.) The customers database table or relation. Other views of this diagram: Large image - Description (text)Knowing that the relation (table) is a set of tuples (rows) tells us more about this structure, as we saw with schemes and domains.
Insuring unique rowsSince each tuple in a relation must be unique, no two tuples can have exactly the same values for every one of their attributes, that is, there can be no duplicate tuples in a relation. Unfortunately, the same cannot be said about SQL tables. A SQL table is bag (i.e., a multiset) of rows, unless constraints are placed on the table to ensure there be no duplicate rows. Thus, to implement a relation as a SQL table, there must be some set of attributes in each relation whose values, taken together, guarantee uniqueness of each row. Any set of attributes that can do this is called a super key (SK). By the definition of a relation as a set of tuples, the set of all attributes must be a super key. If such a set were not a super key, it would allow two or more identical tuples in the relation which would violate the definition of a set. Since super keys are constraints on the data, they must be true for any relation (table) of a relation scheme, thus these super keys are shown in the relation scheme diagram. A super key is our first database constraint, we will learn more of them throughout. The set of all attributes in a relation scheme R is just one super key of that scheme, there can be and usually there are more. The other super keys are proper subsets of the relation scheme. Out of all these super keys, the database designer picks one to serve as the primary key (PK) of the relation. (Notice that the PK is a SK, but not all SKs are PKs, since only one is chosen as a PK!) The PK is sometimes also called a unique identifier for each row of the table. This is not an arbitrary choice—we’ll discuss it in detail on a later page. For our customers table, we’ll pick the set {first_name, last_name, phone}. We are likely to have at least two customers with the same first and last name, but it is very unlikely that they will both have the same phone number. In SQL, we specify the primary key of a table with a data constraint that lists the attributes that form the PK. We also give the constraint a name that is easy for us to remember later (as is done below using “customers_pk”). ALTER TABLE customers ADD CONSTRAINT customers_pk PRIMARY KEY (first_name, last_name, phone); SQL statement to alter a table by adding a PKSince every SQL table must have a PK in order for it to correctly represent a relation, it is best to avoid altering a table to add a PK constraint and instead specify such constraints at the time the table is created. In creating the customers table, you may have noticed the use of NOT NULL for some of the attributes. NULL is a special constant in database systems (as is the case in most programming languages) that means “this field doesn’t have any value assigned to it. ” It’s not the same as a zero length string or the number zero. Thus, NOT NULL is a database constraint that prevents rows from being inserted without a value for attributes with this constraint (similarly also prevents updates that would assign NULL to such attributes). The statement below recreates the table with the required PK constraint. If you’ve previously created the table, you can use the DROP TABLE statement to delete the table from the database, including of course all data that may be in the table. CREATE TABLE customers ( first_name VARCHAR(20) NOT NULL, last_name VARCHAR(20) NOT NULL, phone VARCHAR(20) NOT NULL, street VARCHAR(50), zipcode VARCHAR(5), CONSTRAINT customers_pk PRIMARY KEY (first_name, last_name, phone)); SQL statement to create Customers table, including its PK. |