Structured Query Language (SQL) is a standard computer language that contains a set of defined syntax and expressions used for accessing and managing data in databases and in other data processing technologies. The American National Standards Institute (ANSI) defines a standard for SQL. Most RDBMSs use that standard and have extended it, making SQL syntax across different RDBMSs slightly different from one another. Query expressions in ArcGIS adhere to standard SQL expressions. The SQL syntax you use within an expression differs depending on the data source. Each data source has its own variant of SQL, which are referred to as SQL dialects. For example:
When using ArcGIS dialog boxes to construct a SQL expression, autocomplete is used to help you apply the correct syntax for the data source you're querying. As you type, a prompt appears, showing the field names, values, keywords, and operators supported by your data source. Review the following to help determine when ArcGIS SQL syntax is used or when the SQL syntax of the underlying RDBMS is used when creating an SQL expression.
Within ArcGIS Pro, the SQL expression dialog box can be found in the following locations: A SQL expression contains a combination of one or more values, operators, and SQL functions that can be used to query or select a subset of features and table records within ArcGIS. All SQL queries are expressed using the keyword SELECT. SELECT * FROM forms the first part of the SQL expression and is automatically supplied for you on most ArcGIS dialog boxes. For example, when you construct a query by writing SQL syntax, a SELECT statement is used to select fields from a layer or table and is supplied for you. The next part of the SQL expression that comes after SELECT * FROM <Layer_name> is the WHERE clause. The WHERE clause is used to get records that meet specific criteria and is the part of the expression you must build. The asterisk (*) in a SQL expression is used to ask for all columns. Here is a basic form of a SQL expression WHERE clause:
For example, STATE_NAME = 'Florida'. This expression contains a single clause and selects all features containing 'Florida' in the STATE_NAME field. For compound expressions, the following form is used:
For example, STATE_NAME = 'Florida' OR (STATE_NAME = 'South Carolina' AND POP2010 > 15000). This compound expression is comprised of multiple clauses connected by a logical operator, AND or OR, and selects all features containing Florida in the STATE_NAME field, and all the features that contain both South Carolina in the STATE_NAME field and have a value greater than 15,000 in the field named POP2010. Optionally, parentheses () can be used for defining the order of operations in compound expressions. Because you are selecting columns as a whole, you cannot restrict the SELECT to return only some of the columns in the corresponding table because the SELECT * syntax is hard-coded. For this reason, keywords, such as DISTINCT, ORDER BY, and GROUP BY, cannot be used in an SQL expression in ArcGIS except when using subqueries. To learn more, see Subqueries. The following sections describe the elements of common SQL query expressions used in ArcGIS. Strings must always be enclosed in single quotation marks in queries, for example: STATE_NAME = 'California'Strings are case sensitive in expressions, except when run on geodatabases in Microsoft SQL Server. To make a case-insensitive search in other data sources, you can use an SQL function to convert all values to the same case. For file-based data sources such as file geodatabases or shapefiles, you can use the UPPER or LOWER function to set the case for a selection. For example, the following expression selects the state whose name is stored as 'Rhode Island' or 'RHODE ISLAND': UPPER(STATE_NAME) = 'RHODE ISLAND'If the string contains a single quotation mark, you first need to use another single quotation mark as an escape character, for example: NAME = 'Alfie''s Trough'Use the LIKE operator (instead of the = operator) to build a partial string search. For example, this expression selects Mississippi and Missouri among United States state names: STATE_NAME LIKE 'Miss%'The percent symbol (%) means that anything is acceptable in its place: one character, a hundred characters, or no character. Alternatively, if you want to search with a wildcard that represents one character, use an underscore (_). For example, this expression finds Catherine Smith and Katherine Smith: OWNER_NAME LIKE '_atherine Smith'You can use greater than (>), less than (<), greater than or equal (>=), less than or equal (<=), not equal (<>), and BETWEEN operators to select string values based on sorting order. For example, this expression selects all the cities in a coverage with names starting with the letters M through Z: CITY_NAME >= 'M'String functions can be used to format strings. For instance, the LEFT function returns a certain number of characters starting on the left of the string. In this example, the query returns all states starting with the letter A: LEFT(STATE_NAME,1) = 'A'Refer to the documentation of your database management system (DBMS) for a list of supported functions. You can use the NULL keyword to select features and records that have null values for the specified field. The NULL keyword is always preceded by IS or IS NOT. For example, to find cities whose 1996 population has not been entered, you can use the following: POPULATION IS NULLAlternatively, to find cities whose 1996 population has been entered, you can use the following: The decimal point (.) is always used as the decimal delimiter, regardless of your locale or regional settings. The comma cannot be used as a decimal or thousands delimiter in an expression. You can query numbers using the equal (=), not equal (<>), greater than (>), less than (<), greater than or equal to (>=), less than or equal to (<=), and BETWEEN operators, for example: POPULATION >= 5000Numeric functions can be used to format numbers. For instance, the ROUND function rounds a number to a given number of decimals in a file geodatabase: ROUND(SQKM,0) = 500Refer to your DBMS documentation for a list of supported numeric functions. Geodatabase data sources store dates in a date-time field. However, shapefiles do not. Therefore, most of the query syntax listed below contains a reference to the time. In some cases, the time part of the query may be safely omitted if the field is known to contain only dates; in other cases, it needs to be stated, or the query will return a syntax error. Searching date fields requires careful attention to the syntax required by your data source. If you build a date query in Clause mode of the Query Builder, the correct syntax will be automatically generated for you. Here is an example of a query that will return all records on or after January 1, 2011, for a file geodatabase data source: INCIDENT_DATE >= date '2011-01-01 00:00:00'Dates are stored in the underlying database as a reference to December 30, 1899, at 00:00:00. This is valid for all the data sources listed here. The purpose of this section is only to help you query dates, not time values. When a time that is not null is stored with the dates (for instance, January 12, 1999, 04:00:00), querying the date only will not return the record because when you pass only a date to a date-time field, it will fill the time with zeros and retrieve only the records where the time is 12:00:00 a.m. The attribute table shows date and time in a user-friendly format, depending on your regional settings, rather than the underlying database's format. This is fine most of the time, but it also has a few drawbacks:
Keep in mind this will not return records where the time is not null. An alternative format for querying dates in Oracle follows: Datefield = TO_DATE('yyyy-mm-dd hh:mm:ss','YYYY-MM-DD HH24:MI:SS')The second parameter 'YYYY-MM-DD HH24:MI:SS' describes the format used for querying. An actual query looks like this: Datefield = TO_DATE('2003-01-08 14:35:00','YYYY-MM-DD HH24:MI:SS')You can use a shorter version: TO_DATE('2003-11-18','YYYY-MM-DD')Again, this will not return records where the time is not null. The hh:mm:ss part of the query can be omitted when the time is not set in the records. The following is an alternative format: Datefield = 'mm/dd/yyyy'The hh:mm:ss part of the query cannot be omitted even if the time is equal to 00:00:00. You must specify the full time stamp when using equal-to queries or no records will be returned. You can successfully query with the following statements if the table you query contains date records with these exact time stamps (2007-05-29 00:00:00 or 2007-05-29 12:14:25): select * from table where date = '2007-05-29 00:00:00';or select * from table where date = '2007-05-29 12:14:25';If you use other operators—such as greater than, less than, greater than or equal to, or less than or equal to—you don't need to designate the time, but you can if you want to be that precise. Both of the following statements work: select * from table where date < '2007-05-29';select * from table where date < '2007-05-29 12:14:25';File geodatabases support the use of a time in the date field, so this can be added to the expression: Datefield = date 'yyyy-mm-dd hh:mm:ss'Shapefiles and coverages do not support the use of time in a date field. All SQL used by the file geodatabase is based on the SQL-92 standard. Querying a date on the left part (first table) of a join only works with file-based data sources, such as file geodatabases, shapefiles, and DBF tables. However, there is a possible workaround for working with data that is not file-based, like enterprise data as described below. Querying a date on the left part of a join will be successful when using the limited version of SQL developed for file-based data sources. If you are not using such a data source, you can force the expression to use this format. This can be done by making sure the query expression involves fields from more than one join table. For example, if a feature class and a table (FC1 and Table1) are joined and are both from an enterprise geodatabase, the following expressions will fail or return no data: FC1.date = date #01/12/2001# FC1.date = date '01/12/2001'To query successfully, you can create a query as follows: FC1.date = date '01/12/2001' and Table1.OBJECTID > 0Since the query involves fields from both tables, the limited SQL version will be used. In this expression, Table1.OBJECTID is always > 0 for records that matched during join creation, so this expression is true for all rows that contain join matches. To ensure that every record with FC1.date = date '01/12/2001' is selected, use the following query: FC1.date = date '01/12/2001' and (Table1.OBJECTID IS NOT NULL OR Table1.OBJECTID IS NULL)This query will select all records with FC1.date = date '01/12/2001', whether or not there was a join match for each particular record. Compound expressions can be built by combining expressions with the AND and OR operators. For example, the following expression selects all the houses that have more than 1,500 square feet and a garage for three or more cars: AREA > 1500 AND GARAGE > 3When you use the OR operator, at least one side of the expression of the two separated by the OR operator must be true for the record to be selected, for example: RAINFALL < 20 OR SLOPE > 35Use the NOT operator at the beginning of an expression to find features or records that don't match the specified expression, for example: NOT STATE_NAME = 'Colorado'NOT expressions can be combined with AND and OR. For example, this expression selects all the New England states except Maine: SUB_REGION = 'New England' AND NOT STATE_NAME = 'Maine'Calculations can be included in expressions using the arithmetic operators +, -, *, and /. Calculations can be between fields and numbers, for example: AREA >= PERIMETER * 100Calculations can also be performed between fields. For example, to find the countries with a population density of less than or equal to 25 people per square mile, you can use this expression: POP1990 / AREA <= 25Expressions are evaluated according to standard operator precedence rules. For example, the part of an expression enclosed in parentheses is evaluated before the part that isn't enclosed. HOUSEHOLDS > MALES * (POP90_SQMI + AREA)You can add parentheses in SQL Edit mode by typing them, or use the Group and Ungroup commands in Clause mode to add or remove them. A subquery is a query nested in another query and is supported by geodatabase data sources only. It can be used to apply predicate or aggregate functions or to compare data with values stored in another table. This can be done with the IN or ANY keyword. For example, this query selects only the countries that are not also listed in the indep_countries table: COUNTRY_NAME NOT IN (SELECT COUNTRY_NAME FROM indep_countries)Shapefiles, and other nongeodatabase file-based data sources do not support subqueries. Subqueries that are performed on versioned enterprise feature classes and tables will not return features that are stored in the delta tables. File geodatabases provide the limited support for subqueries explained in this section, while enterprise geodatabases provide full support. For information on the full set of subquery capabilities of enterprise geodatabases, refer to your DBMS documentation. This query returns the features with a GDP2006 greater than the GDP2005 of any of the features contained in countries: GDP2006 > (SELECT MAX(GDP2005) FROM countries)Subquery support in file geodatabases is limited to the following:
The following is the full list of query operators supported by file geodatabases, shapefiles, coverages, and other file-based data sources. They are also supported by enterprise geodatabases, although these data sources may require different syntax. In addition to the operators below, enterprise geodatabases support other capabilities. See your DBMS documentation for details. You use an arithmetic operator to add, subtract, multiply, and divide numeric values.
You use comparison operators to compare one expression to another.
The following is the full list of functions supported by file geodatabases, shapefiles, coverages, and other file-based data sources. The functions are also supported by enterprise geodatabases, although these data sources may require different syntax or function names. In addition to the functions below, enterprise geodatabases support other capabilities. See your DBMS documentation for details.
Arguments denoted as string_exp can be the name of a column, a character string literal, or the result of another scalar function, where the underlying data type can be represented as a character type. Arguments denoted as character_exp are variable-length character strings. Arguments denoted asstart or length can be a numeric literal or the result of another scalar function, where the underlying data type can be represented as a numeric type. These string functions are 1 based; that is, the first character in the string is character 1.
All numeric functions return a numeric value. Arguments denoted as numeric_exp, float_exp, or integer_exp can be the name of a column, the result of another scalar function, or a numeric literal, where the underlying data type could be represented as a numeric type.
The CAST() function converts a value or an expression from one data type to another specified data type. The syntax is as follows: CAST (expression AS data_type(length))
For example, in some scenarios, a string operation might be necessary, but if the data is stored in a number type field, the query wouldn't work. However, using the CAST() function, you can cast the number field to a string for a SQL operation. This code casts the number field SQLNUM as a text field, which can then be used in a text operation. CAST(SQLNUM AS CHARACTER(12)) The following table contains the keywords to use for data type conversions and can be specified in uppercase or lowercase.
CAST function examples
Feedback on this topic? |