$Revision: 6807 $
Copyright 2010-2024 Fred Toussi. Permission is granted to distribute this document without any alteration under the terms of the HSQLDB license. Additional permission is granted to the HSQL Development Group to distribute this document with or without alterations under the terms of the HSQLDB license.
2024-10-25
Table of Contents
HyperSQL data access and data change statements are compatible with the latest SQL:2023 Standard. There are a few extensions and some relaxation of rules, but these do not affect statements that are written to the Standard syntax. There is full support for classic SQL, as specified by SQL-92, and many enhancements added in later versions of the standard.
An SQL statement can be executed in two ways. One way is to use the
java.sql.Statement
interface. The Statement object
can be reused to execute completely different SQL statements.
Alternatively, a PreparedStatment
can be used to
execute an SQL statement repeatedly, and the statements can use
parameters. Using either form, if the SQL statement is a query expression,
a ResultSet
is returned.
In SQL, when a query expression (SELECT or similar SQL statement) is
executed, an ephemeral table is created. When this table is returned to
the application program, it is returned as a result set, which is accessed
row by row by a cursor. A JDBC ResultSet
represents
an SQL result set and its cursor.
The minimal definition of a cursor is a list of rows with a position that can move forward. Some cursors also allow the position to move backwards or jump to any position in the list.
An SQL cursor has several attributes. These attributes depend on the
query expression. Some of these attributes can be overridden by specifying
qualifiers in the SQL statement or by specifying values for the parameters
of the JDBC Statement
or
PreparedStatement
.
The columns of the result set are determined by the query
expression. The number of columns and the type and name characteristics
of each column are known when the query expression is compiled and
before its execution. This metadata information remains constant
regardless of changes to the contents of the tables used in the query
expression. The metadata for the JDBC ResultSet
is in the form of a ResultSetMetaData
object.
Various methods of the ResultSetMetaData
interface return different properties of each column of the
ResultSet
.
A result set may contain 0 or more rows. The rows are determined by the execution of the query expression.
The setMaxRows(int)
method of JDBC
Statement
allows limiting the number of rows
returned by the statement. This limit is conceptually applied after the
result has been built, and the excess rows are discarded.
A cursor is either scrollable or not. Scrollable cursors allow accessing rows by absolute or relative positioning. No-scroll cursors only allow moving to the next row. The cursor can be optionally declared with the SQL qualifiers SCROLL, or NO SCROLL. The JDBC statement parameter can be specified as: TYPE_FORWARD_ONLY and TYPE_SCROLL_INSENSITIVE. The JDBC type TYPE_SCROLL_SENSITIVE is not supported by HSQLDB.
The default is NO SCROLL or TYPE_FORWARD_ONLY.
When a JDBC ResultSet
is opened, it is
positioned before the first row. Using the
next()
method, the position is moved to the
first row. While the ResultSet
is positioned on a
row, various getter methods can be used to access the columns of the
row.
The result returned by some query expressions is updatable. HSQLDB supports core SQL updatability features, plus some enhancements from the SQL optional features.
A query expression is updatable if it is a SELECT from a single underlying base table (or updatable view) either directly or indirectly. A SELECT statement featuring DISTINCT or GROUP BY or FETCH, LIMIT, OFFSET is not updatable. In an updatable query expression, one or more columns are updatable. An updatable column is a column that can be traced directly to the underlying table. Therefore, columns that contain expressions are not updatable. Examples of updatable query expressions are given below. The view V is updatable when its query expression is updatable. The SELECT statement from this view is also updatable:
SELECT A, B FROM T WHERE C > 5 SELECT A, B FROM (SELECT * FROM T WHERE C > 10) AS TT WHERE TT.B <10 CREATE VIEW V(X,Y) AS SELECT A, B FROM T WHERE C > 0 AND B < 10 SELECT X FROM V WHERE Y = 5
If a cursor is declared with the SQL qualifier, FOR
UPDATE OF <column name list>
, then only the stated
columns in the result set become updatable. If any of the stated columns
is not actually updatable, then the cursor declaration will not
succeed.
If the SQL qualifier, FOR UPDATE is used, then all the updatable columns of the result set become updatable.
If a cursor is declared with FOR READ ONLY, then it is not updatable.
In HyperSQL, if FOR READ ONLY or FOR UPDATE is not used then all the updatable columns of the result set become updatable. This relaxes the SQL standard rule that in this case limits updatability to only simply updatable SELECT statements (where all columns are updatable).
In JDBC, CONCUR_READ_ONLY or CONCUR_UPDATABLE can be specified for
the Statement
parameter. CONCUR_UPDATABLE is
required if the returning ResultSet is to be updatable. If
CONCUR_READ_ONLY, which is the default, is used, then even an updatable
ResultSet becomes read-only.
When a ResultSet
is updatable, various
setter methods can be used to modify the column values. The names of the
setter methods begin with "update". After all the updates on a row are
done, the updateRow()
method must be called
only once to finalise the row update.
An updatable ResultSet
may or may not be
insertable-into. In an insertable ResultSet
, all
columns of the result are updatable and any column of the base table
that is not in the result must be a generated column or have a default
value.
In the ResultSet
object, a special
pseudo-row, called the insert row, is used to populate values for
insertion into the ResultSet
(and consequently,
into the base table). The setter methods must be used on all the
columns, followed by a call to
insertRow()
.
Individual rows from all updatable result sets can be deleted one
at a time. The deleteRow()
is called when the
ResultSet
is positioned on a row.
While using an updatable ResultSet to modify data, it is recommended not to change the same data using another ResultSet and not to execute SQL data change statements that modify the same data.
The sensitivity of the cursor relates to visibility of changes made to the data by the same transaction but without using the given cursor. While the result set is open, the same transaction may use statements such as INSERT or UPDATE, and change the data of the tables from which the result set data is derived. A cursor is SENSITIVE if it reflects those changes. It is INSENSITIVE if it ignores such changes. It is ASENSITIVE if behaviour is implementation dependent.
The SQL default is ASENSITIVE, i.e., implantation dependent.
In HyperSQL all cursors are INSENSITIVE. They do not reflect changes to the data made by other statements.
A cursor is holdable if the result set is not automatically closed when the current transaction is committed. Holdability can be specified in the cursor declaration using the SQL qualifiers WITH HOLD or WITHOUT HOLD.
In JDBC, holdability is specified using either of the following values for the Statement parameter: HOLD_CURSORS_OVER_COMMIT, or CLOSE_CURSORS_AT_COMMIT.
The SQL default is WITHOUT HOLD.
The JDBC default for HyperSQL result sets is WITH HOLD for read-only result sets and WITHOUT HOLD for updatable result sets.
If the holdability of a ResultSet
is
specified in a conflicting manner in the SQL statement and the JDBC
Statement
object, the JDBC setting takes
precedence.
The autocommit property of a connection is a feature of JDBC and
ODBC and is not part of the SQL Standard. In autocommit mode, all
transactional statements are followed by an implicit commit. In
autocommit mode, all ResultSet
objects are
read-only and holdable.
The JDBC settings, ResultSet.CONCUR_READONLY and ResultSet.CONCUR_UPDATABLE are the available alternatives for read-only or updatability. The default is ResultSet.CONCUR_READONLY.
The JDBC settings, ResultSet.TYPE_FORWARD_ONLY, ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.TYPE_SCROLL_SENSITIVE are the available alternatives for both scrollability (navigation) and sensitivity. HyperSQL does not support ResultSet.TYPE_SCROLL_SENSITIVE. The two other alternatives can be used for both updatable and read-only result sets.
The JDBC settings ResultSet.CLOSE_CURSORS_AT_COMMIT and ResultSet.HOLD_CURSORS_OVER_COMMIT are the alternatives for the lifetime of the result set. The default is ResultSet.CLOSE_CURSORS_AT_COMMIT. The other setting can only be used for read-only result sets.
Examples of creating statements for updatable result sets are given below:
Connection c = newConnection(); Statement st; c.setAutoCommit(false); st = c.createStatement(ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_UPDATABLE); st = c.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.CONCUR_UPDATABLE);
When a JDBC PreparedStatement
or
CallableStatement
is used with an SQL statement
that contains dynamic parameters, the data types of the parameters are
resolved and determined by the engine when the statement is prepared.
The SQL Standard has detailed rules to determine the data types and
imposes limits on the maximum length or precision of the parameter.
HyperSQL applies the standard rules with two exceptions for parameters
with String and BigDecimal Java types. HyperSQL ignores the limits when
the parameter value is set, and only enforces the necessary limits for
inserts and updates when the PreparedStatement
is
executed. In all other cases, parameter type limits are checked and
enforced when the parameter is set.
In the example below the setString()
calls do not raise an exception, but one of the
execute()
statements does.
// table definition: CREATE TABLE T (NAME VARCHAR(12), ...) Connection c = newConnection(); PreparedStatement st = c.prepareStatement("SELECT * FROM T WHERE NAME = ?"); // type of the parameter is VARCHAR(12), which limits length to 12 characters st.setString(1, "Eyjafjallajokull"); // string is longer than type, but no exception is raised here st.execute(); // executes with no exception and does not find any rows // but if an UPDATE is attempted, an exception is raised st = c.prepareStatement("UPDATE T SET NAME = ? WHERE ID = 10"); st.setString(1, "Eyjafjallajokull"); // string is longer than type, but no exception is raised here st.execute(); // exception is thrown when HyperSQL checks the value for update
JDBC parameters can be set with any compatible type, as supported
by the JDBC specification. For CLOB and BLOB types, you can use streams,
or create instances of BLOB or CLOB before assigning them to the
parameters. You can even use CLOB or BLOB objects returned from
connections to other RDBMS servers. The
Connection.createBlob()
and
createClob()
methods can be used to create the
new LOBs. For very large LOBs the stream methods are preferable as they
use less memory.
For array parameters, you can use a
java.sql.Array
object that contains the array
elements before assigning to JDBC parameters. The
Connection.createArrayOf(...)
method can be
used to create a new object, or you can use an Array returned from
connections to other RDBMS servers. You can also use Java arrays of
primitives to assign to the SQL array parameters.
The above also applies to the methods of
java.sql.ResultSet
that are used for setting
values in new and updated rows in updatable
ResultSet
objects.
The setObject() methods of
PreparedStatement
and
CallableStatement
also accept objects of the new
types introduced in Java 8 and listed below under JDBC Returned Values.
The new Java 8 method, getObject(int columnIndex,
Class<T> type)
, can be used to retrieve the value of
an OUT parameter from a CallableStatement
.
Data change statements, also called data manipulation statements
(DML) such as INSERT, UPDATE, MERGE can be called with different
executeUpdate()
methods of
java.sql.Statement
and
java.sql.PreparedStatement
. Some of these methods
allow you to specify how values for generated columns of the table are
returned. These methods are documented in the JavaDoc for
org.hsqldb.jdbc.JDBCStatement
and
org.hsqldb.jdbc.JDBCPreparedStatement
. HyperSQL
can return not just the generated columns, but any set of columns of the
table. You can use this to retrieve the columns values that may be
modified by a BEFORE TRIGGER on the table.
The JDBC CallableStatement
interface is
used to call Java or SQL procedures that have been defined in the
database. The SQL statement is in the form of CALL
procedureName ( ... )
with constant value arguments or with
parameter markers. Note that you must use a parameter marker for OUT and
INOUT arguments of the procedure you are calling. The OUT arguments
should not be set before executing the callable statement.
After executing the statement, you can retrieve the OUT and INOUT parameters with the appropriate getXXX() method.
Procedures can also return one or more result sets. You should
call the getResultSet()
and
getMoreResults()
methods to retrieve the result
sets one by one.
SQL functions can also return a table. You can call such functions the same way as procedures and retrieve the table as a ResultSet.
The methods of the JDBC ResultSet
interface
are used to return values and to convert value to different types as
supported by the JDBC specification. Methods of JDBC
CallableStatement
that have the same signature
are used to return values from procedure calls.
When a CLOB and BLOB object is returned from a ResultSet, no data
is transferred until the data is read by various methods of
java.sql.CLOB
and
java.sql.BLOB
. Data is streamed in large blocks
to avoid excessive memory use.
Array objects are returned as instances of
java.sql.Array
.
Methods added in Java 8 are supported and the behaviour of the
getObject(int columnIndex)
method for
retrieving TIMESTAMP WITH TIME ZONE has changed. In versions before
2.4.0, this method returned a java.sql.Timestamp
object. It now returns a java.time.OffsetDateTime
object which contains the time zone offset value as well as the other
fields of the TIMESTAMP.
A new method, getObject(int columnIndex,
Class<T> type)
is available. With this method, you
specify the required return type. The common types such as String,
Integer, byte[] are supported, as well as new types that can be used for
DATE, TIME, TIMESTAMP, and INTERVAL values. The table below shows which
Java classes are the most appropriate for the specified SQL TYPES. In
addition, you can use these Java types for values of SQL type that are
not fully matched. For example,
java.time.LocalDateTime
can be used to retrieve a
DATE value.
java.util.UUID = UUID |
java.time.LocalDate = DATE |
java.sql.Date = DATE |
java.time.LocalTime = TIME |
java.sql.Time = TIME |
java.time.LocalDateTime =
TIMESTAMP |
java.sql.Timestamp =
TIMESTAMP |
java.time.OffsetTime = TIME WITH
TIME ZONE |
java.time.OffsetDateTime =
TIMESTAMP WITH TIME ZONE |
java.time.Duration = INTERVAL
MONTH, INTERVAL YEAR, INTERVAL YEAR TO MONTH |
java.time.Period = INTERVAL
SECOND, INTERVAL MINUTE, INTERVAL HOUR, INTERVAL DAY, and their
range combinations |
java.sql.Array = all ARRAY |
The DECLARE CURSOR statement is used within an SQL PROCEDURE body. In the current version of HyperSQL, the cursor is used only to return a result set from the procedure. Therefore, the cursor must be declared WITH RETURN and can only be READ ONLY.
DECLARE CURSOR
declare cursor statement
<declare cursor> ::= DECLARE <cursor
name>
[ { SENSITIVE | INSENSITIVE | ASENSITIVE } ] [ { SCROLL |
NO SCROLL } ]
CURSOR [ { WITH HOLD | WITHOUT HOLD } ] [ { WITH RETURN |
WITHOUT RETURN } ]
FOR <query expression>
[ FOR { READ ONLY | UPDATE [ OF <column name list>
] } ]
The query expression is a SELECT statement or similar, and is
discussed in the rest of this chapter. In the example below a cursor is
declared for a SELECT statement. It is later opened to create the result
set. The cursor is specified WITHOUT HOLD, so the result set is not kept
after a commit. Use WITH HOLD to keep the result set. Note that you need
to declare the cursor WITH RETURN as it is returned by the
CallableStatement
.
DECLARE thiscursor SCROLL CURSOR WITHOUT HOLD WITH RETURN FOR SELECT * FROM INFORMATION_SCHEMA.TABLES; -- OPEN thiscursor;
The syntax elements that can be used in data access and data change statements are described in this section. The SQL Standard has a very extensive set of definitions for these elements. The BNF definitions given here are sometimes simplified.
Literals are used to express constant values. The general type of a literal is known by its format. The specific type is based on conventions.
unicode escape elements
unicode escape elements
<Unicode escape specifier> ::= [ UESCAPE
<quote><Unicode escape character><quote> ]
<Unicode escape value> ::= <Unicode 4 digit
escape value> | <Unicode 6 digit escape value> | <Unicode
character escape value>
<Unicode 4 digit escape value> ::= <Unicode
escape
character><hexit><hexit><hexit><hexit>
<Unicode 6 digit escape value> ::= <Unicode
escape character><plus sign>
<hexit><hexit><hexit><hexit><hexit><hexit>
<Unicode character escape value> ::= <Unicode
escape character><Unicode escape character>
<Unicode escape character> ::= a single
character other than a <hexit> (a-f, A-F, 0-9), <plus sign>,
<quote>, <double quote>, or <white
space>
character literal
character literal
<character string literal> ::= [
<introducer><character set specification> ] <quote> [
<character representation>... ] <quote> [ {
<separator> <quote> [ <character representation>... ]
<quote> }... ]
<introducer> ::=
<underscore>
<character representation> ::= <nonquote
character> | <quote symbol>
<nonquote character> ::= any character apart
from the quote symbol.
<quote symbol> ::=
<quote><quote>
<national character string literal> ::= N
<quote> [ <character representation>... ] <quote> [ {
<separator> <quote> [ <character representation>... ]
<quote> }... ]
<Unicode character string literal> ::= [
<introducer><character set specification> ]
U<ampersand><quote> [ <Unicode representation>... ]
<quote> [ { <separator> <quote> [ <Unicode
representation>... ] <quote> }... ] <Unicode escape
specifier>
<Unicode representation> ::= <character
representation> | <Unicode escape value>
The type of a character literal is CHARACTER. The length of the string literal is the character length of the type. If the quote character is used in a string, it is represented with two quote characters. Long literals can be divided into multiple quoted strings, separated with a space or end-of-line character.
Unicode literals start with U& and can contain ordinary characters and Unicode escapes. A Unicode escape begins with the backslash ( \ ) character and is followed by four hexadecimal characters which specify the character code. The Unicode escape character can be custom defined for a literal string by adding UESPACE as in one of the examples below.
Example of character literals are given below:
'a literal' ' string seperated' ' into parts' 'a string''s literal form with quote character' U&'Unicode string with Greek delta \0394 and phi \03a6 letters' U&'Unicode string with forward slash // as custom escape character'UESCAPE'/'
binary literal
binary literal
<binary string literal> ::= X <quote> [
<space>... ] [ { <hexit> [ <space>... ] <hexit>
[ <space>... ] }... ] <quote> [ { <separator>
<quote> [ <space>... ] [ { <hexit> [ <space>...
] <hexit> [ <space>... ] }... ] <quote> }...
]
<hexit> ::= <digit> | A | B | C | D | E |
F | a | b | c | d | e | f
The type of a binary literal is BINARY. The octet length of the binary literal is the length of the type. Case-insensitive hexadecimal characters are used in the binary string. Each pair of characters in the literal represents a byte in the binary string. Long literals can be divided into multiple quoted strings, separated with a space or end-of-line character.
X'1abACD34' 'Af'
bit literal
bit literal
<bit string literal> ::= B <quote> [
<bit> ... ] <quote> [ { <separator> <quote> [
<bit>... ] <quote> }... ]
<bit> ::= 0 | 1
The type of a binary literal is BIT. The bit length of the bit literal is the length of the type. Digits 0 and 1 are used to represent the bits. Long literals can be divided into multiple quoted strings, separated with a space or end-of-line character.
B'10001001' '00010'
numeric literal
numeric literal
<signed numeric literal> ::= [ <sign> ]
<unsigned numeric literal>
<unsigned numeric literal> ::= <exact numeric
literal> | <approximate numeric literal>
<exact numeric literal> ::= <unsigned
integer> [ <period> [ <unsigned integer> ] ] |
<period> <unsigned integer>
<sign> ::= <plus sign> | <minus
sign>
<approximate numeric literal> ::=
<mantissa> E <exponent>
<mantissa> ::= <exact numeric
literal>
<exponent> ::= <signed
integer>
<signed integer> ::= [ <sign> ]
<unsigned integer>
<unsigned integer> ::=
<digit>...
The type of an exact numeric literal without a decimal point is INTEGER, BIGINT, or DECIMAL, depending on the value of the literal (the smallest type that can represent the value is the type).
The type of an exact numeric literal with a decimal point is DECIMAL. The precision of a decimal literal is the total number of digits of the literal. The scale of the literal is the total number of digits to the right of the decimal point.
The type of an approximate numeric literal is DOUBLE. An approximate numeric literal always includes the mantissa and exponent, separated by E.
12 34.35 +12E-2
boolean literal
boolean literal
<boolean literal> ::= TRUE | FALSE |
UNKNOWN
The boolean literal is one of the specified keywords.
datetime and interval literal
datetime and interval literal
<datetime literal> ::= <date literal> |
<time literal> | <timestamp literal>
<date literal> ::= DATE <date
string>
<time literal> ::= TIME <time
string>
<timestamp literal> ::= TIMESTAMP <timestamp
string>
<date string> ::= <quote> <unquoted
date string> <quote>
<time string> ::= <quote> <unquoted
time string> <quote>
<timestamp string> ::= <quote>
<unquoted timestamp string> <quote>
<time zone interval> ::= <sign> <hours
value> <colon> <minutes value>
<date value> ::= <years value> <minus
sign> <months value> <minus sign> <days
value>
<time value> ::= <hours value>
<colon> <minutes value> <colon> <seconds
value>
<interval literal> ::= INTERVAL [ <sign> ]
<interval string> <interval qualifier>
<interval string> ::= <quote> <unquoted
interval string> <quote>
<unquoted date string> ::= <date
value>
<unquoted time string> ::= <time value> [
<time zone interval> ]
<unquoted timestamp string> ::= <unquoted
date string> <space> <unquoted time
string>
<unquoted interval string> ::= [ <sign> ]
{ <year-month literal> | <day-time literal>
}
<year-month literal> ::= <years value> [
<minus sign> <months value> ] | <months
value>
<day-time literal> ::= <day-time interval>
| <time interval>
<day-time interval> ::= <days value> [
<space> <hours value> [ <colon> <minutes value>
[ <colon> <seconds value> ] ] ]
<time interval> ::= <hours value> [
<colon> <minutes value> [ <colon> <seconds
value> ] ] | <minutes value> [ <colon> <seconds
value> ] | <seconds value>
<years value> ::= <datetime
value>
<months value> ::= <datetime
value>
<days value> ::= <datetime
value>
<hours value> ::= <datetime
value>
<minutes value> ::= <datetime
value>
<seconds value> ::= <seconds integer
value> [ <period> [ <seconds fraction> ]
]
<seconds integer value> ::= <unsigned
integer>
<seconds fraction> ::= <unsigned
integer>
<datetime value> ::= <unsigned
integer>
The type of a datetime or interval type is specified in the literal. The fractional second precision is the number of digits in the fractional part of the literal. Details are described in the SQL Language chapter
DATE '2008-08-08' TIME '20:08:08' TIMESTAMP '2008-08-08 20:08:08.235' INTERVAL '10' DAY INTERVAL -'08:08' MINUTE TO SECOND
References are identifier chains, which can be a single identifiers or identifiers chains composed of single identifiers chained together with the period symbol.
identifier chain
identifier chain
<identifier chain> ::= <identifier> [ {
<period> <identifier> }... ]
<basic identifier chain> ::= <identifier
chain>
A period-separated chain of identifiers. The identifiers in an identifier chain can refer to database objects in a hierarchy. The possible hierarchies are as follows. In each hierarchy, elements from the start or the end can be missing, but the order of elements cannot be changed.
catalog, schema, database object
catalog, schema, table, column
correlation name, column
Examples of identifier chain are given below:
SELECT MYCATALOG.MYSCHEMA.MYTABLE.MYCOL FROM MYCATALOG.MYSCHEMA.MYTABLE SELECT MYTABLE.MYCOL FROM MYSCHEMA.MYTABLE DROP TABLE MYCATALOG.MYSCHEMA.MYTABLE CASCADE ALTER SEQUENCE MYCATALOG.MYSCHEMA.MYSEQUENCE RESTART WITH 100
column reference
column reference
<column reference> ::= <basic identifier
chain> | MODULE <period> <qualified identifier>
<period> <column name>
Reference a column or a routine variable.
SQL parameter reference
SQL parameter reference
<SQL parameter reference> ::= <basic
identifier chain>
Reference an SQL routine parameter.
contextually typed value specification
contextually typed value specification
<contextually typed value specification> ::=
<null specification> | <default
specification>
<null specification> ::=
NULL
<default specification> ::=
DEFAULT
Specify a value whose data type or value is inferred from its context.
DEFAULT is used for assignments to table columns that have a default value, or to table columns that are generated either as an IDENTITY value or as an expression.
NULL can be used only in a context where the type of the value is known. For example, a NULL can be assigned to a column of the table in an INSERT or UPDATE statement, because the type of the column is known. But if NULL is used in a SELECT list, it must be used in a CAST statement.
Value expression is a general name for all expressions that return a value. Different types of expressions are allowed in different contexts.
value expression primary
value expression primary
<value expression primary> ::= <parenthesized
value expression> | <nonparenthesized value expression
primary>
<parenthesized value expression> ::= <left
paren> <value expression> <right
paren>
<nonparenthesized value expression primary> ::=
<unsigned value specification> | <column reference> |
<set function specification> | <scalar subquery> | <case
expression> | <cast specification> | <next value
expression> | <current value expression> | <routine
invocation>
Specify a value that is syntactically self-delimited.
value specification
value specification
<value specification> ::= <literal> |
<general value specification>
<unsigned value specification> ::= <unsigned
literal> | <general value specification>
<target specification> ::= <host parameter
specification> | <SQL parameter reference> | <column
reference> | <dynamic parameter
specification>
<simple target specification> ::= <host
parameter specification> | <SQL parameter reference> |
<column reference> | <embedded variable
name>
<host parameter specification> ::= <host
parameter name> [ <indicator parameter> ]
<dynamic parameter specification> ::=
<question mark>
Specify one or more values, host parameters, SQL parameters, dynamic parameters, or host variables.
row value expression
row value expression
<row value expression> ::= <row value special
case> | <explicit row value constructor>
<row value predicand> ::= <row value special
case> | <row value constructor predicand>
<row value special case> ::=
<nonparenthesized value expression primary>
<explicit row value constructor> ::= <left
paren> <row value constructor element> <comma> <row
value constructor element list> <right paren>
|
ROW <left paren> <row value constructor
element list> <right paren> | <row
subquery>
Specify a row consisting of one or more elements. A comma separated list of expressions, enclosed in brackets, with the optional keyword ROW. In SQL, a row containing a single element can often be used where a single value is expected.
set function specification
set function specification
<set function specification> ::= <aggregate
function> | <grouping operation>
<grouping operation> ::= GROUPING <left
paren> <column reference> [ { <comma> <column
reference> }... ] <right paren>
Specify an integer value formed by bits denoting the presence
of the column in the current row of the result of GROUPING SETS.
HyperSQL supports <grouping operation>
from
version 2.5.1.
COALESCE
coalesce expression
<coalesce expression> := COALESCE <left
paren> <value expression> { <comma> <value
expression> }... <right paren>
Replace null values with another value. The coalesce expression
has two or more instances of <value expression>. If the first
<value expression> evaluates to a non-null value, it is returned
as the result of the coalesce expression. If it is null, the next
<value expression>
is evaluated and if it
evaluates to a non-non value, it is returned, and so on.
The type of the return value of a COALESCE expression is the
aggregate type of the types of all the <value
expression>
instances. Therefore, any value returned is
implicitly cast to this type. HyperSQL also features built-in functions
with similar functionality.
NULLIF
nullif expression
<nullif expression> := NULLIF <left paren>
<value expression> <comma> <value expression>
<right paren>
Return NULL if two values are equal. If the result of the first
<value expression>
is not equal to the result
of the second, then it is returned, otherwise NULL is returned. The type
of the return value is the type of the first <value
expression>
.
SELECT i, NULLIF(n, 'not defined') FROM t
CASE
case specification
<case specification> ::= <simple case> |
<searched case>
<simple case> ::= CASE <case operand>
<simple when clause>... [ <else clause> ]
END
<searched case> ::= CASE <searched when
clause>... [ <else clause> ] END
<simple when clause> ::= WHEN <when operand
list> THEN <result>
<searched when clause> ::= WHEN <search
condition> THEN <result>
<else clause> ::= ELSE
<result>
<case operand> ::= <row value predicand> |
<overlaps predicate part 1>
<when operand list> ::= <when operand> [ {
<comma> <when operand> }... ]
<when operand> ::= <row value predicand> |
<comparison predicate part 2> | <between predicate part 2> |
<in predicate part 2> | <character like predicate part 2> |
<octet like predicate part 2> | <similar predicate part 2> |
<regex like predicate part 2> | <null predicate part 2> |
<quantified comparison predicate part 2> | <match predicate
part 2> | <overlaps predicate part 2> | <distinct predicate
part 2>
<result> ::= <result expression> |
NULL
<result expression> ::= <value
expression>
Specify a conditional value. The result of a case expression is always a value. All the values introduced with THEN must be of the same type or convertible to the same type. The WHEN clause of CASE can be used in two different forms. The first form starts with a variable and the WHEN clauses follow, either as possible values for the variable, or as conditions. The second form does not start with a variable and each WHEN is followed by a self-contained conditional expression which can use any variables.
Some simple examples of the CASE expression are given below. The first two examples return 'Britain', 'Germany', or 'Other country' depending on the value of dial code. The third example uses IN and smaller-than predicates.
CASE dialcode WHEN 44 THEN 'Britain' WHEN 49 THEN 'Germany' ELSE 'Other country' END CASE WHEN dialcode=44 THEN 'Britain' WHEN dialcode=49 THEN 'Germany' WHEN dialcode < 0 THEN 'bad dial code' ELSE 'Other country' END CASE dialcode WHEN IN (44,49,30) THEN 'Europe' WHEN IN (86,91,92) THEN 'Asia' WHEN < 0 THEN 'bad dial code' ELSE 'Other continent' END
The case statement can be far more complex and involve several conditions.
CAST
cast specification
<cast specification> ::= CAST <left paren>
<cast operand> AS <cast target> <right
paren>
<cast operand> ::= <value expression> |
<implicitly typed value specification>
<cast target> ::= <domain name> | <data
type>
Specify a data conversion. Data conversion takes place automatically among variants of a general type. For example, numeric values are freely converted from one type to another in expressions.
Explicit type conversion is necessary in two cases. One case is to determine the type of a NULL value. The other case is to force conversion for special purposes. Values of data types can be cast to a character type. The exception is BINARY and OTHER types. The result of the cast is the literal expression of the value. Conversely, a value of a character type can be converted to another type if the character value is a literal representation of the value in the target type. Special conversions are possible between numeric and interval types, which are described in the section covering interval types.
The examples below show examples of cast with their result:
CAST (NULL AS TIMESTAMP) CAST (' 199 ' AS INTEGER) = 199 CAST ('tRue ' AS BOOLEAN) = TRUE CAST (INTERVAL '2' DAY AS INTEGER) = 2 CAST ('1992-04-21' AS DATE) = DATE '1992-04-21'
NEXT VALUE FOR
next value expression
<next value expression> ::= NEXT VALUE FOR
<sequence generator name>
Return the next value of a sequence generator. This expression can be used as a select list element in queries, or in assignments to table columns in data change statements. If the expression is used more than once in a single row that is being evaluated, the same value is returned for each invocation. After evaluation of the particular row is complete, the sequence generator will return a different value from the old value. The new value is generated by the sequence generator by adding the increment to the last value it generated. In SQL syntax compatibility modes, variants of this expression in different SQL dialects are supported. In the example below the expression is used in an insert statement:
INSERT INTO MYTABLE(COL1, COL2) VALUES 2, NEXT VALUE FOR MYSEQUENCE
CURRENT VALUE FOR
current value expression
<current value expression> ::= CURRENT VALUE FOR
<sequence generator name>
Return the latest value that was returned by the NEXT VALUE FOR expression for a sequence generator in this session. In the example below, the value that was generated by the sequence for the first insert, is reused for the second insert:
INSERT INTO MYTABLE(COL1, COL2) VALUES 2, NEXT VALUE FOR MYSEQUENCE; INSERT INTO CHILDTABLE(COL1, COL2) VALUES 10, CURRENT VALUE FOR MYSEQUENCE;
value expression
value expression
<value expression> ::= <numeric value
expression> | <string value expression> | <datetime value
expression> | <interval value expression> | <boolean value
expression> | <row value expression>
An expression that returns a value. The value can be a single value, or a row consisting more than one value.
numeric value expression
numeric value expression
<numeric value expression> ::= <term> |
<numeric value expression> <plus sign> <term> |
<numeric value expression> <minus sign>
<term>
<term> ::= <factor> | <term>
<asterisk> <factor> | <term> <solidus>
<factor>
<factor> ::= [ <sign> ] <numeric
primary>
<numeric primary> ::= <value expression
primary> | <numeric value function>
Specify a numeric value. The BNF indicates that
<asterisk>
and
<solidus>
(the operators for multiplication and
division) have precedence over <minus sign>
and
<plus sign>
.
numeric value function
numeric value function
<numeric value function> ::= <position
expression> | <extract expression> | <length expression>
...
Specify a function yielding a value of type numeric. The supported numeric value functions are listed and described in the Built In Functions chapter.
string value expression
string value expression
<string value expression> ::= <string
concatenation> | <string factor>
<string factor> ::= <value expression
primary> | <string value function>
<string concatenation> ::= <string value
expression> <concatenation operator> <string
factor>
<concatenation operator> ::=
||
Specify a character string value, a binary string value, or a
bit string value. The BNF indicates that a string value expression can
be formed by concatenation of two or more <value expression
primary>
. The types of the <value expression
primary>
elements must be compatible, that is, all must be
string, or binary or bit string values.
character value function
string value function
<string value function> ::=
...
Specify a function that returns a character string or binary string. The supported character value functions are listed and described in the Built In Functions chapter.
datetime value expression
datetime value expression
<datetime value expression> ::= <datetime
term> | <interval value expression> <plus sign>
<datetime term> | <datetime value expression> <plus
sign> <interval term> | <datetime value expression>
<minus sign> <interval term>
<datetime term> ::= <datetime
factor>
<datetime factor> ::= <datetime primary> [
<time zone> ]
<datetime primary> ::= <value expression
primary> | <datetime value function>
<time zone> ::= AT <time zone
specifier>
<time zone specifier> ::= LOCAL | TIME ZONE {
<interval primary> | <time zone name> }
Specify a datetime value. Details are described in the SQL Language chapter.
datetime value function
datetime value function
<datetime value function> ::=
...
Specify a function that returns a datetime value. The supported datetime value functions are listed and described in the Built In Functions chapter.
interval term
interval value expression
<interval value expression> ::= <interval
term> | <interval value expression 1> <plus sign>
<interval term 1> | <interval value expression 1> <minus
sign> <interval term 1> | <left paren> <datetime value
expression> <minus sign> <datetime term> <right
paren> <interval qualifier>
<interval term> ::= <interval factor> |
<interval term 2> <asterisk> <factor> | <interval
term 2> <solidus> <factor> | <term>
<asterisk> <interval factor>
<interval factor> ::= [ <sign> ]
<interval primary>
<interval primary> ::= <value expression
primary> [ <interval qualifier> ] | <interval value
function>
<interval value expression 1> ::= <interval
value expression>
<interval term 1> ::= <interval
term>
<interval term 2> ::= <interval
term>
Specify an interval value. Details are described in the SQL Language chapter.
interval absolute value function
interval value function
<interval value function> ::= <interval
absolute value function>
<interval absolute value function> ::= ABS
<left paren> <interval value expression> <right
paren>
Specify a function that returns the absolute value of an interval. If the interval is negative, it is negated, otherwise the original value is returned.
boolean value expression
boolean value expression
<boolean value expression> ::= <boolean
term> | <boolean value expression> OR <boolean
term>
<boolean term> ::= <boolean factor> |
<boolean term> AND <boolean factor>
<boolean factor> ::= [ NOT ] <boolean
test>
<boolean test> ::= <boolean primary> [ IS
[ NOT ] <truth value> ]
<truth value> ::= TRUE | FALSE |
UNKNOWN
<boolean primary> ::= <predicate> |
<boolean predicand>
<boolean predicand> ::= <parenthesized
boolean value expression> | <nonparenthesized value expression
primary>
<parenthesized boolean value expression> ::=
<left paren> <boolean value expression> <right
paren>
Specify a boolean value.
Predicates are conditions and evaluate to a boolean value. Some
predicates have two sides. The left side of the predicate, the
<row value predicand>
, is the common element of
all two-sided predicates. This element is a generalisation of both
<value expression>
, which is a scalar, and of
<explicit row value constructor>
, which is a
row. The two sides of a predicate can be split in CASE expressions where
the <row value predicand>
is part of multiple
predicates.
In the following example, a column of the table is the left side of two predicates in a CASE expression.
SELECT CASE city WHEN 'Oslo' THEN 'Scandinavia' WHEN IN ('Dallas', 'Boston') THEN 'America' ELSE '?' END FROM customer
The number of fields in all <row value
predicand>
used in predicates must be the same and the
types of the fields in the same position must be compatible for
comparison. If either of these conditions does not hold, an exception is
raised. The number of fields in a row is called the
degree.
In many types of predicates (but not all of them), if the
<row value predicand>
evaluates to NULL, the
result of the predicate is UNKNOWN. If the <row value
predicand>
has more than one element, and one or more of
the fields evaluate to NULL, the result depends on the particular
predicate.
comparison predicand
comparison predicate
<comparison predicate> ::= <row value
predicand> <comp op> <row value
predicand>
<comp op> ::= <equals operator> | <not
equals operator> | <less than operator> | <greater than
operator> | <less than or equals operator> | <greater than
or equals operator>
Specify a comparison of two row values. If either
<row value predicand>
evaluates to NULL, the
result of <comparison predicate>
is UNKNOWN.
Otherwise, the result is TRUE, FALSE or UNKNOWN.
If the degree of <row value
predicand>
is larger than one, comparison is performed
between each field and the corresponding field in the other
<row value predicand>
from left to right, one
by one.
When comparing two elements, if either field is NULL then the result is UNKNOWN.
For <equals operator>
, if the result
of comparison is TRUE for all field, the result of the predicate is
TRUE. If the result of comparison is FALSE for one field, the result of
predicate is FALSE. Otherwise the result is UNKNOWN.
The <not equals operator>
is
translated to NOT (<row value predicand> = <row value
predicand>)
.
The <less than or equals operator>
is
translated to (<row value predicand> = <row value
predicand>) OR (<row value predicand> < <row value
predicand>)
. The <greater than or equals
operator>
is translated in a similar way.
For the <less than operator>
and
<greater than operator>
, if two fields at a
given position are equal, then comparison continues to the next field.
Otherwise, the result of the last performed comparison is returned as
the result of the predicate. This means that if the first field is NULL,
the result is always UNKNOWN.
The logic that governs NULL values and UNKNOWN result is as follows: Suppose the NULL values were substituted by arbitrary real values. If substitution cannot change the result of the predicate, then the result is TRUE or FALSE, based on the existing non-NULL values, otherwise the result of the predicate is UNKNOWN.
The examples of comparison given below use literals, but the literals actually represent the result of evaluation of some expression.
((1, 2, 3, 4) = (1, 2, 3, 4)) IS TRUE ((1, 2, 3, 4) = (1, 2, 3, 5)) IS FALSE ((1, 2, 3, 4) < (1, 2, 3, 4)) IS FALSE ((1, 2, 3, 4) < (1, 2, 3, 5)) IS TRUE ((NULL, 1, NULL) = (NULL, 1, NULL)) IS UNKNOWN ((NULL, 1, NULL) = (NULL, 2, NULL)) IS FALSE ((NULL, 1, NULL) <> (NULL, 2, NULL)) IS TRUE ((NULL, 1, 2) <all operators> (NULL, 1, 2)) IS UNKNOWN ((1, NULL, ...) < (1, 2, ...)) IS UNKNOWN ((1, NULL, ...) < (2, NULL, ...)) IS TRUE ((2, NULL, ...) < (1, NULL, ...)) IS FALSE
BETWEEN
between predicate
<between predicate> ::= <row value
predicand> <between predicate part 2>
<between predicate part 2> ::= [ NOT ] BETWEEN [
ASYMMETRIC | SYMMETRIC ] <row value predicand> AND <row value
predicand>
Specify a range comparison. The default is ASYMMETRIC. The
expression X BETWEEN Y AND Z
is equivalent to
(X >= Y AND X <= Z)
. Therefore, if Y > Z,
the BETWEEN expression is never true. The expression X BETWEEN
SYMMETRIC Y AND Z
is equivalent to (X >= Y AND X
<= Z) OR (X >= Z AND X <= Y)
. The expression
Z NOT BETWEEN ...
is equivalent to NOT (Z
BETWEEN ...)
. If any of the three <row value
predicand>
evaluates to NULL, the result is
UNKNOWN.
IN
in predicate
<in predicate> ::= <row value predicand> [
NOT ] IN <in predicate value>
<in predicate value> ::= <table subquery>
| <left paren> <in value list> <right paren>
| <left paren> UNNEST <left paren>
<array value expression> <right paren> <right
paren>
<in value list> ::= <row value expression>
[ { <comma> <row value expression> }...
]
Specify a quantified comparison. The expression X NOT
IN Y is
equivalent to NOT (X IN Y)
. The
( <in value list> )
is converted into a table
with one or more rows. The expression X IN Y
is
equivalent to X = ANY Y
, which is a
<quantified comparison predicate>
.
If the <table subquery>
returns no
rows, the result is FALSE. Otherwise the <row value
predicand>
is compared one by one with each row of the
<table subquery>
.
If the comparison is TRUE for at least one row, the result is TRUE. If the comparison is FALSE for all rows, the result is FALSE. Otherwise the result is UNKNOWN.
HyperSQL supports an extension to the SQL Standard to allow an
array to be used in the <in predicate value>. This is intended to
be used with prepared statements where a variable length array of values
can be used as the parameter value for each call. The example below
shows how this is used in SQL. The JDBC code must create a new
java.sql.Array
object that contains the values and
set the parameter with this array.
SELECT * FROM customer WHERE firstname IN ( UNNEST(?) ) Connection conn; PreparedStatement ps; // conn and ps are instantiated here Array arr = conn.createArrayOf("INTEGER", new Integer[] {1, 2, 3}); ps.setArray(1, arr); ResultSet rs = ps.executeQuery();
LIKE
like predicate
<like predicate> ::= <character like
predicate> | <octet like predicate>
<character like predicate> ::= <row value
predicand> [ NOT ] LIKE <character pattern> [ ESCAPE <escape
character> ]
<character pattern> ::= <character value
expression>
<escape character> ::= <character value
expression>
<octet like predicate> ::= <row value
predicand> [ NOT ] LIKE <octet pattern> [ ESCAPE <escape
octet> ]
<octet pattern> ::= <binary value
expression>
<escape octet> ::= <binary value
expression>
Specify a pattern-match comparison for character or binary
strings. The <row value predicand>
is always a
<string value expression>
of character or
binary type. The <character pattern>
or
<octet pattern>
is a <string value
expression>
in which the underscore and percent characters
have special meanings. The underscore means match any one character,
while the percent means match a sequence of zero or more characters. The
<escape character>
or <escape
octet>
is also a <string value
expression>
that evaluates to a string of exactly one
character length. If the underscore or the percent is required as normal
characters in the pattern, the specified <escape
character>
or <escape octet>
can
be used in the pattern before the underscore or the percent. The
<row value predicand>
is compared with the
<character pattern>
and the result of
comparison is returned. If any of the expressions in the predicate
evaluates to NULL, the result of the predicate is UNKNOWN. The
expression A NOT LIKE B
is equivalent to NOT
(A LIKE B)
. If the length of the escape is not 1 or it is used
in the pattern not immediately before an underscore or a percent
character, an exception is raised.
IS NULL
null predicate
<null predicate> ::= <row value predicand>
IS [ NOT ] NULL
Specify a test for a null value. The expression X IS
NOT NULL
is NOT equivalent to NOT (X IS
NULL)
if the degree of the <row value
predicand>
is larger than 1. The rules are: If all fields
are null, X IS NULL
is TRUE and X IS NOT
NULL
is FALSE. If only some fields are null, both X
IS NULL
and X IS NOT NULL
are FALSE. If all
fields are not null, X IS NULL
is FALSE and
X IS NOT NULL
is TRUE.
ALL and ANY
quantified comparison predicate
<quantified comparison predicate> ::= <row
value predicand> <comp op> <quantifier> <table
subquery>
<quantifier> ::= <all> |
<some>
<all> ::= ALL
<some> ::= SOME | ANY
Specify a quantified comparison. For a quantified comparison,
the <row value predicand>
is compared one by
one with each row of the <table sub
query>
.
If the <table subquery>
returns no
rows, then if ALL
is specified the result is TRUE,
but if SOME
or ANY
is specified
the result is FALSE.
If ALL
is specified, if the comparison is
TRUE for all rows, the result of the predicate is TRUE. If the
comparison is FALSE for at least one row, the result is FALSE. Otherwise
the result is UNKNOWN.
If SOME
or ANY
is
specified, if the comparison is TRUE for at least one row, the result is
TRUE. If the comparison is FALSE for all rows, the result is FALSE.
Otherwise the result is UNKNOWN. Note that the IN predicate is
equivalent to the SOME or ANY predicate using the <equals
operator>
.
In the examples below, the date of an invoice is compared to holidays in a given year. In the first example the invoice date must equal one of the holidays, in the second example it must be later than all holidays (later than the last holiday), in the third example it must be on or after some holiday (on or after the first holiday), and in the fourth example, it must be before all holidays (before the first holiday).
invoice_date = SOME (SELECT holiday_date FROM holidays) invoice_date > ALL (SELECT holiday_date FROM holidays) invoice_date >= ANY (SELECT holiday_date FROM holidays) invoice_date < ALL (SELECT holiday_date FROM holidays)
EXISTS
exists predicate
<exists predicate> ::= EXISTS <table
subquery>
Specify a test for a non-empty set. If the evaluation of
<table subquery>
results in one or more rows,
then the expression is TRUE, otherwise FALSE.
UNIQUE
unique predicate
<unique predicate> ::= UNIQUE <table
subquery>
Specify a test for the absence of duplicate rows. The result of
the test is either TRUE or FALSE (never UNKNOWN). The rows of the
<table subquery>
that contain one or more NULL
values are not considered for this test. If the rest of the rows are
distinct from each other, the result of the test is TRUE, otherwise it
is FALSE. The distinctness of rows X and Y is tested with the predicate
X IS DISTINCT FROM Y
.
MATCH
match predicate
<match predicate> ::= <row value
predicand> MATCH [ UNIQUE ] [ SIMPLE | PARTIAL | FULL ] <table
subquery>
Specify a test for matching rows. The default is MATCH SIMPLE without UNIQUE. The result of the test is either TRUE or FALSE (never UNKNOWN).
The interpretation of NULL values is different from other
predicates and quite counter-intuitive. If the <row value
predicand>
is NULL, or all of its fields are NULL, the
result is TRUE.
Otherwise, the <row value predicand>
is compared with each row of the <table
subquery>
.
If SIMPLE is specified, if some field of <row value
predicate>
is NULL, the result is TRUE. Otherwise if
<row value predicate>
is equal to one or more
rows of <table subquery>
the result is TRUE if
UNIQUE is not specified, or if UNIQUE is specified and only one row
matches. Otherwise the result is FALSE.
If PARTIAL is specified, if the non-null values
<row value predicate>
are equal to those in one
or more rows of <table subquery>
the result is
TRUE if UNIQUE is not specified, or if UNIQUE is specified and only one
row matches. Otherwise the result is FALSE.
If FULL is specified, if some field of <row value
predicate>
is NULL, the result is FALSE. Otherwise if
<row value predicate>
is equal to one or more
rows of <table subquery>
the result is TRUE if
UNIQUE is not specified, or if UNIQUE is specified and only one row
matches.
Note that MATCH can also be used in FOREIGN KEY constraint definitions. The exact meaning is described in the Schemas and Database Objects chapter.
CONTAINS
contains predicate
<contains predicate> ::= PERIOD <row value
predicand> CONTAINS PERIOD <row value
predicand>
Specify a test for two datetime periods. Each <row
value predicand>
must have two fields and the fields
together represent a datetime period. So the predicates is always in the
form PERIOD (X1, X2) CONTAINS PERIOD (Y1, Y2)
. Fields
in each period are always a datetime value of the same type (DATE or
TIMESTAMP).
All datetime values are converted to TIMESTAMP WITH TIME ZONE. The second datetime value must be after the first, otherwise a data error is returned.
If the second period is fully within the first period, the result is TRUE. Otherwise it is false.
If any of the values is NULL, the result is UNDEFINED.
EQUALS
equals predicate
<equals predicate> ::= PERIOD <row value
predicand> EQUALS PERIOD <row value
predicand>
Specify a test for two datetime periods. The conversions and checks are applied the same way as with the CONTAINS predicate. If the two periods have the same begin and end datetime values the result is TRUE. Otherwise it is false.
If any of the values is NULL, the result is UNDEFINED.
IS DISTINCT
is distinct predicate
<distinct predicate> ::= <row value
predicand> IS [ NOT ] DISTINCT FROM <row value
predicand>
Specify a test of whether two row values are distinct. The
result of the test is either TRUE or FALSE (never UNKNOWN). The
degree the two <row value
predicand>
must be the same. Each field of the first
<row value predicand>
is compared to the field
of the second <row value predicand>
at the same
position. If one field is NULL and the other is not NULL, or if the
elements are NOT equal, then the result of the expression is TRUE. If no
comparison result is TRUE, then the result of the predicate is FALSE.
The expression X IS NOT DISTINCT FROM Y
is equivalent
to NOT (X IS DISTINCT FORM Y)
. The following check
returns true if startdate is not equal to enddate. It also returns true
if either startdate or enddate is NULL. It returns false in other
cases.
startdate IS DISTINCT FROM enddate
OVERLAPS
overlaps predicate
<overlaps predicate> ::= <row value
predicand> OVERLAPS <row value predicand>
<overlaps predicate> ::= PERIOD <row value
predicand> OVERLAPS PERIOD <row value
predicand>
The OVERLAPS predicate tests for an overlap between two datetime periods. This predicate has two forms. The one without the PERIOD keywords is more relaxed in terms of valid periods.
If there is there is any overlap between the two datetime periods, the result is TRUE. Otherwise it is false.
If any of the values is NULL, the result is UNDEFINED.
In the example below, the period is compared with a week long period ending yesterday.
(startdate, enddate) OVERLAPS (CURRENT_DATE - 7 DAY, CURRENT_DATE - 1 DAY)
PRECEDES
precedes predicate
<precedes predicate> ::= PERIOD <row value
predicand> [ IMMEDIATELY] PRECEDES PERIOD <row value
predicand>
Specify a test for two datetime periods. The conversions and checks are applied the same way as with the CONTAINS predicate. If the second period begins after the end of the first period, the result is TRUE. Otherwise it is false.
If IMMEDIATELY is specified, the second period must follow immediately after the end of the first period. This means the end of the first period is the same point of time as the start of the second period.
If any of the values is NULL, the result is UNDEFINED.
SUCCEEDS
succeeds predicate
<succeeds predicate> ::= PERIOD <row value
predicand> [ IMMEDIATELY ] SUCCEEDS PEDIOD <row value
predicand>
Specify a test for two datetime periods with similar syntax to PRECEDES. If the first period begins after the end of the second period, the result is TRUE. Otherwise it is false.
If IMMEDIATELY is specified, the first period must follow immediately after the end of the second period.
If any of the values is NULL, the result is UNKNOWN.
The example below shows a predicate that returns TRUE.
PERIOD (CURRENT_DATE - 7 DAY, CURRENT_DATE) IMMEDIATELY PRECEDES (CURRENT_DATE, CURRENT_DATE + 7 DAY)
aggregate function
aggregate function
<aggregate function> ::= COUNT <left
paren> <asterisk> <right paren> [ <filter clause> ]
| <general set function> [ <filter clause> ] | <array
aggregate function> [ <filter clause> ]
<general set function> ::= <set function
type> <left paren> [ <set quantifier> ] <value
expression> <right paren>
<set function type> ::= <computational
operation>
<computational operation> ::= AVG | MAX | MIN |
ANY_VALUE | SUM | EVERY | ANY | SOME | COUNT | STDDEV_POP | STDDEV_SAMP
| VAR_SAMP | VAR_POP | MEDIAN
<set quantifier> ::= DISTINCT |
ALL
<filter clause> ::= FILTER <left paren>
WHERE <search condition> <right paren>
<array aggregate function> ::= ARRAY_AGG
<left paren> [ <set quantifier> ] <value expression> [
<order by clause> ] <right paren>
<listagg set function> ::= LISTAGG <left
paren> [ <set quantifier> ] <value expression>
<comma> <separator> [ <fetch first clause> ] [
<listagg overflow clause> ] <right paren> <within group
specification>
<listagg overflow clause> ::= ON OVERFLOW
<overflow behavior>
<overflow behavior> ::= ERROR | TRUNCATE [
<listagg truncation filler> ] [ <listagg count indication>
]
<listagg truncation filler> ::= <character
string literal>
<listagg count indication> ::= WITH COUNT |
WITHOUT COUNT
<group concat function> ::= GROUP_CONCAT
<left paren> [ <set quantifier> ] <value expression> [
<order by clause> ] [ SEPARATOR <separator> ] <right
paren>
<string agg function> ::= STRING_AGG <left
paren> [ <set quantifier> ] <value expression> [
<order by clause> ] [ <comma> <separator> ] <right
paren> [ <within group specification> ]
<separator> ::= <character string
literal>
<within group specification> ::= WITHIN GROUP
<left paren> ORDER BY <sort specification list> <right
paren>
Specify a value computed from a collection of rows.
An aggregate function is used exclusively in a
<query specification>
and its use transforms a
normal query into an aggregate query returning a single row instead of
the multiple rows that the original query returns. For example,
SELECT acolumn <table expression>
is a query
that returns the value of acolumn
for all the rows
the satisfy the given condition. But SELECT MAX(acolumn)
<table expression>
returns only one row, containing the
largest value in that column. The query SELECT COUNT(*)
<table expression>
returns the count of rows, while
SELECT COUNT(acolumn) <table expression>
returns the count of rows where acolumn IS NOT
NULL
.
If the <table expression>
is a grouped
table (has a GROUP BY
clause), the aggregate function
returns the result of the COUNT
or
<computational operation>
for each group. In
this case the result has the same number of rows as the original grouped
query. For example, SELECT SUM(acolumn) <table
expression>
when <table
expression>
has a GROUP BY
clause,
returns the sum of values for acolumn
in each
group.
If all values are NULL, the aggregate function (except COUNT) returns NULL.
The SUM operations can be performed on numeric and interval expressions only. AVG and MEDIAN can be performed on numeric, interval or datetime expressions. AVG returns the average value, while SUM returns the sum of all values. MEDIAN returns the middle value in the sorted list of values.
MAX and MIN can be performed on all types of expressions and return the minimum or the maximum value.
ANY_VALUE returns a value from a single row of the original query. The filter condition can be used to limit the eligible values.
COUNT(*)
returns the count of all values,
including nulls, while COUNT(<value
expression>)
returns the count of non-NULL values. COUNT
with DISTINCT also accepts multiple arguments. In this usage the
distinct combinations of the arguments are counted. Examples
below:
SELECT COUNT(DISTINCT firstname, lastname) FROM customer SELECT COUNT(DISTINCT (firstname, lastname)) FROM customer
The EVERY, ANY and SOME operations can be performed on boolean expressions only. EVERY returns TRUE if all the values are TRUE, otherwise FALSE. ANY and SOME are the same operation and return TRUE if one of the values is TRUE, otherwise it returns FALSE.
The other operations perform the statistical functions STDDEV_POP, STDDEV_SAMP, VAR_SAMP, VAR_POP on numeric values. NULL values are ignored in calculations.
User-defined aggregate functions can be defined and used instead of the built-in aggregate functions. Syntax and examples are given in the SQL-Invoked Routines chapter.
The <filter clause>
allows you to add a
search condition. When the search condition evaluates to TRUE for a row,
the row is included in aggregation. Otherwise the row is not included.
In the example below a single query returns two different filtered
counts:
SELECT COUNT(ITEM) FILTER (WHERE GENDER = 'F') AS "FEMALE COUNT", COUNT(ITEM) FILTER (WHERE GENDER = 'M') AS "MALE COUNT" FROM PEOPLE
ARRAY_AGG is different from all other aggregate functions as it
does not ignore the NULL values. This set function returns an array that
contains all the values, for different rows, of the <value
expression>
. For example, if the <value
expression>
is a column reference, the SUM function adds
the values for all the row together, while the ARRAY_AGG function adds
the value for each row as a separate element of the array. ARRAY_AGG can
include an optional <order by clause>
which can
reference all the available columns of the query, not just the
<value expression>
that is used as the
ARRAY_AGG argument. The <order by clause>
can
have multiple elements (columns) and each element can include NULLS LAST
or DESC qualifiers. The elements of the returned array are sorted
according to the <order by clause>
,
LISTAGG, GROUP_CONCAT and STRING_AGG are equivalent specialised
functions derived from ARRAY_AGG. These functions compute the array in
the same way as ARRAY_AGG, remove all the NULL elements, then return a
string that is a concatenation of the elements of the array. If
<separator>
has been specified, it is used to
separate the elements of the array, otherwise the comma is used to
separate the elements.
LISTAGG is an SQL:2023 function and more flexible than the
other two functions. HyperSQL extends the Standard syntax by allowing an
optional <fetch first clause>
to limit the
number of elements. The <within group
specification>
determines the sort order. The optional
<listagg overflow clause>
allows an error to be
returned or truncates the returned string. The optional
<listagg truncation filler>
is appended to a
truncated result. The optional <listagg count
indication>
can specify that the count of truncated
elements is appended to a truncated result.
The example below shows a grouped query with LISTAGG, ARRAY_AGG and GROUP_CONCAT. The CUSTOMER table that is included in the DatabaseManagerSwing GUI app is the source of the data.
SELECT LASTNAME, ARRAY_AGG(FIRSTNAME ORDER BY FIRSTNAME) FROM Customer GROUP BY LASTNAME LASTNAME C2 --------- ---------------------------------------------------------- Steel ARRAY['John','John','Laura','Robert'] King ARRAY['George','George','James','Julia','Robert','Robert'] Sommer ARRAY['Janet','Robert'] SELECT LASTNAME, LISTAGG(DISTINCT FIRSTNAME, '; ' FETCH 3 ROWS ONLY ON OVERFLOW TRUNCATE WITH COUNT) WITHIN GROUP (ORDER BY FIRSTNAME DESC) FROM Customer GROUP BY LASTNAME LASTNAME C2 --------- ------------------------------------------------- Steel Robert; Laura; John King Robert; Julia; James; ...(1) Sommer Robert; Janet SELECT LASTNAME, GROUP_CONCAT(DISTINCT FIRSTNAME ORDER BY FIRSTNAME DESC SEPARATOR ' * ') FROM Customer GROUP BY LASTNAME LASTNAME C2 --------- ------------------------------------------------- Steel Robert * Laura * John King Robert * Julia * James * George Sommer Robert * Janet
search condition
search condition
<search condition> ::= <boolean value
expression>
Specify a condition that is TRUE, FALSE, or UNKNOWN. A search condition is often a predicate.
PATH
path specification
<path specification> ::= PATH <schema name
list>
<schema name list> ::= <schema name> [ {
<comma> <schema name> }... ]
Specify an order for searching for a user-defined SQL-invoked routine. This is not currently supported by HyperSQL.
routine invocation
routine invocation
<routine invocation> ::= <routine name>
<SQL argument list>
<routine name> ::= [ <schema name>
<period> ] <qualified identifier>
<SQL argument list> ::= <left paren> [
<SQL argument> [ { <comma> <SQL argument> }... ] ]
<right paren>
<SQL argument> ::= <value expression> |
<target specification>
Invoke an SQL-invoked routine. Examples are given in the SQL-Invoked Routines chapter.
COLLATE
collate clause
<collate clause> ::= COLLATE <collation
name>
Specify a collation for a column or for an ORDER BY expression. This collation is used for comparing the values of the column in different rows. Comparison can happen during the execution of SELECT, UPDATE or DELETE statements, when a UNIQUE constraint or index is defined on the column, or when the rows are sorted by an ORDER BY clause.
CONSTRAINT
constraint name definition
<constraint name definition> ::= CONSTRAINT
<constraint name>
<constraint characteristics> ::= <constraint
check time> [ [ NOT ] DEFERRABLE ] | [ NOT ] DEFERRABLE [
<constraint check time> ]
<constraint check time> ::= INITIALLY DEFERRED |
INITIALLY IMMEDIATE
Specify the name of a constraint and its characteristics. The deferrable characteristic is an optional element of CONSTRAINT definition, not yet supported by HyperSQL.
HyperSQL fully supports all of SQL-92 data access statements, plus most of the additions from SQL:2011. Due to time constraints, the current version of this Guide does not cover the subject fully. You are advised to consult an SQL book such as the O'Reilly title, "SQL and Relational Theory" by C. J. Date.
Database queries are data access statements. The most commonly used data access statement is the SELECT statement, but there are other statements that perform a similar role. Data access statements access tables and return result tables. The returned result tables are falsely called result sets, as they are not necessarily sets of rows, but multisets of rows.
Result tables are formed by performing the following operations on base tables and views. These operations are loosely based on Relational Algebra.
JOIN operations
SET and MULTISET operations
SELECTION
PROJECTION
COMPUTING
COLUMN NAMING
GROUPING and AGGREGATION
SELECTION AFTER GROUPING OR AGGREGATION
SET and MULTISET (COLLECTION) OPERATIONS
ORDERING
SLICING
Conceptually, the operations are performed one by one in the above order if they apply to the given data access statement. In the example below a simple select statement is made more complex by adding various operations.
CREATE TABLE atable (a INT, b INT, c INT, d INT, e INT, f INT); /* in the next SELECT, no join is performed and no further operation takes place */ SELECT * FROM atable /* in the next SELECT, selection is performed by the WHERE clause, with no further action */ SELECT * FROM atable WHERE a + b = c /* in the next SELECT, projection is performed after the other operations */ SELECT d, e, f FROM atable WHERE a + b = c /* in the next SELECT, computation is performed after projection */ SELECT (d + e) / f FROM atable WHERE a + b = c /* in the next two SELECT statements, column naming is performed in different ways*/ SELECT (a + e) / f AS calc, f AS div FROM atable WHERE a + b = c SELECT dcol, ecol, fcol FROM atable(acol, bcol, ccol, dcol, ecol, fcol) WHERE acol + bcol = ccol /* in the next SELECT, both grouping and aggregation is performed */ SELECT d, e, SUM(f) FROM atable GROUP BY d, e /* in the next SELECT, selection is performed after grouping and aggregation is performed */ SELECT d, e, SUM(f) FROM atable GROUP BY d, e HAVING SUM(f) > 10 /* in the next SELECT, a UNION is performed on two selects from the same table */ SELECT d, e, f FROM atable WHERE d = 3 UNION SELECT a, b, c FROM atable WHERE a = 30 /* in the next SELECT, ordering is performed */ SELECT (a + e) / f AS calc, f AS div FROM atable WHERE a + b = c ORDER BY calc DESC, div NULLS LAST /* in the next SELECT, slicing is performed after ordering */ SELECT * FROM atable WHERE a + b = c ORDER BY a FETCH 5 ROWS ONLY
The following sections discuss various types of tables and operations involved in data access statements.
The SELECT statement itself does not cover all types of data
access statements, which may combine multiple SELECT statements. The
<query specification>
is the most common data
access statement and begins with the SELECT keyword.
SELECT STATEMENT
select statement (general)
Users generally refer to the SELECT statement when they mean a
<query specification>
or <query
expression>
. If a statement begins with SELECT and has no
UNION or other set operations, then it is a <query
specification>
. Otherwise it is a <query
expression>
.
In data access statements, a table can be a database table (or
view) or an ephemeral table formed for the duration of the query. Some
types of table are <table primary>
and can
participate in joins without the use of extra parentheses. The BNF in
the Table Primary section below lists different types of <table
primary>:
Tables can also be formed by specifying the values that are contained in them:
<table value constructor> ::= VALUES <row
value expression list>
<row value expression list> ::= <table row
value expression> [ { <comma> <table row value
expression> }... ]
In the example below a table with two rows and 3 columns is constructed out of some values:
VALUES (12, 14, null), (10, 11, CURRENT_DATE)
When a table is used directly in a UNION or similar operation, the keyword TABLE is used with the name:
<explicit table> ::= TABLE <table or query
name>
In the examples below, all rows of the two tables are included in the union. The keyword TABLE is used in the first example. The two examples below are equivalent.
TABLE atable UNION TABLE anothertable SELECT * FROM atable UNION SELECT * FROM anothertable
A subquery is simply a query expression in brackets. A query expression is usually a complete SELECT statement and is discussed in the rest of this chapter. A scalar subquery returns one row with one column. A row subquery returns one row with one or more columns. A table subquery returns zero or more rows with one or more columns. The distinction between different forms of subquery is syntactic. Different forms are allowed in different contexts. If a scalar subquery or a row subquery return more than one row, an exception is raised. If a scalar or row subquery returns no row, it is usually treated as returning a NULL. Depending on the context, this has different consequences.
<scalar subquery> ::= <subquery>
<row subquery> ::= <subquery>
<table subquery> ::= <subquery>
<subquery> ::= <left paren> <query
expression> <right paren>
A query specification is also known as a SELECT statement. It is
the most common form of <derived table>
. A
<table expression>
is a base table, a view or
any form of allowed derived table. The SELECT statement performs
projection, naming, computing, or aggregation on the rows of the
<table expression>
.
<query specification> ::= SELECT [ DISTINCT |
ALL ] <select list> <table expression>
<select list> ::= <asterisk> | <select
sublist> [ { <comma> <select sublist> }... ]
<select sublist> ::= <derived column> |
<qualified asterisk>
<qualified asterisk> ::= <asterisked
identifier chain> <period> <asterisk>
<asterisked identifier chain> ::= <asterisked
identifier> [ { <period> <asterisked identifier> }... ]
<asterisked identifier> ::=
<identifier>
<derived column> ::= <value expression> [
<as clause> ]
<as clause> ::= [ AS ] <column name>
The qualifier DISTINCT or ALL apply to the results of the SELECT statement after all other operations have been performed. ALL simply returns the rows, while DISTINCT compares the rows and removes the duplicate ones.
Projection is performed by the <select
list>
.
A single <asterisk>
means all columns of
the <table expression>
are included, in the
same order as they appear in the <table
expression>
. An asterisk qualified by a table name means
all the columns of the qualifier table name are included. If an
unqualified asterisk is used, then no other items are allowed in the
<select list>
. When the <table
expression>
is the direct result of NATURAL or USING joins,
the use of <asterisk>
includes the columns used
for the join before the other columns. A qualified asterisk does not
cover the join columns.
A derived column is a <value expression>
,
optionally named with the <as clause>
. A
<value expression>
can be many things. Common
types include: the name of a column in the <table
expression>
; an expression based on different columns or
constant values; a function call; an aggregate function; a CASE WHEN
expression.
A table expression is part of the SELECT statement and consists of the FROM clause with optional other clauses that performs selection (of rows) and grouping from the table(s) in the FROM clause.
<table expression> ::= <from clause> [
<where clause> ] [ <group by clause> ] [ <having
clause> ]
<from clause> ::= FROM <table reference> [ {
<comma> <table reference> }... ]
<table reference> ::= <table primary> |
<joined table>
<table primary> ::= <table or query name>
[ <query system time period specification> ] [ [ AS ]
<correlation name> [ <left paren> <derived column
list> <right paren> ] ]
| <derived table> [ AS ] <correlation
name> [ <left paren> <derived column list> <right
paren> ]
| <lateral derived table> [ AS ] <correlation
name> [ <left paren> <derived column list> <right
paren> ]
| <collection derived table> [ AS ]
<correlation name> [ <left paren> <derived column
list> <right paren> ]
| <table function derived table> [ AS ]
<correlation name> [ <left paren> <derived column
list> <right paren> ]
| <parenthesized joined table> [ AS ]
<correlation name> [ <left paren> <derived column
list> <right paren> ]
<where clause> ::= WHERE <boolean value
expression>
<group by clause> ::= GROUP BY [ <set
quantifier> ] <grouping element> [ { <comma> <grouping
element> }... ]
<having clause> ::= HAVING <boolean value
expression>
<query system time period specification> ::= FOR
SYSTEM_TIME AS OF <point in time 1>
| FOR SYSTEM_TIME BETWEEN [ SYMMETRIC ] <point in
time 1> AND <point in time 2>
| FOR SYSTEM_TIME FROM <point in time 1> TO
<point in time 2>
The <from clause>
contains one or more
<table reference>
separated by commas. A table
reference is often a table or view name or a joined table.
The <where clause>
filters the rows of
the table in the <from clause> and removes the rows for which the
search condition is not TRUE.
Table primary refers to different forms of table reference in the FROM clause.
The simplest form of reference is simply a name. This is the name of a table, a view, a transition table in a trigger definition, or a query name specified in the WITH clause of a query expression.
<table or query name> ::= <table name> |
<transition table name> | <query name>
The <query system time period
specification>
can be used after the name of a
system-versioned table to query historic data in the table. Without
this clause, only the current rows of the table are returned and
historic rows are ignored. The first example below shows a list of
customers as of a year ago. The second example also shows any changes
made to the email column over the previous year.
SELECT firstname, lastname, email FROM customer FOR SYSTEM_TIME AS OF CURRENT_TIMESTAMP - 1 YEAR SELECT DISTINCT firstname, lastname, email FROM customer FOR SYSTEM_TIME FROM CURRENT_TIMESTAMP - 2 YEAR TO CURRENT_TIMESTAMP - 1 YEAR
derived table
A query expression that is enclosed in parentheses and returns
from zero to many rows is a <table subquery>
.
In a <derived table>
the query expression is
self contained and cannot reference the columns of other table
references. This is the traditional and most common form of use of a
<table subquery>
.
<derived table> ::= <table
subquery>
LATERAL
When the word LATERAL is used before a <table
subquery>
, it means the query expression can reference
the columns of other table references that go before it.
<lateral derived table> ::= LATERAL <table
subquery>
The use of <lateral derived table>
completely transforms the way a query is written. For example, the two
queries below are equivalent, but with different forms. The query with
LATERAL is evaluated separately for each row of the first table that
satisfies the WHERE condition. The example below uses the tables that
are created and populated in DatabaseManagerSwing with the "Insert
test data" menu option. The first query uses a scalar subquery to
compute the sum of invoice values for each customer. The second query
is equivalent and uses a join with a LATERAL table.
SELECT firstname, lastname, (SELECT SUM(total) FROM invoice WHERE customerid = customer.id) s FROM customer SELECT firstname, lastname, a.c FROM customer, LATERAL(SELECT SUM(total) FROM invoice WHERE customerid = customer.id) a (c)
UNNEST
UNNEST is similar to LATERAL, but instead of a query expression, one or more expressions that return an array are used. These expressions are converted into a table which has one column for each expression and contains the elements of the array. If WITH ORDINALITY is used, an extra column that contains the index of each element is added to this table. The number of rows in the table equals the length of the largest arrays. The smaller arrays are padded with NULL values. If an <array value expression> evaluates to NULL, an empty array is used in its place. The array expression can contain references to any column of the table references preceding the current table reference.
<collection derived table> ::= UNNEST <left
paren> <array value expression>, ... <right paren> [
WITH ORDINALITY ]
The <array value expression>
can be the
result of a function call. If the arguments of the function call are
values from the tables on the left of the UNNEST, then the function is
called for each row of table.
In the first example below, UNNEST is used with the built in-function SEQUENCE_ARRAY to build a table containing dates for the last seven days and their ordinal position. In the second example, a select statement returns costs for the last seven days . In the third example, the WITH clause turns the two selects into named subqueries which are used in a SELECT statement that uses a LEFT join.
SELECT * FROM UNNEST(SEQUENCE_ARRAY(CURRENT_DATE - 7 DAY, CURRENT_DATE - 1 DAY, 1 DAY)) WITH ORDINALITY AS T(D, I) D I ---------- - 2010-07-25 1 2010-07-26 2 2010-07-27 3 2010-07-28 4 2010-07-29 5 2010-07-30 6 2010-07-31 7 CREATE TABLE expenses (item_date DATE, cost DECIMAL(8,2)) -- SELECT item_date, SUM(cost) AS total FROM expenses WHERE item_date >= CURRENT_DATE - 7 DAY GROUP BY item_date ITEM_DATE TOTAL ---------- ------ 2010-07-27 100.12 2010-07-29 50.45 WITH costs(item_date, total) AS (SELECT item_date, SUM(cost) FROM expenses WHERE item_date >= CURRENT_DATE - 7 DAY GROUP BY item_date), dates(d, i) AS (SELECT * FROM UNNEST(SEQUENCE_ARRAY(CURRENT_DATE - 7 DAY, CURRENT_DATE - 1 DAY, 1 DAY)) WITH ORDINALITY) SELECT i, d, total FROM dates LEFT OUTER JOIN costs ON dates.d = costs.item_date I D TOTAL - ---------- ------ 1 2010-07-25 (null) 2 2010-07-26 (null) 3 2010-07-27 100.12 4 2010-07-28 (null) 5 2010-07-29 50.45 6 2010-07-30 (null) 7 2010-07-31 (null)
Table Function Derived Table
When TABLE is used in this context, the <collection
value expression>
must be the result of a function call
to a built-in function or user-defined function that returns an array
or a table. When the function returns an array, this array is
converted into a table, similar to the way UNNEST operates. When the
function returns a table, the result is a MULTISET and is used as
is.
<table function derived table> ::= TABLE <left
paren> <collection value expression> <right
paren>
A parenthesized joined table is simply a joined table contained in parentheses. Joined tables are discussed below.
<parenthesized joined table> ::= <left
paren> <parenthesized joined table> <right paren> |
<left paren> <joined table> <right
paren>
Joins are operators with two table as the operands, resulting in a third table, called joined table. All join operators are evaluated left to right, therefore, with multiple joins, the table resulting from the first join operator becomes an operand of the next join operator. Parentheses can be used to group sequences of joined tables and change the evaluation order. So if more than two tables are joined together with join operators, the end result is also a joined table. There are different types of join, each producing the result table in a different way.
<joined table> ::= <cross join> |
<qualified join> | <natural join>
<cross join> ::= <table reference> CROSS JOIN
<table factor>
<qualified join> ::= <table reference> | [
<join type> ] JOIN <table reference> <join
specification>
<natural join> ::= <table reference> NATURAL
[ <join type> ] JOIN <table factor>
<join specification> ::= <join condition> |
<named columns join>
<join condition> ::= ON <search
condition>
<named columns join> ::= USING <left paren>
<join column list> <right paren>
<join type> ::= INNER | <outer join type> [
OUTER ]
<outer join type> ::= LEFT | RIGHT |
FULL
<join column list> ::= <column name
list>
CROSS JOIN
The simplest form of join is CROSS JOIN. The CROSS JOIN of two tables is a table that has all the columns of the first table, followed by all the columns of the second table, in the original order. Each row of the first table is combined with each row of the second table to fill the rows of the new table. If the rows of each table form a set, then the rows of the CROSS JOIN table form the Cartesian product of the rows of the two table operands.
Conditions are not allowed as part of a cross join, which is
simply A CROSS JOIN B
. Any conditions in a WHERE
clause are later applied to the table resulting from the cross
join.
Tables in the FROM CLAUSE separated with commas, are equivalent to
cross joins between the tables. Two joined tables separated with a
comma, such as A, B
, is equivalent to (A) CROSS JOIN
(B), which means the joined tables A and B are populated separately
before they are joined.
CROSS JOIN is not is not generally very useful, as it returns large result tables unless WHERE conditions are used.
UNION JOIN
The UNION JOIN has limited use in queries. The result table has
the same columns as that of CROSS JOIN. Each row of the first table is
extended to the right with nulls and added to the new table. Each row of
the second table is extended to the left with nulls and added to the new
table. The UNION JOIN is expressed as A UNION JOIN B
.
This should not be confused with A UNION B
, which is
a set operation. Union join is for special applications and is not
commonly used.
JOIN ... ON
The condition join is similar to CROSS JOIN, but a condition is
tested for each row of the new table and the row is created only if the
condition is true. This form of join is expressed as A JOIN B
ON (<search condition>)
.
Equijoin is a condition join in which the search condition is an equality condition between on or more pairs of columns from the two table. Equijoin is the most commonly used type of join.
SELECT a.*, b.* FROM a INNER JOIN b ON a.col_one = b.col_two
JOIN ... USING
NATURAL JOIN
Joins with USING or NATURAL keywords are similar to an equijoin
but they cannot be replaced simply with an equijoin. The new table is
formed with the specified or implied shared columns of the two tables,
followed by the rest of the columns from each table. In NATURAL JOIN,
the shared columns are all the column pairs that have the same name in
the first and second table. In JOIN USING, only columns names that are
specified by the USING clause are shared. The joins are expressed as
A NATURAL JOIN B
, and A JOIN B USING
(<comma separated column name list>)
.
The columns of the joined table are formed by the following procedures: In JOIN ... USING the shared columns are added to the joined table in the same order as they appear in the column name list. In NATURAL JOIN the shared columns are added to the joined table in the same order as they appear in the first table. In both forms of join, the non-shared columns of the first table are added in the order they appear in the first table, finally the non-shared columns of the second table are added in the order they appear in the second table.
The type of each shared column of the joined table is based on the type of the columns in the original tables. If the original types are not exactly the same, the type of the shared column is formed by type aggregation. Type aggregations selects a type that can represent values of both aggregated types. Simple type aggregation picks one of the types. For example, SMALLINT and INTEGER, results in INTEGER, or VARCHAR(10) and VARCHAR(20) results in VARCHAR(20). More complex type aggregation inherits properties from both types. For example DECIMAL(8) and DECIMAL (6,2) results in DECIMAL (8,2).
In the examples below, the rows are joined exactly the same way, but the first query contains a.col_two and b.col_two together with all the rest of the columns of both tables, while the second query returns only one copy of col_two.
SELECT * FROM a INNER JOIN b ON a.col_two = b.col_two SELECT * FROM a INNER JOIN b USING (col_two)
OUTER JOIN
LEFT, RIGHT and FULL OUTER JOIN
The three qualifiers can be added to all types of JOIN except
CROSS JOIN and UNION JOIN. First the new table is populated with the
rows from the original join. If LEFT is specified, all the rows from the
first table that did not make it into the new table are extended to the
right with nulls and added to the table. If RIGHT is specified, all the
rows from the second table that did not make it into the new table are
extended to the left with nulls and added to the table. If FULL is
specified, the addition of leftover rows is performed from both the
first and the second table. These forms are expressed by prefixing the
join specification with the given keyword. For example, A LEFT
OUTER JOIN B ON (<search condition>)
or A
NATURAL FULL OUTER JOIN B
or A FULL OUTER JOIN B
USING (<comma separated column name list>)
.
SELECT a.*, b.* FROM a LEFT OUTER JOIN b ON a.col_one = b.col_two
Despite the name, selection has nothing to do with the list of columns in a SELECT statement. In fact, it refers to the search condition used to limit the rows that from a result table (selection of rows, not columns). In SQL, simple selection is expressed with a WHERE condition appended to a single table or a joined table. In some cases, this method of selection is the only method available; for example in DELETE and UPDATE statements. But when it is possible to perform the selection with join conditions, this is the better method, as it results in a clearer expression of the query.
Projection is selection of the columns from a simple or joined table to form a result table. Explicit projection is performed in the SELECT statement by specifying the select column list. Some form of projection is also performed in JOIN ... USING and NATURAL JOIN.
The joined table has columns that are formed according to the rules mentioned above. But in many cases, not all the columns are necessary for the intended operation. If the statement is in the form, SELECT * FROM <joined table>, then all the columns of <joined table> are returned. But normally, the columns to be returned are specified after the SELECT keyword, separated from each other with commas.
In the select list, it is possible to use expressions that reference any columns of <joined table>. Each of these expressions forms a computed column. It is computed for each row of the result table, using the values of the columns of the <joined table> for that row.
Naming is used to hide the original names of tables or table columns and to replace them with new names in the scope of the query. Naming is also used for defining names for computed columns.
Without explicit naming, the name of a column is a predefined name. If the column is a column of a table, or is a named parameter, the name is of the table column or parameter is used. Otherwise it is generated by the database engine. HyperSQL generates column names such as C1, C2, etc. As generated naming is implementation defined according to the SQL Standard, it is better to explicitly name the computed and derived columns in your applications.
Naming in Joined Table
Naming is performed by adding a new name after a table's real name
and by adding a list of column names after the new table name. Both
table naming and column naming are optional, but table naming is
required for column naming. The expression A [AS] X (<comma
separated column name list>)
means table A is used in the
query expression as table X and its columns are named as in the given
list. The original name A, or its original column names, are not visible
in the scope of the query. The BNF is given below. The
<correlation name>
can be the same or different
from the name of the table. The <derived column
list>
is a comma separated list of column names. The degree
of this list must be equal to the degree of the table. The column names
in the list must be distinct. They can be the same or different from the
names of the table's columns.
<table or query name> [ [ AS ] <correlation
name> [ <left paren> <derived column list> <right
paren> ] ]
In the examples below, the columns of the original tables are named (a, b, c, d, e, f). The two queries are equivalent. In the second query, the table and its columns are renamed and the new names are used in the WHERE clauses:
CREATE TABLE atable (a INT, b INT, c INT, d INT, e INT, f INT); SELECT d, e, f FROM atable WHERE a + b = c SELECT x, y, z FROM atable AS t (u, v, w, x, y, z) WHERE u + v = w
Naming in Select List
Naming in the SELECT list logically takes place after naming in the joined table. The new names for columns are not visible in the immediate query expression or query expression. They become visible in the ORDER BY clause and in the result table that is returned to the user. Or if the query expression is used as a derived table in an enclosing query expression.
In the example below, the query is on the same table but with column renaming in the Select list. The new names are used in the ORDER BY clause:
SELECT x + y AS xysum, y + z AS yzsum FROM atable AS t (u, v, w, x, y, z) WHERE u + v = w ORDER BY xysum, yzsum
If the names xysum
or yzsum
are not used, the computed columns cannot be referenced in the ORDER BY
list.
Name Resolution
In a joined table, if a column name appears in tables on both sides then any reference to the name must use the table name in order to specify which table is being referred to.
Grouping Operations
Grouping results in the elimination of duplicate rows. A grouping operation is performed after the operations discussed above. A simple form of grouping is performed by the use of DISTINCT after SELECT. This eliminates all the duplicate rows (rows that have the same value in each of their columns when compared to another row). The other form of grouping is performed with the GROUP BY clause. This form is usually used together with aggregation.
GROUP BY
<group by clause> ::= GROUP BY [ <set
quantifier> ] <grouping element> [ { <comma> <grouping
element> }... ]
<grouping element> ::= <ordinary grouping
set> | <rollup list> | <cube list> | <grouping sets
specification> | <empty grouping set>
<ordinary grouping set> ::= <grouping column
reference> | <left paren> <grouping column reference
list> <right paren>
<grouping column reference list> ::=
<grouping column reference> [ { <comma> <grouping column
reference> }... ]
<grouping column reference> ::= <column
reference> [ <collate clause> ]
<rollup list> ::= ROLLUP <left paren>
<ordinary grouping set list> <right
paren>
<ordinary grouping set list> ::= <ordinary
grouping set> [ { <comma> <ordinary grouping set> }... ]
<cube list> ::= CUBE <left paren>
<ordinary grouping set list> <right paren>
<grouping sets specification> ::= GROUPING SETS
<left paren> <grouping set list> <right paren>
<grouping set list> ::= <grouping set> [ {
<comma> <grouping set> }... ]
<grouping set> ::= <ordinary grouping set>
| <rollup list> | <cube list> | <grouping sets
specification> | <empty grouping set>
<empty grouping set> ::= <left paren>
<right paren>
An ordinary <group by clause>
is a comma
separated list of columns of the table formed by the <from
clause>
or expressions based on the columns. This is the
most common usage and can be described as GROUP BY <column
reference> [ { <comma> <grouping column reference> }...
]
.
When a <group by clause>
is used, only
the columns used in the <group by clause>
or
expressions used there, can be used in the <select
list>
, together with any <aggregate
function>
on other columns. In other words, the column
names or expressions listed in the GROUP BY clause dictate what can be
used in the <select list>
. After the rows of
the table formed by the <from clause>
and the
<where clause>
are finalised, the grouping
operation groups together the rows that have the same values in the
columns of the <group by clause>
. Then any
<aggregate function>
in the <select
list>
is performed on each group, and for each group, a row
is formed that contains the values of the columns of the
<group by clause>
and the values returned from
each <aggregate function>
.
When the type of <column reference>
is
character string, the <collate clause>
can be
used to specify the collation used for grouping the rows. For example, a
collation that is not case sensitive can be used, or a collation for a
different language than the original collation of the column.
The first example below shows a simple GROUP BY, while in the second example, has a HAVING condition.
CREATE TABLE REVENUE(CHANNEL VARCHAR(20), YEAR INTEGER, COUNTRY VARCHAR(2), PROVINCE VARCHAR(20), SALES INTEGER); SELECT CHANNEL, YEAR, COUNTRY, SUM(SALES) FROM REVENUE GROUP BY CHANNEL, YEAR, COUNTRY; SELECT CHANNEL, YEAR, COUNTRY, SUM(SALES) FROM REVENUE GROUP BY CHANNEL, YEAR, COUNTRY HAVING SUM(SALES) > 50000;
An extended <group by clause>
may
comprise elements such as GROUPING SETS, ROLLUP, CUBE and the empty
grouping set. These syntax elements are expanded and then simplified
into a list of parenthesized column elements, which result in multiple
grouping operations. HyperSQL supports all the syntax listed above. The
example below uses ROLLUP for grouping.
SELECT CHANNEL, YEAR, COUNTRY, SUM(SALES) AS S FROM REVENUE GROUP BY ROLLUP(CHANNEL, YEAR, COUNTRY);
The ROLLUP is translated into 4 groupings: (channel, year, country), (channel, year), (channel), (). The result set will contain the rows as grouped by (channel, year, country), then rows as grouped by (channel, year) with the country column replaced by null, then rows as grouped by (channel) only, with year and country columns replaced by null, then a single row representing the () empty group with all three columns replaced by null.
CHANNEL YEAR COUNTRY S ------------ ------ ------- ------- INTERNET 2009 GB 25000 INTERNET 2009 US 275000 INTERNET 2010 GB 45000 INTERNET 2010 US 500000 DIRECT SALES 2009 GB 162000 DIRECT SALES 2009 US 1602500 DIRECT SALES 2010 GB 181000 DIRECT SALES 2010 US 1833000 INTERNET 2009 (null) 300000 INTERNET 2010 (null) 545000 DIRECT SALES 2009 (null) 1764500 DIRECT SALES 2010 (null) 2014000 INTERNET (null) (null) 845000 DIRECT SALES (null) (null) 3778500 (null) (null) (null) 4623500
If CUBE is used instead of ROLLUP, other permutations of the three columns are added to those produced by ROLLUP. These include (channel, country), (year, country), (year) and (country).
The optional <set quantifier> is either ALL or DISTINCT and defaults to ALL. When GROUPING SETS is used and multiple sets are specified and some duplicate groups are created as a result, use of DISTINCT eliminates the duplicate groups.
Note any ordering of the rows returned by GROUP BY in incidental. You need to use ORDER BY for the ordering you require.
HAVING
A <having clause>
filters the rows of the
table that is formed after applying the <group by
clause>
using its search condition. The search condition
must be an expression based on the expressions in the GROUP BY list or
the aggregate functions used.
DISTINCT
SELECT DISTINCT
When the keyword DISTINCT is used after SELECT, it works as a
shortcut replacement for a simple GROUP BY clause. The expressions in
the SELECT list are used directly as the <group by
clause>
. The following examples of SELECT DISTINCT and
SELECT with GROUP BY are equivalent.
SELECT DISTINCT d, e + f FROM atable WHERE a + b = c SELECT d, e + f FROM atable WHERE a + b = c GROUP BY d, e + f
Aggregation is an operation that computes a single value from the values of a column over several rows. The operation is performed with an aggregate function. The simplest form of aggregation is counting, performed by the COUNT function.
Other common aggregate functions return the maximum, minimum and average value among the values in different rows. Aggregate functions were discussed earlier in this chapter.
Set and Multiset Operations
While join operations generally result in laterally expanded tables, SET and COLLECTION operations are performed on two tables that have the same degree and result in a table of the same degree. The SET operations are UNION, INTERSECT and EXCEPT (difference). When each of these operations is performed on two tables, the collection of rows in each table and in the result is reduced to a set of rows, by eliminating duplicates. The set operations are then performed on the two tables, resulting in the new table which itself is a set of rows. Collection operations are similar but the tables are not reduced to sets before or after the operation and the result is not necessarily a set, but a collection of rows.
The set operations on two tables A and B are: A UNION
[DISTINCT] B
, A INTERSECT [DISTINCT] B
and
A EXCEPT [DISTINCT] B
. The result table is formed in
the following way: The UNION operation adds all the rows from A and B
into the new table, but avoids copying duplicate rows. The INTERSECT
operation copies only those rows from each table that also exist in the
other table, but avoids copying duplicate rows. The EXCEPT operation
copies those rows from the first table which do not exist in the second
table, but avoids copying duplicate rows.
The collection operations are similar to the set operations, but
can return duplicate rows. They are: A UNION ALL B
,
A INTERSECT ALL B
and A EXCEPT ALL
B
. The UNION ALL operation adds all the rows from A and B into
the new table. The INTERSECT operation copies only those rows from each
table that also exist in the other table. If n copies of a rows exists
in one table, and m copies in the other table, the number of copies in
the result table is the smaller of n and m. The EXCEPT operation copies
those rows from the first table which do not exist in the second table.
If n copies of a row exist in the first table and m copies in the second
table the number of copies in the result table is n-m, or if n < m,
then zero.
The optional WITH clause can be used in a query expression. The
WITH clause lists one or more named ephemeral tables that can be
referenced in the query expression body. The ephemeral tables are
created and populated before the rest of the query expression is
executed. Their contents do not change during the execution of the
<query expression body>
that follows the WITH
clause.
<with clause> ::= WITH [ RECURSIVE ] <with
list>
<with list> ::= <with list element> [ {
<comma> <with list element> }... ]
<with list element> ::= <query name> [
<left paren> <with column list> <right paren> ] AS
<left paren> <query expression> <right paren>
<with column list> ::= <column name
list>
An example of the use of the WITH clause is given above under
UNNEST. The <query expression>
in the WITH
clause is evaluated and its result table can be referenced in the body
of the main <query expression body>
using the
specified <query name>
.
When RECURSIVE is used, the <with column
list>
must be defined. The RECURSIVE keyword changes the
way the <with list>
is interpreted. The
<query expression>
contained in the
<with list element>
must be the UNION or UNION
ALL of two <query expression body> elements (VALUES or SELECT
statements). A working table is created and the left side SELECT of the
UNION is evaluated only once and its result is copied to the working
table. This result is also copied to the general result of the
<query expression>
. Iteration starts after this
step. In each iteration, the right side SELECT is evaluated. The
contents of the working table is used when the <query
name>
is referenced in the right side SELECT statement of
the UNION. The result of this SELECT is then added to the previous
general result of the <query expression>
with
UNION or UNION ALL. The working table is cleared and filled with the
latest result. These operations are repeated again and again, until the
latest result is empty and the general result of the <query
expression>
stops changing. The result of the
<with list element>
is now fully populated and
is later used in the execution of the <query expression
body>
that follows the WITH clause.
HyperSQL limits recursion to 256 rounds. If this is exceeded, an error is raised.
From version 2.6, HyperSQL extends recursive query processing by
allowing the use of RECURSIVE_TABLE
to reference the
current general result of the <query
expression>
. This table name can be used in subqueries with
an IN predicate in order to reduce and limit the new result created in
each iteration.
A trivial example of a recursive query is given below. Note the first column GEN. For example, if each row of the table represents a member of a family of dogs, together with its parent, the first column of the result indicates the calculated generation of each dog, ranging from first to fourth generation.
CREATE TABLE pptree (pid INT, id INT, name VARCHAR(10)); INSERT INTO pptree VALUES (NULL, 1, 'dizzi'),(1,2, 'fizzi'), (1,3, 'gizzi'),(2,4, 'kizzi'), (4,5, 'mizzi'),(3,6, 'pizzi'), (3,7, 'tizzi'); WITH RECURSIVE tree (gen, par, child, name) AS ( VALUES(1, CAST(null as int), 1, 'dizzi') UNION SELECT gen + 1, pid, id, name FROM pptree, tree WHERE pid = child ) SELECT * FROM tree; GEN PAR CHILD NAME --- ------ ----- ----- 1 (null) 1 dizzi 2 1 2 fizzi 2 1 3 gizzi 3 2 4 kizzi 3 3 6 pizzi 3 3 7 tizzi 4 4 5 mizzi
If recursive queries become complex, they also become very difficult to develop and debug. HyperSQL provides an alternative to this with user-defined SQL functions which return tables. Functions can perform any complex, repetitive task with better control, using loops, variables and, if necessary, recursion.
The query below computes the Fibonacci numbers up to 100 digits. The WHERE clause limits the iteration.
WITH RECURSIVE Fibonacci(N1, F1, N2, F2) AS ( VALUES(0, CAST(0 AS DECIMAL(100)), 1, CAST(1 AS DECIMAL(100)) UNION SELECT N1 + 1, F1 + F2, N1 +2, F1 + 2 * F2 FROM Fibonacci p WHERE N1 < 100 ) SELECT * FROM Fibonacci
A query expression consists of an optional WITH clause and a query expression body. The optional WITH clause lists one or more named ephemeral tables that can be referenced, just like the database tables in the query expression body.
<query expression> ::= [ <with clause> ]
<query expression body>
A query expression body refers to a table formed by using UNION and other set operations. The query expression body is evaluated from left to right and the INTERSECT operator has precedence over the UNION and EXCEPT operators. A simplified BNF is given below:
<query expression body> ::= <query term> |
<query expression body> UNION | EXCEPT [ ALL | DISTINCT ] [
<corresponding spec> ] <query term>
<query term> ::= <query primary> |
<query term> INTERSECT [ ALL | DISTINCT ] [ <corresponding
spec> ] <query term>
<query primary> ::= <simple table> |
<left paren> <query expression body> [ <order by
clause> ] [ <result offset clause> ] [ <fetch first
clause> ] <right paren>
<simple table> ::= <query specification> |
<table value constructor> | <explicit table> <explicit
table> ::= TABLE <table or query name>
<corresponding spec> ::= CORRESPONDING [ BY
<left paren> <column name list> <right paren>
]
A <query term>
and a <query
primary>
can be a SELECT statement, an
<explicit table>
, or a <table value
constructor>
.
The CORRESPONDING clause is optional. If it is not specified, then
the <query term>
and the <query
primary>
must have the same number of columns. If
CORRESPONDING is specified, the two sides need not have the same number
of columns. If no column list is used with CORRESPONDING, then all the
column names that are common in the tables on two sides are used in the
order in which they appear in the first table. If a columns list is
used, it allows you to select only some columns of the tables on the
left and right side to create the new table. In the example below the
columns named u and v from the two SELECT statements are used to create
the UNION table.
SELECT * FROM atable UNION CORRESPONDING BY (u, v) SELECT * FROM anothertable
The type of each column of the query expression is determined by combining the types of the corresponding columns from the two participating tables.
When the rows of the result table have been formed, it is possible to specify the order in which they are returned to the user. The ORDER BY clause is used to specify the columns used for ordering, and whether ascending or descending ordering is used. It can also specify whether NULL values are returned first or last.
SELECT x + y AS xysum, y + z AS yzsum FROM atable AS t (u, v, w, x, y, z) WHERE u + v = w ORDER BY xysum NULLS LAST, yzsum NULLS FIRST
The ORDER BY clause specifies one or more <value
expressions>
. The list of rows is sorted according to the
first <value expression>
. When some rows are
sorted equal then they are sorted according to the next
<value expression>
and so on.
<order by clause> ::= ORDER BY <sort
specification list>
<sort specification list> ::= <sort
specification> [ { <comma> <sort specification> }...
]
<sort specification> ::= <value expression> [
<collate clause> ] [ ASC | DESC ] [ NULLS FIRST | NULLS LAST
]
The defaults are ASC and NULLS FIRST. Two database properties SQL NULLS FIRST and SQL NULLS ORDER can be modified to change the default behaviour.
A collation is used for columns of an ORDER BY expression that are
of the type CHAR or VARCHAR. If a <collate
clause>
is not specified then the collation of the column,
or the default collation of the database is used.
The default collation for a database is ASCII, with lowercase letters sorted after all uppercase letters. The example below shows the effect of collation on an ordered list.
-- default collation collation for the database is ASCII SELECT id, lastname FROM customer ORDER BY lastname ID LASTNAME -- -------- 40 Clancy 36 King 35 White 6 king -- a language collation is used, it treats king and King as adjacent entries SELECT id, lastname FROM customer ORDER BY lastname COLLATE "English" ID LASTNAME -- -------- 40 Clancy 6 king 36 King 35 White
In the above example, if the LASTNAME column is
itself defined in the table definition with COLLATE
"English"
, then the COLLATE clause is not necessary in the
ORDER BY expression.
An ORDER BY operation can sometimes be optimised by the engine when it can use the same index for accessing the table data and ordering. Optimisation can happen both with DESC + NULLS LAST and ASC + NULLS FIRST.
sort specification list
sort specification list
<sort specification list> ::= <value
expression> [ASC | DESC] [NULLS FIRST | NULLS
LAST]
Specify a sort order. A sort operation is performed on the
result of a <query expression>
or
<query specification>
and sorts the result
according to one or more <value expression>
.
The <value expression>
is usually a single
column of the result, but in some cases it can be a column of the
<table expression>
that is not used in the
select list. The default is ASC and NULLS FIRST.
A different form of limiting the rows can be performed on the result table after it has been formed according to all the other operations (selection, grouping, ordering etc.). This is specified by the FETCH ... ROWS and OFFSET clauses of a SELECT statement. In this form, the specified OFFSET rows are removed from start of the table, then up to the specified FETCH rows are kept and the rest of the rows are discarded.
<result offset clause> ::= OFFSET <offset row
count> { ROW | ROWS }
<fetch first clause> ::= FETCH { FIRST | NEXT } [
<fetch first row count> ] { ROW | ROWS } ONLY [ USING INDEX
]
<limit clause> ::= LIMIT <fetch first row
count> [ USING INDEX ]
A slicing operation takes the result set that has been already processed and ordered. It then discards the specified number of rows from the start of the result set and returns the specified number of rows after the discarded rows. The <offset row count> and <fetch first row count> can be constants, dynamic variables, routine parameters, or routine variables. The type of the constants must be INTEGER.
SELECT a, b FROM atable WHERE d < 5 ORDER BY absum OFFSET 3 FETCH 2 ROWS ONLY SELECT a, b FROM atable WHERE d < 5 ORDER BY absum OFFSET 3 LIMIT 2 /* alternative keyword */
When the FETCH keyword is used, the specified number of rows must be at least 1, otherwise an error is returned. This behaviour is consistent with the SQL Standard. When the LIMIT keyword is used, the specified number of rows can be zero, which means return all rows (no LIMIT). In MySQL and PostgreSQL syntax modes, zero limit means no rows (empty result).
If there is an index on all the columns specified in the ORDER BY
clause, it is normally used for slicing. In some queries, an index on
another column may take precedence because it is used to process the
WHERE condition. It is possible to add USING INDEX
to
the end of the slicing clause to force the use of the index for ordering
and slicing, instead of the index for the WHERE condition.
A query expression, for example a SELECT statement, uses indexes for efficient data retrieval. The EXPLAIN PLAN statement lists the indexes used, together with other useful information about the query. EXPLAIN PLAN can also be used for data manipulation statements such as UPDATE.
EXPLAIN PLAN
explain plan
<explain plan> ::= EXPLAIN PLAN FOR <query
expression>
For example, EXPLAIN PLAN FOR SELECT * FROM REVENUE
WHERE COUNTRY = 'UK'
.
DELETE FROM
delete statement: searched
<delete statement: searched> ::= DELETE FROM
<target table> [ [ AS ] <correlation name> ] [ WHERE
<search condition> ][ LIMIT <fetch first row count>
]
Delete rows of a table. The search condition is a
<boolean value expression>
that is evaluated
for each row of the table. If the condition is true, the row is deleted.
If the condition is not specified, all the rows of the table are
deleted. In fact, an implicit SELECT is performed in the form of
SELECT * FROM <target table> [ WHERE <search
condition>]
and the selected rows are deleted. When used in
JDBC, the number of rows returned by the implicit SELECT is returned as
the update count.
If there are FOREIGN KEY constraints on other tables that reference the subject table, and the FOREIGN KEY constraints have referential actions, then rows from those other tables that reference the deleted rows are either deleted, or updated, according to the specified referential actions.
The LIMIT clause, or alternatively the ROWNUM() function in the WHERE clause, can be used to limit the number of rows that are deleted. This is useful when a very large number of rows needs to be deleted. In this situation, you can perform the operation is chunks and commit after each chunk to reduce memory usage and the total time of the operation.
In the second example below the rows that have the maximum value for column A are deleted;
DELETE FROM T WHERE C > 5 DELETE FROM T AS TT WHERE TT.A = (SELECT MAX(A) FROM T)
TRUNCATE TABLE
truncate table statement
<truncate table statement> ::= TRUNCATE TABLE
<target table> [ <identity column restart option> ] [
<truncate options> ]
<identity column restart option> ::= CONTINUE
IDENTITY | RESTART IDENTITY
<truncate options> ::= AND COMMIT [ NO CHECK
]
<truncate table versioning statement> ::=
TRUNCATE TABLE <target table> VERSIONING TO { TIMESTAMP'YYYY-MM-DD
HH:MM:SS' | CURRENT_TIMESTAMP }
Delete all rows of a table without firing its triggers. This
statement can only be used on base tables (not views). If the table is
referenced in a FOREIGN KEY constraint defined on another table, the
statement causes an exception. Triggers defined on the table are not
executed with this statement. The default for <identity
column restart option>
is CONTINUE
IDENTITY
. This means no change to the IDENTITY sequence of the
table. If RESTART IDENTITY
is specified, then the
sequence is reset to its start value.
TRUNCATE is faster than ordinary DELETE. The TRUNCATE statement is an SQL Standard data change statement; therefore it is performed under transaction control and can be rolled back if the connection is not in the auto-commit mode.
HyperSQL also supports the optional AND COMMIT and NO CHECK options. If AND COMMIT is used, then the transaction is committed with the execution of the TRUNCATE statement. The action cannot be rolled back. If the additional NO CHECK option is also specified, then the TRUNCATE statement is executed even if the table is referenced in a FOREIGN KEY constraint defined on another, non-empty table. This form of TRUNCATE is faster than the default form and does not use much memory.
The <truncate table versioning
statement>
is for removing old history rows from a
system-versioned table. All history rows that expired before the given
timestamp are removed. No current row is removed.
TRUNCATE SCHEMA
truncate schema statement
<truncate schema statement> ::= TRUNCATE SCHEMA
<target schema> [ <identity column restart option> ] AND
COMMIT [ NO CHECK ]
Performs the equivalent of a TRUNCATE TABLE ... AND COMMIT on all the table in the schema. If the additional NO CHECK option is also specified, then the TRUNCATE statement is executed even if any of the tables in the schema is referenced in a FOREIGN KEY constraint defined on a non-empty table in a different schema.
If RESTART IDENTITY is specified, all table IDENTITY sequences and all SEQUENCE objects in the schema are reset to their start values.
Use of this statement requires schema ownership or administrative privileges.
INSERT INTO
insert statement
<insert statement> ::= INSERT INTO <target
table> [ [ AS ] <correlation name> ] <insert columns and
source>
<insert columns and source> ::= <from
subquery> | <from constructor> | <from
default>
<from subquery> ::= [ <left paren>
<insert column list> <right paren> ] [ <override
clause> ] <query expression>
<from constructor> ::= [ <left paren>
<insert column list> <right paren> ] [ <override
clause> ] <contextually typed table value
constructor>
<override clause> ::= OVERRIDING USER VALUE |
OVERRIDING SYSTEM VALUE
<from default> ::= DEFAULT
VALUES
<insert column list> ::= <column name
list>
Insert new rows in a table. An INSERT statement inserts one or more rows into the table.
The special form, INSERT INTO <target table>
DEFAULT VALUES
can be used with tables which have a default
value for each column.
With the other forms of INSERT, the optional
(<insert column list>)
specifies to which
columns of the table the new values are assigned.
In one form, the inserted values are from a <query
expression>
and all the rows that are returned by the
<query expression>
are inserted into the table.
If the <query expression>
returns no rows,
nothing is inserted.
In the other form, a comma separated list of values called
<contextually typed table value constructor>
is
used to insert one or more rows into the table. This list is
contextually typed, because the keywords NULL and DEFAULT can be used
for the values that are assigned to each column of the table. In this
form, the keyword DEFAULT means the default value of the column and can
be used only if the target column has a default value or is an IDENTITY
or GENERATED column of the table.
The <override clause>
must be used
when a value is explicitly assigned to a column that has been defined as
GENERATED ALWAYS AS IDENTITY. The OVERRIDING SYSTEM VALUE clause must be
used to override the sequence value with the user-supplied values. For
columns defined as GENERATED BY DEFAULT AS IDENTITY, there is no need to
use OVERRIDING when the user provides values to be used for the insert.
The OVERRIDING USER VALUE clause can be used with all types of GENERATED
columns and means the values provided by the user are simply ignored and
new values generated by the system are used instead. Two examples of
table definition are given below.
CREATE TABLE t1 (id INTEGER GENERATED ALWAYS AS IDENTITY(START WITH 100), name VARCHAR(20) PRIMARY KEY) CREATE TABLE t2 (id INTEGER GENERATED BY DEFAULT AS IDENTITY(START WITH 1) PRIMARY KEY, name VARCHAR(20))
In both examples below, the value inserted for the id column is 14. In the first example, it is necessary to use OVERRIDING SYSTEM VALUE when inserting into the id column of table t1 because the column has been defined as GENERATED ALWAYS. In the second example, no OVERRIDING clause is required for the insert into table t2 as its id column is defined as GENERATED BY DEFAULT. In both examples, if there is an existing row with that value as primary key, a constraint violation exception is thrown.
INSERT INTO t1 (id, name) OVERRIDING SYSTEM VALUE VALUES ( 14, 'Test Value') INSERT INTO t2 (id, name) VALUES ( 14, 'Test Value')
In the examples below, OVERRIDING USER VALUE is used to let the system generate values for the id column. The generated values override the value 14 in the first example, and the existing values for the id column in the table in the second example.
INSERT INTO t1 (id, name) OVERRIDING USER VALUE VALUES ( 14, 'Another Test Value') INSERT INTO t1 (id, name) OVERRIDING USER VALUE (SELECT * FROM t1)
An array can be inserted into a column of the array type by using literals, by specifying a parameter in a prepared statement, or by an existing array returned by a query expression. The last example below inserts an array.
The rows that are inserted into the table are checked against all the constraints that have been declared on the table. The whole INSERT operation fails if any row fails to inserted due to constraint violation. Examples:
CREATE TABLE T (A INTEGER GENERATED BY DEFAULT AS IDENTITY, B INTEGER DEFAULT 2) INSERT INTO T DEFAULT VALUES /* all columns of T have DEFAULT clauses */ INSERT INTO T (SELECT * FROM Z) /* table Z has the same columns as table T */ INSERT INTO T (A,B) VALUES ((1,2),(3,NULL), (DEFAULT,6)) /* three rows are inserted into table T */ ALTER TABLE T ADD COLUMN D VARCHAR(10) ARRAY /* an ARRAY column is added */ INSERT INTO T VALUES DEFAULT, 3, ARRAY['hot','cold']
If the table contains an IDENTITY column, the value for this
column for the last row inserted by a session can be retrieved using a
call to the IDENTITY() function. This call returns the last value
inserted by the calling session. When the insert statement is executed
with a JDBC Statement
or
PreparedStatement
method, the
getGeneratedKeys()
method of
Statement
can be used to retrieve not only the
IDENTITY column, but also any GENERATED computed column, or any other
column. The getGeneratedKeys()
returns a
ResultSet with one or more columns. This contains one row per inserted
row, and can therefore return all the generated columns for a multi-row
insert.
There are three methods of specifying which generated keys should be returned. The first method does not specify the columns of the table. With this method, the returned ResultSet will have a column for each column of the table that is defined as GENERATED ... AS IDENTITY or GENERATED ... AS (<expression>). The two other methods require the user to specify which columns should be returned, either by column indexes, or by column names. With these methods, there is no restriction on which columns of the inserted values to be returned. This is especially useful when some columns have a default clause which is a function, or when there are BEFORE triggers on the table that may provide the inserted value for some of the columns.
In MySQL syntax compatibility mode, HyperSQL supports INSERT IGNORE, REPLACE and ON DUPLICATE KEY UPDATE variations of the INSERT statement.
UPDATE
update statement: searched
<update statement: searched> ::= UPDATE
<target table> [ [ AS ] <correlation name> ] SET <set
clause list> [ WHERE <search condition> ][ LIMIT <fetch
first row count> ]
Update rows of a table. An UPDATE statement selects rows from
the <target table>
using an implicit SELECT
statement formed in the following manner:
SELECT * FROM <target table> [ [ AS ]
<correlation name> ] [ WHERE <search condition>
]
Then it applies the SET <set clause
list>
expression to each selected row.
If the implicit SELECT returns no rows, no update takes place. When used in JDBC, the number of rows returned by the implicit SELECT is returned as the update count.
If there are FOREIGN KEY constraints on other tables that reference the subject table, and the FOREIGN KEY constraints have referential actions, then rows from those other tables that reference the updated rows are updated, according to the specified referential actions.
The rows that are updated are checked against all the constraints that have been declared on the table. The whole UPDATE operation fails if any row violates any constraint.
The LIMIT clause, or alternatively the ROWNUM() function in the WHERE clause, can be used to limit the number of rows that are updated. This is useful when a very large number of rows needs to be updated. In this situation, you can perform the operation is chunks and commit after each chunk to reduce memory usage and the total time of the operation.
set clause list
set clause list
<set clause list> ::= <set clause> [ {
<comma> <set clause> }... ]
<set clause> ::= <multiple column
assignment> | <set target> <equals operator> <update
source>
<multiple column assignment> ::= <set target
list> <equals operator> <assigned
row>
<set target list> ::= <left paren> <set
target> [ { <comma> <set target> }... ] <right
paren>
<assigned row> ::= <contextually typed row
value expression>
<set target> ::= <column
name>
<update source> ::= <value expression> |
<contextually typed value specification>
Specify a list of assignments. This is used in UPDATE, MERGE and SET statements to assign values to a scalar or row target.
Apart from setting a whole target to a value, a SET statement can set individual elements of an array to new values. The last example below shows this form of assignment to the array in the column named B.
In the examples given below, UPDATE statements with single and multiple assignments are shown. Note in the third example, a SELECT statement is used to provide the update values for columns A and C, while the update value for column B is given separately. The SELECT statement must return exactly one row . In this example the SELECT statement refers to the existing value for column C in its search condition.
UPDATE T SET A = 5 WHERE ... UPDATE T SET (A, B) = (1, NULL) WHERE ... UPDATE T SET (A, C) = (SELECT X, Y FROM U WHERE Z = C), B = 10 WHERE ... UPDATE T SET A = 3, B[3] = 'warm'
MERGE INTO
merge statement
<merge statement> ::= MERGE INTO <target
table> [ [ AS ] <merge correlation name> ] USING <table
reference> ON <search condition> <merge operation
specification>
<merge correlation name> ::= <correlation
name>
<merge operation specification> ::= <merge
when clause>...
<merge when clause> ::= <merge when matched
clause> | <merge when not matched clause>
<merge when matched clause> ::= WHEN MATCHED [
AND <search condition> ] THEN <merge update or delete
specification>
<merge update or delete specification> ::=
<merge update specification> | <merge delete
specification>
<merge when not matched clause> ::= WHEN NOT
MATCHED [ AND <search condition> ] THEN <merge insert
specification>
<merge update specification> ::= UPDATE SET
<set clause list>
<merge delete specification> ::= DELETE
<merge insert specification> ::= INSERT [
<left paren> <insert column list> <right paren> ] [
<override clause> ] VALUES <merge insert value
list>
<merge insert value list> ::= <left paren>
<merge insert value element> [ { <comma> <merge insert
value element> }... ] <right paren>
<merge insert value element> ::= <value
expression> | <contextually typed value
specification>
Update rows, delete rows or insert new rows into the
<target table>
. The MERGE statement uses a
second table, specified by <table reference>
,
to determine the rows to be updated or inserted. It is possible to use
the statement only to update rows, to delete rows or to insert rows, but
usually both update and insert are specified.
The <search condition>
matches each
row of the <table reference>
with each row of
the <target table>
. If the two rows match then
the UPDATE clause is used to update the matching row of the target
table. Those rows of <table reference>
that
have no matching rows are then used to insert new rows into the
<target table>
. Therefore, a MERGE statement
can update or delete between 0 and all the rows of the
<target table>
and can insert between 0 and the
number of the rows in <table reference>
into
the <target table>
. If any row in the
<target table>
matches more than one row in
<table reference>
a cardinality error is
raised. On the other hand, several rows in the <target
table>
can match a single row in <table
reference>
without any error. The constraints and
referential actions specified on the database tables are enforced the
same way as for an update, a delete and an insert statement.
The optional <search condition>
in
each WHEN clause can be used to filter (reduce) the rows for the
particular action.
HyperSQL allows only one UPDATE, INSERT or DELETE operation in a MERGE statement. If both UPDATE and DELETE are used, the operations are performed in the order they appear in the MERGE statement. If the search conditions of both operations apply to the same row, only the first operation is performed.
The MERGE statement can be used with only the WHEN NOT MATCHED clause as a conditional INSERT statement that inserts a row if no existing rows match a condition.
In the first example below, the table originally contains two
rows for different furniture. The <table
reference>
is the (VALUES(1, 'conference table'),
(14, 'sofa'), (5, 'coffee table'))
expression, which evaluates
to a table with 3 rows. When the x value for a row matches an existing
row, then the existing row is updated. When the x value does not match,
the row is inserted. Therefore one row of table t is updated from
'dining table' to 'conference table', and two rows are inserted into
table t.
The second example uses a SELECT statement as the source of the values for the MERGE.
In the third example, a new row in inserted into the table only
when the primary key for the new row does not exist. This example uses
parameters and should be executed as a JDBC
PreparedStatement
. The parameter is cast as
INTEGER because the MERGE statement does not determine the types of
values in the USING clause.
In the fourth example, existing rows that match are deleted.
CREATE TABLE t (id INT PRIMARY KEY, description VARCHAR(100)) INSERT INTO t VALUES (1, 'dining table'), (2, 'deck chair') MERGE INTO t USING (VALUES(1, 'conference table'), (14, 'sofa'), (5, 'coffee table')) AS vals(x,y) ON t.id = vals.x WHEN MATCHED THEN UPDATE SET t.description = vals.y WHEN NOT MATCHED THEN INSERT VALUES vals.x, vals.y MERGE INTO t USING (SELECT * FROM tt WHERE acol = 2) AS vals(x,y) ON t.id = vals.x WHEN MATCHED THEN UPDATE SET t.description = vals.y WHEN NOT MATCHED THEN INSERT VALUES vals.x, vals.y MERGE INTO t USING (VALUES(CAST(? AS INT))) AS vals(x) ON t.id = vals.x WHEN NOT MATCHED THEN INSERT VALUES vals.x, ? MERGE INTO t USING (SELECT * FROM tt WHERE acol = 2) AS vals(x,y) ON t.id = vals.x WHEN MATCHED THEN DELETE WHEN NOT MATCHED THEN INSERT VALUES vals.x, vals.y
HyperSQL supports some SQL statements, expressions, functions, and Java methods that report on the most recently executed statement.
The IDENTITY()
function returns the last inserted
identity value for the current session.
The GET DIAGNOSTICS
statement is supported to a
limited extent. The built-in function DIAGNOSTICS()
is
an alternative. These are normally used in SQL/PSM routines to check the
result of the last data update operation.
GET DIAGNOSTICS
get diagnostics statement
<get diagnostics statement> ::= GET DIAGNOSTICS
<simple target value specification> = ROW_COUNT
The <simple target value specification>
is
a session variable, or a routine variable or OUT parameter.
The keyword ROW_COUNT
specifies the row count
returned by the last executed statement. For INSERT, UPDATE, DELETE and
MERGE statements, this is the number of rows affected by the statement.
This is the same value as returned by JDBC
executeUpdate()
methods. For all other statements, zero
is returned.
The value of ROW_COUNT
is stored in the specified
target.
This statement is often used in CREATE PROCEDURE statements.
In future versions, more options will be supported for diagnostics values.
$Revision: 6787 $