Chapter 1. Running and Using HyperSQL

Fred Toussi

The HSQL Development Group

$Revision: 6787 $

Copyright 2002-2024 Fred Toussi. Permission is granted to distribute this document without any alteration under the terms of the HSQLDB license. Additional permission is granted to the HSQL Development Group to distribute this document with or without alterations under the terms of the HSQLDB license.

2024-10-25

Table of Contents

Introduction

The HSQLDB Jar

Running Database Access Tools

A HyperSQL Database

In-Process Access to Database Catalogs

Server Modes

HyperSQL HSQL Server
HyperSQL HTTP Server
HyperSQL HTTP Servlet
Connecting to a Database Server
Security Considerations
Using Multiple Databases

Accessing the Data

Closing the Database

Creating a New Database

Introduction

HyperSQL Database (HSQLDB) is a modern relational database system. Version 2.7.4 is the latest release of the all-new version 2 code. Written from ground up to follow the international ISO SQL:2023 standard, it supports the complete set of the classic features of the SQL Standard, together with optional features such as stored procedures and triggers.

HyperSQL version 2.7.4 is compatible with Java 11 or later and supports the Java module system. A version of the HSQLDB jar compiled with JDK 8 is also included in the download zip package. These jars are also available from Maven repositories.

HyperSQL is used for development, testing and deployment of database applications.

SQL Standard compliance is the most unique characteristic of HyperSQL.

There are several other distinctive features. HyperSQL can provide database access within the user's application process, within an application server, or as a separate server process. HyperSQL can run entirely in memory using a fast memory structure. HyperSQL can use disk persistence in a flexible way, with reliable crash-recovery. HyperSQL is the only open-source relational database management system with a high-performance dedicated lob storage system, suitable for gigabytes of lob data. It is also the only relational database that can create and access large comma delimited files as SQL tables. HyperSQL supports three live switchable transaction control models, including fully multi-threaded MVCC, and is suitable for high performance transaction processing applications. HyperSQL is also suitable for business intelligence, ETL and other applications that process large data sets. HyperSQL has a wide range of enterprise deployment options, such as XA transactions, connection pooling data sources and remote authentication.

New SQL syntax compatibility modes have been added to HyperSQL. These modes allow a high degree of compatibility with several other database systems which use non-standard SQL syntax.

HyperSQL is written in the Java programming language and runs in a Java virtual machine (JVM). It supports the JDBC interface for database access.

The ODBC driver for PostgreSQL can be used with HSQLDB.

This guide covers the database engine features, SQL syntax and different modes of operation. The JDBC interfaces, pooling and XA components are documented in the JavaDoc. Utilities such as SqlTool and DatabaseManagerSwing are covered in a separate Utilities Guide.

The HSQLDB Jar

The HSQLDB jar package, hsqldb.jar, is located in the /lib directory of the ZIP package and contains several components and programs.

Components of the HSQLDB jar package

HyperSQL RDBMS Engine (HSQLDB)
HyperSQL JDBC Driver
DatabaseManagerSwing GUI database access tool

The HyperSQL RDBMS and JDBC Driver provide the core functionality. DatabaseManagerSwing is a database access tool that can be used with any database engine that has a JDBC driver.

An additional jar, sqltool.jar, contains SqlTool, a command line database access tool that can also be used with other database engines.

Running Database Access Tools

The access tools are used for interactive user access to databases, including creation of a database, inserting or modifying data, or querying the database. All tools are run in the normal way for Java programs. In the following example the Swing version of the Database Manager is executed. The hsqldb.jar is located in the directory ../lib relative to the current directory.

 java -cp ../lib/hsqldb.jar org.hsqldb.util.DatabaseManagerSwing

If hsqldb.jar is in the current directory, the command would change to:

 java -cp hsqldb.jar org.hsqldb.util.DatabaseManagerSwing

Main class for the HSQLDB tools

org.hsqldb.util.DatabaseManagerSwing

When a tool is up and running, you can connect to a database (may be a new database) and use SQL commands to access and modify the data.

Tools can use command line arguments. You can add the command line argument --help to get a list of available arguments for these tools.

Double clicking the HSQLDB jar will start the DatabaseManagerSwing application.

A HyperSQL Database

Each HyperSQL database is called a catalog. There are three types of catalog depending on how the data is stored.

Types of catalog data

mem: stored entirely in RAM - without any persistence beyond the JVM process's life
file: stored in file system
res: stored in a Java resource, such as a Jar and always read-only

All-in-memory mem: catalogs can be used for test data or as sophisticated caches for an application. These databases do not have any files.

A file: catalog consists of between 2 to 6 files, all named the same but with different extensions, located in the same directory. For example, the database named "testdb" consists of the following files:

testdb.properties
testdb.script
testdb.log
testdb.data
testdb.backup
testdb.lobs

The properties file contains a few settings about the database. The script file contains the definition of tables and other database objects, plus the data for memory tables. The log file contains recent changes to the database. The data file contains the data for cached tables and the backup file is used to revert to the last known consistent state of the data file. All these files are essential and should never be deleted. For some catalogs, the testdb.data and testdb.backup files will not be present. In addition to those files, a HyperSQL database may link to any formatted text files, such as CSV lists, anywhere on the disk.

While the "testdb" catalog is open, a testdb.log file is used to write the changes made to data. This file is removed at a normal SHUTDOWN. Otherwise (with abnormal shutdown) this file is used at the next startup to redo the changes. A testdb.lck file is also used to record the fact that the database is open. This is deleted at a normal SHUTDOWN.

Note

	Note
When the engine closes the database at a shutdown, it creates temporary files with the extension `.new` which it then renames to those listed above. These files should not be deleted by the user. At the time of the next startup, all such files will be renamed or deleted by the database engine. In some circumstances, a `testdb.data.xxx.old` is created and deleted afterwards by the database engine. The user can delete these `testdb.data.xxx.old` files.

When the engine closes the database at a shutdown, it creates temporary files with the extension .new which it then renames to those listed above. These files should not be deleted by the user. At the time of the next startup, all such files will be renamed or deleted by the database engine. In some circumstances, a testdb.data.xxx.old is created and deleted afterwards by the database engine. The user can delete these testdb.data.xxx.old files.

A res: catalog consists of the files for a small, read-only database that can be stored inside a Java resource such as a ZIP or JAR archive and distributed as part of a Java application program.

In-Process Access to Database Catalogs

In general, JDBC is used for all access to databases. This is done by making a connection to the database, then using various methods of the java.sql.Connection object that is returned to access the data. Access to an in-process database is started from JDBC, with the database path specified in the connection URL. For example, if the file: database name is "testdb" and its files are located in the same directory as where the command to run your application was issued, the following code is used for the connection:

 Connection c = DriverManager.getConnection("jdbc:hsqldb:file:testdb", "SA", "");

The database file path format can be specified using forward slashes in Windows hosts as well as Linux hosts. So relative paths or paths that refer to the same directory on the same drive can be identical. For example if your database directory in Linux is /opt/db/ containing a database testdb (with files named testdb.*), then the database file path is /opt/db/testdb. If you create an identical directory structure on the C: drive of a Windows host, you can use the same URL in both Windows and Linux:

 Connection c = DriverManager.getConnection("jdbc:hsqldb:file:/opt/db/testdb", "SA", "");

When using relative paths, these paths will be taken relative to the directory in which the shell command to start the Java Virtual Machine was executed. Refer to the Javadoc for JDBCConnection for more details.

Paths and database names for file databases are treated as case-sensitive when the database is created or the first connection is made to the database. But if a second connection is made to an open database, using a path and name that differs only in case, then the connection is made to the existing open database. This measure is necessary because in Windows the two paths are equivalent.

A mem: database is specified by the mem: protocol. For mem: databases, the path is simply a name. Several mem: databases can exist at the same time and distinguished by their names. In the example below, the database is called "mymemdb":

 Connection c = DriverManager.getConnection("jdbc:hsqldb:mem:mymemdb", "SA", "");

A res: database, is specified by the res: protocol. As it is a Java resource, the database path is a Java URL (similar to the path to a class). In the example below, "resdb" is the root name of the database files, which exists in the directory "org/my/path" within the classpath (probably in a Jar). A Java resource is stored in a compressed format and is decompressed in memory when it is used. For this reason, a res: database should not contain large amounts of data and is always read-only.

 Connection c = DriverManager.getConnection("jdbc:hsqldb:res:org.my.path.resdb", "SA", "");

The first time in-process connection is made to a database, some general data structures are initialised and a helper thread is started. After this, creation of connections and calls to JDBC methods of the connections execute as if they are part of the Java application that is making the calls. When the SQL command "SHUTDOWN" is executed, the global structures and helper thread for the database are destroyed.

Note that only one Java process at a time can make in-process connections to a given file: database. However, if the file: database has been made read-only, or if connections are made to a res: database, then it is possible to make in-process connections from multiple Java processes.

Server Modes

For most applications, in-process access is faster, as the data is not converted and sent over the network. The main drawback is that it is not possible by default to connect to the database from outside your application. As a result you cannot check the contents of the database with external tools such as Database Manager while your application is running.

Server modes provide the maximum accessibility. The database engine runs in a JVM and opens one or more in-process catalogs. It listens for connections from programs on the same computer or other computers on the network. It translates these connections into in-process connections to the databases.

Several different programs can connect to the server and retrieve or update information. Applications programs (clients) connect to the server using the HyperSQL JDBC driver. In most server modes, the server can serve an unlimited number of databases that are specified at the time of running the server, or optionally, as a connection request is received.

A Sever mode is also the preferred mode of running the database during development. It allows you to query the database from a separate database access utility while your application is running.

There are three server modes, based on the protocol used for communications between the client and server. They are briefly discussed below. More details on servers is provided in the HyperSQL Network Listeners (Servers) chapter.

HyperSQL HSQL Server

This is the preferred way of running a database server and the fastest one. A proprietary communications protocol is used for this mode. A command similar to those used for running tools and described above is used for running the server. The following example of the command for starting the server starts the server with one (default) database with files named "mydb.*" and the public name of "xdb". The public name hides the file names from users.

 java -cp ../lib/hsqldb.jar org.hsqldb.server.Server --database.0 file:mydb --dbname.0 xdb

The command line argument --help can be used to get a list of available arguments. Connections are made using an hsql: URL.

 Connection c = DriverManager.getConnection("jdbc:hsqldb:hsql://localhost/xdb", "SA", "");

HyperSQL HTTP Server

This method of access is used when the computer hosting the database server is restricted to the HTTP protocol. The only reason for using this method of access is restrictions imposed by firewalls on the client or server machines and it should not be used where there are no such restrictions. The HyperSQL HTTP Server is a special web server that allows JDBC clients to connect via HTTP. The server can also act as a small general-purpose web server for static pages.

To run an HTTP server, replace the main class for the server in the example command line above with WebServer:

 java -cp ../lib/hsqldb.jar org.hsqldb.server.WebServer --database.0 file:mydb --dbname.0 xdb

The command line argument --help can be used to get a list of available arguments. Connections are made using an http: URL.

 Connection c = DriverManager.getConnection("jdbc:hsqldb:http://localhost/xdb", "SA", "");

HyperSQL HTTP Servlet

This method of access also uses the HTTP protocol. It is used when a servlet engine (or application server) such as Tomcat or Resin provides access to the database. The Servlet Mode cannot be started independently from the servlet engine. The Servlet class, in the HSQLDB jar, should be installed on the application server to provide the connection. The database file path is specified using an application server property. Refer to the source file src/org/hsqldb/server/Servlet.java to see the details.

Both HTTP Server and Servlet modes can be accessed using the JDBC driver at the client end. They do not provide a web front end to the database. The Servlet mode can serve multiple databases.

Please note that you do not normally use this mode if you are using the database engine in an application server. In this situation, connections to a catalog are usually made in-process, or using the hsql: protocol to an HSQL Server

Connecting to a Database Server

When a HyperSQL server is running, client programs can connect to it using the HSQLDB JDBC Driver contained in hsqldb.jar. Full information on how to connect to a server is provided in the Java Documentation for JDBCConnection (located in the /doc/apidocs directory of HSQLDB distribution). A common example is connection to the default port (9001) used for the hsql: protocol on the same machine:

Example 1.1. Java code to connect to the local hsql Server

 try {
     Class.forName("org.hsqldb.jdbc.JDBCDriver" );
 } catch (Exception e) {
     System.err.println("ERROR: failed to load HSQLDB JDBC driver.");
     e.printStackTrace();
     return;
 }

 Connection c = DriverManager.getConnection("jdbc:hsqldb:hsql://localhost/xdb", "SA", "");

If the HyperSQL HTTP server is used, the protocol is http: and the URL will be different:

Example 1.2. Java code to connect to the local http Server

 Connection c = DriverManager.getConnection("jdbc:hsqldb:http://localhost/xdb", "SA", "");

Note in the above connection URL, there is no mention of the database file, as this was specified when running the server. Instead, the public name defined for dbname.0 is used. Also, see the HyperSQL Network Listeners (Servers) chapter for the connection URL when there is more than one database per server instance.

Security Considerations

When a HyperSQL server is run, network access should be adequately protected. Source IP addresses may be restricted by use of our Access Control List feature, network filtering software, firewall software, or standalone firewalls. Only secure passwords should be used-- most importantly, the password for the default system user should be changed from the default empty string. If you are purposefully providing data to the public, then the wide-open public network connection should be used exclusively to access the public data via read-only accounts. (i.e., neither secure data nor privileged accounts should use this connection). These considerations also apply to HyperSQL servers run with the HTTP protocol.

HyperSQL provides two optional security mechanisms. The encrypted SSL protocol, and Access Control Lists. Both mechanisms can be specified when running the Server or WebServer. On the client, the URL to connect to an SSL server is slightly different:

Example 1.3. Java code to connect to the local secure SSL hsqls: and https: Servers

 Connection c = DriverManager.getConnection("jdbc:hsqldb:hsqls://localhost/xdb", "SA", "");
 Connection c = DriverManager.getConnection("jdbc:hsqldb:https://localhost/xdb", "SA", "");

The security features are discussed in detail in the HyperSQL Network Listeners (Servers) chapter.

Using Multiple Databases

A server can provide connections to more than one database. In the examples above, more than one set of database names can be specified on the command line. It is also possible to specify all the databases in a .properties file, instead of the command line. These capabilities are covered in the HyperSQL Network Listeners (Servers) chapter

Accessing the Data

As shown so far, a java.sql.Connection object is always used to access the database. But performance depends on the type of connection and how it is used.

Establishing a connection and closing it has some overheads, therefore it is not good practice to create a new connection to perform a small number of operations. A connection should be reused as much as possible and closed only when it is not going to be used again for a long while.

Reuse is more important for server connections. A server connection uses a TCP port for communications. Each time a connection is made, a port is allocated by the operating system and deallocated after the connection is closed. If many connections are made from a single client, the operating system may not be able to keep up and may refuse the connection attempt.

A java.sql.Connection object has some methods that return further java.sql.* objects. All these objects belong to the connection that returned them and are closed when the connection is closed. These objects, listed below, can be reused. But if they are not needed after performing the operations, they should be closed.

A java.sql.DatabaseMetaData object is used to get metadata for the database.

A java.sql.Statement object is used to execute queries and data change statements. A single java.sql.Statement can be reused to execute a different statement each time.

A java.sql.PreparedStatement object is used to execute a single statement repeatedly. The SQL statement usually contains parameters, which can be set to new values before each reuse. When a java.sql.PreparedStatement object is created, the engine keeps the compiled SQL statement for reuse, until the java.sql.PreparedStatement object is closed. As a result, repeated use of a java.sql.PreparedStatement is much faster than using a java.sql.Statement object.

A java.sql.CallableStatement object is used to execute an SQL CALL statement. The SQL CALL statement may contain parameters, which should be set to new values before each reuse. Similar to java.sql.PreparedStatement, the engine keeps the compiled SQL statement for reuse, until the java.sql.CallableStatement object is closed.

A java.sql.Connection object also has some methods for transaction control.

The commit() method performs a COMMIT while the rollback() method performs a ROLLBACK SQL statement.

The setSavepoint(String name) method performs a SAVEPOINT <name> SQL statement and returns a java.sql.Savepoint object. The rollback(Savepoint name) method performs a ROLLBACK TO SAVEPOINT <name> SQL statement.

The Javadoc for JDBCConnection, JDBCDriver, JDBCDatabaseMetadata, JDBCResultSet, JDBCStatement, JDBCPreparedStatement list all the supported JDBC methods together with information that is specific to HSQLDB.

Closing the Database

All databases running in different modes can be closed with the SHUTDOWN command, issued as an SQL statement.

When SHUTDOWN is issued, all active transactions are rolled back. The catalog files are then saved in a form that can be opened quickly the next time the catalog is opened.

A special form of closing the database is via the SHUTDOWN COMPACT command. This command rewrites the .data file that contains the information stored in CACHED tables and compacts it to its minimum size. This command should be issued periodically, especially when lots of inserts, updates, or deletes have been performed on the cached tables. Changes to the structure of the database, such as dropping or modifying populated CACHED tables or indexes also create large amounts of unused file space that can be reclaimed using this command.

Databases are not closed when the last connection to the database is explicitly closed via JDBC. A connection property, shutdown=true, can be specified on the first connection to the database (the connection that opens the database) to force a shutdown when the last connection closes.

Example 1.4. specifying a connection property to shutdown the database when the last connection is closed

 Connection c = DriverManager.getConnection(
         "jdbc:hsqldb:file:/opt/db/testdb;shutdown=true", "SA", "");

This feature is useful for running tests, where it may not be practical to shutdown the database after each test. But it is not recommended for application programs.

Creating a New Database

When a server instance is started, or when a connection is made to an in-process database, a new, empty database is created if no database exists at the given path.

With HyperSQL 2.0 the user name and password that are specified for the connection are used for the new database. Both the user name and password are case-sensitive. (The exception is the default SA user, which is not case-sensitive). If no user name or password is specified, the default SA user and an empty password are used.

This feature has a side effect that can confuse new users. If a mistake is made in specifying the path for connecting to an existing database, a connection is nevertheless established to a new database. For troubleshooting purposes, you can specify a connection property ifexists=true to allow connection to an existing database only and avoid creating a new database. In this case, if the database does not exist, the getConnection() method will throw an exception.

Example 1.5. specifying a connection property to disallow creating a new database

 Connection c = DriverManager.getConnection(
         "jdbc:hsqldb:file:/opt/db/testdb;ifexists=true", "SA", "");

A database has many optional properties, described in the System Management chapter. You can specify most of these properties on the URL or in the connection properties for the first connection that creates the database. See the Properties chapter.

	Chapter 1. Running and Using HyperSQL
Preface		Chapter 2. SQL Language