apache > db
Apache DB Project
 
Font size:      

Proposal for Derby: an Apache Database Sub-Project

Submitted July 20, 2004

Section 0 : Rationale

Databases are a core component of nearly all applications. But traditional database servers are expensive to install, configure and maintain, raising the total cost of ownership for building and deploying applications. The complexity of traditional database servers - whether commercial or open source - hinders the pace of innovation in application development.

What's needed is a much easier to use database - written in Java with all of the advantages of Java's portability - that doesn't sacrifice the benefits and features of a full-function relational database. The database needs to install when the application is installed, run when the application runs, stop when the application stops, and upgrade when the application upgrades. Unless the application developer wishes, the application user/administrator should not need to worry about the database underlying the application.

Besides providing all of the functionality of modern relational databases - complete SQL syntax, transaction commit and rollback, high concurrency, triggers, online backup, etc. - the database must adhere to appropriate standards for relational databases. Standards are key, as developers should not have to sacrifice their favorite development tools to use the database, nor should they incur undesirable expense if they later port to a different database.

Our goal is to produce a community of developers - with backgrounds in Database Engineering, SQL-92 and SQL-99 standards, data-centric applications, and XML-related technologies such as XML-Query - whose mission is to provide a standards-based, full-featured relational database with the aforementioned features. The produced software will be Apache Software Foundation (ASF) licensed.

The starting point for this project will be a contribution by IBM to the ASF of "Derby", a pure-Java, full-featured relational database based on a snapshot of the current IBM Cloudscape product. Derby can be used in several configurations:

  • As an embedded database engine that runs within the same JVM as the application. So the database starts when the application starts, stops when the application stops, and requires no skills or effort to administer by the user/administrator of the application.
  • As a networked database server that runs in a separate JVM and accepts network connections.
  • As a combination of these two configurations that provides simplicity for the application developer/administrator plus remote access to the database when desired.

Section 0.1 : Criteria

We feel that this project has an excellent chance for success as measured by the following aspects:

  • Meritocracy: The project will be meritocratic - the usual Apache meritocracy rules would apply.
  • Community: The user community for this project will be large. IBM's internal experience was enthusiastic and widespread adoption of Cloudscape by 70+ projects due to its functionality and ease-of-use characteristics. We believe these same advantages will benefit many users and developers and will result in a large community.
  • Core Developers: IBM is dedicating developers to the Derby project and is actively seeking additional developers from the community to join the project.
  • Alignment: As a full-featured SQL database with a JDBC API, the Derby package aligns well with other Apache projects:
    • Apache Geronimo - Derby fulfills the JDBC requirement for J2EE 1.3 and 1.4 and can be used successfully with the Geronimo project.
    • Apache Tomcat - Derby can be embedded in Tomcat for a self-contained web-application with database functionality. (Has already been demonstrated.)
    • Apache DB - Cloudscape is currently listed as a supported back-end for Torque. Derby could be a supported back-end for ORB
    • Other possible integration points for Derby include Apache Cocoon, multiple projects under Apache Jakarta, and integration with Apache Xerces.

Section 1 : Scope of the Project

There are multiple goals for this Apache project:

  • Promote a healthy open source community.
  • Support the breadth of platforms that run Java with a relational database that adheres rigorously to open standards.
  • Promote a standards-based, database-agnostic approach to application development.
  • Encourage innovation in application development and deployment by offering:
    • an easy to use, full-featured relational database for use during application development
    • an easy to deploy database for use in new applications
    • an easy to manage database for production users with modest scalability requirements
    • a standards-based relational database such that users could upgrade later to larger, more complex database server offerings if scalability requirements demand
    • an ongoing commitment to maintaining a migration path to other open standards databases

Section 2 : Initial Source

The initial code base from which to create this project is from the commercial product called IBM Cloudscape. The history of this product is that it was developed at Cloudscape Inc. starting in 1996. The Cloudscape product was purchased along with the Cloudscape company by Informix Software in 1999. In 2001, IBM purchased the database assets of Informix Software, including the Cloudscape product.

IBM plans to contribute the Derby code base, test cases, build files, and documentation to the ASF under the terms specified in the ASF Corporate Contributor License. Once at Apache, the project will be licensed under the ASF license.

Derby has the following features and benefits:

  • Zero administration: Derby can be deployed in unsupervised environments, and does not require database administration or resource management. This eliminates the need for a database administrator at each client installation site.
  • Multiple platform compatibility: Derby fully supports Sun Microsystems Java technology standards and runs on any standard JVM V1.3, or later. It supports J2SE and J2EE.
  • Transactions and concurrency: Derby is highly concurrent, and supports row-level locking and read-committed, read-uncommitted, repeatable-read isolation levels, and serializable isolation levels.
  • Typical relational database features
    • Fast query compilation
    • Bulk load
    • Multiple user support
    • Online backup
    • Crash recovery
    • Built-in performance diagnostics: Query statistics, locks, and space usage
  • Advanced security features
    • Signed JAR files
    • Optional LDAP and application-defined authentication
    • Configurable disk encryption, supporting both diverse encryption algorithms and encryption providers other than the default Java Cryptography Extension (JCE) provider
  • Internationalization and localization support: Derby supports Unicode and allows for localized error messages.
  • SQL features
    • SQL-92E compliance and supports key features in the SQL-99 standards
    • Cost-based optimizer that supports hash joins, sort avoidance, and row- or table-level locking based on percent of data selected
    • Triggers
    • Foreign key and check constraints
    • Multicolumn B-Tree indexes
    • Multithreaded connections
    • Hold cursors
    • Temporary tables
    • BLOB/CLOB data types
    • Support for complex SQL transactions
  • Java features
    • JDBC driver
    • Java procedures and functions
    • Storage of JAR files in the database, including signed JARs

Section 3 : ASF Resources to be Created

Section 3.1 : Mailing Lists

Section 3.2 : CVS Repositories

  • incubator-derby

Section 3.3 : Bugzilla or possibly Jira

  • TBD

Section 4 : Initial Set of Committers

All committers will sign and submit a Contributors License Agreement. To start the project in the incubator, the initial committers from IBM will include:

Section 5 : Apache Sponsoring Individuals

  • Apache DB project

Section 6 : Incubation Exit Criteria

We feel this project should exit the incubator to join the Apache DB project should the following goals be met.

Technical Goals:

  • One or more Releases of the package during the incubation phase

Non-Technical Goals:

  • List presence and monitoring in wider Apache communities
  • Website cross reference to existing Apache literature with respect to rules and regulations
  • Initial integration plan and cooperation with Apache DB and Apache Jakarta projects
  • Additional contributors on the project