The theory of databases is central to computer science. Many formulations of databases have been given over the years, though most references seem to avoid rigorous mathematical definitions for any of them. In this paper, we define a new category of databases called the category of geometric databases. Among the formulations currently in use, geometric databases are most similar to relational databases.
Given data types DT, the Cech complex C(DT) provides a classifying object for databases on DT. It is a simplicial set together with a sheaf of data called the universal data bundle. A database on DT is a pair (X,O_X), where X is a simplicial set and O_X is a sheaf of data on X, equipped with a bundle map to the universal data bundle. Morphisms of databases (f,f^\sharp) : (X,O_X) --> (Y,O_Y) are defined in a familiar way, e.g. they are defined analogously to morphisms of ringed spaces.
We show that the category of databases on DT is closed under limits and colimits. We show that any relational database can be reformulated as a geometric database. We further show that all of the typical operations on relational databases (e.g. joins, projects, selects, insertions, etc.) have analogues in our setting. For example, the join of two databases corresponds to their union under this correspondence.
Aside from providing a beautiful geometric picture of data organization, our theory has several advantages over relational databases. Most notably, queries of a geometric database always yield new geometric databases, whereas the corresponding statement for relational databases does not hold.