October 11, 1943 in Newburyport, Mass.
Bachelor’s degree in Electrical Engineering (Princeton University, 1965); M.Sc. in Electrical Engineering (University of Michigan, Ann Arbor, 1967); Ph.D. in Computer Science & Engineering (University of Michigan, Ann Arbor, 1971).
Assistant Professor of Computer Science (University of California at Berkeley, 1971--1976), Associate Professor (1976-1982), Professor (1982-1993), Professor of the Graduate School (1994-1999); Senior Lecturer (Massachusetts Institute of Technology, 2001-2) Adjunct Professor (2002-Present). Concurrently co-founded and held executive or advisory roles with companies including Relational Technology, Inc. (founded 1980, later Ingres Corporation), Illustra Corporation (founded 1992, later acquired by Informix where Stonebraker was Chief Technology Officer 1996-2000), Cohera Corporation (1997, acquired by PeopleSoft), StreamBase Systems (2003, acquired by Tibco in 2013), Vertica Systems (2005, acquired by HP), Goby (2008, acquired by Telenauv in 2012), SciDB (2008), VoltDB (2009), and Tamr (2013).
ACM System Software Award (1992); ACM SIGMOD Innovation Award (1994); National Academy of Engineering (elected 1998); IEEE John von Neumann Medal (2005); Alan M. Turing Award (2014).
For fundamental contributions to the concepts and practices underlying modern database systems.
Michael Stonebraker’s contributions to the refinement and spread of database management technology are hard to overstate. He began work in this area as a young assistant professor at the University of California—Berkeley. After reading Edgar F. Codd’s seminal papers on the relational model, Stonebraker started work with a colleague, Eugene Wong, to develop an efficient and practical implementation. The result was INGRES, a name that reflected the project’s original intention to produce a geographically-oriented system with graphical capabilities. This officially stood for “Interactive Graphic and Retrieval System” but echoed the name of a celebrated French painter.
A prototype of INGRES was working by 1974, but the project did not stop there. Over the next decade INGRES, and systems inspired by it, built a new commercial market of relational database systems. Today the relational database management system is one of computing’s most important and widely used technologies, having replaced filing cabinets as the standard way of storing and retrieving information.
Development of INGRES
Stonebraker led development of INGRES at Berkeley until 1985, supported by grant money and the labor of graduate and undergraduate students. Berkeley was particularly notable during this era as a place where theoretical research and system building came together with spectacular results. Further examples included the work on timesharing systems by Butler Lampson (winner 1992) and others and the Berkeley Software Distribution (BSD) of the Unix operating system, which gave rise to a commonly used form of open source licensing. These cultures and practices anticipated much of what we now associate with the open source software movement. Stonebraker remembers that “we would recruit the smartest freshmen and sophomores we could find, give them wonderful equipment, and they would basically die writing code for us.”
Stonebraker’s work built on, and complemented, that of three other Turing award winners. Academic research into database management technology has had an unusually direct connection to the widely used industrial-strength systems underlying the websites, business applications, scientific breakthroughs, social media systems, and “big data” projects of the modern world. Charles W. Bachman (winner 1973) designed what is often called the first database management system in the early 1960s, and helped to define and popularize the concept of a database management system through his later work with the industry group CODASYL. Edgar F. Codd (winner 1981) developed an elegant and flexible way of storing and retrieving data, the relational model, which gradually eclipsed the network data model over the course of the 1980s. James Nicholas Gray (winner 1988) contributed to IBM’s System R, an influential experimental implementation of the relational model, and later pioneered robust, high performance methods for record locking and transaction processing.
Legacy of INGRES
INGRES and System R together helped to turn relational systems from a laboratory curiosity into the default choice for even the most demanding data processing applications. While the IBM prototype targeted the company’s multi-million dollar mainframes, INGRES was a Unix application suitable for relatively affordable minicomputers and was widely distributed to other universities where people used it, experimented with it, and extensively modified it.
INGRES brought a new kind of database technology to a new kind of computer. Database management systems were widely adopted by businesses from the early 1970s onwards as central hubs which managed the data used by many different application programs. These early commercial systems ran on mainframes and followed either Bachman’s network model or a more restrictive hierarchical approach favored by IBM. In the mainframe world these approaches remained dominant throughout the 1980s so that, for example, IBM first commercialized its work in the area as a niche product for “decision support” analytical applications rather than workaday operational systems.
During the 1970s, minicomputers became a cost-effective alternative to mainframes for an ever widening range of applications. Thanks to INGRES and its derivatives, relational technology became the default choice for minicomputer databases, as the new technology was widely applied to transaction processing applications (keeping routine records of things like address changes or account updates) as well as analytical work. The commercial database systems of the 1970s required their users to navigate through data structures at a relatively low level, making explicit decisions about how to index and link records when the database was created and navigating record by record through these structures when retrieving information. Relational database systems shifted to a more abstract and flexible view of data. Only when querying the database did users specify how data from different tables should be combined. This shifted much of the responsibility for efficiently organizing and retrieving data from the user to the database management software, pushing hard against the limits of affordable hardware.
INGRES was a feat of virtuoso software engineering, prioritizing performance and reliability so that new features were added only once a way of implementing them efficiently had been discovered. By 1976 INGRES was rapidly executing queries written in its QUEL query language (roughly equivalent to the SEQUEL, later SQL, language introduced by IBM). It could be embedded in C programs or used interactively. Under the hood, INGRES implemented a variety of indexing and compression methods, automatically optimizing queries. The team had already begun to add support for transactions, so that related updates would occur together--or not at all--to enforce integrity constraints between related records in different tables, and to deal with the potential problems caused by simultaneous updates from different users. Additional features, such as crash recovery and efficient backup and restore capabilities, turned INGRES from a research project to an industrial-strength technology. This took a huge amount of additional work. As Stonebraker recalled, “We built an initial prototype, putting in the first 90% of the effort required to create a real system, and it more or less worked. I think that the thing that distinguished INGRES from the typical academic project, and in retrospect one of the smartest things we ever did, was to then put in the next 90% of the effort to make INGRES really work.”
Students trained on the INGRES project, and in many cases using the INGRES code itself as a starting point, produced most of the leading minicomputer database packages. These included Britton-Lee (an early supplier of specialized parallel processing database management systems), the NonStop SQL product offered by Tandem Computers, and Sybase (whose SQL Server was later licensed by Microsoft). In 1980 Stonebraker himself co-founded Relational Technology, Inc. to produce its own commercial version of INGRES. His involvement with the firm was primarily as a consultant, though he worked there full time for around six months. It was a significant player in the database software market over the next decade, making an initial public offering in 1988 before being acquired in 1990.
By this point Stonebraker was already immersed in the development and commercialization of a successor system. Postgres added many features missing from existing relational systems, including support for rules to maintain consistent relationships between tables, support for complex “object-relational” data types, the replication of data across servers, and procedural languages to embed code fragments within the database management system to be triggered when specified conditions occured. Postgres was also used to experiment with other features of interest to database researchers. Techniques pioneered in Postgres were widely implemented, and in 1992 Stonebraker cofounded Illustra Information Technologies to market a commercial version. It was acquired in 1997 by Informix, which rebuilt its product line around the code.
Stonebraker retired from Berkeley in 1994, retaining a connection as a “Professor of the Graduate School.” In 1999 he moving to New Hampshire, soon taking up an adjunct appointment at MIT where he could focus on developing and commercializing new technologies without the obligation of regular faculty responsibilities. Since then he has cofounded a company every few years, focusing on the development of database management technologies specialized for particular areas such as data warehousing (Vertica), managing data streams captured by sensors (StreamBase Systems), and high-throughput transaction processing (VoltDB). However one of his latest ventures, SciDB, which focuses on handling massive arrays of scientific data, departs from the relational model as well as from traditional general purpose implementation techniques.
As an eloquent and authoritative commentator on trends in database technology, Stonebraker has defended the enduring power of the relational model against efforts by the “NoSQL” movement to promote the superiority of “post-relational” approaches. At the same time, he has been critical of the assumption that “one size fits all” when implementing relational database management systems and that dominant general purpose systems, such as Oracle, can serve the needs of all users.
Stonebraker is the only Turing award winner to have engaged in serial entrepreneurship on anything like this scale, giving him a distinctive perspective on the academic world. The connection of theory to practice has often been controversial in database research, despite the foundational contribution of mathematical logic to modern database management systems. Stonebraker has been critical of the insularity of some researchers, noting the attention given to such ideas as recursive querying or object-oriented databases suggests that “they are more interested in working on problems that are solvable, rather than problems that are important.” His “advice to theoreticians” was “go spend some time in the real world and work on problems that people want solved.” In contrast, “Knowing what I know now, I would never have started building INGRES, because it’s too hard…. So I think my advice to my younger self would be to suspend your disbelief and just do it anyway. The way you climb Mt. Everest is one step at a time…”
(Quotations from Stonebraker are taken from his interview with Marianne Winslett, published in ACM SIGMOD Record, Vol.32, No. 2, June 2003 as "Michael Stonebraker Speaks Out.")
Author: Thomas Haigh