Front Page /// Expert Systems, Relational Databases and Fuzzy Logic /// Site Index
Fundamentally, Data Stores take structured data and preserve it across program invocations for later retrieval. Traditionally this was done using 'flat files', which were organised for the convenience of the program.
Flat files are one of the simplest examples of data storage, usually used when the data structure is fairly simple, and generally sorted against a single 'Key' attribute, generally with one record per line.
Most people will have encountered such files as ".CSV - Comma Seperated Value" files, and simple "XML" files, and they are so simple to use that they continue to be used for such simple ideas as a dictionary list, or for a timeline.
Unfortunately, flat files start to lose their attraction as soon as the data structure gets too big or complex, or when programs want to access the same data using a different 'Key' attribute, or when programs use overlapping and thus incomplete sets of data.
Both Databases and Flat Files have the following properties in common:
Flat files have a couple of additional properties, which make them less than ideal:
This causes the following problems:
Because of these problems, databases were developed, but in an ad-hoc fashion. Eventually two main types emerged, the Hierarchical database, and the Network Database, both built around interlinked files which were optimised for fast retrieval. The advantages of these, despite their ad-hoc nature lead to a general move to database technologies and away from the previous file-oriented approach.
Because they were ad-hoc, no one really knew which optimisations were best, leading Edgar F Codd of IBM to produce a series of papers between 1970 and 1974, which laid out a theoretical basis for thinking about databases which has been at the center of research ever since.
This Relational Database model had to introduce a number of new terms, as the existing ones were often just not unammbiguous or exact enough for the level of mathematical rigour required.
Because these terms were unfamiliar, but some of them mapped somewhat to the common terms, it leads a lot of people to misunderstand the model, and then propose alternatives which when analysed look like hierarchical or network databases, complete with the problems which the relational model was developed to deal with.
Relational databases are about the collection of lots of mainly stable facts in a minimally redundant manner which are then used to generate the views of the data that the users need to see.