Set I/O Architectures
Data diversity and storage access are dependent on how data is
physically represented on a computer.
Though data processing results are independent of computer representations,
the performance of processing operations vary dramatically.
Performance depends
on physical representations and organization of data.
An ideal storage management environment would support
many varieties of physical data representations.
Set I/O architectures are very different than
record I/O architectures.
Sets are mathematical objects.
Records are physical objects,
Mathematical objects
are formally defined abstract definitions that are independent
of any physical representation.
Physical objects
are arbitrarily defined concrete definitions that are dependent on
currently existing physical representations.
Systems processing data as mathematical objects
rely on the properties of the abstract definitions,
not on the current state of the physical representations.
Record I/O
requires knowledge of how data is physically represented in storage.
Set I/O
requires knowledge of how data is mathematically represented in storage.
Set I/O implementations were first introduced in
1968.^{[AFIP]}^{[LLL]}
Though set accessing systems have been commercially available since
1971,^{[RDBMS]}
the performance advantages of set accessing I/O over record accessing I/O are little known.
The key I/O performance difference is that record accessing I/O
depends on physical representations of storage data,
while set accessing I/O depends on mathematical representations of storage data.
Storage independent representation of data is key to I/O performance.
■
Set I/O architectures use a formal foundation
for a mathematical representation
and manipulation of system data.
Changes to the physical representation
and organization of data can be made at any
time, as long as mathematical integrity is
maintained.
■
In 1965 ARPA initiated research to
provide applications with
machineindependent access to stored
data.^{[ARPA]}
■
Since information
contained in data is independent of any representation,
and since mathematically welldefined objects
and operations on such objects
are also independent of representations,
ARPA directed the research to discover
a mathematical foundation for representing
and manipulating
data on a computer.^{[SETS?]}
■
All the properties of Classical set theory,
except one, fit the criteria for modeling computer data
as mathematical objects.
ARPA research focused on
extending Classical set theory to include the property of
structure,
giving birth to the concept of
extended sets.^{[XSET]}
■
Record I/O architectures specify physical
representations and organizations
of data that reflect specific
application processing requirements.
■
Set I/O architectures insulate applications from direct access
to storage by use of set operations.
Record I/O architectures bind applications to
storage by use of index structures.
■
Set I/O implementations have been commercially active since
1971.^{[RDBMS]}
Early implementations only supported data represented as labeled arrays.
XML documents became represented as extended sets in
2001.^{[XML]}
By 2011 extended set theory
provided a mathematical foundation
capable of modeling any computer representation of
data.^{[XST]}^{[XOPS]}
■
Set I/O architectures are intended to provide applications global access to data,
while local platforms focus on performance issues.
Developers can use
set I/O for universal data access
while allowing local implementations freedom to provide
nearoptimal performance.^{[MFCD]}^{[FM]}
Conclusion
The fundamental result of ARPA's research
exploring the feasibility of a
machineindependent data model was the discovery that
data could be represented as a mathematical object.
A formal modeling notation was developed to
represent and manipulate all computer data as
extended sets.^{[XSET]}
The evolution of this notation gave rise to XSP Technology.
Data as XSP Sets
Twelve RDM tables
R1  R12 expressed by a single Labeled
set Ri, RDM.
A very simple XMLstructure expressed as a labeled
set, XML.
Three extended relations expressed as labeled
sets, Xrel1.
Two complex extended relations expressed as labeled
sets, Xrel2.
The ability to represent and manipulate data as XSP sets is what
distinguishes Set I/O implementations from traditional
DBMS implementations.
More detailed information to assist implementations and I/O optimization strategies using
XSP Technology
can be found
at Extended Set Processing.
EPILOGUE
Data That Can't Be Accessed, Can't Be Processed.
If Data Can Be Accessed, It Has A Set Identity.
If Data Has A Set Identity, It Can Be Processed By Set Operations.
If Data Can Be Processed By Set Operations, Processes Are Limited Only By Imagination.
References
Copyright © 2018
INTEGRATED INFORMATION SYSTEMS
« Last modified on 07/03/2018 »
