XSP TECHNOLOGY |
- SOURCE DATA TOO USABLE INFORMATION IN MINIMAL TIME -
INTRODUCTION:
XSP (Extended Set Processing) Technology is a formal
systems development discipline based on manipulating
data by its intrinsic mathematical identity instead by
some artificial
representation.
XSP Technology is currently being applied
to improved performance for information extraction applications;
to parallel information access from highly distributed,
disparate data sources; and
to the integration of XML (Extensible Markup Language)
and RDM (Relational Data Model) based systems.
XSP Technology
relies on the mathematical identity of data representations to provide
a formal computer modeling of real world information
with formal representations of real world items,
relationships, and behavior through:
For any mapping mechanism to be formally defined requires that the source objects and target objects be well defined. Further, for formal mappings to be modeled by operations and operands requires that all operand candidates have a recognizable mathematical identity. The key contribution of XSP Technology is the use of extended set theory to capture the mathematical identity of any and all data representations to allow formal modelings of XSP System implementations.
- Conceptual Data Representations: brain oriented representations for specifying real world information modeling applications,
- Machine Data Representations: byte oriented representations for executing conceptual data representation specified applications,
- Formal Mapping Mechanisms: for transforming objects of conceptual data representation to objects of machine data representation and for transforming objects of machine data representation to objects of conceptual data representation.
- RECOGNIZING DATA REPRESENTATIONS AS MATHEMATICAL OBJECTS -
IMPLEMENTATIONS:
First and foremost, the key concept of XSP technology is in recognizing data representations as mathematical objects instead of as physical objects. This includes all logical and physical data representations: past, present, and future.
XSP is a modeling technology for integrating logical functionality with physical performance.
There are records, files, B-trees, stacks, arrays, linked-lists, etc. that are used to represent data in a byte-oriented environment. These data representations can be poked at and pushed around as physical objects or they can be accessed and manipulated by their mathematical identity. There is no special extended set representation for implementations, just the standard tried and proven representations that we are all familiar with. The XSP difference is in recognizing these familiar data representations by their unfamiliar mathematical identities.
MODELING:
Most models impose a view of how things should behave. Few models mimic how things actually behave. Newton did not impose his views on how he though a falling apple should behave. He captured the existing behavior of a falling apple with a formal notation. Similarly, the use of
extended set theory (XST) is not to impose a particular modeling abstraction on the specification and manipulation of data representations, but rather to formally capture those properties of data representations that already exist. Thus, not only can any and all implementations be modeled with XST, all imposed abstract data models can also be modeled using XST.
EXTENDED SET THEORY (XST):
Extended set theory (XST) is
an axiomatic based extension to the foundations of
Classical set theory (CST)
providing a membership condition with a "property component".
In CST, the set membership condition is expressed as: "x is an element of Q or x is not an element of Q".
x ∈ Q is read "x is an element of Q" and is set-theoretically expressed by Q = { x }.
In XST, membership is expressed as: "x is a y-property element of Q or x is not a y-property element of Q".
x ∈y Q is read "x is a y-element of Q" and is set-theoretically expressed by Q = { xy }.
SELECTIVE SET RETRIEVAL vs. INDEXED RECORD ACCESS:
For over thirty years Indexed Record Access has been the de facto standard
for I/O transfers between RAM and storage. It has thus been assumed that
Indexed Record Access provides the best I/O performance for all information
access needs. This assumption is certainly true for Transaction Processing
applications where the fastest access to a single record is the application
requirement.
However, for Information Extraction applications, where access to
a small percentage of
many, many records is the application requirement, Indexed Record
Access actually degrades I/O performance.
It is generally assumed that there are only two I/O access alternatives:
Indexed Record Access or exhaustive searching.
Which is true in structure-dependent (row-store or column-store) architectures,
but is not true in operation-centric architectures.
When data representations are recognized as mathematical objects
instead of as physical objects,
a third I/O data access alternative is available. Using extended set processing
Selective Set Retrieval I/O interfaces rely on CPU processing power to
transform data between RAM and storage instead of relying on sophisticated
Indexed Record Access techniques to find individual records.
Data in storage can thus be reorganized for optimal informationally dense
I/O transfers during the execution of an application.
For Information Extraction applications
Selective Set Retrieval I/O interfaces can perform 20 to 40 times
faster than traditional Indexed Record Access I/O interfaces.
RDM, XML & XSP:
It is common for RDM advocates to tout the mathematical superiority
of the RDM because of its formal basis in set theory.
In point of fact, the RDM works in spite of set theory, not because of it.
It is also common for XML devotees to make no claim of any kind
that XML structures and operations
on these structures have any formal counterpart in set theory.
This happens to be true when using Classical set theory, but not true
when using extended set theory.
The RDM is, in principle, founded on Classical set theory (CST).
Since CST is a proper subset of XST, and since XST supports
n-tuple membership (which CST does not and which RDM requires)
all RDM data can be expressed and manipulated as modeled by XST.
Since every well formed XML document is, by its very nature, a
well defined extended set
(with attributes and tags portrayed as element properties),
all XML data and operations on XML data
can be faithfully modeled by XST. This common set-theoretic foundation
for XML and RDM data is leveraged by XSP Technology to support
implementations for integrated processing of XML and RDM data.
Example 1: Two structurally distinct RDM-Tables representing the same RDM-Relation with data content for people under domains Name, Age, and Spouse:
RDM-Table-1 Name Age Spouse Alan 43 Mary Mary 37 Alan
RDM-Table-2 Spouse Age Name Alan 37 Mary Mary 43 Alan
RDM-Relation = { { Alan<Name>, 43<Age>, Mary<Spouse>}, { Mary<Name>, 37<Age>, Alan<Spouse>} }
Example 2: XML and XST representations of the same data content: a person(p) with the name(n), an age(a), and a spouse(s).Papers presented on this web-site will examine the formal foundations of RDM and XML systems in the light of XST to provide practical relevance of XSP Technology to the field of high performance software systems design, development, and use. Subsequent papers will show how XST operations can be used to manipulate high-level RDM and XML data representations supported, internally, by low-level mathematical identity preserving extended set representations.XML_1: <p>
<n>Alan</n>
<a> 43 </a>
<s>Mary</s>
</p>
<p>
<n>Mary</n>
<a> 37 </a>
<s>Alan</s>
</p>
XST_1 = { { Alan<1,n>, 43<2,a>, Mary<3,s>} <1,p> , { Mary<1,n>, 37<2,a>, Alan<3,s>} <2,p> }
WHAT IS XSP?
A formal means for manipulating data as a mathematical object. (1 page PDF) - [8/08/05]XSP TECHNOLOGY
A Foundation For Integrated Information Access Systems (1 page PDF) - [6/20/02]FORMAL MODELING OF XSP SYSTEMS
XSP systems support an XST interface where all operations mapping the mathematical identity of data between conceptual and machine representations are explicitly expressed in terms of XST. (1 page PDF) - [3/13/07]VLDB 1977 Invited paper abstract
Extended Set Theory: A General Model For Very Large, Distributed, Backend Information Systems (1 page PDF)OPERATION-CENTRIC ARCHITECTURES
Operation-Centric architectures manipulate data representations as a mathematical objects, while traditional Structure-Dependent architectures manipulate data representations as physical objects. (2 pages PDF) - [6/02/06]RAPID INFORMATION ACCESS
Adding an Information Access Accelerator to Existing DMBSs (2 pages HTML ) - [1/11/03]RAPID RESPONSE TRANSACTION PROCESSING
Elapsed time comparisons of ten commercial systems. (2 pages PDF ) - [1/11/03]ADAPTIVE DATA RESTRUCTURING FUNCTIONS
A High Performance Alternative to Indexed Data Access Structures (1 page PDF) - [9/08/02]MODELING DATA PROCESSING IMPLEMENTATIONS
Operation-Centric, Structure-Dependent, & XML Data Models (1 page PDF) - [4/03/04]AXIOMATIC EXTENDED SET THEORY
A Focus on Extended Functions (abstract) (1 page PDF ) - [6/11/02]PEBBLE PILES & INDEX STRUCTURES
No intended offense to either ancient sheepherders or modern database researchers. (1 page PDF) - [8/08/05]DATA WAREHOUSE OR INFORMATION BLACK HOLE?
A 1995 observation of the Gartner Group's concern with OLAP. (1 page PDF) - [3/06/96]XML: PLAGUE or PANACEA?
Relational architectures provided a quantum jump in systems development by eliminating the structure-dependence between user view and system execution. Will XML development obliterate this advance or expand its scope of application? (1 page PDF) - [1/17/00]
XSP TECHNOLOGY Theory & Practice
Formal Modeling & Practical Implementation of XML & RDM Systems (6 pages PDF) - [8/27/07]DATA REPRESENTATIONS AS MATHEMATICAL OBJECTS
Considering Content Compatibility of Relational & XML Data Representations (6 pages PDF) - [4/25/07]FEASIBILITY OF AN OPERATION-CENTRIC ENVIRONMENT
for Processing XML Documents & Relational Data (4 pages PDF) - [1/26/01]RDM-RELATIONS & XML-STRUCTURES AS SETS (3 pages PDF) - [7/09/02]
XSP: An Integration Technology for Systems Development and Evolution
Formal Specification for Unifying XML and Relational Systems (19 pages PDF) - [7/12/01]SET-PROCESSING AT THE I/O LEVEL
A Performance Alternative to Traditional Index Structures (25 pages PDF) - [11/13/05]Introduction To A MATHEMATICAL FOUNDATION FOR SYSTEMS DEVELOPMENT
A Hypermodel Syntax for Precision Modeling of Arbitrarily Complex Systems
NATO-ASI Series, Vol. F24, 1986 (39 pages PDF) - [10/08/86]THE TROUBLE WITH SOFTWARE
A historical perspective by David L. R. Stein (co-founder of the Gartner Group) on XSP Technology from the initial ARPA funded research to current commercial interests in XSP based systems. (11 pages PDF) - [2003]
SUMMARY: AXIOMS FOR EXTENDED SET THEORY
XST (extended set theory) differs from CST (classical set theory) by assuming a ternary membership condition instead of a binary membership condition. The CST membership is based on the truth or falsity of "x is an element of a set A". XST membership is based on the truth or falsity of "x is a y-property element of a set A". The y qualifier is called the scope component of XST set membership. Since CST sets have no scopes they can be subsumed under XST by defining CST membership using "x is a null-property element of a set A". (2 pages PDF) - [8/13/07]NOTES ON: ITEMS, SETS, NAMES, TUPLES, & KLASSES
Introduction to basic terms with a construction for natural numbers. (1 page PDF) - [8/12/07]XST NOTES: Tuplesets, Tagged-Sets, Application, & Etc.
A succinct introduction to Tuplesets and their properties. The idea presented is that, armed with a well-behaved definition for n-tuple (unavailable in CST, but well-defined in XST), a definition for application can be developed prior to defining the concept of function and that such a definition can support the concept of self-application when applied to specifically defined sets. (The supplement contains a preview of definitions for functions and Kategories.) (12 pages PDF) - [12/04/05]
INFORMATION ACCESS ARCHITECTURES
Structure-Dependent & Operation-Centric Data Access Methods. (19 slides PPT) - [2002]THE INFORMATION ACCESS ACCELERATOR
Ralph Stout's de-mystifying account of the relevance of XSP Technology to real world information access needs. (39 slides PPT) - [11/04/05] (34 pages PDF) - [11/04/05]MANAGING DATA MATHEMATICALLY: Data As A Mathematical Object
August 2006 invited presentation given at Microsoft Research. (39 slides PPT) - [8/14/06]
SKOLEM: TWO REMARKS ON SET THEORY (The ordered n-tuples as sets)
Skolem concludes: "I shall not pursue these considerations here, but only emphasize that it is still a problem how the ordered n-tuple can be defined in the most suitable way." MATH. SCAND, 5 (1957) 40-46 (7 pages PDF)A RELATIONAL MODEL OF DATA FOR LARGE SHARED DATA BANKS
The much misunderstood relationship between "set theory", "n-tuples", "n-ary relations", "relationships", and "arrays" was well articulated by Codd in this paper. CACM 13, No.6(June) 1970 (pages 379-380 PDF)PROFESSIONAL PROFILE of D L Childs