What is DObjects?
DObjects is an easy-to-use and scalable
distributed data objects framework for
heterogeneous systems being developed at Emory
University. The framework has a number of unique
features and contributions that distinguish it
from current solutions.
First, the system employs a novel distributed and
decentralized architecture. It uses a
metacomputing platform as a resource sharing
substrate. The system consists of independent
nodes which form a virtual abstraction of a
single database management system. Each node can
either be responsible for retrieving data from
data sources through data adapters or simply
provide computational power which is used by
others. The system scales well when only few or
thousands of nodes are used.
Second, the system provides convenient and
transparent querying mechanisms for data objects.
Front-end users build queries and get back an
array of data objects called persistent entities
as the response to the queries. DObjects' query
language strictly follows object-oriented fashion
of data representation. When query is created,
users have a possibility of constraining results
in terms of populated attributes (the system
fills in only desired attributes in result
persistent entities) and conditions that have to
be satisfied by returned data objects. Data from
multiple heterogeneous sources are joined
transparently from user's point of view. Each
persistent entity is a wrapper for data obtained
from data source.
Third, the system implements a distributed query
optimization scheme based on a cost space model
that optimizes both throughput and response time
of the system. Every time a query is executed, it
is deployed on some nodes of the system. The
deployment process is driven by various factors,
such as desired data placement, current load of
particular nodes and distance between nodes in
terms of network latency.
Finally, the system is a general purpose
distributed data objects framework. It supports
various options of query types, such as ad-hoc
queries, batch queries and continuous queries for
various applications.

The figure above
presents our vision of deployed DObjects
framework. The system has no centralized services
and thus allows system administrators to avoid
all the burden in this area. It uses the
metacomputing paradigm as a resource sharing
substrate. Each node in the system provides its
computational power that can be used by others
during query execution. In addition, nodes can
run data adapters which pull data from external
data sources and transform it to uniform format
that is expected while building query responses.
Front-end users can connect to any system node;
however, while the physical connection is
established between a client and one of the
system nodes, the logical connection is
established between a client node and a virtual
database system consisting of all the
participating nodes.
What do I need to know to use the
system?
In general, even though DObjects is a distributed
framework, you do not need to be an expert in
distributed systems to use it. It has simple
end-user interface which is expressed as
classical database interface that can be queried.
The only difference is that, you have to use
query language which the system provides (not
SQL, but more object-oriented language).
What are the benefits of using this
framework?
DObjects is framework that facilitates
integration of data from multiple heterogeneous
sources. By defining system configuration you can
easily provide information how data from
different sources should be mapped to global,
uniform schema. After this step, you can start
system and query for data. Even though your
connection happens to be between client and one
of system nodes, you can use whole virtual
database system, e.g. you can use computational
power of all the nodes or query and operate on
data from any datasouce uniformly.
System requirements
DObjects is written on top of H2O
Java metacomputing framework, so it is platform
independent. It was tested on Windows, Linux and
SumOS and was running well. You have to just
remember about appropriate firewall settings
which will enable interactions between
distributed components.