Proactively Reconfigurable, Adaptive, Reliable Middleware
|
The MEAD
system aims to enhance distributed middleware
applications with new capabilities such as (i) transparent,
yet tunable, fault tolerance with configurable performance
and timing guarantees, (ii) proactive
dependability, (iii) resource-aware system adaptation to crash,
communication and timing faults with (iv) scalable and fast
fault-detection and fault-recovery. As a part of the research
on MEAD, we are investigating failure prediction, zero-downtime
software upgrades, automated fingerpointing in distributed
systems, and resource-constrained (embedded) survivability.
 |
|
PEOPLE
|
SOFTWARE
The open-source software releases and documentation for the
MEAD system are
available for download. The documentation
includes detailed instructions for installaton, compilation
and runtime execution. Technical support for using the MEAD
system is also available through the
Support Request Form.
Empirical evaluations of MEAD (and its underlying group communication system,
Spread) can be found in our
publications, as well as at
the Lockheed Martin ATL QoS page.
PUBLICATIONS
- Living with Nondeterminism in Replicated Middleware Applications
J. G. Slember and P. Narasimhan,
ACM/IFIP/USENIX Conference on Middleware, Melbourne,
Australia, November 2006.
- Nondeterminism in ORBs: The perception and the reality
J. G. Slember and P. Narasimhan,
DEXA Workshop on High Availability of Distributed Systems,
Krakow, Poland, September 2006.
- Impact-Sensitive Framework for Dynamic Change-Management
T. A. Dumitras, D. Rosu, A. Dan and P. Narasimhan,
DSN Workshop on Architecting Dependable Systems, Philadelphia, PA,
USA, June 2006.
-
Causes of Failure in Web Applications
S. Pertet and P. Narasimhan,
Carnegie Mellon University Parallel Data Lab Technical Report, CMU-PDL-05-109,
December 2005.
-
Fault Tolerant Middleware and the Magical 1%
T. A. Dumitras and P. Narasimhan,
ACM/IFIP/USENIX Conference on Middleware, Grenoble, France,
November-December 2005.
-
MEAD: Support for Real-Time Fault-Tolerant CORBA
P. Narasimhan, T. A. Dumitras, A. M. Paulos, S. M. Pertet, C. F. Reverte,
J. G. Slember and D. Srivastava,
Concurrency and Computation: Practice and Experience, vol. 17,
no. 12, 2005, pp. 1527-1545;
Copyright 2005 John Wiley and Sons
-
Architecting and Implementing Versatile Dependability
T. A. Dumitras, D. Srivastava and P. Narasimhan,
Architecting Dependable Systems Vol. III, edited by
Cristina Gacek, Alexander Romanovsky and Rogerio de Lemos,
Springer-Verlag, 2005
- Handling Cascading Failures: The Case for
Topology-Aware Fault-Tolerance
S. Pertet and P. Narasimhan, DSN Workshop on Hot Topics in System
Dependability, Yokohama, Japan, June 2005.
- Proactive Problem Determination in
Transaction-Oriented Applications
S. Pertet, P. Narasimhan, A. Sailer and G. Kar, DSN Fast Abstract,
Yokohama, Japan, June 2005.
-
Using Program Analysis to Identify and Compensate for Nondeterminism
in Fault-Tolerant, Replicated Systems
J. G. Slember and P. Narasimhan,
IEEE Symposium on Reliable Distributed Systems, Florianopolis,
Brazil, October 2004, pp. 251-263.
-
Proactive Fault-Recovery in Distributed Systems
S. M. Pertet,
Master's Thesis, Department of Electrical & Computer Engineering,
Carnegie Mellon University, May 2004
- Automated Configuration of Fault Tolerance in Distributed Systems
C. F. Reverte,
Master's Thesis, Department of Electrical & Computer Engineering,
Carnegie Mellon University, May 2004
- An Architecture for Versatile Dependability
T. A. Dumitras and P. Narasimhan, DSN Workshop on
Architecting Dependable Systems, Florence, Italy, June 2004.
-
Proactive Recovery in Distributed CORBA Applications
S. Pertet and P. Narasimhan, IEEE Conference on
Dependable Systems and Networks (DSN), Florence, Italy, June 2004, pp. 357-366.
-
Experiences, Approaches and Challenges in Building Fault-Tolerant
CORBA Systems
P. Felber and P. Narasimhan, IEEE Transactions
on Computers, vol. 54, no. 5, May 2004, pp. 497-511
-
Decentralized Resource Management and Fault Tolerance for
Distributed CORBA Applications
C. F. Reverte and P. Narasimhan, IEEE Workshop on Object-oriented
Real-time Dependable Systems, Capri Island, Italy, October 2003
- Estimating Fault-Detection and Fail-Over Times for
Nested Real-Time CORBA Applications
S. Ratanotayanon and P. Narasimhan, International Conference
on Parallel and Distributed Processing Techniques and Applications,
Las Vegas, NV, June 2003
-
A Middleware for Dependable Distributed Real-Time Systems
(Raytheon Company Best Paper Award)
T. D. Bracewell and P. Narasimhan, Joint Systems and Software
Engineering Symposium, Falls Church, VA, April 2003
- Middleware for Embedded Adaptive
Dependability
P. Narasimhan, C. F. Reverte, S. Ratanotayanon
and G. S. Hartman, IEEE Workshop on Large Scale Real-Time
and Embedded Systems, Austin, TX, December 2002
- Trade-Offs Between Real-Time and
Fault Tolerance for Middleware Applications
P. Narasimhan, Workshop on Foundations of Middleware
Technologies, Irvine, CA, November 2002
PRESENTATIONS
-
Living Realistically with Nondeterminism in Fault-Tolerant
Replicated Applications
J. Slember and P. Narasimhan, OMG Real-time Workshop, Washington D.C.,
July 2005
-
Moving from Fault-Tolerant CORBA to Fault-Tolerant CCM
D. Srivastava, A. Paulos and P. Narasimhan,
TAO/CIAO Workshop, Washington D.C., July 2004
- Developing Fault-Tolerant CCM
T. D. Bracewell and P. Narasimhan, Workshop on CCM,
Nashville, TN, December 2003
- The MEAD Architecture for Real-Time Fault-Tolerant
Middleware
P. Narasimhan, S. Ratanotayanon and C. F. Reverte,
OMG Real-Time and Distributed Object Computing
Workshop, Washington D.C., July 2003
- MEAD: Real-time Fault-Tolerant Support for
Middleware
P. Narasimhan, OMG Real-Time Special Interest
Group Meeting on Fault Tolerance, Orlando, FL, March 2003
-
Evolving the CORBA Standard to Support New Distributed Real-Time and
Embedded Systems
T. D. Bracewell and P. Narasimhan,
OMG Real-Time Special Interest
Group Meeting on Fault Tolerance, Orlando, FL, March 2003
-
Providing Both Real-Time and Fault Tolerance for
CORBA Applications
P. Narasimhan, OMG Real-Time and Distributed Object Computing
Workshop, Arlington, VA, July 2002.
POSTERS
- Proactive Problem Determination: Determ
ining the root
cause of performance degradation in e-Commerce systems
A. Sailer, G. Kar, S. Pertet and P. Narasimhan, IBM Academy's 3rd Proactive P
roblem Prediction,
Avoidance, and Diagnosis Conference, April 2005
- MEAD Overview: Architectural
overview of the components of the MEAD real-time fault-tolerant
middleware system, DARPA PCES-II PI Meeting, December 2003.
- Proactive Dependability: Concepts underlying MEAD's
proactive fault-recovery approach
- Fault-Tolerance Advisor: Concepts underlying MEAD's fault-tolerance
configuration/deployment advice for distributed applications
SPONSORS
- National Science Foundation - Integrated Real-Time and
Fault-Tolerance Support for Middleware Applications, Priya Narasimhan (PI)
- DARPA Programmable Composition of Embedded Systems (PCES-II) Program -
Middleware for Embedded Adaptive Dependability, Priya Narasimhan (PI)
- General Motors
- Siemens Corporate Research
WHY MEAD?
Mead, the legendary ambrosia of the Vikings, was believed to endow
its imbibers with immortality (i.e., dependability), reproductive
capabilities (i.e., replication), the wisdom for weaving poetry
(i.e., cross-cutting aspects of real-time and fault tolerance) and
a happy and long married life (i.e., partition-tolerance).
If you have any questions/comments on this webpage, please email priya@cs.cmu.edu.
Last modified: Mon Mar 27 19:15:42 EST 2006