CS7700

Data Intensive Distributed Computing

Fall 2006 - Paper Reading List

 

 

Background:

 

  1. T. Hey, A. Trefethen. "The Data Deluge: An e-Science Perspective", in Grid Computing - Making the Global Infrastructure a Reality, chapter 36, pp. 809-824. Wiley and Sons.
  2. W. E. Johnston, "High-Speed, Wide Area, Data Intensive Computing: A Ten Year Retrospective", 7th IEEE Symposium on High Performance Distributed Computing, July 29-31, 1998, Chicago, IL.
  3. I. Foster, and C. Kesselman, "Computational Grids", in The Grid: Blueprint for a New Computing Infrastructure, Morgan-Kaufman, 1999.
  4. A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, and S. Tuecke, "The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets", Journal of Network and Computer Applications, 23:187-200, 2001.

 

Applications:

 

  1. J. Lee, B. Tierney, W. E. Johnston, "Data Intensive Distributed Computing; A Medical Application Example", HPCN Europe 1999: pp.150-158.
  2. K. Holtman, "CMS Data Grid System Overview and Requirements", CMS Note 2001/037,CERN, July 2001.
  3. B. Spencer Jr., T.A. Finholt, I. Foster, C. Kesselman, et al., "NEESgrid:  A Distributed Collaboratory for Advanced Earthquake Engineering Experiment and Simulation", 13th World Conference on Earthquake Engineering, August 2004.
  4. S. Barnard, R. Biswas, S. Saini, R. Van der Wijngaart, M. Yarrow, L. Zechter, I. Foster, O. Larsson. "Large-Scale Distributed Computational Fluid Dynamics on the Information Power Grid using Globus", Proceedings of Frontiers’99, 1999

 

Grid Toolkits:

 

  1. I. Foster, C. Kesselman, "Globus: A Metacomputing Infrastructure Toolkit", International Journal of Supercomputer Applications, 11(2):115-128, 1997.
  2. D. Thain, T. Tannenbaum, and M. Livny, "Condor and the Grid", in Grid Computing: Making the Global Infrastructure a Reality, John Wiley, 2003.
  3. G. Allen, K. Davis, T. Goodale, A. Hutanu, et al. "The Grid Application Toolkit: Toward Generic and Easy Application Programming Interfaces for the Grid", Proceedings of the IEEE, Volume 93, Issue 3, March 2005 Page(s): 534 – 550, 2005.
  4. K. Seymour, A. Yarkhan, S. Agrawal, J. Dongarra, "NetSolve: Grid Enabling Scientific Computing Environments", Grid Computing and New Frontiers of High Performance Processing, Elsevier Press, Advances in Parallel Computing, 14, 2005.

 

Distributed Storage:

 

  1. B. Tierney, J. Lee, B. Crowley, M. Holding, J. Hylton, F. L. Drake, "A Network-Aware Distributed Storage Cache for Data Intensive Environments",  in Proceedings of the Eighth IEEE International Symposium on High Performance Distributed Computing, pages 185-193, Redondo Beach, CA, August 1999.
  2. D. Teaff, R. W. Watson, and R. A. Coyne, "The Architecture of the High Performance Storage System (HPSS)" Proceedings of the Goddard Conference on Mass Storage and Technologies, College Park, MD, March, 1995.
  3. A. Rajasekar, M. Wan, and R. Moore, "MySRB & SRB - Components of a Data Grid", the 11th International Symposium on High Performance Distributed Computing (HPDC-11), Edinburgh, Scotland, July 24-26, 2002.
  4. A Shoshani, A Sim, J Gu, "Storage Resource Managers: Middleware Components for Grid Storage", Proceedings of the Nineteenth IEEE Symposium on Mass Storage, 2002.

 

Grid File Systems:

 

  1. F. Schmuck, and R. Haskin, "GPFS: A Shared-Disk File System for Large Computing Clusters", in Proceedings of the 1st USENIX Conference on File and Storage Technologies,Monterey, CA, January 28 - 30, 2002.
  2. P. H. Carns, W. B. Ligon, R. B. Ross, and R. Thakur, "PVFS: A Parallel File System for Linux Clusters", Proceedings of the 4th Annual Linux Showcase and Conference, 2000.
  3. J. Kubiatowicz, D. Bindel, Y. Chen, S. Czerwinski, et al., "OceanStore: An Architecture for Global-Scale Persistent Storage", in Proceedings of the Ninth international Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), November 2000.
  4. O. Tatebe, Y. Morita, S. Matsuoka, N. Soda, and S, Sekiguchi, "Grid Datafarm Architecture for Petascale Data Intensive Computing" Proceedings of the 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid), pp.102-110, 2002.

 

Remote I/O:

 

  1. I. Foster, D. Kohr, R. Krishnaiyer, J. Mogill, "Remote I/O: Fast Access to Distant Storage", Proc. Workshop on I/O in Parallel and Distributed Systems (IOPADS), pp. 14-25, 1997.
  2. J. Lee, X. Ma, R. Ross, R. Thakur, and M. Winslett, "RFS: Efficient and Flexible Remote File Access for MPI-IO", Proceedings of the International Conference on Cluster computing, 2004

 

High Performance Data Transfers:

 

  1. B. Allcock, , J. Bester, J. Bresnahan, et. al., "Data Management and Transfer in High Performance Computational Grid Environments". Parallel Computing Journal, Vol. 28 (5), May 2002, pp. 749-771.
  2. S. Vazhkudai, J. M. Schopf, and I. Foster, "Predicting the Performance of Wide Area Data Transfers", Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), April 2002.
  3. E. He, J. Leigh, O. Yu, and T. A. DeFanti, "Reliable Blast UDP : Predictable High Performance Bulk Data Transfer", IEEE Cluster Computing Conference, Chicago, IL, 2002.
  4. T Kelly, "Scalable TCP: Improving Performance in Highspeed Wide Area Networks", ACM SIGCOMM Computer Communication Review, 2003

 

Data Staging and Replication:

 

  1. W. R. Elwasif, J. S. Plank, and R. Wolski, "Data Staging Effects in Wide Area Task Farming Applications", IEEE International Symposium on Cluster Computing and the Grid, Brisbane, Australia, May, 2001.
  2. D Aksoy, M. J. Franklin, S. Zdonik, "Data Staging for On-Demand Broadcast", Proceedings of Very Large Databases (VLDB), 2001.
  3. H. Stockinger, A. Samar, B. Allcock, I. Foster, K. Holtman, and B. Tierney, "File and Object Replication in Data Grids", Proceedings of the Tenth International Symposium on High Performance Distributed Computing (HPDC-10), IEEE Press, August 2001.
  4. A. Chervenak, B. Schwartzkopf, H. Stockinger, et al, "Giggle: A Framework for Constructing Scalable Replica Location Services", Proceedings of the 2002 ACM/IEEE conference on Supercomputing, Baltimore, Maryland, 2002.

 

Traditional Scheduling:

 

  1. V. Hamscher, U. Schwiegelshohn, A. Streit, and R. Yahyapour, "Evaluation of Job Scheduling Strategies for Grid Computing", Grid Workshop at 7th International Conference on High Performance Computing (HiPC-2000), Bangalore, India, LNCS 1971, pp. 191 – 202.
  2. K. Ranganathan, and I. Foster, "Computation Scheduling and Data Replication Algorithms for Data Grids", Grid Resource Management: State of the Art and Future Trends, Kluwer Academic Publishers, 2003.
  3. F. D. Berman, R. Wolski, S. Figueira, J. Schopf, and G. Shao, "Application-level scheduling on distributed heterogeneous networks", Proceedings of the 1996 ACM/IEEE conference on Supercomputing, 1996.
  4. A. Alhusaini, V. K. Prasanna, and C.S. Raghavendra, "A Unified Resource Scheduling Framework for Heterogeneous Computing Environments", in Proceedings of the Heterogeneous Computing Workshop, pages 156-165, San Juan, PR, April 1999.

 

Data Management and Co-scheduling:

 

  1. T. Kosar, and Miron Livny, "A Framework for Reliable and Efficient Data Placement in Distributed Computing Systems", Journal of Parallel and Distributed Computing ,Volume 65, Issue 10, October 2005.
  2. D. Thain, J. Basney, S.C. Son, and M. Livny, "The Kangaroo Approach to Data Movement on the Grid", Tenth IEEE Symposium on High Performance Distributed Computing (HPDC10), San Francisco, California, August 7-9, 2001.
  3. A. Romosan, D. Rotem, A. Shoshani, and D. Wright, "Co-Scheduling of Computation and Data on Computer Clusters", in Proceedings of SSDBM 2005, pp.103-112.
  4. J Basney, and M. Livny, "Improving Goodput by Co-scheduling CPU and Network Capacity", in Proceedings of International Conference on High Performance Distributed Computing (HPDC), 1999.

Visualization:

39. W. Allcock, J. Bester, J. Bresnahan, I. Foster, J. Gawor, J. A. Insley, J. M. Link, and M. E. Papka, "GridMapper: A Tool for Visualizing the Behavior of Large-Scale Distributed Systems". 11th IEEE International Symposium on High Performance Distributed Computing (HPDC-11), pp179-187, Edinburgh, Scotland, July 24-16, 2002.

40. N. Karonis, M. Papka, J. Binns, J. Bresnahan, J. Insley, D. Jones, and J. Link, "High-Resolution Remote Rendering of Large Datasets in a Collaborative Environment", Future Generation of Computer Systems (FGCS), 2003.

41. C. Zhang, J. Leigh, T. A. DeFanti, M. Mazzucco, and R. Grossman, "TeraScope: Distributed Visual Data Mining of Terascale Data Sets over Photonic Networks" Future Generation Computer Systems (FGCS), 2003.

42. J. Leigh, T. DeFanti, R. Singh, F. Karayannis, "TeraVision: a High Resolution Graphics Streaming Device for Amplified Collaboration Environments",
Future Generation Computer Systems (FGCS), 2003.

Workflow Management:

 

  1. P. Couvares, T. Kosar, A. Roy, Jeff Weber, and Kent Wegner, "Workflow Management in Condor", to appear in Workflows for e-Science, Springer Press, 2006.
  2. B. Ludascher, I. Altintas, C. Berkley, D. Higgins, et al., "Scientific Workflow Management and the Kepler System", Concurrency and Computation: Practice &  Experience, Special Issue on Scientific Workflows, 2005.
  3. I. Foster, J. Voeckler, M. Wilde, and Y. Zhao, "Chimera: A Virtual Data System for Representing, Querying and Automating Data Derivation", Proceedings of the 14th Conference on Scientific and Statistical Database Management, Edinburgh, Scotland, July 2002.
  4. E. Deelman, J. Blythe, Y. Gil, C. Kesselman, et al., "Pegasus : Mapping Scientific Workflows onto the Grid", Across Grids Conference, Nicosia, 2004.

 

Future Challenges:

 

  1. DOE-Office of Science, "The Data Management Challenge", Report from the DOE Office of Science Data-Management Workshops, March-May 2004.
  2. NSF, "Research Challenges in Distributed Computer Systems", NSF Report, 2005.

 

Reference Papers*: (*These papers will not be discussed in the class, but they are good reference and background papers to read!)

 

  1. T. Kosar, "Data Placement in Widely Distributed Systems", Ph.D. Thesis, University of Wisconsin-Madison, August 2005.
  2. J.H. Saltzer, D.P. Reed, and D.D. Clark, "End-To-End Arguments in System Design", ACM Trans. on Computer Systems 2, 4, November 1984, pp. 277-288.
  3. J.M. Schopf and B. Nitzberg, "Grids: Top Ten Questions", Scientific Programming, special issue on Grid Computing, 10(2):103 - 111, August 2002.
  4. S. Venugopal, R. Buyya, and K. Ramamohanarao, "A Taxonomy of Data Grids for Distributed Data Sharing, Management, and Processing", ACM Computing Surveys (CSUR), 2006
  5. J. Yu, and R. Buyya, "A Taxonomy of Scientific Workflow Systems for Grid Computing", SIGMOD Record, 2005.