Computação Grid IFSC, Julho 2005

66
Computação Grid IFSC, Julho 2005 Sergio Takeo Kofuji, Prof. Dr. EPUSP

description

Computação Grid IFSC, Julho 2005. S e rgio Takeo Kofuji, Prof. Dr. EPUSP. Motivação. Sergio Takeo Kofuji. Computational simulation : “a means of scientific discovery that employs a computer system to simulate a physical system according to laws derived from theory and experiment”. - PowerPoint PPT Presentation

Transcript of Computação Grid IFSC, Julho 2005

  • Computao Grid IFSC, Julho 2005Sergio Takeo Kofuji, Prof. Dr. EPUSP

  • MotivaoSergio Takeo Kofuji

  • Three pillars of scientific understandingTheoryExperimentSimulationtheoretical experiments

  • Can simulation produce more than insight?

    The purpose of computing is insight, not numbers. R. W. Hamming (1961)What changed were simulations that showed that the new ITER design will, in fact, be capable of achieving and sustaining burning plasma. R. L. Orbach (2003, in Congressional testimony about why the U.S. is rejoining the International Thermonuclear Energy Reactor (ITER) consortium)The computer literally is providing a new window through which we can observe the natural world in exquisite detail. J. S. Langer (1998)

  • Can simulation lead to scientific discovery?Images c/o R. Cheng (left), J. Bell (right), LBNL, and NERSC 2003 SIAM/ACM Prize in CS&E (J. Bell & P. Colella)

    Instantaneous flame front imaged by density of inert markerInstantaneous flame front imaged by fuel concentration

  • Environmentglobal climatecontaminant transport

    AppliedPhysicsradiation transportsupernovaeScientific SimulationIn these, and many other areas, simulation is an important complement to experiment.The imperative of simulation

  • Environmentglobal climatecontaminant transport

    Experiments controversialAppliedPhysicsradiation transportsupernovaeScientific SimulationIn these, and many other areas, simulation is an important complement to experiment.The imperative of simulation

  • Environmentglobal climatecontaminant transport

    Experiments controversialAppliedPhysicsradiation transportsupernovaeScientific SimulationExperiments dangerousIn these, and many other areas, simulation is an important complement to experiment.The imperative of simulation

  • Environmentglobal climatecontaminant transport

    Experiments controversialAppliedPhysicsradiation transportsupernovaeExperiments prohibited or impossibleScientific SimulationExperiments dangerousIn these, and many other areas, simulation is an important complement to experiment.The imperative of simulation

  • Environmentglobal climatecontaminant transport

    Experiments controversialAppliedPhysicsradiation transportsupernovaeExperiments prohibited or impossibleScientific SimulationExperiments dangerousIn these, and many other areas, simulation is an important complement to experiment.Experiments difficult to instrumentThe imperative of simulation

  • Environmentglobal climatecontaminant transport

    Experiments controversialAppliedPhysicsradiation transportsupernovaeExperiments prohibited or impossibleScientific SimulationExperiments dangerousIn these, and many other areas, simulation is an important complement to experiment.Experiments difficult to instrumentExperiments expensive The imperative of simulation

  • What would scientists do with 100-1000x? Example: predicting future climatesResolutionrefine horizontal from 160 to 40 kmrefine vertical from 105 to 15kmNew physicsatmospheric chemistrycarbon cycle (currently, carbon release is external driver)dynamic terrestrial vegetation (nitrogen and sulfur cycles and land-use and land-cover changes)Improved representation of subgrid processescloudsatmospheric radiative transfer

  • What would we do with 100-1000x more? Example: predict future climatesResolution of Kuroshio Current: Simulations at various resolutions have demonstrated that, because equatorial meso-scale eddies have diameters ~10-200 km, the grid spacing must be < 10 km to adequately resolve the eddy spectrum. This is illustrated in four images of the sea-surface temperature. Figure (a) shows a snapshot from satellite observations, while the three other figures are snapshots from simulations at resolutions of (b) 2, (c) 0.28, and (d) 0.1.

  • What would scientists do with 100-1000x? Example: lattice QCDCurrently available: 1 Tflop/sResources at the 100-200 Tflop/s level will:enable precise calculation of electromagnetic form factors characterizing the distribution of charge and current in the nucleonmake possible calculation of the quark structure of the nucleon enable calculation of transitions to excited nucleon statesPflop/s resources would: enable study of the gluon structure of the nucleon, in addition to its quark structureallow precision calculation of the spectroscopy of strongly interacting particles with unconventional quantum numbers, guiding experimental searches for states with novel quark and gluon structure.

  • What would we do with 100-1000x more? Example: probe the structure of particlesConstraints on the Standard Model parameters r and h. For the Standard Model to be correct, they must be restricted to the region of overlap of the solidly colored bands. The figure on the left shows the constraints as they exist today. The figure on the right shows the constraints as they would exist with no improvement in the experimental errors, but with lattice gauge theory uncertainties reduced to 3%.

  • Workflow

    ExamplesAstronomyPublic HealthCollectTelescopeMicroscope, Stethoscope, SurveyCOLLECTNational Virtual Observatory/ COMPLETECDC WonderAnalyzeStudy the density structure of a star-forming glob of gasFind a link between one factorys chlorine runoff & diseaseANALYZEStudy the density structure of all star-forming gas inStudy the toxic effects of chlorine runoff in the U.S.CollaborateWork with your student COLLABORATEWork with 20 people in 5 countries, in real-timeRespondWrite a paper for a Journal.RESPONDWrite a paper, the quantitative results of which are shared globally, digitally.

  • Workflowa.k..a. The Scientific Method (in the Age of the Age of High-Speed Networks, Fast Processors, Mass Storage, and Miniature Devices)IIC contact: Matt Welsh, FAS

  • GlobalMMCS Web Service MCU ArchitectureGateways convert to uniform XGSP MessagingHigh Performance (RTP) and XML/SOAP and ..Use Multiple Media servers to scale to many codecs and manyversions of audio/video mixingNB Scales as distributedWeb ServicesNaradaBrokeringWeb Services:Session ControlAudio MixerVideo MixerCodec ConversionHelix Real StreamingPDA ConversionH323/SIP GatewaysThumbnails

    Plus NaradaBrokering Message serversand routers

    Release May 15As independent can replicate as neededExample of stream handlingNeeds a Grid Farm

  • Integrao de PDA, Cell phone e Desktop Grid Access

  • GRIDComputao Tcnica e CientficaProcessamento, Dados, Visualisao, InstrumentaoComputadores de Alto Desempenho, Redes de Alta VelocidadeAcesso RestritoUsurios: EspecialistasComputao Comercial

  • IntroduoSergio Takeo Kofuji

  • Nveis de Gridnico Domnio AdministrativoDepartmental CampusCorporativoRegionalEstadualNacional Global

  • GRIDs de AplicaesGrid Mdico & SadeBiblioteca Digital MultimdiaGrid de SensoresGrid de Computao PervasivaGrid de Fsica, Biotecnologia, AmbientalGrid de ColaboraoMedia (Film) Production and Distribution GridRastreamento de Objetos/Animais/PessoasJogos

  • GRID - TiposProcessamento Supercomputador Virtual Supercomputer; Cluster Virtual etcDados/Armazenamento (Storage )Intrumentao e SensorVisualisaoColaborao

  • GRID Componentes FsicosEquipmentosVector/Parallel Supercomputer (NEC SX7, CRAY)MPPs Clusters(SMP) ServersDesktopsNotebooksPalmtopsCell PhonesSensors and ActuatorsRFIDs

  • GRID - TendnciasPervasivoHeterogneoDinmicoOrientado a Servios

  • Acesso Pervasive ao GridThe Wall

  • Grids - Evoluo1a. GeraoComputationally intensive, file access/transferBag of various heterogeneous protocols & toolkitsRecognizes internet, ignores webAcademic Team

    2a. GeraoData intensiveKnowledge intensiveService based architecture Recognizes Web and Web servicesGlobal Grid ForumIndustry participation

  • Grid - Arquitetura Simplificada

  • Padres Abertos de GridIncreased functionality,standardizationCustomsolutions19901995200020052010

  • Servios Grid e Web - ConvergnciaGridWebWSRF - Comunidades de Grid e Web podemse mover para uma base comum

  • Orientao a ServioConstrudo sobre os conceitos de Servio e MensagensA service is the logical manifestation of some physical or logical resources (like databases, programs, devices, humans, etc.) and/or some application logic that is exposed to the network andA message is a unit of communication for exchanging information. All communication between services is facilitated by the sending and receiving of messagesFrom Savas Parastatidis

  • ServioContratoDescribes the format of the messages exchangedDefines the message exchange patterns in which a service is prepared to participatePolticaDeclaratively describe service interaction requirements, quality of service, security, etcFoco em mensagens (message-orientation)From Savas Parastatidis

  • Troca de Mensagens entre ServiosService-orientation (and Web Services) helps architects achieve the following properties (but do not guarantee them)Scalability, encapsulation, maintenance, re-use, composability, loose coupling, etc.

    From Savas Parastatidis

  • Service-orientation vs Resource-orientationService-orientationResource-orientationObject-orientationFrom Savas Parastatidis

  • ServioFrom Savas Parastatidis

  • Aplicao Orientada a ServioFrom Savas Parastatidis

  • A Cluster-based Service-oriented ApplicationFrom Savas Parastatidis

  • An Intranet Service-oriented ApplicationFrom Savas Parastatidis

  • An Internet-scale Service-oriented ApplicationFrom Savas Parastatidis

  • Servio WebEspecificaes paraSecurityOrchestrationReliabilityPoliciesFederationManagementetc.

    From Savas Parastatidis

  • Peer-to-peer x GridP2P and Grid share a lot of common idealsBoth are services communicating by messages on shared resourcesP2Ps tend to be more dynamic than Grid (Grid resources are usually quite static)P2P applications are long-lived (i.e. everyone on the network shares a similar goal of file sharing)Grid applications tend to be transientP2Ps often tend to be very fault tolerantMultiple redundancy tends to be built inLack of security is a significant difference between P2Ps and GridP2Ps dont support the idea of VOs effectively (but nothing to stop individuals organizing themselves)

  • Grid - NveisDesktop Cycle Aggregation Desktop only United Devices, Entropia, Data SynapseCluster & Departmental Grids Single owner, platform, domain, file system and location SUN SGE, Platform LSF, PBSEnterprise Grids Single enterprise; multiple owners, platforms, domains, file systems, locations, and security policies SUN SGE EE, Platform MulticlusterGlobal Grids Multiple enterprises, owners, platforms, domains, file systems, locations, and security policies Legion, Avaki, GlobusGraph borrowed from A.Grimshaw

  • OGSA,WSRF,GT4

  • Servio Web - Arquitetura

  • WS Componentes/ArqHTTP ServerApache HTTP ServerApplication ServerApache Tomcat SOAP EngineApache AXIS Web ServiceYou write thisSoftware stack used by GT4 WSRF Implementation

  • Servio Web - Invocao

  • Servio Web: Stateless

  • Servio Web: Stateful

  • Access Grid

  • Access GridAccess Grids

  • Access Grid

  • Access Grid VisionTo create virtual spaces where distributed people can work together.Challenges:Globally Distributed ParticipantsDistributed Resources: Computational, Storage, Networks, and PeopleCoordination, Scheduling, TrustHeterogeneous Collaboration ResourcesSolution:Deploy a set of Collaboration Resources to serve as the platform for building the rest.

  • Virtual Venues ClientWhat can be done:Sharing DataShared ApplicationsIntegration with existing Scheduling softwareApplications:Distributed PowerPointShared Web browserWhiteboardShared Desktop ToolShared Visualization ToolsIntegrate legacy workflowsAPS SER-CATBeam line ControlsData ProcessingData AnalysisArchiving/ReportingFusion CollaboratoryCollaborative Control RoomAtmospheric ScienceShared Data Visualization

  • The Virtual Venues Client

  • Wide of Client PlatformsSupported HardwareAdvanced Node Tiled Display, Multiple Video Streams, Localized AudioRoom Node Shared Display, Multiple Video Streams, Single Audio StreamDesktop Node Desktop Monitor, Multiple Video Streams, Single Audio StreamLaptop Node Laptop Display, Single Video Stream, Single Audio StreamMinimal Node Compact Display, Single Video Stream, Single Audio StreamSupported PlatformsWindows XP/2000Linux variants (RedHat, Slackware, Fedora, Debian, )Mac OS X (in the future)

  • Access Grid SecurityPublic Key Infrastructure (PKI) based security.Each user, server, and service has an identity certCommunications use SSL (via Globus Toolkit)SSL providesMutual Authentication (each pair of peers knows the identity of the other)ConfidentialityAuthorization is difficult, so we make it easier

  • Architectural Overview

    Text

  • History and Growth

    Chart2

    002

    006

    0010

    0027

    0027

    0045

    0075

    0085

    0090

    0095

    00101

    00150

    11151

    89159

    2938188

    2664214

    51115265

    70185335

    58243393

    88331481

    220551701

    117668818

    115783933

    53836986

    889241073

    10910331174

    12611591271

    11512741360

    8613601395

    Certificates Issued

    Cumulative Certificates Issued

    Number of Nodes

    Access Grid Growth

    Sheet1

    TimeCerts IssuedCumulativeNumber of NodesMonthCertificates IssuedCumulative Certificates IssuedNumber of Nodestrend

    Jan-990020Jan-99002037111.5161110233

    Jun-990060Jun-990060

    Jan-0000100Jan-0000100

    Jun-0000270Jun-0000270

    Nov-0000270Nov-0000270

    Feb-0100450Feb-0100450

    Nov-0100750Nov-0100750

    Feb-0200850Feb-0200850

    Mar-0200900Mar-0200900

    Apr-0200950Apr-0200950

    May-02001010May-02001010

    Nov-02001500Nov-02001500

    Jan-0311151-1Jan-0311151-1

    Feb-03910160-9Feb-0389159-8

    Mar-033949199-39Mar-032938188-29

    Apr-033079229-30Apr-032664214-26

    May-0373152302-73May-0351115265-51

    Jun-0394246396-94Jun-0370185335-70

    Jul-0375321471-75Jul-0358243393-58393

    Aug-03133454604-133Aug-0388331481-88

    Sep-03306760910-306Sep-03220551701-220

    Oct-032289881138-228Oct-03117668818-117

    Nov-0319211801330-192Nov-03115783933-115

    Dec-0311512951445-115Dec-0353836986-53501

    Jan-0419514901639-195Jan-04889241073-88

    Feb-0422217121861-222Feb-0410910331174-109

    Mar-0424019522101-240Mar-0412611591271-126

    Apr-0423821902339-238Apr-0411512741360-115

    May-0412823182467-128May-048613601395-861395

    0

    Sheet1

    Certificates Issued

    Cumulative Certificates Issued

    Number of Nodes

    Access Grid Growth

    Sheet2

    Sheet3

  • CrossroadsUp to now weve been serving the AG community (ourselves) exclusivelyThe introduction of this technology can revolutionize science, deploying into new communities will change the social landscape.Its time for the Access Grid to grow up.So how are we going to get there?Become a service organization?Abandon the software and move on?Adapt and evolve!

  • Application DeploymentsIntegration with various user communitiesExisting AG Community1000 public + 300 private: 1000 usersFusion Collaboration > 1000 (@ 40 institutions)ANL Advanced Photon Source CAT Teams4000 usersCenter for Learning and Multimodal CommunicationABC Collaboration (Surgical Teaching)Introduces new requirements and refinements of existing requirementsIncreases User base

  • Toolkit Research DirectionsDeeper integration with Grid ComputingCompute Resources, Data ResourcesPublish/Subscribe Service ModelsPeer to Peer Applications and ServicesInvestigate much more interesting node solutionsExtending Security WorkMore Authentication SolutionsMore Authorization SolutionsEngage new communities

  • AGTk 2.2 3.0Network ServicesExample ServicesNode ManagementUser/Venue ServicesChat ImprovementsFirewall optimizationsPort to OS XCommunity ServiceOperators InterfaceNew Node ServicesHigh Quality VideoDisplay (with layout)Camera ControlInteroperabilitySOAP (ZSI, Apache)Language (java, C/C++)Grid Data Integration

  • Roadmap II: AGTk 3.0 ???Advanced Node ConfigurationsSupport for more Active SpacesBetter audio environments, high quality videoBetter media synchronizationIntegrated instruments (telescopes, beamlines, )This will cost: BandwidthMinimal Node ConfigurationsHandheldsSet-top box configurationsAdvanced Collaboration ServicesStream processing, modifyingData mining from StreamsThis will cost: Latency

  • Roadmap II: AGTk 3.0 ???Grid Compute ResourcesAuthentication FlexibilityCertificate Authority ServicesApplication Integration (for specific targets)Workflow SupportApplication Hosting Services

  • [email protected]

    Want to position competitors as you go through this slide.Desktop Cycle Aggregation (United Devices, Entropia, Data Synapse) Low-end, desktop cycle aggregation, limited traction in corporate AmericaCluster & Resource Management (Platform LSF, PBS, SUN SGE) Single owner, typically a dept/group; single domainCampus/ Enterprise Grid: (SUN SGEE, Platform Multi-cluster) Multiple owners, multiple file systems, multiple domains, VPNGlobal Grid: (Globus) Multiple sites across domains, geographies and WANs; Internet-wideShow scheduling solutions, whats our scheduling software

    Screenshot including rasmol