Tendências e Evoluções em Armazemamento de Dados

Post on 12-Feb-2017

111 views 1 download

Transcript of Tendências e Evoluções em Armazemamento de Dados

Tendências e Evoluções em

Armazenamento de Dados

HELLO!• Jefferson Especialista de Storage no

Walmart.com• Processamento de Dados, Fatec de São

Paulo • Experiência em alta criticidade em sistemas

de armazenamento de dados• SNIA Certificate .

AGENDA

• Storage Architecture overview • Ceph overview• Rados Gateway overview• Rados Gateway architecture• RBD• CephFS• Object Storage Multi-site

Storage Overview

BEFORE S3 AFTER S3

“ Before S3 ”

R5B1

R6BqR

5C1

R6CpR

5Dp

R6D1

R5A1

R6A1 R

5B2

R6B1R

5Cp

R6CqR

5D1

R6Dp

R5A2

R5A2 R

5Bp

R6B2R

5C2

R6C1R

5D2

R6Dq

R5A3

R6Ap R

5B3

R6BpR

5C3

R6C2R

5D3

R6D2

R5Ap

R6Aq

CONTROLER

DISK

DISK

DISK

DISK

DISK

DISK

DISK

DISK

DISK

DISK

DISK

DISK

DISK

DISK

DISK

DISK

DISK

DISK

DISK

DISK

CONTROLER

Raid Group 0

Raid Group 2

Raid Group 1

Protection

LUNS or Volumes

COMPUTER DISK

COMPUTER DISK

COMPUTER DISK

COMPUTER DISK

COMPUTER DISK

COMPUTER DISK

COMPUTER DISK

COMPUTER DISK

COMPUTER DISK

COMPUTER DISK

COMPUTER DISK

COMPUTER DISK

COMPUTER DISK

Storage Features• Snapshots • Map to Any • Clone • Volume Copy • Virtual Storage• Thin Provisioning• Dedup• Replication

2 Tipos de Storages

CONTROLER

DISK

DISK

DISK

DISK

DISK

DISK

DISK

DISK

DISK

DISK

DISK

DISK

DISK

DISK

DISK

DISK

DISK

DISK

DISK

DISK

CONTROLER

Block

Unified

SANFC/ISCSI

NASETHERNET

/DEV/SDAOU DRIVE F:

CIFS \\IP\SHARENFS IP:/MOUNT

Ext4

4k 4k 4k

• Ext4• XFS• BRFS• ZFS

“ S3 ”

O Conceito

March 2006Simple Storage Service

Escalável

Sem Pontos De Falha

Rápido

Barato

Simples

RBDCephFS

MONCRUSH

OSD

PG

radosgw

Sage A. Weil

Community ceph-community@ceph.comceph-users ceph-users@ceph.com

Ceph Versions Argonaut – on July 3, 2012Bobtail (v0.56) – on January 1, 2013Cuttlefish (v0.61) – on May 7, 2013Dumpling (v0.67) – on August 14, 2013Emperor (v0.72) – on November 9, 2013Firefly (v0.80) – on May 7, 2014Giant (v0.87) – on October 29, 2014Hammer (v0.94) – on April 7, 2015Infernalis (v9.2.0) – on November 6, 2015Jewel (v10.2.0) – on April 21, 2016

Open source (LGPL license) Software defined storage distributedNo single point of failure Massively scalableSelf healing Unified storage: object, block and file

Ceph

Ceph architecture

CEPHFS

A distributed file

system with POSIX

semantics and scale- out

metadata management

RGW

A web services

gateway for object

storage, compatible with

S3 and Swift

RBD

A reliable, fully-

distributed block device

with cloud platform

integration

LIBRADOSA library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby,

PHP)

RADOS

APP HOST CLIENT

Rados Reliable Distributed Object Storage Replication Flat object namespace within each poolStrong consistency (CP system) Infrastructure aware, dynamic topology Hash-based placement (CRUSH)

3 até 10.000 OSDs p/ um ClusterOne per Disk Server Stored object to client Intelligently peer to replication

OSD

Maintain cluster membership and state Provide consensus for distributed decision-making Small, odd number These do not serve stored objects to clients

Monitor node

M

M

Object Placement

Pool

placement group (PG)

CRUSH(pg, cluster state, rule) =

[A, B]

OBJ

OBJ O

BJ

OBJ O

BJ

OBJ O

BJ

OBJ O

BJ

OBJ O

BJ

OBJ O

BJ

OBJ O

BJ

OBJ O

BJ

OBJ O

BJ

OBJ

BINARY

ID

METADATA

OBJ

CrushMap

Data Center

Rack1

Host1

OSD OSD

Host2

Rack2

Host3

OSD OSD

Host4

Rack3

Host5

OSD OSD

Host6

#ceph osd setcrushmap -i crushmap-filename

# begin crush map# devicesdevice 1 osd.1device 2 osd.2

host Host1 {id -1alg strawhash 0 # rjenkins1item osd.1 weight 3.500

}host Host2 {

id -2alg strawhash 0 # rjenkins1item osd.2 weight 3.500

}rack Rack1 {

id -4alg strawhash 0 # rjenkins1item Host1 weight 3.500

}rack Rack2 {

id -4alg strawhash 0 # rjenkins1item Host2 weight 3.500

}

datacenter DataCenter {id -5alg strawhash 0 # rjenkins1item Rack1 weight 3.500item Rack2 weight 3.500

}rule data {

ruleset 0type replicatedmin_size 1max_size 3step take DataCenterstep chooseleaf firstn 0 type rack step emit

}

Rados Gateway overview

RGW

A web services

gateway for object

storage, compatible with

S3 and Swift

LIBRADOSA library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby,

PHP)

RADOS

APP

RADOS CLUSTER

M M

M

RADOSGWLIBRADOS

APLICATION

REST

SOCKET

RGW Components

• Frontend

• FastCGI - external web servers

• Civetweb– embedded web server

• Rest Dialect S3

• Swift

• Other API

• Execution layer – common layer for all dialects

$ radosgw-admin user create --display-name="johnny rotten" --uid=johnny

access_key": "TCICW53D9BQ2VGC46I44”secret_key": "tfm9aHMI8X76L3UdgE+ZQaJag1vJQmE6HDb5Lbrz”

API :Java, Python, C++, ruby , Perl, C#

HTTP REST: DELETE/GET/PUT /{bucket} HTTP/1.1 Host: cname.domain.com Authorization: AWS {access-key}:{hash-of-header-and-secret}

S3 CLOUD CLIENT:

S3cmd, cyberduck , s3fs

S3cmd ls s3://bucket_name/file.txt

RBD

RBD

A reliable, fully-

distributed block device

with cloud platform

integration

LIBRADOSA library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby,

PHP)

RADOS

HOST

RBD

• Thinly provisioned• Resizable images• Image import/export• Image copy or rename• Read-only snapshots• Revert to snapshots• Ability to mount with Linux or

QEMKVM clients!

RDB(module)

Librados

RADOS

VM/Host

RBD connectors

RADOS

RDB(module)

Libvirt

rbd create --size 1024 POOL/IMAGErbd resize --size 2048 IMAGE (to increase) rbd resize --size 2048 IMAGE --allow-shrink (to decrease)

$ sudo apt-get install ceph-common$ sudo modprobe rbd

ceph-authtool --print-key /etc/ceph/keyring.admin

sudo echo “mon:6789 name=admin,secret=AQDVGc5P0LXzJhAA5C019tbdrgypFNXUpG2cqQ== rbd IMAGE" | sudo tee /sys/bus/rbd/add

$ sudo mkfs.xfs /dev/rbd0 $ sudo mount /dev/rbd0 /mnt/

CephFS

CEPHFS

A distributed file

system with POSIX

semantics and scale- out

metadata management

LIBRADOSA library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby,

PHP)

RADOS

CLIENT

CephFs Overview

Filesystem lookup by inode

INO

Metadata Server

OID

ONO

PGID

OBJ Ceph Crush

Obj

Obj

Obj

File

1

2

3 4 5

6

Librados

RADOS

Metadata Servers

• POSIX-compliant file system• Linux Kernel Clientt

• Mount –t ceph 1.2.3.4:/• /mnt

• Export (NFS), Samba(CIFS) • Ceph-fuse • Recursive Directory Stats

• FileSize • File and Directory Count • Modification Time

• Libcephfs.so • your app • samba • Hadoop • Ganesha(NFS)

CephFS na Prática :

ceph-deploy mds create myserver ceph osd pool create fs_data ceph osd pool create fs_metadata ceph fs new myfs fs_metadata fs_data mount -t cephfs x.x.x.x:6789 /mnt/ceph

Object Storage Multi-site • Replication• 3 Million Objects

SITE A

CEPH CLUSTER

RADOSGW

SITE B

CEPH CLUSTER

RADOSGW

S3 S3

• 800 Objs/s• Active/standby

THANKS!Any questions?You can find me at:Jefferson · Jefferson22alcantara@gmail.com