INSPIRE ready SDI using docker · Myth: With docker you can have things working in a couple of...

62
INSPIRE ready SDI using docker

Transcript of INSPIRE ready SDI using docker · Myth: With docker you can have things working in a couple of...

INSPIRE ready SDI using docker

Jorge S. Mendes de Jesus: [email protected]

Joana Simões: [email protected]

Paul van Genuchten: [email protected]

@JMendesdeJesus @Doublebyte @pvangenuchten

Contents*:

● Docker for your SDI○ SDI ready containers○ INSPIRE example containers○ Geocat Live docker structure and experience

● INSPIRE validation challenge

● Introduction to Docker○ Containers○ DockerHub and repositories○ Docker compose○ Volume sharing and filesystem○ Docker-machine ○ Resilience

● Pets vs Cattle○ Pets versus Cattle model○ No more: "But it works on my machine"

*) Only 1 cat will be shown in this workshop, for more cats see https://delawen.github.io/slides/2017/keynote

Introduction to docker

Linux containers (LXC)

● Linux kernel allows for Namespaces

● Control groups (cgroups)

● chrootIn a nutshell:

● We can have isolated 'containers' sharing same kernel

● Cgroups allows control over resources (network, CPU, memory)

● Chroot in steroids

Yes you (finally) can….

LXC versus Virtual Machines

● Virtual Machines are an industry standard and are based on hypervisor technologies

● An hypervisor allows you to run VM is hardware or guest OS

● Total independ OS sharing same hardware

LXC versus Virtual Machines

● Virtual Machines run full operating systems

● Linux containers are "containers/sections" running of the host OS and sharing kernel and

resources

● Imagine, VM is a family house with garden, roof, plumbing. A LXC is an apartment in a

building sharing heating, plumbing, roof

● Myth: LXC is not a lightweight virtualization, we don't have an hypervisor in LXC

LXC versus Virtual Machines

LXC versus Virtual Machines

● LXC prevent problems between dev and production (later we will explain why)

Docker

● Docker is a LXC ecosystem

● Why ecosystem????

● Docker provides LXC but also tools for

○ Deployment to servers/cloud (docker-machine)

○ Orchestration of services (docker-compose)

○ Docker repository (docker hub)

○ Clustering/Resilience (docker-swarm)

Docker containers

● Containers can run desired OS or service, example: Geoserver or Geonetwork

● Start by building an image based on a Dockerfile

● Dockerfile is a receipt on how to build an image

● Then the image is run as and independent system

● We start with a base image and add content on top

● Final image is a stack of multiple layers

Dockerfile - Generic Geonetwork

Dockerfile - Geonetwork + INSPIRE

● From the previous image, we run a script that downloads and sets the INSPIRE thesaurus. The new image supports INSPIRE functionality.

Geonetwork + INSPIRE

What is there for INSPIRE in GN?

- Enable INSPIRE view- Enable schematron files- Download INSPIRE themes

● Geonetwork using INSPIRE thesaurus

Docker hub and repositories

● An image can be shared to other users

● Images and dockerfiles are shared on Docker hub

● Users can search images

Docker hub and repositories● Running a Geoserver image:

> docker pull camptocamp/geoserver:2.9> docker run -d -p 8000:8000 camptocamp/geoserver:2.9

Docker hub and repositories

Docker hub and repositories

Myth: With docker you can have things working in a couple of minutes

Reality: Pull/Run is minutes, but normally you need more than a default build.

Myth: Docker is a stable technology

Reality: Docker is under development and sometimes things change and surprises happen

Myth: We don't need docker we can continue as before

Reality: You need docker, let's continue with the presentation

Docker compose and SDI

● What is an SDI ??? Well...multiple systems interacting

● Geographic data, metadata, webservices, users, system admins, developers, organizations, INSPIRE, service providers……..make an SDI

● Yes running an SDI is hard and you need experience and organization

Docker compose and SDI

● Docker-compose allows for orchestration of multiple containers, e.g: one container for geoserver and another for a postgis

● You write a yml file describing how things are organized, e.g geoserver runs on port 8080, geoserver waits for postgis to start….etc etc

● As naming convention we tend to use docker-compose.yml as file name

● Compositions are normally in github

Docker compose and SDI

> git clone https://github.com/jorgejesus/inspire-geoserver && cd ./inspire-geoserver > docker-compose build && docker-compose up

Docker compose and SDI

(schematron)

Docker compose and SDI

(schematron)

Volume sharing and filesystem● Docker uses a union filesystem (normally AUFS)

● Imagine files/directories as a layer, that is overlaid to create a filesystem

● Changes are done in the last added layer and if not committed will be lost

● Docker running containers don't keep data (unlike VMs)

Volume sharing and filesystem● Docker you need to create a volume (somewhere to keep your data)

● From the geoserver example (/usr/local/tomcat/webapps/ROOT/data folder will be permanente on host folder $HOME/geoserver/data):

> docker run -d -v $HOME/geoserver/data:/user/local/tomcat/webapps/ROOT/data -p 8000:8000 camptocamp/geoserver:2.9

Volume sharing and filesystem

● Volumes don't necessarily have to be on host

● NFS, amazon S3, HDFS (Hadoop), CephFS,

● Docker has a volume plugin for Interplanetary FileSystem

● It sounds cool and it is cool...out of scope of presentation….

Your data is free to be anywhere, host or cloud services

Docker machine and cloud deployment

● What if docker-compose would run on a VM

● Better what about running it on an Amazon instance or droplet ?

> docker-machine create -d digitalocean --digitalocean-access-token=<TOKEN> --digitalocean-size 4gb --digitalocean-region ams2 mySDI> eval $(docker env mySDI)

> docker-compose build && docker-compose up

● Dockerfile creates a container, docker compose for multiple containers and docker machine for deployment

Docker machine and cloud deployment

● Docker starts to be more than a tool but a also a deployment methodology

Swarm mode and resilience

● Starting in docker 1.12 and better integrated is current version (v17)● Integrated into docker-compose (better support on 1.16)

● We can run containers as clusters (different server)● Scaling services (adding more containers)● Load balance between containers● Mesh network

● Load balance between containers● Mesh network (point to any IP of a container)

Swarm mode and resilience

● RESILIENCE, RESILIENCE RESILIENCE

PAUSE, relax, questions

Pets versus Cattle

Pets:● One of few servers● Properly named

(myserver.org)● You take care if server is

down

Cattle:● Multiple servers● Generic name

(server02.org)● Server down, kill it

Pets versus Cattle

Pets:● Single points failure● Need load balancers● "Personal configurations"

Cattle:● Clustering● Internal load balance● Generic configurations

SDI should be treated like cattle*How do you treat your SDI?

*) we should respect all animals

No more: "But it works on my computer"

● Start your SDI in your local computer using docker● All the misery of setting up the SDI preparing INSPIRE and data will

be only locally

● How many times you develop something or install SDI locally, but production server is different ????

● Problems and more problems, then when it is done it is very likely to work in production

● Geocat experience: Patience and time spent preparing docker will pay later

More sheep before next section

Comments ???

http://geocontainers.org

SDI ready containers

SDI ready containers

● Even a subsection concerning INSPIRE (a bit empty)

http://geocontainers.org

INSPIRE Geoserver● A docker composition with Geoserver and

PostGIS● Geoserver App-schema and INSPIRE plugin● Data and configuration loaded from INSPIRE

cookbook (OneGeology)

http://onegeology.org/docs/technical/GeoSciML_Cookbook_1.2.1.pdf

Docker compose and SDI - Live demo

http://10.10.10.2:8080/geoserver/wfs?request=GetFeature&service=wfs&version=2.0.0&typeName=gsmlgu:GeologicUnit&outputFormat=gml32&count=2

● App-schema + INSPIRE

> git clone https://github.com/jorgejesus/inspire-geoserver && cd ./inspire-geoserver> docker-compose build && docker-compose up

INSPIRE Geonetwork● A docker build of latest geonetwork

● Script adding, thesaurus, view and schematron

SDI ready containers

> git clone https://github.com/jorgejesus/inspire-geonetwork && cd ./inspire-geonetwork> docker build --name geonetwork -t gn . && docker run -p8080:8080 geonetwork

INSPIRE Geonetwork

INSPIRE Geonetwork

Geocat Live - Docker structure and experience

http://geocat.net/live

Geocat Live - Docker structure and experience

Geocat Live - Docker structure and experience

● Geocat live is a service that deploys instances of Geonetwork that are INSPIRE ready

● Geocat live a SaaS (Software as a Service)

● Geocat live deploys a complex docker-compose that creates an instance, on a cloud environment with logging, monitoring and backup

● System built using micro-services* approach (small REST services doing specific tasks)

*) Buzz word that actually works !!!

Geocat Live - Docker structure and experience ● If you are interested on having your SDI in the cloud and INSPIRE ready:

http://geocat.net/contact

The INSPIRE validation challenge

DTAP & INSPIRE● As a data provider you want to test your INSPIRE setup in a test

environment before it replaces the production copy

● Testing involves validating metadata, view- and download services and the links between them

● The configuration (metadata-url in wms-capabilities, service-url in metadata) uses production url’s for links, which do not resolve to items from the test environment

● Complete validation is impossible

The docker trick● Idea is to deploy a full configured SDI with (meta)data in a

docker environment and include the INSPIRE validator (ETF).● Configure the environment to route any traffic to any of the

production url’s to the relevant docker containers● The docker environment can run on your local computer

and, if successful, easily be deployed externally● Based on

http://geonetwork-opensource.org/manuals/trunk/eng/users/tutorials/inspire/view-geoserver.html

Esdin Test Framework● An INSPIRE validator framework developed in the scope of

ESDIN project

● Currently improved to include INSPIRE data validation

● An ETF docker is available at ● https://hub.docker.com/r/iide/etf-webapp/

INSPIRE view service with Geoserver/GeoNetwork

● Add data to database● Set up Workspace● Set up WMS Layer● Add dataset metadata● Add service metadata ● Update Geoserver with metadata links● Validate service

GeoNetwork

● Import dataset metadata from http://metadata.bgs.ac.uk/geonetwork/srv/en/iso19139.xml?id=6678

● Import service metadata from http://metadata.bgs.ac.uk/geonetwork/srv/en/iso19139.xml?id=4051

● Change dataset reference

Geoserver● On WMS service

○ Activate wms service○ Activate INSPIRE○ Limit the SRS’s (activate bounds for each srs)○ Add link to service metadata

● On layers○ Add link to metadata

ETF validation

Other relevant imagesdeegree (https://hub.docker.com/r/martinvi/deegree)

Daobs (INSPIRE dashboard; https://hub.docker.com/r/titellus/docker_dashboard/)

LDProxy (https://hub.docker.com/r/iide/ldproxy)

Hale command line (https://hub.docker.com/r/wetransform/hale-cli/)

GDAL (https://hub.docker.com/r/geodata/gdal/)

Have we reached the end???