Big data
-
Upload
tiago-albineli-motta -
Category
Technology
-
view
477 -
download
0
Transcript of Big data
![Page 1: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/1.jpg)
![Page 2: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/2.jpg)
Objetivo
![Page 3: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/3.jpg)
Recomendação de conteúdo
![Page 4: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/4.jpg)
Em 2010...
![Page 5: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/5.jpg)
Arquitetura tradicional
![Page 6: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/6.jpg)
Artesanato de paralelismo
página visitada Papalog
página visitada Papalog
página visitada Papalog
![Page 7: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/7.jpg)
Artesanato de paralelismo
página visitada GloboSocial
página visitada GloboSocial
![Page 8: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/8.jpg)
Machine Learning
![Page 9: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/9.jpg)
Revolução industrial
yarn
![Page 10: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/10.jpg)
Abstração: Foco no valor
df.groupBy(df("user"), df("object")) .agg(first("user"), first("object"), max("scroll")) .where(df("scroll") > 50)
![Page 11: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/11.jpg)
Coleta de atividades
página visitada
tempo assistindo video
compartilhamento
comentário
tempo lendo matéria
HorizonGateway
porcentagem de scroll
![Page 12: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/12.jpg)
Iterativo e incremental
![Page 13: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/13.jpg)
Resultados
![Page 14: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/14.jpg)
Globo Esporte
![Page 15: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/15.jpg)
BUG :(
Globo Esporte
![Page 16: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/16.jpg)
GShow
![Page 17: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/17.jpg)
75% a mais de conversão em mobile que outras ofertas automáticas
173% a mais de conversão em desktop que outras ofertas automáticas
GShow
![Page 18: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/18.jpg)
TechTudo
![Page 19: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/19.jpg)
TechTudo
+195%
na partipação na retenção do usuárioem 2014
![Page 20: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/20.jpg)
TechTudo: Home
![Page 21: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/21.jpg)
TechTudo: Home50% a mais de conversão em mobile que outras ofertas automáticas
32% a mais de conversão em desktop que outras ofertas automáticas
![Page 22: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/22.jpg)
GlobosatPlay
![Page 23: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/23.jpg)
GlobosatPlay
45% de melhoria de conversão
![Page 24: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/24.jpg)
Data Science
![Page 25: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/25.jpg)
@timotta
![Page 26: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/26.jpg)
Algoritmos de Machine Learning
![Page 27: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/27.jpg)
Content based
![Page 28: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/28.jpg)
Preferências do usuário
![Page 29: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/29.jpg)
TF-IDF
Importância do termo no documento
Quão incomum é o termo no acervo
![Page 30: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/30.jpg)
Entidades semânticas
![Page 31: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/31.jpg)
Encontrando a notícia certa
+BBB
+Edredon
Notícia c
Notícia B
Notícia AUsuário
![Page 32: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/32.jpg)
User based
![Page 33: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/33.jpg)
Collaborative filtering
![Page 34: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/34.jpg)
Matriz de preferências
![Page 35: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/35.jpg)
Preferências implícitas
Porcentagem de scroll
Temo de página visivel
![Page 36: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/36.jpg)
Matriz de preferências implícitas
0,9 0,8
0,8
![Page 37: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/37.jpg)
Fatores latentes
![Page 38: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/38.jpg)
Previsão n-dimensional
Duas dimensões:f(x) = a + bx
Três dimensões:f(x) = a + bx'1 + cx'2
N dimensões:f(x) = a + bx'1 + cx'1 + … nx'n
![Page 39: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/39.jpg)
Validação
![Page 40: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/40.jpg)
Cross validation
![Page 41: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/41.jpg)
K-fold cross validation
![Page 42: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/42.jpg)
Força bruta
for (maxIter <- Array(5, 10, 15, 20)) { for (feature <- Array(10, 20, 30, 40)) { for (alpha <- Array(0.01, 0.1, 0, 1, 10, 100)) { for (regParam <- Array(0.01, 0.1, 0, 1, 10, 100)) {
![Page 43: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/43.jpg)
Root mean square error
![Page 44: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/44.jpg)
Precision and recallQuanto dos documentos recomendados acertamos
Quanto dos documentos relevantes acertamos
![Page 45: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/45.jpg)
F-measure
![Page 46: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/46.jpg)
Métricas por algoritmo
![Page 47: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/47.jpg)
Baseada em testes AB
![Page 48: Big data](https://reader036.fdocumentos.com/reader036/viewer/2022081520/58f31a7d1a28ab0c318b458b/html5/thumbnails/48.jpg)
@timotta