Análise de Sentimentos e Dados podem mudar o jogo # ... · Análise de Sentimentos e Dados podem...
Transcript of Análise de Sentimentos e Dados podem mudar o jogo # ... · Análise de Sentimentos e Dados podem...
© 2013 IBM Corporation
IBM Research – Brazil
1
Análise de Sentimentos e Dados podem mudar o jogo #eitreinadores
Alan Braz – IBM Research
© 2013 IBM Corporation
IBM Research – Brazil
2
Alan Braz
IBM Research – BrazilResearch Software Engineer
2002:2005 BSc Ciência da Computação – UNICAMP2005 IBM GBS – Estagiário de desenvolvimento Web (Java)2005:2007 IBM GBS – Programador Java2007:2010 IBM GBS – Líder técnico e programador pleno2009:2012 IBM GBS – Instrutor e coach de Agile2009:hoje Metrocamp – Professor ES, RUP, Agile2010:2012 IBM GBS – Arquiteto de Software2009:2013 MSc Engenharia de Software (Agile) - UNICAMP2013:atual IBM Research
www.alanbraz.com.br@alanbraz
© 2013 IBM Corporation
IBM Research – Brazil
6
[Lawrence Page, Sergey Brin, Rajeev Motwani, Terry Winograd The PageRank Citation Ranking: Bringing Order to the Web. (1999) ]
© 2013 IBM Corporation
IBM Research – Brazil
12
Innovation and Comfort
Trial-and-Error:– start-ups
RADICALINNOVATION
INNOVATION
Science-Based:– scientific method
(empirical)
– logic deduction (mathematics)
© 2013 IBM Corporation
IBM Research – Brazil
13
System U: Modeling People from Social Media
Social behaviorse.g., when tweeting
Social behaviorse.g., when tweeting
Five Factor Model
•Openness•Conscientious•Extroverted•Agreeable•Neuroticism
Five Factor Model
•Openness•Conscientious•Extroverted•Agreeable•Neuroticism
Ford’s 12 “Universal Needs”
•Structure•Challenge•Excitement•Liberty•Harmony•Closeness
Ford’s 12 “Universal Needs”
•Structure•Challenge•Excitement•Liberty•Harmony•Closeness
• Practicality• Self-expression• Curiosity• Ideals• Love• Stability
Five Values
•Self-transcendence•Conservation•Self-enhancement•Hedonism•Openness-to-Change
Five Values
•Self-transcendence•Conservation•Self-enhancement•Hedonism•Openness-to-Change
© 2013 IBM Corporation
IBM Research – Brazil
16
The World is our Lab: 12 Labs Worldwide in 10 Countries
China WatsonAlmaden
Austin
JapanIsrael
Switzerland
India
Ireland
Australia
BehavioralScience Chemistry
ElectricalEngineering
ComputerScience
MaterialsScience
MathematicalScience Physics
ServicesScience
IBM Research world-wide has 1600+ PhDs with diversity of disciplines:
Africa
© 2013 IBM Corporation
IBM Research – Brazil
1717 09/09/13
IBM Research – Brazil view from Rio de Janeiro Lab
IBM Research – Brazil view from Rio de Janeiro Lab
Mission: To be known for our science and technology and vital to IBM, Brazil, our clients in the region and worldwide
© 2013 IBM Corporation
IBM Research – Brazil
18
IBM Research - Brazil
Natural resources modeling, analytics, and logistics.
Citizen engagement, and education.
Large service operations, human data analytics, and client experience.
Micro/nano- technologies aimed at addressing smarter planet challenges.
Rio de Janeiro
São Paulo
A team of world class researchers in close connection to the other 12 IBM Research labs an to the world’s best scientific, academic, and development communities.
© 2013 IBM Corporation
IBM Research – Brazil
19
Total precipitation forecast 48 hours ahead
90x90km at 1km resolution
high resolution area
Rio de Janeiro Incident Management Research technologies on fine-Grained Weather and Flood Modeling and Prediction
19
Rio de Janeiro City
Flooding points
Flooding prediction
topography
© 2013 IBM Corporation
IBM Research – Brazil
21
Service Systems Research
Mission: to re-invent information-based services such as banking, telecom, e-commerce, IT services, online education, and media&entertainment.
Goal: a top research group in human-computer interaction, social computing, human data analytics, service operations research, and cloud computing platforms.
Service Systems
HCI, CSCW, Design
Artificial Intelligence
Knowledge and Data Mining
Service Science
Distributed Computing
Operations Research
Social Computing
social business analytics
marketing, segmentatio,
analytics market
clientexperience technology
CRM tools,workforce
management,service quality
IT servicedelivery
operations
incident management, ITIL, staffing, automation
computing services
platformscloud
business, analytics business
Leader: Claudio Pinhanez
© 2013 IBM Corporation
IBM Research – Brazil
22
TEDxESPM - Claudio Pinhanez - IBM Research
http://www.youtube.com/watch?v=17W4ZhSB2Og
© 2013 IBM Corporation
IBM Research – Brazil
23 23
Project: Smart Map Services (in Smart Analytics Cloud)
Smart Map Services: translates application/data profile, SLA and expected QoS into optimal virtual cluster configuration using performance prediction models(cooperation with the team led by Isabelle Rouvellou).
Short Term Challenge: What are the required mechanisms to handle inaccurate performance models when generating virtual cluster configurations?
Long Term Challenge: How understanding user interaction with the Cloud and context-awareness can optimize Cloud Services and enhance user experience?
Marco Netto, Marcos Assunção, Renato CunhaSmart Analytics Cloud
Project goal: demonstrate a Workload Optimized Cloud that can smartly map and execute Big Data/HPC workloads on a heterogeneous computing infrastructure.
Approach: align with and build on GTS’s Analytics Cloud to facilitate adoption in IBM SmartCloud offerings; ensure buy-in and close interlock with stakeholders in GTS (Offering/Architecture), STG (Platform Computing), SWG (Big Insight) teams.
© 2013 IBM Corporation
IBM Research – Brazil
2424
Project: Workload Profile Analytics in Service Delivery
Goal: develop a new analytics tool for incident management in GTS-ITD to:• characterize the performance of service delivery
workload profiles;• identify productivity/service quality issues (map
transformation impacts, delivery problems, etc.).
Input: selected fields from ticket data from:• ISM Maximo (current prototype);• Maximo Dispatcher (in development);
Victor Cavalcante, Claudio Pinhanez, Rogério de Paula, Ana Appel, Carolina Andrade, Nelson Nauata, Franklin Amorin
Input: ticket report, start, and finish times
11stst step: step: computes the log-computes the log-log density map of the log density map of the distribution of tickets distribution of tickets according to SLA-normalized according to SLA-normalized assignment and resolution assignment and resolution times – the times – the Workload Profile Workload Profile Chart Chart or or WPCWPC
100%
1000%
10000%
10%
1%
0.1%
0.01%
100% 1000% 10000%10%1%0.1%0.01%
SLA OK
1a
1b
1c
Output: list of productivity and service quality issues.
Large number of false alarms or tickets of extreme
easy resolution.
22ndnd step: step: using the WPC using the WPC translation chart, maps high translation chart, maps high concentration of tickets into concentration of tickets into specific issues of the incident specific issues of the incident management system.management system.
© 2013 IBM Corporation
IBM Research – Brazil
2525
Project: clientfaces – Communities of Account Knowledge and Expertise
Problem: dispersion and loss of knowledge about clients after SO delivery transformation from clusters to SLs (and GDF).
Approach: to create sustainable, attractive virtual communities to gather, integrate, and use knowledge and expertise related to each specific client.
Agenda
People
Status
Knowledge
Bluepages
ISM Maximo(tickets, changes)
Account SubscriptionUser Preferences
Q&A/Tips/Workarounds
Headlines/Announces/Changes
Quickr
DACMT
Account ticket history
Connections
ClientFaces
web
mobile
command center
clientfaces: web platform
clientfaces: mobile platform clientfaces: command center
Rogério de Paula, Vagner Santana, Silvia Bianchi, Claudio Pinhanez, Victor Cavalcante, Carina Missae, Alessandro Costa, Elaine Watanabe
cket statusoverview
client headlines
knowledgethumbnail
peoplethumbnail
client agenda
otherthumbnails
© 2013 IBM Corporation
IBM Research – Brazil
26
Project: Social Media Behavior Simulation
Goal: to create a tool for companies to explore the impact and result of social media actions through simulation.
Applications: exploration of effort size
and impact of marketing campaigns;
determination of counter-information measures in viral media outbreaks.
Maira Gatti, Ana Appel, Claudio Pinhanez, Rogério de Paula, Cicero dos Santos, Alexander Rademaker, Paulo Cavalin, Samuel Barbosa, Daniel Gribel
Romney’s Network 5.1M tweets 28,145 active users 5,498 followers
Obama’s Network 23,856,961 followersRomney’s Network 1,675,792 followers
Sample - Sept 22 to Oct 29, 2012Obama’s Network 5.6M tweets 24,526 active users 3,594 followers
Simulation of Obama/Romney Twittercampaigns in the last month before electionSimulation of Obama/Romney Twittercampaigns in the last month before election
© 2013 IBM Corporation
IBM Research – Brazil
27
Ei!
Claudio PinhanezMaira Gatti
Paulo CavalinCicero Nogueira
Alexandre RademakerAlan Braz
Rogerio de PaulaDaniel Grieber
IBM Research - Brazil
Tiago GomesFabio Silva
Robson RomãoSWG
Mais em: http://www.eitreinadores.com.br/
© 2013 IBM Corporation
IBM Research – Brazil
29
Ei! 194 Million Brazilians Helping their National Team’s Coach
An app made specifically for one person: Luiz Felipe Scolari, coach of the Brazilian national soccer team.
Ei! is an app that identifies, filters and analyzes all the Twitter comments that Brazilians have made during the games.
With the touch of a button, Scolari will know what the country consensus is on:
At half time: which players the audience are liking and hating, what changes should be made, which tactics should be explored, what player needs to be introduced…
After the game: his country’s perspective on how the team, the players and his performance as a coach.
© 2013 IBM Corporation
IBM Research – Brazil
30
HOW TO HEAR AND MAKE SENSE OF THE ONLINE CROWD IN REAL-TIME IN PORTUGUESE?
How to develop and deploy a solution in 2 months AND make sure it won’t fail?
500,000 views goal. Exceeded.
1,208,466 views as of 19.06
Buzz Video
Subscribers:
114 news subscribers
1900%
compared to previous week
Ei! Buzz Video The concept evolves around the passionate participation of Brazilians during a soccer event and their desire to express their opinions about their team and these important games. But, for the 1st time someone is listening. Ei, our Social Sentiment app. This high-tech solution is symbolized in the form of an ear. In a number of scenes reflecting the diversity of the Brazilian population and the wide range of futebol fans and “wanna-be” coaches, we show people in different situations during a football event (Confederations Cup is not mentioned expressively), using their “ear” devices. What they are actually saying is not heard and is not relevant. Important is to show the passion and enthusiasm of the complete Brazil population during this event. Voice over: Now everything you say, Brazil̀ s coach can hear. Meet Ei! An IBM tool that analyses all Twitter activity and provides insights around what millions of Brazilian s̀ think about our national team. CTA: You can be an assistant coach too. eitreinadores.com.br IBM logo/Ei chubby To be seen on: http://youtu.be/h5S9Tm0Lvpk
Link to Video
Ei! Buzz Video The concept evolves around the passionate participation of Brazilians during a soccer event and their desire to express their opinions about their team and these important games. But, for the 1st time someone is listening. Ei, our Social Sentiment app. This high-tech solution is symbolized in the form of an ear. In a number of scenes reflecting the diversity of the Brazilian population and the wide range of futebol fans and “wanna-be” coaches, we show people in different situations during a football event (Confederations Cup is not mentioned expressively), using their “ear” devices. What they are actually saying is not heard and is not relevant. Important is to show the passion and enthusiasm of the complete Brazil population during this event. Voice over: Now everything you say, Brazil̀ s coach can hear. Meet Ei! An IBM tool that analyses all Twitter activity and provides insights around what millions of Brazilian s̀ think about our national team. CTA: You can be an assistant coach too. eitreinadores.com.br IBM logo/Ei chubby To be seen on: http://youtu.be/h5S9Tm0Lvpk
Link to Video
Ei! Buzz Video The concept evolves around the passionate participation of Brazilians during a soccer event and their desire to express their opinions about their team and these important games. But, for the 1st time someone is listening. Ei, our Social Sentiment app. This high-tech solution is symbolized in the form of an ear. In a number of scenes reflecting the diversity of the Brazilian population and the wide range of futebol fans and “wanna-be” coaches, we show people in different situations during a football event (Confederations Cup is not mentioned expressively), using their “ear” devices. What they are actually saying is not heard and is not relevant. Important is to show the passion and enthusiasm of the complete Brazil population during this event. Voice over: Now everything you say, Brazil̀ s coach can hear. Meet Ei! An IBM tool that analyses all Twitter activity and provides insights around what millions of Brazilian s̀ think about our national team. CTA: You can be an assistant coach too. eitreinadores.com.br IBM logo/Ei chubby To be seen on: http://youtu.be/h5S9Tm0Lvpk
Link to Video
Most Popular Posts on IBM Brasil Facebook
109 likes, 138 shares 7 comments, 278 clicks
77 likes, 56 shares 4 comments, 137 clicks
114 likes, 29 shares 1 comment, 115 clicks
133 likes, 139 shares 4 comments, 51 clicks
Would you like your opinion to be heard on the field? With IBM this is possible, watch the video and share it.
The first Social Sentiment analyzis done by Ei! Is on ibm site. Check it out and see which result is the most surprising.
And our Ei! app reached Filipao!
Via his PR representative we know Filipao Scolari’s supporting team accessed the app. Yes, Filipao, every Brazilian is indeed a futebol coach
© 2013 IBM Corporation
IBM Research – Brazil
31
Challenges• Real-time issues
– Up to 5 million tweets per match– Up to 20 thousands tweets per minute
• Texting x Writing: Casual language• nao disse , Balotelli ia meter gol hoje , um golaço ainda , madero aquele negoo
• hora de colocar o Leandro né Felipão ? u.u
• vou ser repetitivo de novo , mas : na minha epoca de jovem torcedor da seleção brasileira , brasil nao tomava gol de p### de chile não viu
• jah to vendo o Brasil faze nois passa vergonha na copa ! ! ! pq meu g-zuis ...
• acho q o ronaldinho tem que ser totula
• Com todo o respeito , Luis Fabiano , popcorn men hahahahaha beijo para quem entendeu , pior piada ever ! Haha
© 2013 IBM Corporation
IBM Research – Brazil
33
Millions of events per second
Microsecond Latency
Traditional / Non-traditional data sources
Real time delivery
PowerfulAnalytics
Algorithmic Trading
Telco ChurnPrediction
SmartGrid
CyberSecurity
Government /Law enforcement
ICUMonitoring
EnvironmentMonitoring
VolumeTerabytes per secondPetabytes per day
Variety All kinds of dataAll kinds of analytics
Velocity Insights in microseconds
InfoSphere StreamsA Platform for Real Time Analytics on BIG Data
Key Big Data Challenge – Velocity
© 2013 IBM Corporation
IBM Research – Brazil
34
Streams Runtime Illustrated
x86 host x86 host x86 host x86 host
Optimizing scheduler assigns jobs to hosts, and continually manages resource allocation
Optimizing scheduler assigns jobs to hosts, and continually manages resource allocation
Commodity hardware – laptop, blades or high performance clustersCommodity hardware – laptop, blades or high performance clusters
MetersCompany Filter
Usage Model
Usage Contract
Text Extract
Season Adjust
Daily Adjust
Temp Action
© 2013 IBM Corporation
IBM Research – Brazil
35
x86 host x86 host x86 host x86 host x86 host
Optimizing scheduler assigns PEs to hosts, and continually manages resource allocation
Optimizing scheduler assigns PEs to hosts, and continually manages resource allocation
Commodity hardware – laptop, blades or high performance clustersCommodity hardware – laptop, blades or high performance clusters
MetersCompany Filter
Usage Model
Usage Contract
Temp Action
Dynamically add hosts and jobsDynamically add hosts and jobs
New jobs work with existing jobsNew jobs work with existing jobs
Text Extract
Degree History
Compare History Store
History
Meters
Season Adjust
Daily Adjust
Text Extract
Streams Runtime Illustrated
© 2013 IBM Corporation
IBM Research – Brazil
36
Ei!: a Twitter-Monitoring Tool of Brazilian Soccer
IBM Brazil is analyzing all public tweets in Portuguese language about all games of the Brazilian soccer team performance during the FIFA Confederations Cup.
The analysis is performed by the Ei! tool developed by IBM Research - Brazil specifically for the event.
© 2013 IBM Corporation
IBM Research – Brazil
37
Ei! is Built on FAMA: Real-Time Social Media Polarity Analysis Tool for Portuguese Language
FAMA is social sentiment analysis tool for the Portuguese Language developed by IBM Research - Brazil
FAMA processes text related to topics of interest which appear in social media: Twitter, Facebook, ReclameFacil, etc.; or in private text repositories such as customer complaints or call center logs.
FAMA can determine polarity related to the topics of interest: positive, negative, or neutral.
FAMA can find most commonly used terms and their co-occurrences with the topics of interest. “FAMA”
Greek goddess of gossip and rumor
© 2013 IBM Corporation
IBM Research – Brazil
38
FAMA: Real-Time Social Media Polarity Analysis in Portuguese
38
Text Classifier
classifieddatabase
Stream Computin
g
Infosphere Streams
learneddatabase
JSONs
TextAnalytics
dashboard user interface
FAMA
© 2013 IBM Corporation
IBM Research – Brazil
39
FAMA Analysis of a Tweet: Example of Text Classification
vou ser repetitivo de novo , mas : na minha epoca de jovem torcedor da seleção brasileira , brasil nao tomava gol de p### de chile não viu
vou ser repetitivo de novo , mas : na minha epoca de jovem torcedor daseleção brasileira
brasil nao tomava gol de p### de chile não viu
feature: bad word
verbs: vou, ser, tomavanoums: epoca, brasil, gol, chile, seleçãoadjectives: repetitivo, jovem, brasileira, palavrão
vou: ir (to go)ser: ser (to be) tomava: tomar (suffer)p###: palavrão (bad word)
© 2013 IBM Corporation
IBM Research – Brazil
40
FAMA Offline System for Learning Polarity
Naïve BayesClassifier
manuallyannotated
corpus
old games database
learneddatabase
WordNet
SynsetClassifier
in development
© 2013 IBM Corporation
IBM Research – Brazil
41
Construction of the Learned Database from Manual Analysis of Tweet Samples
The data for the learned database is created by manual inspection of tweets:
about 2000 tweets from 4 friendly matches
15 different coders with different degrees of interest and knowledge of soccer
uses tool to display, collect, and process the data.
© 2013 IBM Corporation
IBM Research – Brazil
42
Algorithm for Offline Learning of the Polarity of Wordsfrom Twitters in the Learned Database
nao disse , Balotelli ia meter gol hoje , um golaço ainda , madero aquele negoo
ainda
maneiro negroBalotelli
meter
hoje
gol
nao
unigrams
Class c = {Positive, Negative, Neutral}Feature f = normalized wordsm features in a tweet dni(d) represents the counts of feature fi
Positive
Polarity of a word is determined by how often the word (after lematization) appears in manually annotated samples which were considered positive or negative.
irdizer
© 2013 IBM Corporation
IBM Research – Brazil
43
Analyzing a Tweet: Overall Polarity
vou ser repetitivo de novo , mas : na minha epoca de jovem torcedor da seleção brasileira , brasil nao tomava gol de p### de chile não viu
stop words are removed
feature Positive Negative Neutral
ir 0.01% 0.05% 0.07%ser 0.02% 0.02% 0.08%repetitivo 0.01% 0.07% 0.02%de novo 0.02% 0.05% 0.03%mas 0.02% 0.04% 0.08%epoca 0.01% 0.01% 0.02%jovem 0.09% 0.04% 0.02%torcedor 0.07% 0.04% 0.05%seleção 0.09% 0.08% 0.05%brasileira 0.08% 0.03% 0.07%brasil 0.05% 0.07% 0.04%não 0.04% 0.08% 0.03%tomar 0.04% 0.06% 0.08%gol 0.08% 0.04% 0.01%palavrão 0.01% 0.09% 0.04%chile 0.02% 0.03% 0.06%não 0.04% 0.08% 0.03%ver 0.04% 0.03% 0.07%PROD 7.43E-64 4.68E-61 9.10E-62
POLARITY 0.1% 83.6% 16.3%
from learned database
© 2013 IBM Corporation
IBM Research – Brazil
48
Players and Main Topics - Positive
1st half 2nd half
goalsaved
2x0 3x01x0
penaltylost
© 2013 IBM Corporation
IBM Research – Brazil
49
Players and Main Topics - Negative
1st half 2nd half
goalsaved
2x0 3x01x0
penaltylost
© 2013 IBM Corporation
IBM Research – Brazil
50
© 2012 IBM Corporation
1o Tempo - Palavras Mencionadas Conjuntamente - Positivo
11
Neymar:Golaço
Seleção:Neymar
Seleção:Vamos
Seleção:Japão Neymar:Gol Seleção:Joga
Brasil 3 x 0 Japão –15/06 – 1o half – positive tweets
© 2012 IBM Corporation
2o Tempo - Palavras Mencionadas Conjuntamente - Positivo
13
Seleção:volante
Seleção:melhor
Hulk:Homem
Oscar:Joga
Seleção:sair
Neymar:sair
Seleção:joga Oscar:passe
Seleção:Gol
Seleção:Jô
Seleção:vamos
Seleção:mudar
Seleção:joga
Seleção:vitória
Brasil 3 x 0 Japão –15/06 – 2o half – positive tweets (left) and negative tweets (right)
© 2012 IBM Corporation
2o Tempo - Palavras Mencionadas Conjuntamente - Negativo
14
Paulinho:Gol
Seleção:chega
Hulk:Homem
Hulk:Sair
Seleção:2-0
Seleção:vamos
Oscar:Joga Neymar:tirar
Hulk:tirar Neymar:sair
© 2013 IBM Corporation
IBM Research – Brazil
53 Brasil 3 x 0 Japão –15/06 Brasil 3 x 0 Espanha – 30/06
Brasil 3 x 0 Japão –15/06347.344 tweets
Brasil 2 x 0 México – 19/06382.813 tweets
Brasil 4 x 2 Itália – 22/06299.590 tweets
Brasil 2 x 1 Uruguai – 26/06479.173 tweets
Brasil 3 x 0 Espanha – 30/061.564.635 tweets
© 2013 IBM Corporation
IBM Research – Brazil
56
BRL Technology for Main Marketing Campaign of IBM Brazil in 2013
youtube commercial - 1.5M views
TV sponsorship - 22M total audience
above-expectation campaign results
Campaign goal: “Make the greatness of IBM real for Brazilians.” Rodrigo Kede, General Manager, IBM Brazil
© 2013 IBM Corporation
IBM Research – Brazil
57
Future Work: Removing Biases in Social Media Data
Digital access
Age
Class
Gender
Literacy
Geography
Optimism
Position on issues
Fans/opponents
...
© 2013 IBM Corporation
IBM Research – Brazil
60
IBM Research – Brazilhttp://www.research.ibm.com/brazil/
Alan Braz - [email protected] - @alanbraz