Tweet Rises: Twitter Sentiment Analysis
|
|
- Blake Wilkins
- 6 years ago
- Views:
Transcription
1 Tweet Rises: Twitter Sentiment Analysis ABSTRACT Aleksander Bello Archan Luhar This paper focuses on the work of the California Institute of CS 145 group: Tweet Rises. It focuses on a combination of Twitter sentiment analysis and effective web-based visualization. The group worked with several forms of natural language processing and machine learning, and with two primary visualization methods. Alexandru Cioc acioc@caltech.edu Louis O Bryan lobryan@caltech.edu Victor Duan vduan@caltech.edu Categories and Subject Descriptors H.3.5 [Online Information Services]: Web-based services; H.5.3 [Group and Organization Interfaces]: Webbased interaction General Terms Visualization Keywords Twitter, sentiment analysis, 2D visualization 1. INTRODUCTION How is the world feeling right now? That is a hard question so to make it easier and narrow the scope down to the quantifiable, we ask, how is Twitter feeling right now? We propose here an application and underlying infrastructure to categorize Tweets based on emotional content, create a representative sample, and visualize the sentiments on a map in a web browser in real-time. 2. FRONTEND In order to make our work available to the widest audience, we decided to work on visualizing our results within a web browser. A major challenge was in, first, developing a working prototype and in, next, iterating upon it on a weekly Figure 1: Heatmap of Twitter sentiment. basis. Among our most important priorities was to prevent our clients from being flooded with too much information and, from the other standpoint, still providing enough information to make viewing the website meaningful. Our development primarily utilized the Google Maps API in order to achieve a fast and effective visualization. Google Maps allowed us to focus on the visualization itself, and saved us the time of having to create a scalable map of the United States. Originally, we intended on visualizing our results using a heat map, but quickly discovered that a more understandable mode of communication was to use a state map. Our primary concern with using the heat map was that the Google Maps API heatmap naturally scales based on the amount of data it receives. This means that, within seconds, states and cities with small populations have their data points effectively reduced to obscurity, and our entire heat map therefore only shows data for cities like Los Angeles, San Francisco, and New York City. As previously mentioned, we therefore focused on a different type of visualization, the state map. Our idea for using our state map came from analyzing various forms of 2D visualizations, and seeing that a particularly understandable visualization came from geographic electoral maps during U.S. elections. This forms of visualization depicts an entire state with a solid color that indicates that
2 Figure 3: State map of Twitter sentiment. Figure 2: Example US electoral map. state s political preference. We believed that, since people have already been predisposed to understand these maps from news coverage, we could lower the barrier to entry for understanding what our data actually depicts. Therefore, in our own work, we decided to create an independent geometric shape for each state using the Google Maps API, and then colored the state as the average color of the Tweet sentiments it received. For visualization purposes, we let negative emotions be depicted in red and positive emotions be depicted in blue. An issue that quickly arose was that, in averaging the color over every Tweet received, every state would, given an adequate amount of time, become the same shade of purple - roughly an equal amount of red and blue sentiments. We believe that this makes sense since, over a prolonged period of time, we should notice an equal amount of both positive and negative sentiments since Twitter has a very wide audience so whenever a wave of positive or negative sentiments are shown, people often respond with a contradicting sentiment. In order to alleviate this, we decided to allow for specification of how many Tweets to average over. Thus, by lowering the number to something more manageable, like 10 Tweets, we see a more meaningful rise and fall of positive and negative sentiments. Our final touches to our frontend s visualization came in the form of sidebar indicating trending topics and their overall sentiments. Clicking on a topic allowed users to view the visualization for that single topic. This provided more meaningful information for users who aimed at gauging overall sentiments for a single topic as opposed to the overall Twitter Tweet stream. Even further, the sidebar allows users to estimate, at a glance, what a topic s sentiment is. This could potentially be seen as an exploratory tool, since it gives users a chance to notice outliers - topics that might be heavily weighed towards one sentiment, and then easily gives them access to see the map for that topic. Overall, we believe our techniques provided for an effective visualization of our Twitter sentiment data. 3. NATURAL LANGUAGE PROCESSING The first step to analyzing the Tweets is natural language processing (NLP). The techniques used to classify the Tweets all focus on a bag of words approach. As such, the NLP portion of the project focused on developing an efficient way to get the best bag of words from any given Tweet. Initially, we eliminate obvious stop words that don t contribute to the content of a Tweet. This list includes words such as a, I, she, etc. Afterwards, we define words as a string of alphabetic characters with whitespace on both sides. Note that this ignores things such as numbers and emoticons. Once the set of words in each Tweet has been computed for each Tweet in the training data, mutual information is used to determine the words that provide the most insight to the content of the Tweets. For our purposes we used about 1000 words. With these, each Tweet was thus characterized by which of these 1000 words appeared. For example, for the Tweet I am not happy. He is not happy and the mutual information words not and happy, the Tweet would be characterized as [ not, happy ]. Note that the number of times a word appears is not taken into account. Once a Tweet has been characterized by the above steps, it is passed along to the machine learning portion of the classification. 4. MACHINE LEARNING Our machine learning methods consisted of four algorithms: Naïve Bayes, Stochastic gradient descent, Support vector machines, and Maximum entropy. Our first implementation was Naïve Bayes, due to its simplicity. Naïve Bayes predicts the classification of an observation by providing a particularly simple formula for the probability that an outcome C is observed given that there are features F 1, F 2,..., F n in the observation. These probabilities can be compared for each outcome to find the most likely one. Specifically, the model assumes that the features variables F 1, F 1,..., F n are independent, and the assumption implies that p(c F 1, F 2,..., F n ) = 1 n Z p(c) p(f i C) where Z = p(f 1, F 2,, F n ) is the evidence for these features. In our case, the outcome C was whether the Tweet had i=1
3 Figure 4: Performance of Naive Bayes, Maximum entropy, and Support vector machines algorithms for up to 10,000 training Tweets. positive or negative sentiment, the features were the words determined by the mutual information algorithm, and the probabilities p(f i C) were determined from the training data depending on their appearance rate. So, the formula gave a way to compare the likelihood of the two sentiments given the words in the Tweet. Figure 5: Performance of Naive Bayes classifier for up to 1.6 million Tweets. One problem with this approach was that if a word did not appear in both positive and negative sentiment Tweets, the probability p(f i C) was zero for one of the outcomes. Although these terms were unlikely, they occurred in our calculation when we limited the number of training Tweets. We decided to simply leave these terms out of our calculation. The other three algorithms, stochastic gradient descent, support vector machines, and maximum entropy, were implemented in the python scikit-learn package. So, our work with these algorithms mainly involved tuning the parameters to the scikit-learn functions. For example, we changed the loss function, number of iterations, learning rate, and whether or not to fit the intercept for stochastic gradient descent. Using 10,000 training Tweets, all algorithms but stochastic gradient descent had an accuracy rate between 65% and 75% on the test data set. The performance of our stochastic gradient descent implementation was poor, so we left it out in the end. The support vector machines algorithm performed better than the other algorithms when the number of training Tweets was less than 10,000, achieving over 70% accuracy. Only our Naïve Bayes algorithm was able to process significantly larger training sets in a reasonable amount of time. We were able to train Naïve Bayes on 1.6 million Tweets, which gave the algorithm almost 80% accuracy, outperforming the others. The other algorithms may have performed better with the same amount of training data, but they took significantly longer at only 20,000 Tweets, so this was impractical to test. 5. BACKEND The first thing that needs to be done before we can produce any results is to have the raw Tweets, i.e. the text and geolocation information. This is obtained by the Twitter 1% firehose API. There is a persistent connection between our backend and Twitter that continuously streams new Tweets Figure 6: An overview of the backend infrastructure. The ellipses represent the NLP workers, whild the rhomboids represent instances of the frontend servers.
4 in realtime. More specifically, we have two open streams with Twitter: one to get a 1% sample from all Tweets, and one to get a sample only for the specified trending topics. The trending topics are collected and updated by a periodically running script. This approach also allows for custom trending topics that we might like to add. All of these Tweets are stored on Redis, a simple in-memory database. There is one process worker for each stream, so that both streams can be consumed at the same time. It is worth noting that the official Twitter API documentation does not allow multiple streams. Moreover, these consumer workers have to be fast enough to keep up with the upstream Tweets, otherwise the connection will be dropped. To mitigate these two issues (and any other connection issues that might arise), several supervisor mechanisms are set up to restart these services. After the raw Tweets are obtained, they need to be processed before they can be served to the frontend. The processing part consists of the NLP/machine learning workers categorizing the raw Tweets as positive and negative sentiments, extracting out the geolocation data, and then storing this information on a second database. This being the most compute intensive part, is fully parallelizable; more workers can be spinned up to parallelly consume from the raw Tweets queue. Unfortunately, Twitter does not support querying by both geolocation data and topic, thus not all trending topics will have geolocation data attached. We do, however, store all of them, so that we have a more complete sentimental assessment of the trends. In parallel to the Twitter consumer and sentiment categorizer, we have a node.js server that interacts with the user client. This server has two objectives: handle the requests for the website s static files and send sentiment data points in real time via an open socket connection. We used several node.js libraries. Using node.js s built in http server, we mapped all http port 80 requests to a static content folder containing html, css, images, and javascript. Using the third party socket.io library, we structured the data point communication. On a new socket connection, the server sends all sentiment points less than ten minutes old. Then, for every current trending topic and hard coded permanent topic, the server sends the last 24 hours worth of points limited to a maximum of 1000 points. Independent of the individual socket connection logic, the server sends all points gathered in the last ten seconds to all connected clients every ten seconds. unable to query for topic and location (Tweets with location tag in USA) simultaneously. As a result, it was difficult to get a lot of data for the topics that also had location data. The currently trending topics often had fairly little data, and the states are not all colored in. For the custom topics, we tried to hold on to the data over a longer period of time, giving us a chance to acquire more information on those topics and display a better map. See below for figures of the final state of the project. 7. FURTHER WORK We believe further work could be done on our project s frontend. Since our state map, is an effective tool, we could create sub-shapes for each state in order to see sentiments for specific counties. Even further, we could continue work on our heat map. We switched to the state map, becomes of inherent problems that the heat map caused, but we did not have time to return to the heat map and actually fix the problems we encountered. Thus, while our state map, looks like a completed final project, the heat map remains in a rudimentary state. Lastly, for our frontend, we could try and speed up switching between topics. There is current slowdown after a large number of points have been added so optimization changes would prove effective. 8. ACKNOWLEDGEMENTS We would like to thank Professor Adam Wierman and Lingwen Gan for helpful advice and guidance throughout the project. 9. REFERENCES [1] A. Go, R. Bhayani, and L. Huang. Twitter Sentiment Classification Using Distant Supervision. CS224N Project Report, Stanford, [2] E. Kouloumpis, T. Wilson, and J. Moore. Twitter Sentiment Analysis: The Good the Bad and the OMG!. ICWSM, 11: , [3] A. Bifet and E. Frank. Sentiment Knowledge Discovery in Twitter Streaming Data. Discovery Science, [4] T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes Twitter users: real-time event detection by social sensors. In Proceedings of the 19th international conference on World wide web, FINAL PRODUCT The final result of our project is a real time Twitter sentiment analysis tool. There are two modes: states map and heatmap. The states map collects the last five sentiments of Tweets from each state, and displays the overall sentiment of each state. The heatmap plots individual points for each Tweet it receives, and the colors on the map represent the overall sentiment of that small area. Additionally, there are topics to choose from on the side. Some of the topics are determined based on what is currently trending on Twitter. Others are custom topics that we thought we be interesting for a new user to see upon first visiting the site. Due to the limitations of the API, we were
5 Figure 7: Final state map product, focusing on the OKC topic. At the time, the Oklahoma City Thunder were playing the San Antonio Spurs in the Western Conference Finals of the NBA in 2014.
6 Figure 8: Final state map product, focusing on the SATs topic.
USER ADAPTATION IN E-LEARNING ENVIRONMENTS
USER ADAPTATION IN E-LEARNING ENVIRONMENTS Paraskevi Tzouveli Image, Video and Multimedia Systems Laboratory School of Electrical and Computer Engineering National Technical University of Athens tpar@image.
More informationLecture 1: Machine Learning Basics
1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3
More informationModule 12. Machine Learning. Version 2 CSE IIT, Kharagpur
Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should
More informationPython Machine Learning
Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled
More informationA Case Study: News Classification Based on Term Frequency
A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center
More informationOPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS
OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS Václav Kocian, Eva Volná, Michal Janošek, Martin Kotyrba University of Ostrava Department of Informatics and Computers Dvořákova 7,
More informationIndian Institute of Technology, Kanpur
Indian Institute of Technology, Kanpur Course Project - CS671A POS Tagging of Code Mixed Text Ayushman Sisodiya (12188) {ayushmn@iitk.ac.in} Donthu Vamsi Krishna (15111016) {vamsi@iitk.ac.in} Sandeep Kumar
More informationSystem Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks
System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks 1 Tzu-Hsuan Yang, 2 Tzu-Hsuan Tseng, and 3 Chia-Ping Chen Department of Computer Science and Engineering
More informationAn Introduction to Simio for Beginners
An Introduction to Simio for Beginners C. Dennis Pegden, Ph.D. This white paper is intended to introduce Simio to a user new to simulation. It is intended for the manufacturing engineer, hospital quality
More informationTwitter Sentiment Classification on Sanders Data using Hybrid Approach
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 4, Ver. I (July Aug. 2015), PP 118-123 www.iosrjournals.org Twitter Sentiment Classification on Sanders
More informationOn-Line Data Analytics
International Journal of Computer Applications in Engineering Sciences [VOL I, ISSUE III, SEPTEMBER 2011] [ISSN: 2231-4946] On-Line Data Analytics Yugandhar Vemulapalli #, Devarapalli Raghu *, Raja Jacob
More informationEvaluation of Usage Patterns for Web-based Educational Systems using Web Mining
Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining Dave Donnellan, School of Computer Applications Dublin City University Dublin 9 Ireland daviddonnellan@eircom.net Claus Pahl
More informationEvaluation of Usage Patterns for Web-based Educational Systems using Web Mining
Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining Dave Donnellan, School of Computer Applications Dublin City University Dublin 9 Ireland daviddonnellan@eircom.net Claus Pahl
More information(Sub)Gradient Descent
(Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include
More informationRule Learning With Negation: Issues Regarding Effectiveness
Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United
More informationADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF
Read Online and Download Ebook ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF Click link bellow and free register to download
More informationPART 1. A. Safer Keyboarding Introduction. B. Fifteen Principles of Safer Keyboarding Instruction
Subject: Speech & Handwriting/Input Technologies Newsletter 1Q 2003 - Idaho Date: Sun, 02 Feb 2003 20:15:01-0700 From: Karl Barksdale To: info@speakingsolutions.com This is the
More informationLinking Task: Identifying authors and book titles in verbose queries
Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,
More informationCS 446: Machine Learning
CS 446: Machine Learning Introduction to LBJava: a Learning Based Programming Language Writing classifiers Christos Christodoulopoulos Parisa Kordjamshidi Motivation 2 Motivation You still have not learnt
More informationReducing Features to Improve Bug Prediction
Reducing Features to Improve Bug Prediction Shivkumar Shivaji, E. James Whitehead, Jr., Ram Akella University of California Santa Cruz {shiv,ejw,ram}@soe.ucsc.edu Sunghun Kim Hong Kong University of Science
More informationThe 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X
The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, 2013 10.12753/2066-026X-13-154 DATA MINING SOLUTIONS FOR DETERMINING STUDENT'S PROFILE Adela BÂRA,
More informationWeb as Corpus. Corpus Linguistics. Web as Corpus 1 / 1. Corpus Linguistics. Web as Corpus. web.pl 3 / 1. Sketch Engine. Corpus Linguistics
(L615) Markus Dickinson Department of Linguistics, Indiana University Spring 2013 The web provides new opportunities for gathering data Viable source of disposable corpora, built ad hoc for specific purposes
More informationCONCEPT MAPS AS A DEVICE FOR LEARNING DATABASE CONCEPTS
CONCEPT MAPS AS A DEVICE FOR LEARNING DATABASE CONCEPTS Pirjo Moen Department of Computer Science P.O. Box 68 FI-00014 University of Helsinki pirjo.moen@cs.helsinki.fi http://www.cs.helsinki.fi/pirjo.moen
More informationAnalyzing sentiments in tweets for Tesla Model 3 using SAS Enterprise Miner and SAS Sentiment Analysis Studio
SCSUG Student Symposium 2016 Analyzing sentiments in tweets for Tesla Model 3 using SAS Enterprise Miner and SAS Sentiment Analysis Studio Praneth Guggilla, Tejaswi Jha, Goutam Chakraborty, Oklahoma State
More informationCircuit Simulators: A Revolutionary E-Learning Platform
Circuit Simulators: A Revolutionary E-Learning Platform Mahi Itagi Padre Conceicao College of Engineering, Verna, Goa, India. itagimahi@gmail.com Akhil Deshpande Gogte Institute of Technology, Udyambag,
More informationRule Learning with Negation: Issues Regarding Effectiveness
Rule Learning with Negation: Issues Regarding Effectiveness Stephanie Chua, Frans Coenen, and Grant Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX
More informationLEGO MINDSTORMS Education EV3 Coding Activities
LEGO MINDSTORMS Education EV3 Coding Activities s t e e h s k r o W t n e d Stu LEGOeducation.com/MINDSTORMS Contents ACTIVITY 1 Performing a Three Point Turn 3-6 ACTIVITY 2 Written Instructions for a
More informationAustralian Journal of Basic and Applied Sciences
AENSI Journals Australian Journal of Basic and Applied Sciences ISSN:1991-8178 Journal home page: www.ajbasweb.com Feature Selection Technique Using Principal Component Analysis For Improving Fuzzy C-Mean
More informationIntel-powered Classmate PC. SMART Response* Training Foils. Version 2.0
Intel-powered Classmate PC Training Foils Version 2.0 1 Legal Information INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE,
More information16.1 Lesson: Putting it into practice - isikhnas
BAB 16 Module: Using QGIS in animal health The purpose of this module is to show how QGIS can be used to assist in animal health scenarios. In order to do this, you will have needed to study, and be familiar
More informationIterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages
Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages Nuanwan Soonthornphisaj 1 and Boonserm Kijsirikul 2 Machine Intelligence and Knowledge Discovery Laboratory Department of Computer
More informationUtilizing Soft System Methodology to Increase Productivity of Shell Fabrication Sushant Sudheer Takekar 1 Dr. D.N. Raut 2
IJSRD - International Journal for Scientific Research & Development Vol. 2, Issue 04, 2014 ISSN (online): 2321-0613 Utilizing Soft System Methodology to Increase Productivity of Shell Fabrication Sushant
More informationWriting Research Articles
Marek J. Druzdzel with minor additions from Peter Brusilovsky University of Pittsburgh School of Information Sciences and Intelligent Systems Program marek@sis.pitt.edu http://www.pitt.edu/~druzdzel Overview
More informationLearning Methods for Fuzzy Systems
Learning Methods for Fuzzy Systems Rudolf Kruse and Andreas Nürnberger Department of Computer Science, University of Magdeburg Universitätsplatz, D-396 Magdeburg, Germany Phone : +49.39.67.876, Fax : +49.39.67.8
More informationLearning From the Past with Experiment Databases
Learning From the Past with Experiment Databases Joaquin Vanschoren 1, Bernhard Pfahringer 2, and Geoff Holmes 2 1 Computer Science Dept., K.U.Leuven, Leuven, Belgium 2 Computer Science Dept., University
More informationQuickStroke: An Incremental On-line Chinese Handwriting Recognition System
QuickStroke: An Incremental On-line Chinese Handwriting Recognition System Nada P. Matić John C. Platt Λ Tony Wang y Synaptics, Inc. 2381 Bering Drive San Jose, CA 95131, USA Abstract This paper presents
More informationThe Good Judgment Project: A large scale test of different methods of combining expert predictions
The Good Judgment Project: A large scale test of different methods of combining expert predictions Lyle Ungar, Barb Mellors, Jon Baron, Phil Tetlock, Jaime Ramos, Sam Swift The University of Pennsylvania
More informationATENEA UPC AND THE NEW "Activity Stream" or "WALL" FEATURE Jesus Alcober 1, Oriol Sánchez 2, Javier Otero 3, Ramon Martí 4
ATENEA UPC AND THE NEW "Activity Stream" or "WALL" FEATURE Jesus Alcober 1, Oriol Sánchez 2, Javier Otero 3, Ramon Martí 4 1 Universitat Politècnica de Catalunya (Spain) 2 UPCnet (Spain) 3 UPCnet (Spain)
More informationlearning collegiate assessment]
[ collegiate learning assessment] INSTITUTIONAL REPORT 2005 2006 Kalamazoo College council for aid to education 215 lexington avenue floor 21 new york new york 10016-6023 p 212.217.0700 f 212.661.9766
More informationSwitchboard Language Model Improvement with Conversational Data from Gigaword
Katholieke Universiteit Leuven Faculty of Engineering Master in Artificial Intelligence (MAI) Speech and Language Technology (SLT) Switchboard Language Model Improvement with Conversational Data from Gigaword
More informationWORK OF LEADERS GROUP REPORT
WORK OF LEADERS GROUP REPORT ASSESSMENT TO ACTION. Sample Report (9 People) Thursday, February 0, 016 This report is provided by: Your Company 13 Main Street Smithtown, MN 531 www.yourcompany.com INTRODUCTION
More informationDocument number: 2013/ Programs Committee 6/2014 (July) Agenda Item 42.0 Bachelor of Engineering with Honours in Software Engineering
Document number: 2013/0006139 Programs Committee 6/2014 (July) Agenda Item 42.0 Bachelor of Engineering with Honours in Software Engineering Program Learning Outcomes Threshold Learning Outcomes for Engineering
More informationGenerative models and adversarial training
Day 4 Lecture 1 Generative models and adversarial training Kevin McGuinness kevin.mcguinness@dcu.ie Research Fellow Insight Centre for Data Analytics Dublin City University What is a generative model?
More informationNearing Completion of Prototype 1: Discovery
The Fit-Gap Report The Fit-Gap Report documents how where the PeopleSoft software fits our needs and where LACCD needs to change functionality or business processes to reach the desired outcome. The report
More informationGeo Risk Scan Getting grips on geotechnical risks
Geo Risk Scan Getting grips on geotechnical risks T.J. Bles & M.Th. van Staveren Deltares, Delft, the Netherlands P.P.T. Litjens & P.M.C.B.M. Cools Rijkswaterstaat Competence Center for Infrastructure,
More informationPostprint.
http://www.diva-portal.org Postprint This is the accepted version of a paper presented at CLEF 2013 Conference and Labs of the Evaluation Forum Information Access Evaluation meets Multilinguality, Multimodality,
More informationHow to Judge the Quality of an Objective Classroom Test
How to Judge the Quality of an Objective Classroom Test Technical Bulletin #6 Evaluation and Examination Service The University of Iowa (319) 335-0356 HOW TO JUDGE THE QUALITY OF AN OBJECTIVE CLASSROOM
More informationTruth Inference in Crowdsourcing: Is the Problem Solved?
Truth Inference in Crowdsourcing: Is the Problem Solved? Yudian Zheng, Guoliang Li #, Yuanbing Li #, Caihua Shan, Reynold Cheng # Department of Computer Science, Tsinghua University Department of Computer
More informationMathematics process categories
Mathematics process categories All of the UK curricula define multiple categories of mathematical proficiency that require students to be able to use and apply mathematics, beyond simple recall of facts
More informationTeaching Algorithm Development Skills
International Journal of Advanced Computer Science, Vol. 3, No. 9, Pp. 466-474, Sep., 2013. Teaching Algorithm Development Skills Jungsoon Yoo, Sung Yoo, Suk Seo, Zhijiang Dong, & Chrisila Pettey Manuscript
More informationDefragmenting Textual Data by Leveraging the Syntactic Structure of the English Language
Defragmenting Textual Data by Leveraging the Syntactic Structure of the English Language Nathaniel Hayes Department of Computer Science Simpson College 701 N. C. St. Indianola, IA, 50125 nate.hayes@my.simpson.edu
More informationDublin City Schools Mathematics Graded Course of Study GRADE 4
I. Content Standard: Number, Number Sense and Operations Standard Students demonstrate number sense, including an understanding of number systems and reasonable estimates using paper and pencil, technology-supported
More informationUsing Web Searches on Important Words to Create Background Sets for LSI Classification
Using Web Searches on Important Words to Create Background Sets for LSI Classification Sarah Zelikovitz and Marina Kogan College of Staten Island of CUNY 2800 Victory Blvd Staten Island, NY 11314 Abstract
More informationNew Paths to Learning with Chromebooks
Thought Leadership Paper Samsung New Paths to Learning with Chromebooks Economical, cloud-connected computer alternatives open new opportunities for every student Research provided by As Computers Play
More informationChamilo 2.0: A Second Generation Open Source E-learning and Collaboration Platform
Chamilo 2.0: A Second Generation Open Source E-learning and Collaboration Platform doi:10.3991/ijac.v3i3.1364 Jean-Marie Maes University College Ghent, Ghent, Belgium Abstract Dokeos used to be one of
More informationTHE VIRTUAL WELDING REVOLUTION HAS ARRIVED... AND IT S ON THE MOVE!
THE VIRTUAL WELDING REVOLUTION HAS ARRIVED... AND IT S ON THE MOVE! VRTEX 2 The Lincoln Electric Company MANUFACTURING S WORKFORCE CHALLENGE Anyone who interfaces with the manufacturing sector knows this
More informationBlended E-learning in the Architectural Design Studio
Blended E-learning in the Architectural Design Studio An Experimental Model Mohammed F. M. Mohammed Associate Professor, Architecture Department, Cairo University, Cairo, Egypt (Associate Professor, Architecture
More informationSemi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.
Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link
More informationNovember 17, 2017 ARIZONA STATE UNIVERSITY. ADDENDUM 3 RFP Digital Integrated Enrollment Support for Students
November 17, 2017 ARIZONA STATE UNIVERSITY ADDENDUM 3 RFP 331801 Digital Integrated Enrollment Support for Students Please note the following answers to questions that were asked prior to the deadline
More informationTHE WEB 2.0 AS A PLATFORM FOR THE ACQUISITION OF SKILLS, IMPROVE ACADEMIC PERFORMANCE AND DESIGNER CAREER PROMOTION IN THE UNIVERSITY
THE WEB 2.0 AS A PLATFORM FOR THE ACQUISITION OF SKILLS, IMPROVE ACADEMIC PERFORMANCE AND DESIGNER CAREER PROMOTION IN THE UNIVERSITY F. Felip Miralles, S. Martín Martín, Mª L. García Martínez, J.L. Navarro
More informationMajor Milestones, Team Activities, and Individual Deliverables
Major Milestones, Team Activities, and Individual Deliverables Milestone #1: Team Semester Proposal Your team should write a proposal that describes project objectives, existing relevant technology, engineering
More informationActivities, Exercises, Assignments Copyright 2009 Cem Kaner 1
Patterns of activities, iti exercises and assignments Workshop on Teaching Software Testing January 31, 2009 Cem Kaner, J.D., Ph.D. kaner@kaner.com Professor of Software Engineering Florida Institute of
More informationOn Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC
On Human Computer Interaction, HCI Dr. Saif al Zahir Electrical and Computer Engineering Department UBC Human Computer Interaction HCI HCI is the study of people, computer technology, and the ways these
More informationExperiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling
Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling Notebook for PAN at CLEF 2013 Andrés Alfonso Caurcel Díaz 1 and José María Gómez Hidalgo 2 1 Universidad
More informationSpecification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments
Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Cristina Vertan, Walther v. Hahn University of Hamburg, Natural Language Systems Division Hamburg,
More informationSINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)
SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) Hans Christian 1 ; Mikhael Pramodana Agus 2 ; Derwin Suhartono 3 1,2,3 Computer Science Department,
More informationNetpix: A Method of Feature Selection Leading. to Accurate Sentiment-Based Classification Models
Netpix: A Method of Feature Selection Leading to Accurate Sentiment-Based Classification Models 1 Netpix: A Method of Feature Selection Leading to Accurate Sentiment-Based Classification Models James B.
More informationThree Strategies for Open Source Deployment: Substitution, Innovation, and Knowledge Reuse
Three Strategies for Open Source Deployment: Substitution, Innovation, and Knowledge Reuse Jonathan P. Allen 1 1 University of San Francisco, 2130 Fulton St., CA 94117, USA, jpallen@usfca.edu Abstract.
More informationEdexcel GCSE. Statistics 1389 Paper 1H. June Mark Scheme. Statistics Edexcel GCSE
Edexcel GCSE Statistics 1389 Paper 1H June 2007 Mark Scheme Edexcel GCSE Statistics 1389 NOTES ON MARKING PRINCIPLES 1 Types of mark M marks: method marks A marks: accuracy marks B marks: unconditional
More informationSpecification of the Verity Learning Companion and Self-Assessment Tool
Specification of the Verity Learning Companion and Self-Assessment Tool Sergiu Dascalu* Daniela Saru** Ryan Simpson* Justin Bradley* Eva Sarwar* Joohoon Oh* * Department of Computer Science ** Dept. of
More information10.2. Behavior models
User behavior research 10.2. Behavior models Overview Why do users seek information? How do they seek information? How do they search for information? How do they use libraries? These questions are addressed
More informationEvaluation of a College Freshman Diversity Research Program
Evaluation of a College Freshman Diversity Research Program Sarah Garner University of Washington, Seattle, Washington 98195 Michael J. Tremmel University of Washington, Seattle, Washington 98195 Sarah
More informationDevelopment of an IT Curriculum. Dr. Jochen Koubek Humboldt-Universität zu Berlin Technische Universität Berlin 2008
Development of an IT Curriculum Dr. Jochen Koubek Humboldt-Universität zu Berlin Technische Universität Berlin 2008 Curriculum A curriculum consists of everything that promotes learners intellectual, personal,
More informationCitrine Informatics. The Latest from Citrine. Citrine Informatics. The data analytics platform for the physical world
Citrine Informatics The data analytics platform for the physical world The Latest from Citrine Summit on Data and Analytics for Materials Research 31 October 2016 Our Mission is Simple Add as much value
More informationCSL465/603 - Machine Learning
CSL465/603 - Machine Learning Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Introduction CSL465/603 - Machine Learning 1 Administrative Trivia Course Structure 3-0-2 Lecture Timings Monday 9.55-10.45am
More informationP. Belsis, C. Sgouropoulou, K. Sfikas, G. Pantziou, C. Skourlas, J. Varnas
Exploiting Distance Learning Methods and Multimediaenhanced instructional content to support IT Curricula in Greek Technological Educational Institutes P. Belsis, C. Sgouropoulou, K. Sfikas, G. Pantziou,
More informationGraduate Program in Education
SPECIAL EDUCATION THESIS/PROJECT AND SEMINAR (EDME 531-01) SPRING / 2015 Professor: Janet DeRosa, D.Ed. Course Dates: January 11 to May 9, 2015 Phone: 717-258-5389 (home) Office hours: Tuesday evenings
More informationRadius STEM Readiness TM
Curriculum Guide Radius STEM Readiness TM While today s teens are surrounded by technology, we face a stark and imminent shortage of graduates pursuing careers in Science, Technology, Engineering, and
More informationIntroduction to Mobile Learning Systems and Usability Factors
Introduction to Mobile Learning Systems and Usability Factors K.B.Lee Computer Science University of Northern Virginia Annandale, VA Kwang.lee@unva.edu Abstract - Number of people using mobile phones has
More informationApplying Learn Team Coaching to an Introductory Programming Course
Applying Learn Team Coaching to an Introductory Programming Course C.B. Class, H. Diethelm, M. Jud, M. Klaper, P. Sollberger Hochschule für Technik + Architektur Luzern Technikumstr. 21, 6048 Horw, Switzerland
More informationA Coding System for Dynamic Topic Analysis: A Computer-Mediated Discourse Analysis Technique
A Coding System for Dynamic Topic Analysis: A Computer-Mediated Discourse Analysis Technique Hiromi Ishizaki 1, Susan C. Herring 2, Yasuhiro Takishima 1 1 KDDI R&D Laboratories, Inc. 2 Indiana University
More informationBusiness Computer Applications CGS 1100 Course Syllabus. Course Title: Course / Prefix Number CGS Business Computer Applications
Business Computer Applications CGS 10 Course Syllabus Course / Prefix Number CGS 10 CRN: 20616 Course Catalog Description: Course Title: Business Computer Applications Tuesday 6:30pm Building M Rm 118,
More informationarxiv: v1 [cs.lg] 15 Jun 2015
Dual Memory Architectures for Fast Deep Learning of Stream Data via an Online-Incremental-Transfer Strategy arxiv:1506.04477v1 [cs.lg] 15 Jun 2015 Sang-Woo Lee Min-Oh Heo School of Computer Science and
More informationVirtual Teams: The Design of Architecture and Coordination for Realistic Performance and Shared Awareness
Virtual Teams: The Design of Architecture and Coordination for Realistic Performance and Shared Awareness Bryan Moser, Global Project Design John Halpin, Champlain College St. Lawrence Introduction Global
More informationFACULTY Tk20 TUTORIALS: PORTFOLIOS & FIELD EXPERIENCE BINDERS
FACULTY Tk20 TUTORIALS: PORTFOLIOS & FIELD EXPERIENCE BINDERS TABLE OF CONTENTS TOPIC PAGE PORTFOLIOS 2 Introduction 2 Student View 2 Faculty Administrator View 3 Accessing eportfolios from personal Faculty
More informationMYCIN. The MYCIN Task
MYCIN Developed at Stanford University in 1972 Regarded as the first true expert system Assists physicians in the treatment of blood infections Many revisions and extensions over the years The MYCIN Task
More informationMOODLE 2.0 GLOSSARY TUTORIALS
BEGINNING TUTORIALS SECTION 1 TUTORIAL OVERVIEW MOODLE 2.0 GLOSSARY TUTORIALS The glossary activity module enables participants to create and maintain a list of definitions, like a dictionary, or to collect
More informationWho s on First. A Session Starter on Interpersonal Communication With an introduction to Interpersonal Conflict by Dr. Frank Wagner.
Who s on First A Session Starter on Interpersonal Communication With an introduction to Interpersonal Conflict by Dr. Frank Wagner Leader s Guide 1 Film Synopsis WHO S ON FIRST, featuring Abbot and Costello,
More informationSpring 2015 Achievement Grades 3 to 8 Social Studies and End of Course U.S. History Parent/Teacher Guide to Online Field Test Electronic Practice
Spring 2015 Achievement Grades 3 to 8 Social Studies and End of Course U.S. History Parent/Teacher Guide to Online Field Test Electronic Practice Assessment Tests (epats) FAQs, Instructions, and Hardware
More informationWhy Did My Detector Do That?!
Why Did My Detector Do That?! Predicting Keystroke-Dynamics Error Rates Kevin Killourhy and Roy Maxion Dependable Systems Laboratory Computer Science Department Carnegie Mellon University 5000 Forbes Ave,
More informationDyslexia and Dyscalculia Screeners Digital. Guidance and Information for Teachers
Dyslexia and Dyscalculia Screeners Digital Guidance and Information for Teachers Digital Tests from GL Assessment For fully comprehensive information about using digital tests from GL Assessment, please
More informationAUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION
JOURNAL OF MEDICAL INFORMATICS & TECHNOLOGIES Vol. 11/2007, ISSN 1642-6037 Marek WIŚNIEWSKI *, Wiesława KUNISZYK-JÓŹKOWIAK *, Elżbieta SMOŁKA *, Waldemar SUSZYŃSKI * HMM, recognition, speech, disorders
More informationDeep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach
#BaselOne7 Deep search Enhancing a search bar using machine learning Ilgün Ilgün & Cedric Reichenbach We are not researchers Outline I. Periscope: A search tool II. Goals III. Deep learning IV. Applying
More informationLearning Methods in Multilingual Speech Recognition
Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex
More informationHongyan Ma. University of California, Los Angeles
SUMMARY, 300 Young Drive North, Mailbox 951520, hym@ucla.eduhttp://polaris.gseis.ucla.edu/hma/ Objective is a faculty position in library and information science devoted to research and teaching Research
More informationAssignment 1: Predicting Amazon Review Ratings
Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for
More informationIntroduction to Simulation
Introduction to Simulation Spring 2010 Dr. Louis Luangkesorn University of Pittsburgh January 19, 2010 Dr. Louis Luangkesorn ( University of Pittsburgh ) Introduction to Simulation January 19, 2010 1 /
More informationUnsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model
Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model Xinying Song, Xiaodong He, Jianfeng Gao, Li Deng Microsoft Research, One Microsoft Way, Redmond, WA 98052, U.S.A.
More informationCarnegie Mellon University Department of Computer Science /615 - Database Applications C. Faloutsos & A. Pavlo, Spring 2014.
Carnegie Mellon University Department of Computer Science 15-415/615 - Database Applications C. Faloutsos & A. Pavlo, Spring 2014 Homework 2 IMPORTANT - what to hand in: Please submit your answers in hard
More informationEDIT 576 DL1 (2 credits) Mobile Learning and Applications Fall Semester 2014 August 25 October 12, 2014 Fully Online Course
GEORGE MASON UNIVERSITY COLLEGE OF EDUCATION AND HUMAN DEVELOPMENT GRADUATE SCHOOL OF EDUCATION INSTRUCTIONAL DESIGN AND TECHNOLOGY PROGRAM EDIT 576 DL1 (2 credits) Mobile Learning and Applications Fall
More information