E-Business Centre of Excellence
Department of Industrial & Systems Engineering
Indian Institute of Technology Kharagpur, Kharagpur, INDIA

Short-term course on
"Web Data Analytics using R and Python"


September 6-10, 2016
Registration Latest by 22nd July, 2016
Home             About Course             Contact             Important Dates             Registration             Accommodation             Coordinator             Brochure            
 

Introduction

World Wide Web probably is the largest publicly accessible data source in the world. These data comes from sources such Web page content, hyperlinks and usage logs. With the advent of Web 2.0, web page content is no more limited to what is created by website administrator; it also contains user generated contents such as blogs and social interaction data. Many websites also provide APIs to extract this user generated data. This course aims to introduce the following topics with hands on exercises using R and Python.

Contents: Types of web data; issues in collecting, cleaning, pre-processing and representing the web data, introduction to R and Python

Recommender System

Recommenders systems have become an integral part of most ecommerce websites. Such systems recommend products and services relevant to a user based on his activities and activities of the similar users. This is achieved by predicting the rating for an unrated item for a specific user. Our focus here is the collaborative filtering based recommender system.

Contents: A generic framework for understanding recommender system, types of recommender systems, issues in recommender system design, similarity measures, collaborative filtering algorithm, performance measures, Hands on practices

Web Log Analytics

HTTP requests made to a web server are captured in the web log files. This is a useful source of data for understanding users' navigational pattern in a website. Knowledge of this pattern can help a company in many ways starting from redesigning the website to generating marketing insights. This module will help the learner to analyse web log and generate insights from it specifically for improving site navigability.

Contents: web log pre-processing -data collection, cleaning, session identification, Web user behaviour as a Markov Process, transition probability matrix, Hands on practices

Analysing Web and User Generated Contents

While web content is created by the company, the user generated contents are created from unidirectional and bidirectional interaction of the user with the website. User blogs, data from social network sites such as face book and Tweeter are few examples. Irrespective of the source this text data requires developing skill in basic text analytics, and advanced techniques on sentiment analysis and opinion mining. In this module will help the participants to learn about analysing web and user generated contents in general with application to sentiment analysis.

Contents: Overview of text analytics, Tokenization, Stemming, Lemmatization, Wordnet concepts, Parts-of- speech tagging, Collection, cleaning and sentiment analysis on twitter data, Sentiment visualization using sentiment word cloud, Hands on practices

Social Network Analysis

Social network is the study of social entities, their interactions and relationships. From the network we can study the properties of its structure, and the role, position and prestige of each social actor. We can also find various kinds of sub-graphs, e.g., communities formed by groups of actors. Social network analysis is useful for the Web because the Web is essentially a virtual society, and thus a virtual social network, where each page can be regarded as a social actor and each hyperlink as a relationship. Many of the results from social networks can be adapted and extended for use in the Web context.

Contents: Types social networks, random network model vs complex network model, degree distribution, clustering coefficients. explicit vs implicit social network, directed vs undirected social network. centrality, community detection, link prediction. opinion dynamics and stability, hands on practices

Contact
Department of Industrial & Systems Engineering
Indian Institute of Technology Kharagpur
Kharagpur, West Bengal, India - 721302
Tel: +91-9475555571

Campus Weather
During September
Max: 36°C     Min: 24°C     Rainfall: 12 mm
Campus Map