This Github profile contains repositories with the code of the respective chapters. Also see: http://www.datascience-in-tourism.com/
Data Science has brought marvelous opportunities to many industries, and tourism is no exception. Although tourism is known as an interdisciplinary field that crosses sociology, economics, geography, psychology, and communication sciences, tourism researchers have long been constrained by the classical repertoire of research methodologies. Besides the widely applied quantitative and qualitative approaches, we could see advancements especially in quantitative methods over time. In an era of digitization, data comes in new unstructured forms along with traditionally structured datasets, which result in the rise of Big Data. Meanwhile, advancements in computing and the rapid development of algorithms lead to the emergence of advanced analytics that goes beyond conventional business intelligence to gain deeper insights and make predictions. Data Science is more than a set of methods and tools in elevating the typical ways of doing empirical research, allowing researchers to even find answers for previously unknown questions. However, Data Science is yet to be embraced by tourism scholars potentially because of the bigness, messiness, and unstructured nature of data that fuel confusion and uncertainty. At the same time, because Data Science has altered the epistemological foundations, the interplay between Data Science and theory deserves much attention.
By learning how to develop research questions that can be supported by theories, Data Science helps researchers better understand the data, uncover unknown relationships and patterns, and improve data visualization. In tourism, examples of Data Science applications include route optimization, real-time analysis, predictive analysis, personalization, customer sentiment analysis, alerting and monitoring systems, and much more. Nevertheless, adopting Data Science in tourism is not an easy task as it requires an interdisciplinary understanding between computer sciences as its original discipline. Tourism researchers are often not aware of these upcoming techniques and not familiar with their usage, contributions, advantages, pitfalls, and limitations.
This book is intended to serve as a starting point that connects Data Science to the tourism industry, being helpful for both, researchers and practitioners alike. It aims to present an overview of Data Science techniques relevant for tourism by offering a theoretical foundation for these concepts and a how-to-approach which facilitates readers in developing their research projects. Of course, this book cannot claim to cover the individual chapters and topics in their completeness. Rather, the aim is to provide the reader with the necessary knowledge to facilitate the decision regarding the choice of method.
Roman Egger Salzburg University of Applied Sciences, Innovation and Management in Tourism
Roman Egger (Salzburg University of Applied Sciences, Innovation and Management in Tourism), Mike O´Connor (booking.com), Liliya Lavitas (Tripavisor), Holger Sicking (Austrian National Tourist Office), Jeroen Mulder (Air France-KLM)
Luisa Mich University of Trento
Roman Egger & Chung-En Yu Salzburg University of Applied Sciences, Innovation and Management in Tourism
Roman Egger & Chung-En Yu Salzburg University of Applied Sciences, Innovation and Management in Tourism
Roman Egger, Larissa Neuburger & Michelle Mattutzzi Salzburg University of Applied Sciences, Innovation and Management in Tourism University of Florida University of Groningen
Chapter 5: Web Mining & Data Crawling
Roman Egger, Markus Kroner, Andreas Stöckl Salzburg University of Applied Sciences, Legalcounsel.at School of Informatics, Communications and Media, University of Applied Sciences Hagenberg
Roman Egger Salzburg University of Applied Sciences, Innovation and Management in Tourism
Chapter 7: Feature Engineering
Pablo Duboue Textualization Software Ltd.
Chapter 8: Clustering
Matthias Fuchs & Wolfram Höpken Department of Economics, Geography, Law and Tourism, Mid Sweden University University of Applied Science Ravensburg-Weingarten
Chapter 9: Dimensionality Reduction
Nikolay Oskolkov Lund University and National Bioinformatics Infrastructure Sweden (NBIS)
Chapter 10: Classification
Ulrich Bodenhofer & Andreas Stöckl School of Informatics, Communications and Media, University of Applied Sciences Upper Austria
Chapter 11: Regression
Andreas Stöckl & Ulrich Bodenhofer School of Informatics, Communications and Media, University of Applied Sciences Upper Austria
Chapter 12: Hyperparameter Tuning
Pier Paolo Ippolito SAS Institute
Chapter 13: Model Evaluation
Ajda Pretnar Faculty of Computer and Information Science, University of Ljubljana
Chapter 14: Data Interpretability of ML-Models
Urszula Czerwinska no academic affiliation atm
Chapter 15: Introduction: Natural Language Processing
Roman Egger & Enes Gokce Salzburg University of Applied Sciences, Innovation and Management in Tourism Pennsylvania State University
Chapter 16: Text Representation and Word Embeddings
Roman Egger Salzburg University of Applied Sciences, Innovation and Management in Tourism
Chapter 17: Sentiment Analysis
Andrei P. Kirilenko, Svetlana Stepchenkova & Luyu Wang University of Florida
Chapter 18: Topic Modeling
Roman Egger Salzburg University of Applied Sciences, Innovation and Management in Tourism
Chapter 19: Entity Matching
Ivan Bilan TrustYou
Chapter 20: Knowledge-Graphs
Mayank Kejriwal University of Southern California
Chapter 21: Social Network Analysis
Rodolfo Baggio Bocconi University
Chapter 22: Time Series Analysis
Irem Önder Univesity of Massachusetts Amherst
Chapter 23: Agent-based Modeling
JillianStudent Wageningen University
Chapter 24: GIS Analysis
Andrei P. Kirilenko University of Florida
Chapter 25: Data Visualization
Johanna Schmidt VRVis
Roman Egger Salzburg University of Applied Sciences, Innovation and Management in Tourism
CODE IS UNDER GNU/GPL LICENSE