This edited volume focuses on the latest developments in classification, statistical learning, data analysis and related areas of data science, including statistical analysis of large datasets, big data analytics, time series clustering, integration of data from different sources, as well as social networks. It covers both methodological aspects as well as applications to a wide range of areas including economics, marketing, education, social sciences, medicine, environmental sciences and pharmaceutical industry. In addition, the books contributions describe basic features of the software behind the data analysis results, and include links to the corresponding codes and data sets where necessary. This book is intended for researchers and practitioners who are interested in the latest developments and applications in the field. The peer-reviewed contributions were presented at the 10th Scientific Meeting of the Classification and Data Analysis Group (CLADAG) of t
he Italian Statistical Society, held in Santa Margherita di Pula (Cagliari), Italy, October 8 - 10, 2015.
Part I: Big Data |
|
|
From Big Data to Information: Statistical Issues Through a Case Study |
|
|
3 | (10) |
|
|
|
Enhancing Big Data Exploration with Faceted Browsing |
|
|
13 | (10) |
|
|
|
|
Big Data Meet Pharmaceutical Industry: An Application on Social Media Data |
|
|
23 | (8) |
|
|
|
Electre Tri Machine Learning Approach to the Record Linkage |
|
|
31 | (12) |
|
|
Part II: Social Networks |
|
|
Finite Sample Behavior of MLE in Network Autocorrelation Models |
|
|
43 | (8) |
|
|
|
|
|
Network Analysis Methods for Classification of Roles |
|
|
51 | (8) |
|
|
|
MCA-Based Community Detection |
|
|
59 | (10) |
|
Part III: Exploratory Data Analysis |
|
|
Rank Properties for Centred Three-Way Arrays |
|
|
69 | (8) |
|
|
|
|
Principal Component Analysis of Complex Data and Application to Climatology |
|
|
77 | (10) |
|
|
|
Motivations and Expectations of Students' Mobility Abroad: A Mapping Technique |
|
|
87 | (10) |
|
|
|
|
Testing Circular Antipodal Symmetry Through Data Depths |
|
|
97 | (10) |
|
|
|
Part IV: Statistical Modeling |
|
|
Multivariate Stochastic Downscaling for Semicontinuous Data |
|
|
107 | (10) |
|
|
|
|
Exploring Italian Students' Performances in the SNV Test: A Quantile Regression Perspective |
|
|
117 | (10) |
|
|
|
Estimating the Effect of Prenatal Care on Birth Outcomes |
|
|
127 | (10) |
|
|
|
Part V: Clustering and Classification |
|
|
Clustering Upper Level Units in Multilevel Models for Ordinal Data |
|
|
137 | (8) |
|
|
|
|
Clustering Macroseismic Fields by Statistical Data Depth Functions |
|
|
145 | (10) |
|
|
|
|
Comparison of Cluster Analysis Approaches for Binary Data |
|
|
155 | (8) |
|
|
|
Classification Models as Tools of Bankruptcy Prediction-Polish Experience |
|
|
163 | (10) |
|
|
|
|
|
Quality of Classification Approaches for the Quantitative Analysis of International Conflict |
|
|
173 | (10) |
|
Part VI: Time Series and Spatial Data |
|
|
P-Splines Based Clustering as a General Framework: Some Applications Using Different Clustering Algorithms |
|
|
183 | (8) |
|
|
|
|
|
Comparing Multistep Ahead Forecasting Functions for Time Series Clustering |
|
|
191 | (10) |
|
|
|
Comparing Spatial and Spatio-temporal FPCA to Impute Large Continuous Gaps in Space |
|
|
201 | (10) |
|
|
|
Part VII: Finance and Economics |
|
|
A Graphical Tool for Copula Selection Based on Tail Dependence |
|
|
211 | (8) |
|
|
|
|
Bayesian Networks for Financial Market Signals Detection |
|
|
219 | (8) |
|
|
|
|
|
A Multilevel Heckman Model to Investigate Financial Assets Among Older People in Europe |
|
|
227 | (8) |
|
|
|
Bifurcation and Sunspots in Continuous Time Optimal Model with Externalities |
|
|
235 | |
|
|
|
Erratum to: Big Data Meet Pharmaceutical Industry: An Application on Social Media Data Learning |
|
|
E1 | |
|
|
Francesco Mola is full professor of Statistics at the Department of Business and Economics at the University of Cagliari. He received his Ph.D in Computational Statistics and Data Analysis from the University of Naples Federico II. His research interests are in the field of multivariate data analysis and statistical learning, particularly data science and computational statistics. He has published more than sixty papers in international journals, encyclopedias, conference proceedings, and edited books.
Claudio Conversano is associate professor of Statistics at the Department of Business and Economics at the University of Cagliari. He received his Ph.D in Computational Statistics and Data Analysis from the University of Naples Federico II. His research interests include nonparametric statistics, statistical learning and computational finance. He has published more than forty papers in international journals, encyclopedias, conference proceedings, and edited books.
Maurizio Vichi is full professor of Statistics and head of the Department of Statistical Sciences at the Sapienza University of Rome. He is president of the Federation of European National Statistical Societies (FENStatS), former president of the Italian Statistical Society, and of the International Federation of Classification Societies (IFCS). He is coordinating editor of the journal Advances in Data Analysis and Classification, editor of the international book series Classification, Data Analysis and Knowledge Organization, and the series Studies in Theoretical and Applied Statistics, published by Springer. He is a member of ESAC