Description
This comprehensive guide provides a step-by-step approach to data collection, cleaning, formatting, and storage, using Python and R.
About the Author
Jose Manuel Magallanes Reyes has degrees in Computer Science, Public Management and Psychology. He has a second Ph.D. (Computational Social Science) from George Mason University, where he is an Affiliated Researcher at the Center for Social Complexity (GMU-CSC). He is currently Associate Professor at the Department of Social Sciences at Pontificia Universidad Catolica del Peru; while also a Senior Data Science Fellow (eScience Institute), and a Visiting Professor (Evans School) at the University of Washington. He is a Catalyst Fellow at the University of California, Berkeley Initiative for the Transparency in the Social Sciences. His interdisciplinary work has been funded by the National Science Foundation via the CDI Program grant no. IIS-1125171 (granted by GMU-CSC), and via the EITM program (granted by Duke University in 2017). Since 2015, his work is funded by the Washington Research Fund, the Alfred P. Sloan Foundation, and the Gordon and Betty Moore Foundation (granted by the eScience Institute).
Reviews
'Data science has now firmly moved from computer science and engineering to the disciplines of the social sciences, where scholars are harnessing the insightful power of ever larger and more complex data sets. This volume provides a clear introduction for social scientists and policy researchers into the use of R and Python, including best practice of working with data files, command files, and outputs. The step by step approach with real world examples will be of great value to students, scholars, and practitioners engaged in data analytic approaches to social problems.' Todd Landman, Pro-Vice Chancellor, Faculty of Social Sciences, University of Nottingham
'The irruption of big data and the need to comply with high standards of research reproducibility require social scientists and policy analysts to be conversant in data collection and management techniques. Unfortunately, even those with sophisticated methodological training often lack the necessary tools to take on these requirements. Magallanes's book at long last collects and organizes a large amount of information and useful advice on how to curate data for scientific analysis. Through agile narrative and compelling examples, he walks the reader through the use of open-source tools of data science such as R, Python, and Github. The book is an invaluable resource for students and scholars at different levels of proficiency, from neophytes to advanced users.' Guillermo Rosas, Washington University, St. Louis
'This new, practical, reader-friendly, how-to manual on computational social data analysis is both long overdue and a must-have for analysts ad researchers. The range of problem-solving strategies and demonstrations is impressive. While eminently practical, Magallanes' contribution is also rigorous and true to its scientific aims, which will please both basic and applied scientists and practitioners.' Claudio Cioffi-Revilla, Director, Center for Social Complexity, George Mason University, Washington DC, and founding President, Computational Social Science Society of the Americas
'Magallanes' excellent book on data science for researchers and policy analysts is an accessible yet thorough introduction to data management and analyses in R and Python. It has a broad coverage of the techniques required to capture, clean, and process complex information. It is the perfect companion for sophisticated policy analysts and researchers that are ready to take advantage of the wealth of data that is available to skilled computer scientists.' Ernesto Calvo, University of Maryland
'It is rare indeed to pick up a new manuscript and immediately think how much you wish it had been written five years earlier, but I suspect many people will have that reaction to this book. This timely, thorough, and remarkably clear tutorial to both R and Python serves as a much needed on ramp to the data part of data science, and will undoubtedly soon grace the bookshelves of many social scientists - both students and their instructors. If you are intrigued by the possibilities of data science but concerned about the start up costs, look no farther: help has arrived.' Joshua Tucker, New York University
'If you need to develop new skills in R and Python but you don't know where to start, this is the book for you. With simple language, Magallanes shows you how to install the programs, retrieve data using APIs and scrape Internet sources, and how to get the data ready for modeling. This book is a gem.' Anibal Perez-Linan, University of Pittsburgh
'This book will be of great assistance to public policy and management scholars desiring a rigorous introduction to Data Science, particularly with regard to the intricacies of data management. The step-by-step approach will help teachers and students, in both undergraduate and graduate programs, become familiar with essential programming skills, particularly with respect to analyzing Big Data and making it available through Open Government initiatives. The author also provides a very helpful service in using both R and Python to show how to accomplish the same task, which allows readers to decide which of these languages will best serve their needs.' Craig W. Thomas, Evans School of Public Policy and Governance, University of Washington
Book Information
ISBN 9781107540255
Author Jose Manuel Magallanes Reyes
Format Paperback
Page Count 314
Imprint Cambridge University Press
Publisher Cambridge University Press
Weight(grams) 460g
Dimensions(mm) 229mm * 152mm * 18mm