Atjaunināt sīkdatņu piekrišanu

Murach's Python for Data Analysis [Mīkstie vāki]

  • Formāts: Paperback / softback, 600 pages, 235
  • Izdošanas datums: 01-Aug-2021
  • Izdevniecība: Mike Murach & Associates Inc.
  • ISBN-10: 1943872767
  • ISBN-13: 9781943872763
  • Mīkstie vāki
  • Cena: 82,02 €
  • Grāmatu piegādes laiks ir 3-4 nedēļas, ja grāmata ir uz vietas izdevniecības noliktavā. Ja izdevējam nepieciešams publicēt jaunu tirāžu, grāmatas piegāde var aizkavēties.
  • Daudzums:
  • Ielikt grozā
  • Piegādes laiks - 4-6 nedēļas
  • Pievienot vēlmju sarakstam
  • Formāts: Paperback / softback, 600 pages, 235
  • Izdošanas datums: 01-Aug-2021
  • Izdevniecība: Mike Murach & Associates Inc.
  • ISBN-10: 1943872767
  • ISBN-13: 9781943872763
Data is collected everywhere these days, in massive quantities. But data alone doesn’t do you much good. That’s why data analysis—making sense of the data—has become a must-have skill in the fields of business, science, and social science. But it’s a tough skill to acquire. The concepts are challenging, and too many books and online tutorials treat only parts of the total skillset needed. Now, though, Murach’s Python for Data Analysis draws all the essential skills together and presents them in a clear and example-packed way. So you’ll soon be applying your programming skills to complex data analysis problems, more easily than you ever thought possible.In terms of content, this book gets you started the right way by using Pandas for data analysis and Seaborn for data visualization, with JupyterLab as your IDE. Then, it helps you master descriptive analysis by teaching you how to get, clean, prepare, and analyze data, including time-series data. Next, it gets you started with predictive analysis by showing you how to use linear regression models to predict unknown and future values. And to tie everything together, it gives you 4 real-world case studies that show you how to apply your new skills to political, environmental, social, and sports analysis. At the end, you’ll have a solid set of the professional skills that can lead to all sorts of new career opportunities.Sound too good to be true? Download a sample chapter for free from the Murach website and see for yourself how this book can turn you into the data analyst that companies are looking for.

Data is collected everywhere these days, in massive quantities. But data alone doesn’t do you much good. That’s why data analysis—making sense of the data—has become a must-have skill in the fields of business, science, and social science. Murach’s Python for Data Analysis is here to teach you how.
Section 1 Get off to a fast start
Chapter 1 Introduction to Python for data analysis
Introduction to data analysis
4(6)
What data analysis is
4(2)
The five phases of data analysis and visualization
6(2)
The IDEs for Python data analysis
8(2)
The Python skills that you need for data analysis
10(6)
How to install and import the Python modules for data analysis
10(2)
How to call and chain methods
12(2)
The coding basics for Python data analysis
14(2)
How to use JupyterLab as your IDE
16(12)
How to start JupyterLab and work with a Notebook
16(2)
How to edit and run the cells in a Notebook
18(2)
How to use the Tab completion and tooltip features
20(2)
How syntax and runtime errors work
22(2)
How to use Markdown language
24(2)
How to get reference information
26(2)
Two more skills for working with JupyterLab
28(4)
How to split the screen between two Notebooks
28(2)
How to use Magic Commands
30(2)
Introduction to the case studies
32(14)
The Polling case study
32(2)
The Forest Fires case study
34(2)
The Social Survey case study
36(2)
The Sports Analytics case study
38(8)
Chapter 2 The Pandas essentials for data analysis
Introduction to the Pandas DataFrame
46(6)
The DataFrame structure
46(2)
Two ways to get data into a DataFrame
48(2)
How to save and restore a DataFrame
50(2)
How to examine the data
52(6)
How to display the data in a DataFrame
52(2)
How to use the attributes of a DataFrame
54(2)
How to use the info(), nunique(), and describe() methods
56(2)
How to access the columns and rows
58(8)
How to access columns
58(2)
How to access rows
60(2)
How to access a subset of rows and columns
62(2)
Another way to access a subset of rows and columns
64(2)
How to work with the data
66(8)
How to sort the data
66(2)
How to use the statistical methods
68(2)
How to use Python for column arithmetic
70(2)
How to modify the string data in columns
72(2)
How to shape the data
74(6)
How to use indexes
74(2)
How to pivot the data
76(2)
How to melt the data
78(2)
How to analyze the data
80(12)
How to group the data
80(2)
How to aggregate the data
82(2)
How to plot the data
84(8)
Chapter 3 The Pandas essentials for data visualization
Introduction to data visualization
92(8)
The Python libraries for data visualization
92(2)
Long vs. wide data for data visualization
94(2)
How the Pandas plot() method works by default
96(2)
The three basic parameters for the Pandas plot() method
98(2)
How to create 8 types of plots
100(10)
How to create a line plot or an area plot
100(2)
How to create a scatter plot
102(2)
How to create a bar plot
104(2)
How to create a histogram or a density plot
106(2)
How to create a box plot or a pie plot
108(2)
How to enhance a plot
110(10)
How to improve the appearance of a plot
110(2)
How to work with subplots
112(2)
How to use chaining to get the plots you want
114(6)
Chapter 4 The Seaborn essentials for data visualization
Introduction to Seaborn
120(8)
The Seaborn methods for plotting
120(2)
The general methods vs. the specific methods
122(2)
How to use the basic Seaborn parameters
124(2)
How to use the Seaborn parameters for working with subplots
126(2)
How to enhance and save plots
128(10)
How to set the title, x label, and y label
128(2)
How to set the ticks, x limits, and y limits
130(2)
How to set the background style
132(2)
How to work with subplots
134(2)
How to save a plot
136(2)
How to create relational plots
138(4)
How to create a line plot
138(2)
How to create a scatter plot
140(2)
How to create categorical plots
142(4)
How to create a bar plot
142(2)
How to create a box plot
144(2)
How to create distribution plots
146(6)
How to create a histogram
146(2)
How to create a KDE or ECDF plot
148(2)
How to enhance a distribution plot
150(2)
Other techniques for enhancing a plot
152(18)
How to use other Axes methods to enhance a plot
152(2)
How to annotate a plot
154(2)
How to set the color palette
156(2)
How to enhance a plot that has subplots
158(2)
How to customize the titles for subplots
160(2)
How to set the size of a specific plot
162(8)
Section 2 The critical skills for success on the job
Chapter 5 How to get the data
How to find the data that you want to analyze
170(2)
Common data sources
170(1)
How to find and select the data that you want
170(2)
How to import data into a DataFrame
172(6)
How to import data directly into a DataFrame
172(2)
How to download a file to disk before importing it
174(2)
How to work with a zip file on disk
176(2)
How to get database data into a DataFrame
178(4)
How to run queries against a database
178(2)
How to use a SQL query to import data into a DataFrame
180(2)
How to work with a Stata file
182(4)
How to get and explore the metadata of a Stata file
182(2)
How to build DataFrames for the metadata and the data
184(2)
How to work with a JSON file
186(12)
How to download a JSON file to disk
186(1)
How to open a JSON file in JupyterLab
186(2)
How to drill down into the data
188(2)
How to build a DataFrame for the data
190(8)
Chapter 6 How to clean the data
Introduction to data cleaning
198(8)
A general plan for cleaning the data
198(2)
What the info() method can tell you
200(2)
What the unique values can tell you
202(2)
What the value counts can tell you
204(2)
How to simplify the data
206(6)
How to drop rows based on conditions
206(1)
How to drop duplicate rows
206(2)
How to drop columns
208(2)
How to rename columns
210(2)
How to find and fix missing values
212(6)
How to find missing values
212(2)
How to drop rows with missing values
214(2)
How to fill missing values
216(2)
How to fix data type problems
218(10)
How to find dates and numbers that are imported as objects
218(2)
How to convert date and time strings to the datetime data type
220(2)
How to convert object columns to numeric data types
222(2)
How to work with the category data type
224(2)
How to replace invalid values and convert a column's data type
226(2)
How to fix data problems when you import the data
228(12)
How find and fix outliers
230(1)
How to find outliers
230(2)
How to fix outliers
232(8)
Chapter 7 How to prepare the data
How to add and modify columns
240(6)
How to work with datetime columns
240(2)
How to work with string columns
242(1)
How to work with numeric columns
242(2)
How to add a summary column to a DataFrame
244(2)
How to apply functions and lambda expressions
246(8)
How to apply functions to rows or columns
246(2)
How to apply user-defined functions
248(2)
How lambda expressions work with DataFrames
250(2)
How to apply lambda expressions
252(2)
How to work with indexes
254(4)
How to set and remove an index
254(2)
How to unstack indexed data
256(2)
How to combine DataFrames
258(8)
How to join DataFrames with an inner join
258(2)
How to join DataFrames with a left or outer join
260(2)
How to merge DataFrames
262(2)
How to concatenate DataFrames
264(2)
How to handle the SettingWithCopyWarning
266(8)
What the warning is telling you
266(2)
What to do when the warning is displayed
268(1)
What to watch for when the warning isn't displayed
268(6)
Chapter 8 How to analyze the data
How to create and plot long data
274(4)
How to melt columns to create long data
274(2)
How to plot melted columns
276(2)
How to group and aggregate the data
278(6)
How to group and apply a single aggregate method
278(2)
How to work with a DataFrameGroupBy object
280(2)
How to apply multiple aggregate methods
282(2)
How to create and use pivot tables
284(4)
How to use the pivot() method
284(2)
How to use the pivot_table() method
286(2)
How to work with bins
288(6)
How to create bins of equal size
288(2)
How to create bins with equal numbers of values
290(2)
How to plot binned data
292(2)
More skills for data analysis
294(12)
How to select the rows with the largest values
294(2)
How to calculate the percent change
296(2)
How to rank rows
298(2)
How to find other methods for analysis
300(6)
Chapter 9 How to analyze time-series data
How to reindex time-series data
306(10)
How to generate time periods
306(2)
How to reindex with datetime indexes
308(2)
How to reindex with a semi-month index
310(2)
How a user-defined function can improve a datetime index
312(2)
How reindexing with an improved index can improve plots
314(2)
How to resample time-series data
316(6)
How to use the resample() method
316(2)
How to use the label and closed parameters when you downsample
318(2)
How downsampling can improve plots
320(2)
How to work with rolling windows
322(6)
The concept of rolling windows
322(2)
How to create rolling windows
324(2)
How to plot rolling window data
326(2)
How to work with running totals
328(10)
How to create running totals
328(2)
How to plot running totals
330(8)
Section 3 An introduction to predictive analysis
Chapter 10 How to make predictions with a linear regression model
Introduction to predictive analysis
338(2)
Types of predictive models
338(1)
Introduction to regression analysis
338(2)
How to find correlations between variables
340(10)
The Housing dataset
340(2)
How to identify correlations with a scatter plot
342(2)
How to identify correlations with a grid of scatter plots
344(2)
How to identify correlations with r-values
346(2)
How to identify correlations with a heatmap
348(2)
How to use Scikit-learn to work with a linear regression
350(10)
A procedure for creating and using a regression model
350(2)
The function and methods for linear regression models
352(2)
How to create, validate, and use a linear regression model
354(2)
How to plot the predicted data
356(2)
How to plot the residuals
358(2)
How to plot regression models with Seaborn
360(12)
The lmplot() method and some of its parameters
360(2)
How to plot a simple linear regression
362(1)
How to plot a logistic regression
362(2)
How to plot a polynomial regression
364(1)
How to plot a lowess regression
364(2)
How to use the residplot() method to plot the residuals
366(6)
Chapter 11 How to make predictions with a multiple regression model
A simple regression model for a Cars dataset
372(6)
The Cars dataset
372(2)
How to create a simple regression model
374(2)
How to plot the residuals of a simple regression
376(2)
How to work with a multiple regression model
378(4)
How to create a multiple regression model
378(2)
How to plot the residuals of a multiple regression
380(2)
How to work with categorical variables
382(10)
How to identify categorical variables
382(2)
How to review categorical variables
384(2)
How to create dummy variables
386(2)
How to rescale the data and check the correlations
388(2)
How to create a multiple regression that includes dummy variables
390(2)
How to improve a multiple regression model
392(14)
How to select the independent variables
392(2)
How to test different combinations of variables
394(2)
How to use Scikit-learn to select the variables
396(2)
How to select the right number of variables
398(8)
Section 4 The case studies
Chapter 12 The Polling case study
Get and display the data
406(2)
Import the modules that you will need
406(1)
Get the data
406(1)
Display the data
406(2)
Clean the data
408(8)
Examine the data
408(4)
Drop columns and rows
412(2)
Rename columns
414(1)
Fix object types
414(1)
Fix data
414(1)
Take an early plot with Pandas
414(1)
Save the DataFrame
414(2)
Prepare the data
416(6)
Add columns for grouping and filtering
416(2)
Create a new DataFrame in long form
418(1)
Take an early plot of the long data with Seaborn
418(2)
Add monthly bins to the DataFrame
420(1)
Add an average percent column for each month
420(1)
Save the wide and long DataFrames
420(2)
Analyze the data
422(8)
Plot the national and swing state polls
422(2)
Plot the voter types
424(2)
Plot the last two months of polling
426(2)
Plot the gap changes in selected states
428(2)
More preparation and analysis
430(12)
Prepare the gap data for the last week of polling
430(2)
Plot the gap data for the last week of polling
432(2)
Prepare the weekly gap data for the swing states
434(2)
Plot the weekly gap data for the swing states
436(6)
Chapter 13 The Forest Fires case study
Get the data
442(2)
Download and unzip the SQLite database
442(1)
Connect and query the database
442(1)
Import the data into a DataFrame
442(2)
Clean the data
444(6)
Examine the data
444(1)
Improve the readability of the data
444(2)
Drop unnecessary rows
446(1)
Drop duplicate rows
446(1)
Convert dates to datetime objects
446(2)
Check for missing contain dates
448(2)
Prepare the data
450(2)
Add fire_month and days_burning columns
450(1)
Examine the contain_date and days_burning columns
450(2)
Analyze the data
452(12)
Analyze the data for California
452(2)
Two more plots for California fires
454(2)
Rank the states by total acres burned
456(2)
Prepare a DataFrame for total acres burned by year within state
458(1)
Prepare a DataFrame for the top 4 states
458(2)
Plot the acres burned total by year for the top 4 states
460(2)
Review the 20 largest fires in California
462(2)
Use GeoPandas to plot the fires on a map
464(10)
Use GeoPandas to plot the California map
464(2)
Use GeoPandas or Seaborn to plot the California fires on a map
466(2)
Plot the fires in the continental United States
468(6)
Chapter 14 The Social Survey case study
Introduction to the Social Survey
474(2)
Download and unzip the zip file for the data
474(1)
Build a DataFrame for the metadata
474(2)
The employment data
476(10)
Use the codebook and read the data that you want
476(2)
Prepare the data
478(2)
Plot the data and reduce the number of categories
480(2)
Plot the total counts of the responses
482(2)
Convert the counts to percents and plot them
484(2)
The work-life balance data
486(8)
Search the codebook for small question sets
486(2)
Read and review the work-life data
488(2)
Plot the responses for the first question
490(2)
Plot the responses for the second and third questions
492(2)
How to expand the scope of the analysis
494(8)
Use the codebook to find related columns
494(2)
Use the codebook to find follow-up questions
496(2)
Select the columns for an expanded DataFrame
498(2)
Bin the data for a column
500(2)
How to use a hypothesis to guide your analysis
502(10)
Develop and test a first hypothesis
502(2)
Develop and test a second hypothesis
504(2)
Develop and test a third hypothesis
506(6)
Chapter 15 The Sports Analytics case study
Get the data and build the DataFrame
512(2)
Get the data
512(1)
Build the DataFrame
512(2)
Clean the data
514(2)
Locate and drop unneeded rows
514(1)
Locate and drop unneeded columns
514(1)
Convert the game_date column to datetime data
514(2)
Prepare the data
516(4)
Add a column for the season
516(1)
Add a column for the shot result
516(2)
Add a column for points made for each shot
518(1)
Add three summary columns
518(2)
Plot the summary data
520(2)
Plot the points per game by season
520(1)
Plot the averages of shots, shots made, and points per game by season
520(2)
Plot the shot locations
522(10)
Plot the shot locations for two games
522(2)
Plot the shot locations for two seasons
524(2)
Plot the shot density for one season
526(2)
Plot the shot density for two seasons
528
Appendix A How to set up Windows for this book
How to install and use Anaconda
532(4)
How to install Anaconda
532(2)
How to use the Anaconda Prompt
534(1)
How to use the Anaconda Navigator
534(2)
How to install and use the files for this book
536(6)
How to install the files for this book
536(2)
How to make sure Anaconda is installed correctly
538(1)
How to download the large data files for this book
538(4)
Appendix B How to set up macOS for this book
How to install and use Anaconda
542(4)
How to install Anaconda
542(2)
How to run conda commands
544(1)
How to use the Anaconda Navigator
544(2)
How to install and use the files for this book
546
How to install the files for this book
546(2)
How to make sure Anaconda is installed correctly
548(1)
How to download the large data files for this book
548