Skip to the content.

Income, Inflation, Expenditure Analysis by US States

Welcome all !! The purpose of this analysis is to see how US, states of US is doing with it GDP, income and expenditure. This analysis finds some insights about inflation, spending behavior of states and residents.

_P.S:_ the data needed for this project is taken from BEA’s Website. Also, while computing the filter and analysis for this project, I have compared 50 states and Disctrict of Colombia (DC)

As expected, to accomplish the analysis, I have gone through several steps:

1. Importing packages

While importing packages, I have imported many library and packages with it. Sample of my code is:

import pandas as pd

#pip install jupyter_plotly_dash

import zipfile
import matplotlib
matplotlib.style.use('ggplot')
import plotly
import plotly.offline as pyo              #with this I will be able to plot charts offline
import plotly.graph_objects as go            #using plotly graph objects
import dash                                 #enables to build interactive web based applications (charts, graphs)
from jupyter_plotly_dash import JupyterDash   #enables to run dash from jupyter
import dash_core_components as dcc            # to create graph in our layout 
import dash_html_components as html            # to generate html components and use them
from dash.dependencies import Input, Output      #to make the dash apps interactive

from IPython.display import display
plotly.offline.init_notebook_mode()

For full length of my code, click here

2. Importing files and Reading

I downloaded the zipfiles from BEA’s website. I downloaded 9 of them each of around 50000 rows and 30 columns. Sample of my code is:

def extract_zipfile(file):                        
    with zipfile.ZipFile(file,'r') as zip_r:
        zip_r.extractall()
    extract = zip_r.namelist()
    return extract
    
    file='SAINC Annual Personal Income and Employment By State.zip'
    zip_read_api = extract_zipfile(file)

For full length of my code, click here

3. Filtering data and Processing

To come up with specific insights, visualization, filtering and processing of datasets is necessary, so many times I used dropping, accessing specific rows and columns, iterating over list, columns and rows, performing some calculations and even sometime creating a new data frame from the data I have. Some sample of my code is:

income_state_employment_df_all=income_state_employment_df_all.dropna()
----------------------------------

def filter(x):
    x_drop = x.drop(['GeoFIPS','Region','Unit','TableName','LineCode',
                                                               'IndustryClassification'], axis=1)
    years =  x_drop.iloc[:,2:13]
    x_filter= x_drop.drop(years, axis=1)
    x_filter =x_filter.rename(columns={"GeoName":"States"})
    
    return x_filter  
  (Note: This is just sample.)

For full length of my code, click here

4. Visualization and Analysis

For ease of visualization and analysis, I have divided the analysis part into 5 categories:

  1. Analysis of GDP
  2. Aanalysis of Income
  3. Analysis of Personal expenditure and Inflation
  4. Analysis of Relative Cost and living
  5. Co-relation between GDP and Relative Cost and living

5. Findings and Conclusions