{ "cells": [ { "cell_type": "markdown", "id": "6a1ea458", "metadata": {}, "source": [ "# US GDP Forecasting\n", "\n", "Author: Dron Mongia\n", "\n", "Course Project, UC Irvine, Math 10, S24\n", "\n", "I would like to post my notebook on the course’s website: Yes" ] }, { "cell_type": "markdown", "id": "ec8c04f0", "metadata": {}, "source": [ "# Motivation and Procedure" ] }, { "cell_type": "markdown", "id": "f4017e0c", "metadata": {}, "source": [ "GDP is a very strong metric of a country's economic wellbeing and therefore, forecasts of GDP are highly sought after. Policy makers and legislators, for example, may want to have a rough forecast of the trends regarding the country's GDP prior to passing some new bill or law. My procedure for this project will be to first compile time series data from the FRED api consisting of some economic metrics closely related to GDP (GDP = Consumption + Investment + Govt. Spending + Net Export). We then will conduct a series of exploratory tests to better understand our data, and finally we will use a vairety of models (including machine learning methods) to see which one provides the most accurate forecast." ] }, { "cell_type": "code", "execution_count": 1, "id": "0bd734f0", "metadata": {}, "outputs": [], "source": [ "import fredapi as fd\n", "import pandas as pd\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import plotly.express as px" ] }, { "cell_type": "markdown", "id": "983f6d49", "metadata": {}, "source": [ "# Feature Creation" ] }, { "cell_type": "code", "execution_count": 2, "id": "36a3c2c6", "metadata": {}, "outputs": [], "source": [ "fred = fd.Fred(api_key = 'efdb939ff3f6b3129c564ef0aa34a91e') # generally this should be hidden, however for the purposes of this project I will keep it visible" ] }, { "cell_type": "code", "execution_count": 3, "id": "df42df34", "metadata": {}, "outputs": [], "source": [ "def gen_df(category, series):\n", " gen_ser = fred.get_series(series, frequency='q')\n", " return pd.DataFrame({'Date': gen_ser.index, category + ' : Billions of dollars': gen_ser.values})" ] }, { "cell_type": "code", "execution_count": 4, "id": "ad6c8814", "metadata": {}, "outputs": [], "source": [ "def merge_dataframes(dataframes, on_column):\n", " merged_df = dataframes[0]\n", "\n", " for df in dataframes[1:]:\n", " merged_df = pd.merge(merged_df, df, on=on_column)\n", "\n", " return merged_df" ] }, { "cell_type": "code", "execution_count": 5, "id": "9ff5a8c6", "metadata": {}, "outputs": [], "source": [ "dataframes_list = [\n", " gen_df('GDP', 'GDP'),\n", " gen_df('PCE', 'PCE'),\n", " gen_df('GPDI', 'GPDI'),\n", " gen_df('NETEXP', 'NETEXP'),\n", " gen_df('GovTotExp', 'W068RCQ027SBEA')\n", "]" ] }, { "cell_type": "code", "execution_count": 6, "id": "f8f0d003", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | Date | \n", "GDP : Billions of dollars | \n", "PCE : Billions of dollars | \n", "GPDI : Billions of dollars | \n", "NETEXP : Billions of dollars | \n", "GovTotExp : Billions of dollars | \n", "
---|---|---|---|---|---|---|
0 | \n", "1960-01-01 | \n", "542.648 | \n", "326.4 | \n", "96.476 | \n", "2.858 | \n", "144.233 | \n", "
1 | \n", "1960-04-01 | \n", "541.080 | \n", "332.2 | \n", "87.096 | \n", "3.395 | \n", "147.417 | \n", "
2 | \n", "1960-07-01 | \n", "545.604 | \n", "332.1 | \n", "86.377 | \n", "4.682 | \n", "150.459 | \n", "
3 | \n", "1960-10-01 | \n", "540.197 | \n", "334.0 | \n", "75.963 | \n", "5.880 | \n", "153.780 | \n", "
4 | \n", "1961-01-01 | \n", "545.018 | \n", "334.5 | \n", "78.378 | \n", "5.902 | \n", "157.254 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
252 | \n", "2023-01-01 | \n", "26813.601 | \n", "18269.6 | \n", "4725.828 | \n", "-825.687 | \n", "9326.383 | \n", "
253 | \n", "2023-04-01 | \n", "27063.012 | \n", "18419.0 | \n", "4780.290 | \n", "-806.093 | \n", "9422.404 | \n", "
254 | \n", "2023-07-01 | \n", "27610.128 | \n", "18679.5 | \n", "4915.033 | \n", "-779.231 | \n", "10007.677 | \n", "
255 | \n", "2023-10-01 | \n", "27956.998 | \n", "18914.5 | \n", "4954.426 | \n", "-783.734 | \n", "9700.808 | \n", "
256 | \n", "2024-01-01 | \n", "28255.928 | \n", "19164.2 | \n", "5004.419 | \n", "-850.094 | \n", "9924.229 | \n", "
257 rows × 6 columns
\n", "