This project analyzed historical stock price data for major companies and the S&P 500 index to understand price movements, correlations, and daily return patterns. I built an interactive dashboard using Python's data science stack to visualize both raw and normalized stock performance alongside risk metrics.
pd.read_csv() and explored the dataset structure with .info(), .describe(), and .head() to understand the time series format and identify key stocks. Checked for missing values using .isnull().sum() and calculated basic statistics like mean returns and standard deviation to assess data completeness and variability.normalize() function to standardize all stock prices to their starting values, enabling fair comparison of relative performance across different price ranges. daily_return() function using nested loops to compute percentage daily returns: ((current_price - previous_price) / previous_price) * 100 for each stock.show_plot() and interactive_plot() to create both static matplotlib charts and interactive Plotly visualizations for raw prices, normalized prices, and daily returns..corr() and visualized it with a Seaborn heatmap to identify relationships between stock movements.create_distplot() to analyze the statistical properties of daily returns.def show_plot(df, title):
df.plot(x='Date', figsize=(12, 8), linewidth=3, title=title)
plt.xlabel('Date')
plt.ylabel('Price')
plt.grid()
plt.show()
# Plot the data (Figure 1)
show_plot(stocks_df, 'STOCKS DATA')
# Normalized Stock Data (Figure 2)
def normalize(df):
x = df.copy()
for i in x.columns[1:]:
x[i] = x[i] / x[i][0]
return x
normalize(stocks_df)
# Create Interactive chart of Stock Data (Figure 3)
def interactive_plot(df, title):
fig = px.line(title=title)
for i in df.columns[1:]:
fig.add_scatter(x=df['Date'], y=df[i], name=i)
fig.update_layout(
xaxis_title="Date",
yaxis_title="Price"
)
fig.show()
interactive_plot(stocks_df, 'STOCKS DATA')
# Create Interactive chart of Normalized Stock Data (Figure 4)
interactive_plot(normalize(stocks_df), 'STOCKS DATA')
# Calculate stocks daily returns
def daily_return(df):
df_daily_return = df.copy()
for i in df.columns[1:]: # loop through columns
for j in range(1, len(df)): # loop through rows
df_daily_return[i][j] = ((df[i][j] - df[i][j - 1]) / df[i][j - 1]) * 100
df_daily_return[i][0] = 0
return df_daily_return
# Get the daily returns (Figure 5)
stocks_daily_return = daily_return(stocks_df)
stocks_daily_return
interactive_plot(stocks_daily_return, 'Stocks Daily returns')
# Daily Return Correlation
cm = stocks_daily_return.drop(columns=['Date']).corr()
cm
# Heatmap showing correlations (Figure 6)
plt.figure(figsize=(10, 8))
sns.heatmap(cm, annot=True, cmap='RdYlGn') # `annot=True` displays values on the heatmap
plt.show()
# Histogram of daily returns (Figure 7)
stocks_daily_return.hist(bins=50, figsize=(20, 10))
plt.show()
The daily returns calculation initially produced incorrect values for the first row of each stock. After debugging, the issue was that there's no previous day to calculate a return from for the first entry. This was solved by explicitly setting the first day's return to 0 using df_daily_return[i][0] = 0 after the loop calculation, ensuring accurate percentage calculations for all subsequent days.