data assignment 3

Data Assignment 1 will revolve around a common business task in Data Science of taking a large data set and forming some conclusions and answering some research questions your company may have

We will look at a data set I harvested for Washington DC over the last 25 years. You will answer a variety of weather questions based on the data. For each answer you should have corresponding code in python to justify your answer. You may do a python file for each question or do it all in one, it is up to you.

Common to many datasets, some of the data is missing so you will have to determine how to handle missing values.

I have included one file weather.py to get you going and to show you a way to handle Dates. The data you will process is weather_data.csv. An example of reading it in is included in weather.py

Please answer the following questions:

———————————————————

1) What 3 year period had the highest change in actual mean Temperature?

2) What Month has the highest actual Max Temperature on Average across 25 years?

3) What Month on average had the highest difference between the actual low
and actual high temperature on a given day across all 25 years? (I.e. hard to dress for because
it is cold in the morning/night and hot in the day)

4) What is the actual rainiest month on average?

5) in the last 25 years do we have more days that are above average
precipitation or more days below?

6) Is Washington DC on average getting warmer, colder, or staying the
same over the past 25 years?

Bonus:
+6
a) Has DC’s weather gotten more extreme over the last 25 years? Give
your reasoning, code and proof whether yes or no.

+4
b) What do you think the actual min and the actual max temperature
will be on Thanksgiving 2017 (November 23rd)

+2

Tell me something interesting you found in the data that I might not know

—————————————-

Description of Data

————————————–

Column Description
date The date of the weather record, formatted YYYY-M-D
actual_mean_temp The measured average temperature for that day
actual_min_temp The measured minimum temperature for that day
actual_max_temp The measured maximum temperature for that day
average_min_temp The average minimum temperature on that day since 1880
average_max_temp The average maximum temperature on that day since 1880
record_min_temp The lowest ever temperature on that day since 1880
record_max_temp The highest ever temperature on that day since 1880
record_min_temp_year The year that the lowest ever temperature occurred
record_max_temp_year The year that the highest ever temperature occurred
actual_precipitation The measured amount of rain or snow for that day
average_precipitation The average amount of rain or snow on that day since 1880
record_precipitation The highest amount of rain or snow on that day since 1880

Turn in your python file and a text file containing your answers to the questions.