0% found this document useful (0 votes)
55 views3 pages

Temperature Data Analysis Python

This project is all about using python tools to conduct data analysis

Uploaded by

ee24b034
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views3 pages

Temperature Data Analysis Python

This project is all about using python tools to conduct data analysis

Uploaded by

ee24b034
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Temperature Data Analysis Assignment

Kushal Galrani (EE24B034)

find_temperature_extremes(filename, city_name):
●​ Iterates through every row in the csv file.
●​ If the city matches city_name and temperature entry is a valid number, that row is
considered for evaluation.
●​ The result dictionary, which is supposed to be returned at the end gets updated with the
new hottest and coldest temperature values.
●​ If no valid temperature entry is found, empty dictionary is returned.

get_seasonal_averages(filename, city_name, season):


●​ Iterates through every row in the csv file.
●​ Two variables: sum and n (both initially zero) are used to calculate the average.
●​ The date entry is checked. If it matches the season, sum and n are updated.
●​ If no valid entry with the given season was found, it returns an empty dictionary.
●​ If the season argument is not valid, it returns an empty dictionary.

compare_decades(filename, city_name, decade1, decade2):


●​ The last digit of decade gets ignored i.e., 1921--1929 are treated like 1920.
●​ Same decades are allowed.
●​ For calculating averages variables n1, n2, sum1 and sum2 are used (all initially zero).
●​ If no temperature entry is found for one or both of the decades, an empty dictionary is
returned. Trend is returned as stable if the difference is less than threshold (a local
variable)

get_all_averages(filename):
●​ Helper function for find_similar_cities.
●​ Goes through every row and returns a dict mapping city_name to another dictionary of
the format: {"country_name": str, "avg_temp": float, "data_points": int}.
●​ Initially, city_averages starts off as an empty dictionary, but as new cities are found, it is
populated and avg_temp is updated to store the sum of all the temperatures found for
the city
●​ At the end, those sums are converted to averages.
find_similar_cities(filename, target_city, tolerance):
●​ Calls get_all_averages and stores result in city_averages. If target_city is not present in
city_averages (not present in file), an empty dictionary is returned.
●​ If present, it iterates through every city in city_averages. If the magnitude of difference
in temperature is less than tolerance, it is added to similar_cities list.
●​ The "difference" parameter only stores the magnitude of difference.
●​ The sign of tolerance doesn't matter.

get_temperature_trends(filename, city_name, window_size):

CSV Format:
In the CSV file, all the dates are given in ascending order. Some data for some dates is missing,
but every single date for a given city is in ascending order

Moving average:
●​ Say window_size = 5, and data for a few years is missing. Say we have data for: 2000
2001 2003 2004 2007 2009
●​ Then, moving average will be = Mean temperature of the years: 2000, 2001, 2003, 2004,
2007
●​ The "year" of the moving average (to be put in the moving_averages dict) will be: (2000
+ 2007) // 2
●​ The next moving average will be = Mean temperature of: 2001, 2003, 2004, 2007, 2009
●​ And the year will be (2001 + 2009) // 2, And so on
●​ Note: The temperature of a year will be the average temperature of all the datapoints in
that year
●​ This yearly average is stored in the dict raw_annual data
●​ The moving average, along with the year of the moving average window as defined
above is stored in moving_averages

Slope Analysis:
●​ The slope is taken between the first and last moving average (from moving_averages
dict)
●​ That is, slope = (last_moving_avg_temp - first_moving_avg_temp) /
(last_moving_avg_year - first_moving_avg_year)
●​ Note that slope here is a signed quantity.

Warming/Cooling Trends:
Moving averages from moving_averages dict used to analyse trends
The rate reported is the magnitude of slope calculated from the two endpoints
Implementation Details:
The program iterates through every row in the file.
Things that need to be done to "process" a row:
●​ Each year may have several entries in this file. Since the file is in ascending order in
date, all dates of the same year must be grouped together. Variables current_year,
temp_sum_current_year and n_months_current_year are used to calculate the average
temperature of a particular year which is the mean of all temperature entries of that year.
●​ current_year is the year whose average we are currently calculating.
temp_sum_current_year is the sum of all temperature entries found in the given year.
n_months_current_year is the number of entries of the year encountered so far.
●​ row_year contains the year of the current row in the csv file that is being processed.
When row_year is not equal to current_year, that means that all entries of current_year
have been processed and we can calculate the average temperature of the year. Once
we calculate the average temperature of a given year, we can save it and use it to
calculate the moving average. The inner function calculate_yearly_average() calculates
the average and saves it. After that, it calls calculate_moving_average(), another
inner function to calculate moving average.
●​ Note that since, calculate_yearly_average() is only called when row_year is different
from current_year, we have to call calculate_yearly_average() once after all the rows
are processed, to ensure the last yearly average gets processed.
●​ Similarly, calculate_moving_average() uses the variables temp_mv_sum, mvsum_years
and current_mv_year to calculate moving average. mvsum_years is a list of years whose
moving average we are currently calculating. At the end of this function, the inner
function analyze_trends() is called.
●​ Note that the suffix mv_year means that we are referring to the year of some moving
average window.
●​ analyze_trends() checks if it is currently warming or cooling, using variables isWarming,
isCooling, trend_start_mv_year and trend_end_mv_year. If the trend changes, it saves
the old trend data. After all the rows are processed, and after
calculate_yearly_average() is called, we need to explicitly save the current trend
data, to make sure that the last trend gets saved

You might also like