in Education by
for my university assignment, I have to produce a csv file with all the distances of the airports of the world... the problem is that my csv file weight 151Mb. I want to reduce it as much as i can: This is my csv: and this is my code: # drop all features we don't need for attribute in df: if attribute not in ('NAME', 'COUNTRY', 'IATA', 'LAT', 'LNG'): df = df.drop(attribute, axis=1) # create a dictionary of airports, each airport has the following structure: # IATA : (NAME, COUNTRY, LAT, LNG) airport_dict = {} for airport in df.itertuples(): airport_dict[airport[3]] = (airport[1], airport[2], airport[4], airport[5]) # From tutorial 4 soulution: airportcodes=list(airport_dict) airportdists=pd.DataFrame() for i, airport_code1 in enumerate(airportcodes): airport1 = airport_dict[airport_code1] dists=[] for j, airport_code2 in enumerate(airportcodes): if j > i: airport2 = airport_dict[airport_code2] dists.append(distanceBetweenAirports(airport1[2],airport1[3],airport2[2],airport2[3])) else: # little edit: no need to calculate the distance twice, all duplicates are set to 0 distance dists.append(0) airportdists[i]=dists airportdists.columns=airportcodes airportdists.index=airportcodes # set all 0 distance values to NaN airportdists = airportdists.replace(0, np.nan) airportdists.to_csv(r'../Project Data Files-20190322/distances.csv') I also tried re-indexing it before saving: # remove all NaN values airportdists = airportdists.stack().reset_index() airportdists.columns = ['airport1','airport2','distance'] but the result is a dataframe with 3 columns and 17 million columns and a disk size of 419Mb... quite not an improvement... Can you help me shrink the size of my csv? Thank you! JavaScript questions and answers, JavaScript questions pdf, JavaScript question bank, JavaScript questions and answers pdf, mcq on JavaScript pdf, JavaScript questions and solutions, JavaScript mcq Test , Interview JavaScript questions, JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)

1 Answer

0 votes
by
I have done a similar application in the past; here's what I will do: It is difficult to shrink your file, but if your application needs to have for example a distance between an airport from others, I suggest you to create 9541 files, each file will be the distance of an airport to others and its name will be name of airport. In this case the loading of file is really fast.

Related questions

0 votes
    for my university assignment, I have to produce a csv file with all the distances of the airports of ... Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked May 7, 2022 in Education by JackTerrance
0 votes
    I have csv file and that looks like following. I want to remove all rows before one row values [ ... JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked May 14, 2022 in Education by JackTerrance
0 votes
    How can I remove emojis that start with '\x' when reading a csv file using pandas in Python? The ... JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Apr 14, 2022 in Education by JackTerrance
0 votes
    I'am trying to follow this tutorial: ... for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Jun 2, 2022 in Education by JackTerrance
0 votes
    This may sound like a very broad question, but if you'll let me describe some details I can ... JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked May 7, 2022 in Education by JackTerrance
0 votes
    I'am trying to follow this tutorial: ... for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Apr 29, 2022 in Education by JackTerrance
0 votes
    My data looks as follows: ID my_val db_val a X X a X X a Y X b X Y b Y Y b ... JavaScript questions, JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Apr 21, 2022 in Education by JackTerrance
0 votes
    I have a simple method to search a pandas dataframe column for a list of keywords; however, I'd like to create a ... do everyth 28,passei o dia com o meu amor comemo demai...
asked Apr 13, 2022 in Education by JackTerrance
0 votes
    I am utilizing the following to find missing values in my spark df: from pyspark.sql.functions import col, ... Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Apr 7, 2022 in Education by JackTerrance
0 votes
    Currently I'm returning column name of the max value in the each row. df['Active'] = df.idxmax( ... JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Apr 5, 2022 in Education by JackTerrance
0 votes
    Currently I'm returning column name of the max value in the each row. df['Active'] = df.idxmax( ... JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Apr 2, 2022 in Education by JackTerrance
0 votes
    So I have a list of people, each of them are given more than 2 books, 4 books are possible. I ... , JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Apr 2, 2022 in Education by JackTerrance
0 votes
    I have a dataframe as below itm Date Amount 67 420 2012-09-30 00:00:00 65211 68 421 2012-09-09 00 ... solutions would be appreciated. Select the correct answer from above options...
asked Jan 28, 2022 in Education by JackTerrance
0 votes
    I'm starting with input data like this df1 = pandas.DataFrame( { "Name" : ["Alice", "Bob", "Mallory", ... Any hints would be welcome. Select the correct answer from above options...
asked Jan 28, 2022 in Education by JackTerrance
0 votes
    I have a 20 x 4000 dataframe in python using pandas. Two of these columns are named Year and quarter. I'd ... anyone help with that? Select the correct answer from above options...
asked Jan 28, 2022 in Education by JackTerrance
...