Skip to main content

Posts

Showing posts with the label data-processing

Writing Data to Excel Sheets with Python Pandas

Pandas, a powerful Python library for data manipulation and analysis, provides seamless integration with Microsoft Excel. Writing data to Excel sheets using Pandas is a common task in data analysis, enabling you to export your data into a widely accessible and editable format. In this blog post, we will explore the various methods for writing data to Excel sheets using Pandas. We will cover the syntax, usage, and best practices for each method, providing code examples and practical applications. Methods for Writing Data to Excel Sheets Pandas offers two primary methods for writing data to Excel sheets: to_excel(): Writes a DataFrame or Series to an Excel sheet, creating a new file or appending to an existing one. ExcelWriter: Provides a more advanced interface for writing data to Excel sheets, allowing for finer control over the writing process. 1. Using the to_excel() Method The to_excel() method is the most straightforward way to write data to an Excel sheet. It takes a filename as i...

Converting Rows in Pandas DataFrames to Lists: A Comprehensive Guide

Pandas, a powerful Python library for data manipulation and analysis, provides a convenient way to work with tabular data structures known as DataFrames. DataFrames are essentially two-dimensional tables with labeled axes and columns. One common operation in data analysis is converting rows or columns of a DataFrame into lists for further processing or visualization. In this blog post, we will delve into various methods for converting rows of a Pandas DataFrame to lists and explore the nuances and applications of each approach. Method 1: Using the .tolist() Method The simplest way to convert a row of a DataFrame to a list is by using the .tolist() method. This method converts an entire row, or a specific row index, to a Python list. import pandas as pd # Create a DataFrame df = pd.DataFrame({ "Name": ["John", "Mary", "Bob"], "Age": [25, 30, 35] }) # Convert the first row to a list row_list = df.iloc[0].tolist() # Print th...

Pandas Drop: Removing Columns from DataFrames

Pandas is a powerful Python library for data manipulation and analysis. One of its most commonly used functions is drop(), which allows you to remove columns from a DataFrame. This can be useful for a variety of reasons, such as: Removing unnecessary or irrelevant columns Cleaning data by removing duplicate or erroneous columns Preparing data for specific tasks or models How to Use Pandas Drop The drop() function takes a list of column names as its first argument. The columns will be removed from the DataFrame and returned as a new DataFrame. The original DataFrame will not be modified. The following example shows how to use the drop() function to remove a single column from a DataFrame: import pandas as pd # Create a DataFrame df = pd.DataFrame({'name': ['Alice', 'Bob', 'Charlie'], 'age': [20, 30, 40], 'city': ['New York', 'Boston', 'Chicago']}) # Remove the 'city' column df = df.drop('city', axi...

Pandas Concat: Combining DataFrames

Pandas is a powerful Python library for data manipulation and analysis. One of its most useful features is the concat() function, which allows you to combine multiple DataFrames into a single DataFrame. This can be useful for a variety of tasks, such as: Merging data from different sources Combining data from different time periods Creating a single DataFrame from multiple smaller DataFrames How to Use Pandas Concat The concat() function takes a list of DataFrames as its first argument. The DataFrames must have the same number of columns, but the rows can be different. The concat() function will stack the DataFrames vertically, creating a single DataFrame with the combined rows. The following example shows how to use the concat() function to combine two DataFrames: import pandas as pd # Create two DataFrames df1 = pd.DataFrame({'name': ['Alice', 'Bob', 'Charlie'], 'age': [20, 30, 40]}) df2 = pd.DataFrame({'name': ['Dave', 'Ev...

Python Pandas Sorting Dataframe By Columns Which Contains Nan Values (Example)

Sorting dataframe by columns which contains nan values. DataFrame has "sort_values()" method can take an another parameter called "na_position".Using this parameter the rows containing nan values can be pushed to either top or bottom Creating a new dataframe with dictionary  # importing pandas import pandas as pd import numpy as np # animal_data dictionary animal_data = { "Name": ["Cat", "Dog", "Cow"], "Speed": [15, 12, 10], "Sound": ["Meow", "Woof", "Mooo"], "Rank": [1, 5, 3], "Jumping_height": [20, 10, np.NaN], } # creating a dataframe using the animal_data dictionary animal_df = pd.DataFrame(animal_data) # printing animal_df print("animal_df \n", animal_df) animal_df Name Speed Sound Rank Jumping_height 0 Cat 15 Meow 1 20.0 1 Dog 12 Woof 5 10.0 2 Cow 10 Mooo ...

Python Pandas Sorting Dataframe In Ascending or Descending Order Based On Single or Multiple Columns (Example)

Sorting pandas dataframe by single or multiple columns. DataFrame has "sort_values()" method which can be used to sort the dataframe based single or multiple columns , control sorting flow and choose ascending or descending order. Creating a new dataframe with dictionary  # importing pandas import pandas as pd import numpy as np # animal_data dictionary animal_data = { "Name": ["Cat", "Dog", "Cow"], "Speed": [15, 12, 10], "Sound": ["Meow", "Woof", "Mooo"], "Rank": [1, 5, 3], "Jumping_height": [20, 10, np.NaN], } # creating a dataframe using the animal_data dictionary animal_df = pd.DataFrame(animal_data) # printing animal_df print("animal_df \n", animal_df) animal_df Name Speed Sound Rank Jumping_height 0 Cat 15 Meow 1 20.0 1 Dog 12 Woof 5 10.0 2 Cow 10 Mooo 3 ...

Python Pandas Find And Replace String Values With New Values In DataFrame Columns (Example)

Find and replace string values in a column - pandas dataframe String type has str.replace() method which can used for finding and replacing values.We are required to chain this method string type values and pass parameters for value to be searched and new replacing value. Based on the dataset one might be required to run search and replace on columns with mixed datatypes ie. int ,etc. To handle this assert type as string and then chain required string methods Creating a new dataframe with dictionary  # importing pandas import pandas as pd # animal_sp_char_df - with special characters animal_data_with_sp_char = { "Name": ["Cat", "Dog", "Cow", "Tiger", "Goat", "Snake"], "Sound": ["#Meow###", "Wo##of", "Mo#oo", "Rwaar###", "##Baaa", "Skkk##sss"], "Mixed": [123, "#13", "53###", 321, "###456", ...

Python Pandas Select Every Nth Row In DataFrame (Example)

Selecting every nth row from the dataframe. We can select every nth row item from the pandas dataframe by using ".iloc" method. It has the slicing features and stepping features similar to list slicing. iloc is index based and starts from zero Creating a new dataframe with dictionary  # importing pandas import pandas as pd # animal_data_ animal_data_ext = { "Name": ["Cat", "Dog", "Cow","Tiger","Goat","Snake"], "Sound": ["Meow", "Woof", "Mooo","Rwaar","Baaa","Skkksss"], } #creating a dataframe using the animal_data_ext dictionary animal_ext_df = pd.DataFrame(animal_data_ext) #printing animal_ext_df print("animal_ext_df \n", animal_ext_df) animal_ext_df Name Sound 0 Cat Meow 1 Dog Woof 2 Cow Mooo 3 Tiger Rwaar 4 Goat Baaa 5 Snake Skkksss Selecting every nth row (includi...

Python Pandas Lower/ Upper Case values in DataFrame Columns (Examples)

Lowercase / uppercase all column cell contents. String type has methods for lower-casing and upper-casing. A column is selected from dataframe and str (string) operations are made on it. But this would throw errors for int data types. We can either skip over those columns using conditionals or use "astype" assert it as a string and then operate string methods on it without any error.Based on requirement one might choose to either skip over or handle all columns with mixed type contents . Creating a new dataframe with dictionary (which has alpha numeric cell contents) # importing pandas import pandas as pd # animal_data_with_alpha_nums dictionary animal_data_with_alpha_nums = { "Name": ["Cat", "Dog", "Cow"], "Speed": [15, 12, 10], "Sound": ["Meow", "Woof", "Mooo"], "Alpha_Num":[123,"aBc123XyZ","123xYz"] } #creating a dataframe using the a...

Python Pandas Iterate All Columns Get Column-Wise Unique Values From Dataframe (Example)

Generating column unique values from all columns from a dataframe and storing it into python dictionary. Dataframe has "unique" method which returns the required unique values.  Dataframe "columns" property will also be used to get the column names which will iterated over for the purpose. Creating a new dataframe with dictionary (which has duplicate data) # importing pandas import pandas as pd # animal_data_with_duplicates dictionary animal_data_with_duplicates = { "Name": ["Cat", "Dog", "Cow","Tiger","Cat"], "Speed": [15, 12, 10,20,15], "Sound": ["Meow", "Woof", "Mooo","Roar","Meow"], } #creating a dataframe using the animal_data_with_duplicates dictionary animal_with_duplicates_df = pd.DataFrame(animal_data_with_duplicates) #printing animal_with_duplicates_df print("animal_with_duplicates_df \n", animal_with_...

Python Pandas Get Column Unique Values From Dataframe (Example)

Generating column unique values from a dataframe. Dataframe has "unique" method which returns the required unique values. Creating a new dataframe with dictionary (which has duplicate data) # importing pandas import pandas as pd # animal_data_with_duplicates dictionary animal_data_with_duplicates = { "Name": ["Cat", "Dog", "Cow","Tiger","Cat"], "Speed": [15, 12, 10,20,15], "Sound": ["Meow", "Woof", "Mooo","Roar","Meow"], } #creating a dataframe using the animal_data_with_duplicates dictionary animal_with_duplicates_df = pd.DataFrame(animal_data_with_duplicates) #printing animal_with_duplicates_df print("animal_with_duplicates_df \n", animal_with_duplicates_df) animal_with_duplicates_df Name Speed Sound 0 Cat 15 Meow 1 Dog 12 Woof 2 Cow 10 Mooo 3 Tiger 20 Roar 4 Cat 15 Meow #g...

Python Pandas Get Index List Of Dataframe (Example)

Generating index list from a dataframe. Dataframe object has "index" method which returns the required data object. This can be converted into list. Index list can be used for counting of current number of rows , for using ".iloc" , shuffling etc. Creating a new dataframe with data # importing pandas import pandas as pd # animal_data dictionary animal_data = { "Name": ["Cat", "Dog", "Cow"], "Speed": [15, 12, 10], "Sound": ["Meow", "Woof", "Mooo"], } #creating a dataframe using the animal_data dictionary animal_df = pd.DataFrame(animal_data) #printing animal_df print("animal_df \n", animal_df) animal_df Name Speed Sound 0 Cat 15 Meow 1 Dog 12 Woof 2 Cow 10 Mooo # create a list containing values indices_as_list = list(animal_df.index) print("indices_as_list \n", indices_as_list) print("\n length of indices_as_...

Python Pandas Get Column Names (Headers) List With Example

Generating column names list from a dataframe. Dataframe object has "columns" method which returns the required data object. This can be converted into list.  Creating a new dataframe with data # importing pandas import pandas as pd # animal_data dictionary animal_data = { "Name": ["Cat", "Dog", "Cow"], "Speed": [15, 12, 10], "Sound": ["Meow", "Woof", "Mooo"], } #creating a dataframe using the animal_data dictionary animal_df = pd.DataFrame(animal_data) #printing animal_df print("animal_df \n", animal_df) animal_df Name Speed Sound 0 Cat 15 Meow 1 Dog 12 Woof 2 Cow 10 Mooo # create a list containing values column_names_as_list = list(animal_df.columns) print("column_names_as_list \n", column_names_as_list) print("\n length of column_names_as_list \n", len(column_names_as_list)) #or directly use the values a...

Python Pandas Create Datatframe Using Dictionary (Example)

Creating a pandas dataframe with dictionary data. A dictionary containing lists as values can directly passed into pd.Dataframe method to create a new dataframe using the passed data. The keys in the dictionary will be used as the column headers and values in the list will be used as row values. # importing pandas import pandas as pd # animal_data dictionary animal_data = { "Name": ["Cat", "Dog", "Cow"], "Speed": [15, 12, 10], "Sound": ["Meow", "Woof", "Mooo"], } #creating a dataframe using the animal_data dictionary animal_df = pd.DataFrame(animal_data) #printing animal_df print("animal_df \n", animal_df) animal_df Name Speed Sound 0 Cat 15 Meow 1 Dog 12 Woof 2 Cow 10 Mooo

TypeDoc.json

TypeDoc makes building documentations very easy. It can fit into most codebases with minimal setup. TypeDoc extracts & builds documentation from typescript types , interfaces,.. and also makes use javascript doc comments. Install dependencies: npm i typedoc -d yarn add typedoc -d //installing as dev dependency After installation , create a "typedoc.json" in the root path. Find more resources here  https://typedoc.org/options/configuration/ Configuration file I have customised  https://github.com/808vita/nextjs-blog-with-auth/blob/deploy/typedoc.json Repository GitHub link :  Full repository github - 808vita -oofdev

Tried to open a very big CSV file in excel

Have you tried opening a very big csv file with excel?  I tried to open a csv file with 5million rows of data using Microsoft Excel : 1,048,576 rows limit . 1 million row limit ,I was aware of this. Not all the data was loaded , even got a notification modal stating this. 32,767 character per cell , this I was not aware of. After opening a file which exceed this limit , new rows were displayed and looked like a mess. But the file was properly formatted when opened with notepad. This one is obvious but formulas and filters were very slow (given the size of the data ,expected). Why I was required to open a file with over 5million rows in the first place ? I was actively trying to learn machine learning and tried to build dataset for supervised learning. I wanted to open the file to mark classes and values - for training classification and regression models. Workarounds I did include : filtered and removed currently unused rows . This cut the size by almost half. split the files into ...

Python literal_eval - convert stringed list into python list

 Python ast.literal_eval()  Is a very useful method which can evaluate a string contents for python datatypes . If the string happens to have valid datatype inside the string it will initialised . One use case in which I personally used it was to convert strings values stored csv files . The rows contained "list" as string and wanted to run operations on it. These must be converted into list format first ; ast.literal_eval()  achieves that ! #python code import ast #string containing a list stringed_list= "[1,2,3]" #converting stringed list into python list converted_list = ast.ast.literal_eval(string_list) print(type(stringed_list)) # 'str' print(type(converted_list)) # 'list' Even though this is a really convenient way to convert stringed list back into python list ,  it is slow. This works great for moderately small csv data files in which we can store scaled parameter lists , etc as string .  When size of the file increases so does ...

Topics

Show more