Data analysis using pandas - Tips and Techniques

less than 1 minute read

Pandas

pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.

Installation

  • Download Anaconda and install it. It has pandas pre-configured with it.
  • In case you want to install pandas manually then use pip and install latest version.
    pip install pandas
    

Pandas Code Sniplets

1. How to read data from csv file?
import pandas as pd

df = pd.read_csv('<csv filename>')
2. How to load data in pandas dataframe from database using sql query?
import mysql.connector
from mysql.connector import Error
import pandas as pd

try:
    connection = mysql.connector.connect(host='localhost',
                                         database='<database name>',
                                         user='<username>',
                                         password='<password>')
    if connection.is_connected():   
        sql_statement = "select * from test;"             
        df = pd.read_sql(sql_statement, con=connection)
        print(df)

except Error as e:
    print("Error while connecting to MySQL", e)
finally:
    if (connection.is_connected()):        
        connection.close()
        print("MySQL connection is closed")

3. How to set a column as index in dataframe?
df.set_index('<Column Name>', inplace=True)
df.columns.name = None
4. How to replace NaN in dataframe?
5. What are different ways of merging dataframes?
6. How to handle large amount of data in dataframe?

Leave a comment