Tech Bites

Thursday, January 6, 2022

Create Environment with a default python version

//Crate spark context

import pyspark

from pyspark.sql import SparkSession

spark = SparkSession.builder.appName("test").getOrCreate()

//Read CSV File from a location including header as column names and schema

df = session.read.csv('archive/Case.csv', header=True,inferSchema = True)

//To show records

df.show()

//To Rename a column name

df = df.withColumnRenamed('Existing Column','new Column')

//Select Columns

df.select('col1','col2','col3'....).show()

//sort

df.sort(''col1','col2','col3'....).show()

//to specify asc or desc

from pySpark.sql import function as f

df.sort(f.desc( 'col1','col2','col3'....))show()

Cast

Though we don’t face it in this dataset, there might be scenarios where Pyspark reads a double as integer or string, In such cases, you can use the cast function to convert types.

Friday, March 16, 2018

sudo apt install curl

install Anaconda Python using curl

curl -O https://repo.continuum.io/archive/Anaconda3-5.0.1-Linux-x86_64.sh

Unable to lock the administration directory (/var/lib/dpkg/) is another process using it?

sudo rm /var/lib/dpkg/lock
sudo dpkg --configure -a

Tech Bites

Thursday, January 6, 2022

Cast

Friday, March 16, 2018

Time Intelligence Functions in Power BI: A Comprehensive Guide

Search This Blog