Thursday, January 6, 2022

 Create Environment with a default python version


//Crate spark context

import pyspark

from pyspark.sql import SparkSession

spark = SparkSession.builder.appName("test").getOrCreate()

//Read CSV File from a location including header as column names and schema

df = session.read.csv('archive/Case.csv', header=True,inferSchema = True)

//To show records

df.show()

//To Rename a column name

df = df.withColumnRenamed('Existing Column','new Column') 

//Select Columns

df.select('col1','col2','col3'....).show()

//sort

df.sort(''col1','col2','col3'....).show()

 //to specify asc or desc

from pySpark.sql import function as f

df.sort(f.desc( 'col1','col2','col3'....))show()

Cast

Though we don’t face it in this dataset, there might be scenarios where Pyspark reads a double as integer or string, In such cases, you can use the cast function to convert types.

 

 

Friday, March 16, 2018

sudo apt install curl

install Anaconda Python using curl

curl -O https://repo.continuum.io/archive/Anaconda3-5.0.1-Linux-x86_64.sh


Unable to lock the administration directory (/var/lib/dpkg/) is another process using it?

sudo rm /var/lib/dpkg/lock
sudo dpkg --configure -a

Monday, February 8, 2016

Teradata architecture

Teradata Architecture:

Teradata mainly contains the following architectural components.
1. Parsing Engine (PE)
2. BYNET
3. AMP (Access module processor)
4. Disks

Parsing Engine:
Parsing Engine is a virtual processor (vproc).It has the following software components.
1. Session control
2. Parser
3. Optimiser
4. Dispatcher




 Whenever a SQL request is given to the parsing engine the session control verifies for the session authorisation (user name and password) and based on that processes or will reject the request.

 Parser verifies the sql request for proper syntax and evaluates them. Checks the data dictionary if all the objects exist and if the user has authority to access them.

 Optimiser develops the least expensive (time) plan to execute the request .optimiser must know about system configuration, available AMP’S and PE’S. Optimiser enables Teradata to handle complex queries efficiently.

 Dispatcher controls the sequence in which steps are executed an passes the steps to the BYNET. BYNET is a messaging layer in between parsing engine and access module processor. After the AMPs process the steps, the PE receives their responses over the BYNET. The Dispatcher builds a response message and sends the message back to the user.

 Teradata uses hashing algorithm to distribute rows evenly across the amps .Based on the index column Teradata generates the hash value and based on that value data is sent to different amps. Data having same hash value will be sent to one amp (duplicate data).so depending on hash value data is evenly distributed across all the amps. when index is not selected properly data will not be distributed equally and hence leads to more data in one amp and less in one amp which is called skewness .

Thursday, November 5, 2015

SharePoint 2016 Features

Improved provisioning capabilities
Mobile and touch
App Launcher
New and improved controls
Simple and natural sharing
Large file support
Compliance tools

Thursday, September 3, 2015

upload files using windows explorer instead sharepoint 2013 not working OFFICE 365

Step 1 : Close all your browsers and browse the library that you want to open in explorer.
Step 2 : while login to office 365 TICK KEEP ME SIGNED in check box.

if step 2 : fails

Step 1 : Close all your browsers
Step 2 : Restart WEBCLIENT Service
Step 3 : browse the library that you want to open in explorer.
Step 4 : while login to office 365 TICK KEEP ME SIGNED in check box.

Saturday, December 13, 2014

CSS Menu with SharePoint List Items

Step1 :
Add below css for rendering multilevel menu.