Ultimate Developer Command Guide
Python, PySpark & SQL Reference
Essential Commands for Developers
Python Commands
Command Function Example
print() Outputs data to console print("Welcome to Python!") # Prints:
Welcome to Python!
len() Returns length of an object my_list = [1, 2, 3]; print(len(my_list))
# Output: 3
range() Generates a sequence of numbers for i in range(3): print(i) # Prints: 0,
1, 2
def Defines a custom function def greet(name): return f"Hello,
{name}"; print(greet("Alice")) # Prints:
Hello, Alice
import Imports a module or library import math; print([Link]) # Prints:
3.141592653589793
[x for x in Creates a list using comprehension squares = [x**2 for x in [1, 2, 3]];
iterable] print(squares) # Prints: [1, 4, 9]
if/elif/else Conditional logic x = 10; if x > 5: print("Big") else:
print("Small") # Prints: Big
for Iterates over a sequence for fruit in ["apple", "banana"]:
print(fruit) # Prints: apple, banana
while Loops until condition is false count = 0; while count < 3:
print(count); count += 1 # Prints: 0, 1,
2
try/except Handles exceptions try: print(1/0) except
ZeroDivisionError: print("Cannot divide
by zero") # Prints: Cannot divide by
zero
open() Opens a file for reading/writing with open("[Link]", "w") as f:
[Link]("Hello") # Creates file with
text
[Link]() Adds an item to a list my_list = []; my_list.append(5);
print(my_list) # Prints: [5]
[Link]() Retrieves value from dictionary my_dict = {"key": "value"};
print(my_dict.get("key")) # Prints:
value
PySpark Commands
Command/Function Function Example
[Link] Initializes a Spark from [Link] import SparkSession; spark =
session [Link]("MyApp").getOrCreate()
[Link]() Loads CSV file into df = [Link]("[Link]", header=True,
a DataFrame inferSchema=True); [Link]() # Displays CSV data
[Link]() Displays first n [Link](3) # Shows first 3 rows
rows of DataFrame
[Link]() Displays [Link]() # Shows column names and types
DataFrame schema
[Link]() Selects specific [Link]("name", "age").show() # Shows name and
columns age columns
[Link]() Filters rows based [Link]([Link] > 25).show() # Shows rows where
on condition age > 25
[Link]() Alias for filter [Link]("salary > 50000").show() # Filters rows
where salary > 50000
[Link]().agg() Groups data and [Link]("department").agg({"salary":
applies aggregation "avg"}).show() # Shows avg salary per dept
[Link]() Joins two [Link](df2, [Link] == [Link], "inner").show() #
DataFrames Inner join on id
[Link]() Adds or modifies a [Link]("age_plus_10", [Link] + 10).show() #
column Adds column with age + 10
[Link]() Renames a column [Link]("old_name", "new_name").show()
# Renames column
[Link]() Drops specified [Link]("salary").show() # Drops salary column
columns
[Link]() Replaces null [Link]({"age": 0}).show() # Replaces null ages
values with 0
[Link]() Removes duplicate [Link](["name"]).show() # Drops
rows duplicate names
[Link]() Saves DataFrame [Link]("[Link]", mode="overwrite") #
as CSV Saves DataFrame to CSV
[Link]() Registers [Link]("temp_table") # Creates
DataFrame as SQL SQL view
table
[Link]() Runs SQL query on [Link]("SELECT name FROM temp_table WHERE age >
DataFrame 30").show() # Runs SQL query
[Link]() Defines window for from [Link] import Window; w =
ranking/aggregation [Link]("dept").orderBy("salary");
[Link]("rank", row_number().over(w)).show()
# Adds rank column
SQL Commands
Command Function Example
SELECT Retrieves data from a table SELECT name, age FROM employees #
Selects name and age columns
WHERE Filters rows based on condition SELECT * FROM employees WHERE age > 30 #
Filters employees older than 30
ORDER BY Sorts result set SELECT * FROM employees ORDER BY salary
DESC # Sorts by salary in descending
order
GROUP BY Groups rows for aggregation SELECT department, AVG(salary) FROM
employees GROUP BY department # Avg
salary per dept
HAVING Filters grouped results SELECT department, COUNT(*) FROM
employees GROUP BY department HAVING
COUNT(*) > 5 # Depts with > 5 employees
JOIN Combines rows from multiple tables SELECT [Link], d.dept_name FROM
employees e JOIN departments d ON
e.dept_id = [Link] # Joins tables
LEFT JOIN Includes all rows from left table SELECT [Link], d.dept_name FROM
employees e LEFT JOIN departments d ON
e.dept_id = [Link] # Left join
LIMIT Restricts number of returned rows SELECT * FROM employees LIMIT 5 #
Returns first 5 rows
INSERT INTO Adds new rows to a table INSERT INTO employees (name, age) VALUES
('Alice', 28) # Inserts a new employee
UPDATE Modifies existing rows UPDATE employees SET salary = 60000
WHERE name = 'Alice' # Updates salary
DELETE Removes rows from a table DELETE FROM employees WHERE age < 18 #
Deletes rows where age < 18
CREATE TABLE Creates a new table CREATE TABLE employees (id INT, name
VARCHAR(50), age INT) # Creates
employees table
ALTER TABLE Modifies table structure ALTER TABLE employees ADD COLUMN salary
DECIMAL(10,2) # Adds salary column
DROP TABLE Deletes a table DROP TABLE employees # Deletes employees
table
Cheat Sheet Summary
Comprehensive reference for Python, PySpark, and SQL development tasks.
Version 2.0 | Updated: August 2024
Print Tip: Use Ctrl+P (Win) / Cmd+P (Mac) to save as PDF