Skip to main content

PySpark Quiz — Intermediate

📘 Intermediate — Aggregations, Joins, UDFs & Caching

1. How do you group by department and count employees?

2. Which function allows custom column logic?

3. How do you cache a DataFrame?

4. Which join returns only matching rows?

5. Apply UDF convert_salary to salary column?

6. Persist DataFrame to memory and disk?

7. Remove duplicates based on columns?

8. Rename a DataFrame column?

9. Difference between map and flatMap?

10. How to perform a left join on id?

Career