It & Software Online Course by Udemy, On Sale Here
Learn how to use Python and Spark 3.0.1 (PySpark) for Data Engineering and Data Analytics – Beginner to intermediate
An excellent training about Other It & Software
Apache Spark 3 for Data Engineering & Analytics with Python
The key objectives of this course are as follows;Learn the Spark ArchitectureLearn Spark Execution ConceptsLearn Spark Transformations and Actions using the Structured APILearn Spark Transformations and Actions using the RDD (Resilient Distributed Datasets) APILearn how to set up your own local PySpark EnvironmentLearn how to interpret the Spark Web UILearn how to interpret DAG (Directed Acyclic Graph) for Spark ExecutionLearn the RDD (Resilient Distributed Datasets) API (Crash Course)RDD TransformationsRDD ActionsLearn the Spark DataFrame API (Structured APIs)Create Schemas and Assign DataTypesRead and Write Data using the DataFrame Reader and WriterRead Semi-Structured Data such as JSONCreate and New Data Columns to the DataFrame using ExpressionsFilter the DataFrame using the “Filter” and “Where” TransformationsEnsure that the DataFrame has unique rowsDetect and Drop DuplicatesAugment the DataFrame by Adding New Rows Combine 2 or More DataFramesOrder the DataFrame by Specific ColumnsRenaming and Drop Columns from the DataFrameClean the DataFrame by detecting and Removing Missing or Bad DataCreate User-Defined Spark FunctionsRead and Write to/from Parquet FilePartition the DataFrame and Write to Parquet FileAggregate the DataFrame using Spark SQL functions (count, countDistinct, Max, Min, Sum, SumDistinct, AVG)Perform Aggregations with GroupingThe Python Spark project that we are going to do together;Sales DataCreate a Spark SessionRead a CSV file into a Spark DataframeLearn to Infer a SchemaSelect data from the Spark DataframeProduce analytics that shows the topmost sales orders per Region and CountryConvert Fahrenheit to Degrees CentigradeCreate a Spark SessionRead and Parallelize data using the Spark Context into an RDDCreate a Function to Convert Fahrenheit to Degrees CentigradeUse the Map Function to convert data contained within an RDDFilter temperatures greater than or equal to 13 degrees celsiusXYZ ResearchCreate a set of RDDs that hold Research DataUse the union transformation to combine RDDsLearn to use the subtract transformation to minus values from an RDDUse the RDD API to answer the following questionsHow many research projects were initiated in the first three years?How many projects were completed in the first year?How many projects were completed in the first two years?Sales AnalyticsCreate the Sales Analytics DataFrame to a set of CSV FilesPrepare the DataFrame by applying a StructureRemove bad records from the DataFrame (Cleaning)Generate New Columns from the DataFrameWrite a Partitioned DataFrame to a Parquet DirectoryAnswer the following questions and create visualizations using Seaborn and MatplotlibWhat was the best month in sales?What city sold the most products?What time should the business display advertisements to maximize the likelihood of customers buying products?What products are often sold together in the state “NY”? Technology SpecPythonJupyter NotebookJupyter LabPySpark (Spark with Python)PandasMatplotlibSeaborne
Udemy is the leading global marketplace for learning and instruction
By connecting students all over the world to the best instructors, Udemy is helping individuals reach their goals and pursue their dreams.
Study anytime, anywhere.
Reviews
There are no reviews yet.