Arcitura Certified Big Data Scientist
 /  Arcitura Certified Big Data Scientist

A Certified Big Data Scientist has demonstrated proficiency in the application of techniques, principles and processes required for exploring and analyzing large volumes of complex data with the goals of discovering novel insights, developing data products and communicating analytic results to drive decision-making.

Module: Fundamental Big Data
This module provides a high-level overview of essential Big Data topic areas. A basic understanding of Big Data from business and technology perspectives is provided, along with an overview of common benefits, challenges, and adoption issues. The course content is divided into a series of modular sections, each of which is accompanied by one or more hands-on exercises.

The following primary topics are covered:
– Understanding Big Data
– Fundamental Terminology & Concepts
– Big Data Business & Technology Drivers
– Traditional Enterprise & Technologies Related to Big Data
– Characteristics of Data in Big Data Environments
– Dataset Types in Big Data Environments
– Fundamental Analysis and Analytics
– Machine Learning Types
– Business Intelligence & Big Data
– Data Visualization & Big Data
– Big Data Adoption & Planning Considerations

Module: Big Data Analysis & Technology Concepts
This module explores a range of the most relevant topics that pertain to contemporary analysis practices, technologies and tools for Big Data environments. The course content does not get into implementation or programming details, but instead keeps coverage at a conceptual level, focusing on topics that enable participants to develop a comprehensive understanding of the common analysis functions and features offered by Big Data solutions, as well as a high-level understanding of the back-end components that enable these functions.

The following primary topics are covered:
– Big Data Analysis Lifecycle (from business case evaluation to data analysis and visualization)
– A/B Testing, Correlation
– Regression, Heat Maps
– Time Series Analysis
– Traditional Enterprise
– Network Analysis
– Spatial Data Analysis
– Classification, Clustering
– Filtering (including collaborative filtering & content-based filtering)
– Sentiment Analysis, Text Analytics
– Processing Workloads, Clusters
– Cloud Computing & Big Data
– Foundational Big Data Technology Mechanisms

Module: Fundamental Big Data Analysis & Science
This module provides an in-depth overview of essential topic areas pertaining to data science and analysis techniques relevant and unique to Big Data with an emphasis on how analysis and analytics need to be carried out individually and collectively in support of the distinct characteristics, requirements and challenges associated with Big Data datasets.

The following primary topics are covered:
– Data Science, Data Mining & Data Modeling
– Big Data Dataset Categories
– High-Volume, High-Velocity, High-Variety, High-Veracity, High-Value Datasets
– Exploratory Data Analysis (EDA)
– EDA Numerical Summaries, Rules and Data Reduction
– EDA analysis types, including Univariate, Bivariate and Multivariate
– Essential Statistics, including Variable Categories and Relevant Mathematics
– Statistics Analysis, including Descriptive, Inferential, Covariance, Hypothesis Testing, etc.
– Measures of Variation or Dispersion, Interquartile Range & Outliers, Z-Score, etc.
– Probability, Frequency, Statistical Estimators, Confidence Interval, etc.
– Data Munging and Machine Learning
– Variables and Basic Mathematical Notations
– Statistical Measures and Statistical Inference
– Confirmatory Data Analysis (CDA)
– CDA Hypothesis Testing, Null Hypothesis, Alternative Hypothesis, Statistical Significance, etc.
– Distributions and Data Processing Techniques
– Data Discretization, Binning and Clustering
– Visualization Techniques, including Bar Graph, Line Graph, Histogram, Frequency Polygons, etc.
– Prediction Linear Regression, Mean Squared Error and Coefficient of Determination R2, etc.
– Clustering k-means, Cluster Distortion, Missing Feature Values, etc.
– Numerical Summaries

Module: Advanced Big Data Analysis & Science
This module delves into a range of advanced data analysis practices and analysis techniques that are explored within the context of Big Data. The course content focuses on topics that enable participants to develop a thorough understanding of statistical, modeling, and analysis techniques for data patterns, clusters and text analytics, as well as the identification of outliers and errors that affect the significance and accuracy of predictions made on Big Data datasets.

The following primary topics are covered:
– Modeling, Model Evaluation, Model Fitting and Model Overfitting
– Statistical Models, Model Evaluation Measures
– Cross-Validation, Bias-Variance, Confusion Matrix and F-Score
– Machine Learning Algorithms and Pattern Identification
– Association Rules and Apriori Algorithm
– Data Reduction, Dimensionality Feature Selection
– Feature Extraction, Data Discretization (Binning and Clustering)
– Advanced Statistical Techniques
– Parametric vs. Non-Parametric, Clustering vs. Non-Clustering
– Distance-Based, Supervised vs. Semi-Supervised
– Linear Regression and Logistic Regression for Big Data
– Classification Rules for Big Data
– Logistics Regression, Naïve Bayes, Laplace Smoothing, etc.
– Decision Trees for Big Data
– Tree Pruning, Feature Splitting, One Rule (1R) Algorithm
– Pattern Identification, Association Rules, Apriori Algorithm
– Time Series Analysis, Trend, Seasonality
– K Nearest Neighbor (kNN), K-means
– Text Analytics for Big Data
– Bag of Words, Term Frequency, Inverse Document Frequency, Cosine Distance, etc.
– Outlier Detection for Big Data
– Statistical, Distance-Based, Supervised and Semi-Supervised Techniques

Module: Big Data Analysis & Science Lab
As a hands-on lab, this module incorporates a set of detailed exercises that require participants to solve various inter-related problems, with the goal of fostering a comprehensive understanding of how different data analysis techniques can be applied to solve problems in Big Data environments and used to make significant, relevant predictions that offer increased business value.

Suitable for:

  • Performance improvement and strategy specialists
  • BI and data warehouse architects
  • Designers and developers
  • BI project managers
  • Business and data analysts
  • Data quality & data governance professionals
Fundamental Big Data (B90.01) 60 minutes
Big Data Analysis & Technology Concepts (B90.02) 60 minutes
Fundamental Big Data Analysis & Science (B90.04) 60 minutes
Advanced Big Data Analysis & Science (B90.05) 60 minutes
Big Data Analysis & Science Lab (B90.06) 120 minutes

A student can schedule to take any Pearson VUE exam via online proctoring as an alternative to visiting a Pearson VUE testing center. There are certain security requirements that need to be met and the student will be supervised by a live proctor via a webcam. Please see here for more information:

*Note: Students can also purchase an ESL extension when taking a Pearson VUE exam, which will given them an extra 30 minutes of test taking time.

An electronic certificate and a digital certification badge from Acclaim will be sent to students passing the required exam(s).

Learn more about scheduling this exam at a Pearson VUE testing center or for delivery via Pearson VUE Online Proctoring by visiting

  • 28 Jan – 1 Feb 2019
  • 18 – 22 Mar 2019
  • 13 – 17 May 2019
  • 1 – 5 Jul 2019
  • 23 – 27 Sep 2019
  • 4 – 8 Nov 2019
  • 16 – 20 Dec 2019

Book Now


Book Online

Duration: 5 Days
Price: $3,745.00 (After GST)

Certification Body
Supported by
Need more information?

Related Courses