COURSE DESCRIPTIONÂ
This certificate course is designed to develop and provide a participant with hands-on skills and competencies in using data analytics software tools and technologies to derive a business value from a range of institutional or organizational data. The explosion of social media and computerization of every aspect of social and economic activity has resulted in the creation of large volumes of mostly unstructured data from social media, sales, e-mails, security agencies, financial institutions, and immigration will be analyzed to enhance a strategic business decision making.
COURSE OBJECTIVES
By the end of this course, a participant should be able to:
- Appreciate information technologies used in manipulating, storing, and analyzing big data using Hadoop
- Extract business and socially relevant information from analyzed data using appropriate digital methods to interpret and share results
- Examine and use the basic tools for statistical analysis, R and Python as machine learning algorithms
- Apply and use scenarios of Spark ML Library API appropriately.
CAREER PROSPECTS
- Hadoop / Big Data Developer
- Hadoop Administrator
- Big Data Engineer
- Big Data Architect
- Machine Learning Engineer
- Extract Transform and Load (ETL) developer
- Data Scientist/Analyst/Architect
COURSE MODULES
The modules which will be covered here are;
     1.APACHE SPARK:
- Apache Spark is a lightning-fast cluster computing designed for fast computation. It was built on top of Hadoop MapReduce and it extends the MapReduce model to efficiently use more types of computations which includes Interactive Queries and Stream Processing. This module is aimed at exposing participants to the basic concepts in Apache Spark and how to perform analysis on data sets using python
      2.BIG DATA TECHNOLOGIES:
- Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. This module will expose participants to the various ecosystem of Hadoop.
     3.R PROGRAMMING:
- This course is designed to develop and equip a participant with the requisite data science principles, tools and skills necessary to extract knowledge from a data. A participant will also develop the concepts, techniques and tools that are required to deal with various facets of data science practice, including data collection and integration, exploratory data analysis, predictive modelling, descriptive modelling, data product creation, evaluation, and effective communication
4.PYTHON FOR DATA SCIENCE:
- This course is designed to develop and equip a participant with hands-on skills and competencies in Python for data science ecosystem in order to apply data analysis techniques, information visualization, machine learning and inferential statistical analyses to gain new insights into a data. The focus is on the use of several practical tools and libraries in Python programming language.
MODE OF DELIVERY:
- A hybrid face-to-face/online instructor-led hands-on training
PRE-REQUISITES
- Certificate in Software Development (CSD)
- Foundation of Software Development (FSD)
- Basic Knowledge in programming
- SQL, Java, Python
- Minimum of WASSCE​
CAREER PROSPECTS
- Hadoop / Big Data Developer
- Hadoop Administrator
- Big Data Engineer
- Big Data Architect
- Machine Learning Engineer
- Extract Transform and Load (ETL) developer
- Data Scientist/Analyst/Architect