CMPF104: Data Cleaning and Preprocessing: Data science and Data Anaytics: Programming For Foundation In Engineering, Assignment, UNITEN, Malaysia
| University | Universiti Tenaga Nasional (UNITEN) |
| Subject | CMPF104: Programming For Foundation In Engineering |
Data science and Data Anaytics
Download the dataset from BRIGHTEN. If your student ID ends with an odd number, select Concrete_Data_A dataset, and if your student ID ends with an even number, select Concrete_Data_B dataset. Using the Python attributes, function and libraries to solve the following problems.
a) Data Cleaning and Preprocessing:
- Use Pandas to load the dataset. Name the dataframe as concrete_df_XXX.
- Remove ‘Number’ column using .drop() function and visualize the first ten (10)
rows of the data. - Handle any missing values by dropping or replacing the empty cells. Check for missing values using functions like .info() or .isnull().sum()
- Convert the data frame to array, using to_numpy() function.
- Divide the data into two sets of data with division of 80% and 20% for train and test data, respectively. Name the dataset as train_data_XXX and test_data_XXX
b) Data Analysis:
- Calculate the correlation between the variables in the dataframe.
- Utilize NumPy and Pandas to calculate summary statistics of the data such as
maximum, minimum, standard deviation, average, median and mode of each
category. - Use Pandas functions like .describe() for an overview of summary statistics and apply NumPy functions for specific calculations.
c) Visualization:
- Use Matplotlib to create visualizations such as line plots for train and test data
across all categories. - Generate histogram plots and box plots for all variables.
- Ensure that the visualizations are clear, informative, and aesthetically pleasing.
- Customize your plots by adding the titles, labels and legends
Get Help By Expert
Recent Solved Questions
- STA404: Statistics for Business and Social Sciences Assignment, UiTM, Malaysia A study has been made to compare the average amount of sugar content for two brands of energy drinks
- Accounting and Finance for Managers Assignment, PBS, Malaysia Working capital management is a critical part of a company’s financial management. Sales, profitability
- Project Management Assignment, UTAR, Malaysia You are the project manager for your company’s MCP project. You are collaborating with your project team
- BBF315/05: Risk Management Assignment, WOU, Malaysia Explain the advantages of options over forwards and futures and As a financial consultant, recommend to your corporate
- Finance Assignment, MSU, Malaysia KAWAN FOOD BHD Identify any 4 sources of financing (e.g. ordinary share, bond, term loan, overdraft, trade credit, hire purchase, etc) used by the company and discuss the advantages and disadvantages of each source
- BM070 RESEARCH DISSERTATION PROPOSAL: Assessing Organizational Challenges and Solutions in Industry/Company
- Financial Planning Case Study, APU, Malaysia Deven and Sarita’s years have been married for the past 10 years. They have two children, Raju and Shanti
- Object-oriented development with Java Assignment, APU, Malaysia APU Cafeteria Ordering System identify a business domain for a cafeteria. Construct a software solution using an object-oriented programming paradigm to support user registration
- CBCA2103: Computer Architecture Assignment, OUM, Malaysia The purpose of this assignment is to assess learners’ understanding of CPU organization, instructions format, and addressing mode
- The teaching profession is undeniably challenging. A teacher’s role extends beyond imparting knowledge: Foundations Of Educations Assignment, UNM, Malaysia