CT127-3-2-PFDA Retail Data Analytics Assignment: Customer Ratings Case Study for R-Based Insight Generation
University | Asia Pacific University of Technology and Innovation (APU) |
Subject | CT127-3-2-Programming For Data Analysis |
1.0 COURSEWORK TITLE
Retail Transactional Data Analysis and Ratings Classification
THE COURSEWORK OVERVIEW
For the assignment, you are asked to explore the application of data analytics techniques to the dataset that is provided. You must study data problems related to the dataset, giving special consideration to the unique properties of the problem domain, and testing one or more techniques on it.
Your analysis needs to be thorough and go beyond the scope of what has been covered in this course. You should incorporate data exploration, manipulation, transformation, and visualization concepts with data analysis techniques in your solution. It is crucial to provide explanations and justifications for the chosen techniques.
You also may need to pre-process your data to get it into an appropriate format. The assignment should involve several techniques by categorizing it into different criteria and a detailed exploration of the commands used in each criterion. Outline the findings, analyze them, and justify them correctly with an appropriate graph. Also, a supporting document is needed to reflect the graph and code using R programming concepts.
2.0 OBJECTIVES OF THIS COURSEWORK
This assignment will help you to explore and analyze a set of data and reconstruct it into meaningful representations for decision-making.
3.0 TYPE
Group Assignment (2–4 members)
4.0 COURSEWORK DESCRIPTION
This retail dataset consists of consumer demographics and transactional/purchasing history. As a data analyst in the retail industry, you have been commissioned to conduct an in-depth analysis on the consumers of various backgrounds and study their behavioural profile with Programming for Data Analysis Page 2 of 6 Level 2 Asia Pacific University of Technology and Innovation 2024 the given dataset to identify the factors that measure customer satisfaction, represented in the form of product ratings by consumers and provide recommendations to stakeholders.
Techniques
The dataset provided for this assignment consists of customer personal information (i.e., Name, Email, Phone, City, …), demographics (Age, Gender, Income, …) along with their purchasing behaviour (Total Purchases, Total Amount, Product Brand, Product Type, Feedback, Payment Method, …). In addition to the techniques (data exploration, manipulation, transformation, and visualization techniques) covered in the course to conduct analysis, you might consider exploring and implementing more advanced concepts to enhance the effectiveness of data retrieval, especially if it fits your requirements.
5.0 GENERAL REQUIREMENTS
- This is a group assignment, with a maximum of 4 students in a group.
- You should state your hypothesis and objectives with each person one objective of minimum one independent variables and one dependent variable.
- The R program should compile and be executed without errors.
- Validation should be done for each entry from the users to avoid logical errors.
- Do not use third-party tools such as Excel, OpenRefine and etc to pre-process or clean the data. Cleaning and pre-processing must be done in R using scripting.
- No duplication is allowed in the dataset.
- You should:
- Include good programming practices such as comments, variable naming conventions, and indentation.
- Perform additional research to further understand the information on the given dataset during evaluation of the data.
- The analysis should be meaningful and effective in providing the information for decision-making.
- Any additional features implemented must improve the retrieval effects.
DELIVERABLES:
The complete RScript (source code) and report must be submitted to the APU Learning Management System (Moodle).
5.1 RScript (Program Code):
- Name the file under your group number.
- Start the first few lines in your program by typing all member’s names and TP numbers. For example:
# Name1, TP000001 # Name2, TP000002 # Name3, TP000003 # Name4, TP000004
- For each objective example, provide student id and explain what you want to discover. For example:
o Hypothesis 1: Customer segments with higher purchasing power (e.g., premium or frequent buyers) tend to give higher ratings compared to segments with lower purchasing power. Objective 1: To evaluate the relationship between customer segments and level of ratings. NAME, TPXXXXXX Analysis 1-1: Is there any correlation between different customer segments and level of ratings? Analysis 1-2: Is customer segment a key predictor for ratings? Analysis 1-3: What are the external factors, if any, that share a causal relationship with customer segment to influence purchase ratings?
- For each additional example, provide an ID and explanation.
# Extra feature 1 # comments about the extra feature
6.2 Documentation (report):
ü A 5500 words (max 55 pages) report including appendix. The report should comprise of the following content.
A) Cover Page:
All reports must be prepared with a front cover. A protective transparent plastic sheet can be placed in front of the report to protect the front cover. The front cover should be presented with the following details:
- Ä Module
- Ä Coursework Title
- Ä Intake
- Ä Students name and id
- Ä Date Assigned (the date the report was handed out).
- Ä Date Completed (the date the report is due to be handed in).
B) Contents:
- o Introduction
- ü Data Description
- ü Assumptions (if any)
- ü Hypothesis and Objectives
- o Data Preparation
- ü Data import
- ü Cleaning / pre-processing (if necessary)
- ü Data Validation (if necessary)
- o Data Analysis
- ü Each objective (along with student name) must start in a separate page and contain:
- § Analytical technique(s) – e.g. descriptive using statistics
- § Justification of technique(s)
- § Screenshot of source code with output/plot.
- § Outline the findings based on the results obtained.
- ü The extra feature explanation must be in a separate page and contains:
- § Screenshot of source code with output/plot.
- § Explain how adding this extra feature can improve the results.
- ü Interpret the results from each analysis
- ü Each objective (along with student name) must start in a separate page and contain:
- o Conclusion
- ü Overall discussion on the findings from all objectives
- ü Recommendation
- ü Limitation and future direction
- ü State the word count (at the end of page)
C) Workload Matrix
D) References
- Ä You may source algorithms and information from the Internet or books. Proper referencing of the resources should be evident in the document.
- Ä All references must be made using the APA (American Psychological Association) referencing style as shown below:
6.0 ASSIGNMENT ASSESSMENT CRITERIA
The assignment assessment consists of 2 major components: Analysis (70%) and finding and discussion (30%). Details of the division for each component are as follows:
Analysis (70%) | Finding and Discussion (30%) |
---|---|
Criteria
Analysis techniques ü Approaches used to process, interpret, and extract insights from data. Report content: à Methodology description 10% |
Criteria
Finding and Discussion ü Project Introduction (description, assumption, hypothesis, objectives) ü Conclusion (result finding, discussion, recommendation and future direction) ü Structure of the report and references 30% |
Analysis Methods
à Specific process employed to carry out the analysis (transform raw data into meaningful insights) [e.g. Exploratory data analysis, Hypothesis test, Descriptive statistics etc] Report content: à RScript – code snippet and visualization screenshots with explanation. 60% |
7.0 DEVELOPMENT TOOLS
The program written for this assignment should be written in R Studio
8.0 ACADEMIC INTEGRITY
- § You are expected to maintain the utmost level of academic integrity during the duration of the course.
- § Plagiarism is a serious offence and will be dealt with according to APU and De Montfort University regulations on plagiarism. (20%)
Get Help By Expert
Recent Solved Questions
- ECW1101: Introductory Microeconomics Assignment, MUM, Malaysia A price ceiling that is below the perfectly competitive equilibrium price creates a deadweight loss because it leads
- You plan to open a Malay restaurant specializing in serving Bandar Baru Bangi and Kajang traditional food: Production and Operation Management Assignment, OUM, Malaysia
- MAT240S: Math’s Statistics Assignment, SU, Malaysia The following data was collected from a random sample of students These data represent their commuting distance (in km) per week
- Discriminant Analysis Report: Assessing Reliability and Validity of Customer Satisfaction, Service Quality, and Brand Loyalty
- TA6434: Algorithm And Data Structure Assignment, UKM, Malaysia Write a menu-based program to create a list of records at least 3 data using the queue concept
- ATF20603 BUSINESS ACCOUNTING Financial Records and Performance Analysis
- Teaching and Learning Assignment, TU, Malaysia Teachers play a crucial role in supporting students to continuously upgrade their knowledge and skills through effective
- Biostatics Assignment, SU, Malaysia A company claims that its new manufacturing process results in a mean weight of 60 grams for a certain product
- Principle of Marketing Report, UNM, Malaysia Today’s marketers are also using sophisticated analytical techniques to track consumers’ digital movements and to build
- Ahmad is attempting to perform an inventory analysis on one of his most popular products: Production Operation Management Home Work, UUM, Malaysia