Bachelor of ICT/BI
Data Science curriculum
Assignment description and subassignments
Preconditions, structure and assessment
Final project Data Science (30 EC)
Integral final assignment
This learning line covers many aspects of Data Science, starting with an introductory course that lays the groundwork, through to reporting and data visualization of the data analysis done. To complete this learning line you will need to demonstrate the following learning outcomes:
1. Introduction to data science
The student demonstrates knowledge and understanding of the creation and use of (big) data. Also, the student has knowledge of the various applications in the field of data science. (LU1)
2. Creating value with data analytics
The student knows how big data can transform an organization. The student can choose the appropriate metrics to add value to an organization. (LU2)
3. Dealing with data
The student demonstrates knowledge and understanding of data integration and all its components. The student has knowledge of legal, privacy, security and ethical guidelines regarding the use and
processing of data. (LU3)
4. Data analysis and forecasting
The student understands and applies methods and techniques necessary to design and perform statistical analysis for the purpose of a data study. The student also applies techniques to discover patterns in data. (LU4)
5. Data within the organization
The student will know what is required for the transformation to a data driven organization and with the help of which platforms and technical tools it can be realized. (LU5)
6. Storytelling and data visualization
The student can apply various forms of expression and visualization models to make data insightful using images. The student also demonstrates the ability to tailor data visualization to the target audience. (LU6)
To successfully complete the Data Science curriculum, a passing grade is required for the integral final assignment; partial assignments cannot be used to complete individual courses.
General mission description
You are going to write a report based on a data research project you are going to conduct. To begin, you’ll select a problem you want to address and draft research questions. You will think about value creation and how your research will add value to the organization. Next, you find appropriate dataset(s) that will help you find answers to your questions. After creating the research design, it is time to examine the data. You select appropriate calculations and perform them. You will present your results in table and graph form, after which you can draw conclusions. You will also write recommendations for the organization and address how the
organization handles data research. Finally, you prepare a presentation for the employees, in which you inform them about your research and its results.
Deliverables (as individual documents totaling up to 50 MB, in Teams):
• Research Report
• Powerpoint presentation (5-10 slides)
You will use the case study of the fictional company Wire Solutions (see Edhub, documents). If you prefer to do your research within your own organization, you can ask the teacher for approval. Experience has shown that data research within your own organization is often difficult (for example, because of sensitive information that is not readily shared).
Submission 1. Research Design
You select a problem that Wire Solutions is facing and that you think you can address with data research. For example, an internal problem is the large amount of IT failures that disrupt work processes and thus compromise quality and efficiency. An external problem, for example, is Wire Solutions’ desire and need to grow in the area of business transactions, without currently knowing exactly where that growth is possible. You are free to draw a different problem from the case study for your research.
You will describe the chosen problem and how you believe data research can play an important role in solving it. In doing so, you will address value creation; how does the data contribute to taking advantage of the opportunities and challenges Wire Solutions has either internally or externally? You describe and argue the value-to-firm (V2F) and the value-to-customer (V2C), and can also choose to include the value-to-society
(V2S) in your description.
Then you formulate your research question and elaborate on it in (at least) 2 sub-questions. Answering the sub-questions leads to the solution of the research question as a whole and thus to the stated (internal or external) problem. Be explicit in the added value that is created with the help of the data analysis to be carried out.
After identifying the problem and the research questions, you will search for the required dataset(s). You can use the data made available by CBS, the World Bank, PDOK, Kaggle and NOVI. Of course you may also use dataset(s) from other sources. Describe the dataset(s) you have selected, addressing:
• The type of data (elements)
• The size
• The 5Vs (Volume, Variety, Velocity, Veracity, Value)
• The origin (sources)
Also describe how you take into account the legal, privacy, security and ethical aspects of data research within your research. Describe and substantiate where you think the challenges and risks lie in these areas. Take into account current events and address developments in, for example, legislation and regulations concerning data research.
• Problem statement, including reasoned value creation through data research (incl. V2F and V2C);
• Research questions, developed into (at least) 2 sub-questions;
• Description of dataset(s);
• Description of legal, privacy, security and ethical aspects, including challenges and risks.
Sub-task 2. Data analysis and visualization.
After describing your research design and selecting the data set(s), you actually analyze the data. You formulate and argue the conditions you set for the data (for example missing data, age of the data, wrong data) and on this basis you clean up your data. Then you can get started with the analysis. For this you use several statistical calculations, at least the mean, the standard deviation, regression and correlation. Explain how you will use these calculations to draw your conclusions. If other statistical calculations are necessary to answer your research questions, then of course you carry these out as well (for example, to demonstrate causal relationships). You argue why additional calculations are or are not necessary in your research.
Then you choose which types of graphs you want to use to visualize the results of your calculations. You argue your choice and build the graphs. You will use at least 2 different types of graphs and build at least
4 graphs in total.
• Argued conditions to the data;
• Reasoned choice of statistical calculations;
• Performed calculations (in tabular form), including the algorithms;
• Reasoned choice of graphs;
• The graphs (at least 2 different types, at least 4 graphs in total).
Sub-task 3. Predictions and recommendations.
Based on your data analysis, you will answer the research questions. You argue the conclusions you have drawn based on the data. You will include the height of your correlation factor and the regression found. You also describe which improvements are possible in the data to realize even more value creation. You make predictions based on your conclusions; what can be expected in the future, based on these results? You then convert these predictions into recommendations to the management of Wire Solutions.
You will describe and argue whether or not Wire Solutions is ‘data-driven’ and provide advice on how to further leverage data research for the organization. You will also describe and argue which platforms and technical tools could be used to take the data ecosystem to the next level.
• Conclusions and recommendations;
• Description and argumentation regarding data-driven action and the data ecosystem;
Below you will find a number of preconditions which your research report must meet. These preconditions have a mandatory character.
• Keep spelling and grammatical errors to an absolute minimum, have someone else proofread your work again before turning it in;
• Take care of a reader-friendly layout.
• Make relevant use of images, charts, tables, etc;
• Preferably convert your report to .pdf. (with the exception of the powerpoint section);
• Use the APA convention for in-text references and bibliography. See, e.g., https://www.scribbr.nl/category/apa-stijl/
• The size of the overall report is between 6,000 and 9,000 words (excluding table of contents, executive summary, introduction, bibliography, and any appendices);
• Deliver the report and power-point presentation in Teams, as separate files totaling up to 50 MB.
Note regarding size/word count: this is not a goal in itself. You are encouraged to focus on making your report as concise as possible, but of course without compromising the achievement of the learning outcomes or the accessibility (readability) of the report.
For the final assignment, it is recommended to follow the following structure:
In any case, state on the title page:
• Student number
• Submission date
• Curriculum title and subtitle, if any
The summary also called management summary is intended for example for managers who do not read the entire research report. The summary provides an overview of what is described in the research report. Also, the summary helps the reader understand what is described in the research report and place it in context.
Table of Contents
Include an organized table of contents with correct page numbering. Please also list any attachments here.
The introduction, like the table of contents, acts as a road map for the reader; here you explain what the reader can expect from your research report. You describe the topic and the question of the data research. You also discuss your research design.
Here you work out the sub-tasks. It is recommended that you work out the assignments logically and clearly. Explain step by step how you arrived at your opinion so that the reader can keep up. Try to emphasize facts with examples and substantiate your arguments.
Conclusion and Recommendations.
The conclusion is the last part of your paper. It is not only a summary, but it is also recommended to briefly name and connect the main points so that a conclusion can be formed. It is not the intention to repeat parts literally from the content of your paper. Try to achieve a shorter formulation. You will also elaborate on your recommendations in this section of your report.
The final assignment will be assessed based on the following assessment criteria. For each criterion, the assessor awards a number of points. Both the actual realization of the learning outcome and the student’s demonstrated understanding of the related theory are considered.
|# Partial assignment||Learning Outcomes and Assessment Criteria.||Weightin g||Score in figures of 1 to 10||Weighting times score|
|1. Research Design||In this subtask, aspects of The Introduction to Data Science (LU1) courses, Creating value with data analysis (LU2), Dealing with data (LU3) and Data analysis and forecasting (LU4).||30%|
|Criterion 1.1||The student describes a clear problem statement and Argues the assumed value creation by means of correctly worked out V2F and V2C. (LU1, LU2)||10%|
|Criterion 1.2||The student will establish appropriate and quality research questions (main question + at least 2 sub- questions). (LU1)||10%|
|Criterion 1.3||The student will clearly and correctly describe the data(sets), in terms of data type, size, source reference and the 5Vs. (LU3)||5%|
|Criterion 1.4||The student describes and argues relevant legal, privacy, security and ethical aspects and In doing so, identifies relevant risks and challenges. (LU3, LU4)||5%|
|Feedback by the teacher|
|2. Data Analysis and visualization||In this subtask, aspects of The courses Dealing with Data (LU3), Data Analysis and predictions (LU4) and Storytelling and data- visualization (LU6).||50%|
|Criterion 2.1||The student describes and argues the conditions imposed on the dates and Cleans up the data correctly (if necessary). (LU3, LU4)||10%|
|Criterion 2.2||The student selects appropriate statistical calculations and argues the choices. (LU4)||10%|
|Criterion 2.3||The student performs the calculations correctly and complete and gives the results in clear tabular form, including associated algorithms. (LU4)||10%|
|Criterion 2.4||The student will select appropriate types of graphs and Argues the choices made. (LU4, LU6)||10%|
|Criterion 2.5||The student builds accurate graphs (minimum 2 different and at least 4 total), in which on clear and logical way the results found are shown. (LU4, LU6)||10%|
|Feedback by the teacher|
|3. Predictions and recommendations||In this subtask, aspects of The Data Analysis and Prediction (LU4) courses, Data within the organization (LU5) and data visualization (LU6).||20%|
|Criterion 3.1||The student will draw appropriate conclusions based on and substantiated by the calculations performed (mean, standard deviation, correlation and regression) and results of any additional calculations. (LU4)||5%|
|Criterion 3.2||The student will describe logical and realistic recommendations, based on the research findings. (LU4, LU5)||5%|
|Criterion 3.3||The student will describe the extent of data- driven act and the data ecosystem and provides reasoned, appropriate and realistic advice on how these can be taken to the next level become. (LU5)||5%|
|Criterion 3.4||The student prepares a clear presentation, using appropriate visualizations and texts that match the target audience. (LU6)||5%|
|Feedback by the teacher|