Much has been said and written about data science roles and the growth of the data science group of occupations1. The work of data science starts with the task of handling data in some way2. So, one place to start is to look in detail at the range of data tasks which exist at work. Using the O*NET occupational database3, there are over nineteen thousand (19,695) work tasks of which 788 (4% of all tasks) are to do with the handling of data in some way. It is this group of 788 tasks which are used here.
These 788 tasks are undertaken across 343 different occupations (O*NET SOC Codes) of which 87 (25%) undertake three or more of the tasks. It is interesting to note that those occupations undertaking the greatest number of data related tasks fall into three main categories: health, geospatial, and general (two database roles plus statistician).
Table 1: Occupations undertaking the greatest number of data related tasks4
|Number of Tasks||Occupation|
|15||Clinical Data Managers
Geographic Information Systems Technicians
Data Warehousing Specialists
|12||Remote Sensing Scientists and Technologists
Remote Sensing Technicians
Geophysical Data Technicians
|9||Geospatial Information Scientists and Technologists|
Source: O*NET v24
Those occupations undertaking 3 or more data related tasks, 87 in all, undertake 450 data tasks i.e. 57% of the data tasks in the O*NET dataset. Plus, most of the data tasks are core to the role (83% or 655) with 17% or 133 tasks are supplemental.
When it comes to looking at the actual tasks, 107 different action verbs are used to describe them which can be analysed using Bloom’s Taxonomy5 to show the levels of the tasks. The level of direct matches is relatively low (34%) there is a reasonably equal distribution across all of the six levels of the taxonomy (Level 6: 10 matches; Level 5: 6; Level 4: 5; Level 3: 7; Level 2: 6; and Level 1: 2) which suggests that data tasks can be undertaken at multiple levels of complexity and capability.
When you look at the equivalent level of detail around data science roles in the UK, the Institute for Apprenticeships and Technical Education are seeking to offer standards across 4 different levels, and these are detailed below.
|Occupation Standard Title||Level||Status|
|Data Scientist (integrated degree)||6||Approved|
|Data Technician||3||In development|
|Artificial Intelligence Data Specialist||7||In development|
|Geospatial Mapping and Science Specialist (degree)||6||Approved|
|Geospatial Survey Technician||3||Approved|
What emerges from these various listing of occupations is a series of groupings of key data roles, and these range between 47 and 88 (6 were identified as well10) and which possess the following types of skills found in the changing world of research information scientists and librarians (data gatherers, custodians and providers in the pre-digital age)9.
|Groups of Data Science Roles|
|O’Reilly Strata (2013)7||The Royal Society (2019)10||Tech Partnership (2014)8|
|Data business people
|Data Scientist and Advanced Analysts
Data Systems Developers
Data-Driven Decision Makers
|Big data developer
Big data architect
Big data analyst
Big data administrator
Big data consultant
Big data Project Manager
Big data designer
Big data scientist
|Information Scientist and Librarians|
|Skills||Essential Now||Essential in 2-5 years|
|Ability to advise on preserving research output||10%||49%|
|Knowledge to advise on data management and curation, including ingest, discovery, access, dissemination, preservation, and portability||16%||48%|
|Knowledge to support researchers in complying with the various mandates of funders, including open access requirements||16%||40%|
|Knowledge to advise on potential data manipulation tools used in the discipline/subject||7%||34%|
|Knowledge to advise on data mining||3%||33%|
|Knowledge to advocate, and advise on, the use of metadata||10%||29%|
|Ability to advise on the preservation of project records||3%||24%|
|Knowledge of sources of research funding to assist researchers to identify potential funders||8%||21%|
|Skills to develop metadata schema, and advise on discipline/subject standards and practices, for individual research projects||2%||16%|
One conclusion to draw is that the handling, use and management of data is becoming increasingly widespread across occupations, and the standards (definitions) being developed for specific data dense occupations are, at the element level, useful for many others. While the emphasis as regards data science is very much driven by digital developments, there are still a significant number of data driven roles from the pre-digital era.
- The Royal Society (2019) Dynamics of Data Science: how can all sectors benefit from data science talent? The Royal Society, London. 104 pages; Ismail, N. A. and Abidin, W. Z. (2016) “Data scientist skills”, IOSR Journal of Mobile Computing and Application, 3 (4), 52-61
- Edison (2016) Edison Data Science Framework: Part 1. Data Science Competence Framework Release 1. Initial output from the project, Education for Data Intensive Science to Open New Science frontiers. Grant Agreement Number: 675419
- O*NET see https://www.onetonline.org
- As you drop down the table and number of data tasks undertaken by an occupation holder; the following distribution is found:
- Anderson, L.W. and Krathwohl, D.R. (2001) A taxonomy for learning and assessing. Abridged Edition. Allyn and Bacon, Boston.
- Four are identified in the 2013 O’Reilly Strata Survey see: https://cdn.oreillystatic.com/oreilly/radarreport/0636920029014/Analyzing_the_Analyzers.pdf – Harris, H.H.; Murphy, S.P. and Vaisman, M. (2013) Analysing the Analysers. An introspective survey of data scientists and their work. O’Reilly Strata. The four roles are: data business people, data creatives, data developers, and data researchers
- Eight are identified in the 2014 study, Big Data Analytics: Assessment of demand for labour and skills 2013-2020. Tech Partnership Publications. See: https://www.e-skills.com/Documents/Research/General/BigData_report_Nov14.pdf The eight roles are: big data developer, big data architect, big data analyst, big data administrator, big data consultant, big data project manager, big data designer, and data scientist.
- Auckland, M. (2012) Re-skilling for research. London: RLUK. See: https://www.rluk.ac.uk/files/RLUK%20Re-skilling.pdf
- See: The Royal Society (2019) op. cit.
|Number of Data Tasks||Number of Occupations|