Much has been said and written about data science roles and the growth of the data science group of occupations1. The work of data science starts with the task of handling data in some way2. So, one place to start is to look in detail at the range of data tasks which exist at work. Using the O*NET occupational database3, there are over nineteen thousand (19,695) work tasks of which 788 (4% of all tasks) are to do with the handling of data in some way. It is this group of 788 tasks which are used here.

These 788 tasks are undertaken across 343 different occupations (O*NET SOC Codes) of which 87 (25%) undertake three or more of the tasks. It is interesting to note that those occupations undertaking the greatest number of data related tasks fall into three main categories: health, geospatial, and general (two database roles plus statistician).

Table 1: Occupations undertaking the greatest number of data related tasks4

Number of Tasks Occupation
15 Clinical Data Managers
Bioinformatics Technicians
14 Database Administrators
Geographic Information Systems Technicians
Data Warehousing Specialists
13 Database Architects
12 Remote Sensing Scientists and Technologists
Remote Sensing Technicians
11 None
10 Statisticians
Biostatisticians
Geophysical Data Technicians
9 Geospatial Information Scientists and Technologists

Source: O*NET v24

Those occupations undertaking 3 or more data related tasks, 87 in all, undertake 450 data tasks i.e. 57% of the data tasks in the O*NET dataset. Plus, most of the data tasks are core to the role (83% or 655) with 17% or 133 tasks are supplemental.

When it comes to looking at the actual tasks, 107 different action verbs are used to describe them which can be analysed using Bloom’s Taxonomy5 to show the levels of the tasks. The level of direct matches is relatively low (34%) there is a reasonably equal distribution across all of the six levels of the taxonomy (Level 6: 10 matches; Level 5: 6; Level 4: 5; Level 3: 7; Level 2: 6; and Level 1: 2) which suggests that data tasks can be undertaken at multiple levels of complexity and capability.

When you look at the equivalent level of detail around data science roles in the UK, the Institute for Apprenticeships and Technical Education are seeking to offer standards across 4 different levels, and these are detailed below.

Occupation Standard Title Level Status
Data Scientist (integrated degree) 6 Approved
Data Analyst 4 Approved
Data Technician 3 In development
Artificial Intelligence Data Specialist 7 In development
Geospatial Mapping and Science Specialist (degree) 6 Approved
Geospatial Survey Technician 3 Approved
Bioinformation Scientist 7 Approved
Intelligence Analyst 4 Approved

What emerges from these various listing of occupations is a series of groupings of key data roles, and these range between 47 and 88 (6 were identified as well10) and which possess the following types of skills found in the changing world of research information scientists and librarians (data gatherers, custodians and providers in the pre-digital age)9.

Groups of Data Science Roles
O’Reilly Strata (2013)7 The Royal Society (2019)10 Tech Partnership (2014)8
Data business people
Data creatives
Data developers
Data researchers
Data Scientist and Advanced Analysts
Data Analysts
Data Systems Developers
Analytics Managers
Functional Analysts
Data-Driven Decision Makers
Big data developer
Big data architect
Big data analyst
Big data administrator
Big data consultant
Big data Project Manager
Big data designer
Big data scientist
Information Scientist and Librarians
Skills Essential Now Essential in 2-5 years
Ability to advise on preserving research output 10% 49%
Knowledge to advise on data management and curation, including ingest, discovery, access, dissemination, preservation, and portability 16% 48%
Knowledge to support researchers in complying with the various mandates of funders, including open access requirements 16% 40%
Knowledge to advise on potential data manipulation tools used in the discipline/subject 7% 34%
Knowledge to advise on data mining 3% 33%
Knowledge to advocate, and advise on, the use of metadata 10% 29%
Ability to advise on the preservation of project records 3% 24%
Knowledge of sources of research funding to assist researchers to identify potential funders 8% 21%
Skills to develop metadata schema, and advise on discipline/subject standards and practices, for individual research projects 2% 16%

One conclusion to draw is that the handling, use and management of data is becoming increasingly widespread across occupations, and the standards (definitions) being developed for specific data dense occupations are, at the element level, useful for many others. While the emphasis as regards data science is very much driven by digital developments, there are still a significant number of data driven roles from the pre-digital era.

Notes:

  1. The Royal Society (2019) Dynamics of Data Science: how can all sectors benefit from data science talent? The Royal Society, London. 104 pages; Ismail, N. A. and Abidin, W. Z. (2016) “Data scientist skills”, IOSR Journal of Mobile Computing and Application, 3 (4), 52-61
  2. Edison (2016) Edison Data Science Framework: Part 1. Data Science Competence Framework Release 1. Initial output from the project, Education for Data Intensive Science to Open New Science frontiers. Grant Agreement Number: 675419
  3. O*NET see https://www.onetonline.org
  4. As you drop down the table and number of data tasks undertaken by an occupation holder; the following distribution is found:
  5. Number of Data Tasks Number of Occupations
    8 2
    7 4
    6 2
    5 13
    4 19
    3 35
  6. Anderson, L.W. and Krathwohl, D.R. (2001) A taxonomy for learning and assessing. Abridged Edition. Allyn and Bacon, Boston.
  7. https://www.instituteforapprenticeships.org/apprenticeship-standards/data-analyst/
  8. Four are identified in the 2013 O’Reilly Strata Survey see: https://cdn.oreillystatic.com/oreilly/radarreport/0636920029014/Analyzing_the_Analyzers.pdf – Harris, H.H.; Murphy, S.P. and Vaisman, M. (2013) Analysing the Analysers. An introspective survey of data scientists and their work. O’Reilly Strata. The four roles are: data business people, data creatives, data developers, and data researchers
  9. Eight are identified in the 2014 study, Big Data Analytics: Assessment of demand for labour and skills 2013-2020. Tech Partnership Publications. See: https://www.e-skills.com/Documents/Research/General/BigData_report_Nov14.pdf The eight roles are: big data developer, big data architect, big data analyst, big data administrator, big data consultant, big data project manager, big data designer, and data scientist.
  10. Auckland, M. (2012) Re-skilling for research. London: RLUK. See: https://www.rluk.ac.uk/files/RLUK%20Re-skilling.pdf
  11. See: The Royal Society (2019) op. cit.