This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Performing a literature review is a critical first step in research to understanding the state-of-the-art and identifying gaps and challenges in the field. A systematic literature review is a method which sets out a series of steps to methodically organize the review. In this paper, we present a guide designed for researchers and in particular early-stage researchers in the computer-science field. The contribution of the article is the following:
Clearly defined strategies to follow for a systematic literature review in computer science research, and
Algorithmic method to tackle a systematic literature review.Keywords: Systematic literature reviews, literature reviews, research methodology, computer science, doctoral studies
A Systematic Literature Review (SLR) is a research methodology to collect, identify, and critically analyze the available research studies (e.g., articles, conference proceedings, books, dissertations) through a systematic procedure [12]. An SLR updates the reader with current literature about a subject [6]. The goal is to review critical points of current knowledge on a topic about research questions to suggest areas for further examination [5]. Defining an “Initial Idea” or interest in a subject to be studied is the first step before starting the SLR. An early search of the relevant literature can help determine whether the topic is too broad to adequately cover in the time frame and whether it is necessary to narrow the focus. Reading some articles can assist in setting the direction for a formal review., and formulating a potential research question (e.g., how is semantics involved in Industry 4.0?) can further facilitate this process. Once the focus has been established, an SLR can be undertaken to find more specific studies related to the variables in this question. Although there are multiple approaches for performing an SLR ([5], [26], [27]), this work aims to provide a step-by-step and practical guide while citing useful examples for computer-science research. The methodology presented in this paper comprises two main phases: “Planning” described in section 2, and “Conducting” described in section 3, following the depiction of the graphical abstract.
Defining the protocol is the first step of an SLR since it describes the procedures involved in the review and acts as a log of the activities to be performed. Obtaining opinions from peers while developing the protocol, is encouraged to ensure the review's consistency and validity, and helps identify when modifications are necessary [20]. One final goal of the protocol is to ensure the replicability of the review.
The PICOC (Population, Intervention, Comparison, Outcome, and Context) criteria break down the SLR's objectives into searchable keywords and help formulate research questions [27]. PICOC is widely used in the medical and social sciences fields to encourage researchers to consider the components of the research questions [14]. Kitchenham & Charters [6] compiled the list of PICOC elements and their corresponding terms in computer science, as presented in Table 1 , which includes keywords derived from the PICOC elements. From that point on, it is essential to think of synonyms or “alike” terms that later can be used for building queries in the selected digital libraries. For instance, the keyword “context awareness” can also be linked to “context-aware”.
Planning Step 1 “Defining PICOC keywords and synonyms”.
Description | Example (PICOC) | Example (Synonyms) | |
---|---|---|---|
Population | Can be a specific role, an application area, or an industry domain. | Smart Manufacturing | • Digital Factory • Digital Manufacturing • Smart Factory |
Intervention | The methodology, tool, or technology that addresses a specific issue. | Semantic Web | • Ontology • Semantic Reasoning |
Comparison | The methodology, tool, or technology in which the Intervention is being compared (if appropriate). | Machine Learning | • Supervised Learning • Unsupervised Learning |
Outcome | Factors of importance to practitioners and/or the results that Intervention could produce. | Context-Awareness | • Context-Aware • Context-Reasoning |
Context | The context in which the comparison takes place. Some systematic reviews might choose to exclude this element. | Business Process Management | • BPM • Business Process Modeling |
Clearly defined research question(s) are the key elements which set the focus for study identification and data extraction [21]. These questions are formulated based on the PICOC criteria as presented in the example in Table 2 (PICOC keywords are underlined).
Research questions examples.
Research Questions examples |
---|
• RQ1: What are the current challenges of context-aware systems that support the decision-making of business processes in smart manufacturing ? • RQ2: Which technique is most appropriate to support decision-making for business process management in smart factories ? • RQ3: In which scenarios are semantic web and machine learning used to provide context-awareness in business process management for smart manufacturing ? |
The validity of a study will depend on the proper selection of a database since it must adequately cover the area under investigation [19]. The Web of Science (WoS) is an international and multidisciplinary tool for accessing literature in science, technology, biomedicine, and other disciplines. Scopus is a database that today indexes 40,562 peer-reviewed journals, compared to 24,831 for WoS. Thus, Scopus is currently the largest existing multidisciplinary database. However, it may also be necessary to include sources relevant to computer science, such as EI Compendex, IEEE Xplore, and ACM. Table 3 compares the area of expertise of a selection of databases.
Planning Step 3 “Select digital libraries”. Description of digital libraries in computer science and software engineering.
Database | Description | URL | Area | Advanced Search Y/N |
---|---|---|---|---|
Scopus | From Elsevier. sOne of the largest databases. Very user-friendly interface | http://www.scopus.com | Interdisciplinary | Y |
Web of Science | From Clarivate. Multidisciplinary database with wide ranging content. | https://www.webofscience.com/ | Interdisciplinary | Y |
EI Compendex | From Elsevier. Focused on engineering literature. | http://www.engineeringvillage.com | Engineering | Y (Query view not available) |
IEEE Digital Library | Contains scientific and technical articles published by IEEE and its publishing partners. | http://ieeexplore.ieee.org | Engineering and Technology | Y |
ACM Digital Library | Complete collection of ACM publications. | https://dl.acm.org/ | Computing and information technology | Y |
Authors should define the inclusion and exclusion criteria before conducting the review to prevent bias, although these can be adjusted later, if necessary. The selection of primary studies will depend on these criteria. Articles are included or excluded in this first selection based on abstract and primary bibliographic data. When unsure, the article is skimmed to further decide the relevance for the review. Table 4 sets out some criteria types with descriptions and examples.
Planning Step 4 “Define inclusion and exclusion criteria”. Examples of criteria type.
Criteria Type | Description | Example |
---|---|---|
Period | Articles can be selected based on the time period to review, e.g., reviewing the technology under study from the year it emerged, or reviewing progress in the field since the publication of a prior literature review. | Inclusion: From 2015 to 2021 Exclusion: Articles prior 2015 |
Language | Articles can be excluded based on language. | Exclusion: Articles not in English |
Type of Literature | Articles can be excluded if they are fall into the category of grey literature. | Exclusion: Reports, policy literature, working papers, newsletters, government documents, speeches |
Type of source | Articles can be included or excluded by the type of origin, i.e., conference or journal articles or books. | Inclusion: Articles from Conferences or Journals Exclusion: Articles from books |
Impact Source | Articles can be excluded if the author limits the impact factor or quartile of the source. | Inclusion Articles from Q1, and Q2 sources Exclusion: Articles with a Journal Impact Score (JIS) lower than x |
Accessibility | Not accessible in specific databases. | Exclusion: Not accessible |
Relevance to research questions | Articles can be excluded if they are not relevant to a particular question or to “n” number of research questions. | Exclusion Not relevant to at least 2 research questions |
Assessing the quality of an article requires an artifact which describes how to perform a detailed assessment. A typical quality assessment is a checklist that contains multiple factors to evaluate. A numerical scale is used to assess the criteria and quantify the QA [22]. Zhou et al. [25] presented a detailed description of assessment criteria in software engineering, classified into four main aspects of study quality: Reporting, Rigor, Credibility, and Relevance. Each of these criteria can be evaluated using, for instance, a Likert-type scale [17], as shown in Table 5 . It is essential to select the same scale for all criteria established on the quality assessment.
Planning Step 5 “Define QA assessment checklist”. Examples of QA scales and questions.
Example 1: Do the researchers discuss any problems (limitations, threats) with the validity of their results (reliability)? | Level of Participation 1 – No, and not considered (Score: 0) 2 – Partially (Score: 0.5) 3 – Yes (Score: 1) |
Example 2: Is there a clear definition/ description/ statement of the aims/ goals/ purposes/ motivations/ objectives/ questions of the research? | Level of agreement 1 – Disagree (Score: 1) 2 – Somewhat disagree (Score: 2) 3 – Neither agree nor disagree (Score: 3) 4 – Somewhat agree (Score: 4) 5 – Agree (Score: 5) |
The data extraction form represents the information necessary to answer the research questions established for the review. Synthesizing the articles is a crucial step when conducting research. Ramesh et al. [15] presented a classification scheme for computer science research, based on topics, research methods, and levels of analysis that can be used to categorize the articles selected. Classification methods and fields to consider when conducting a review are presented in Table 6 .
Planning Step 6 “Define data extraction form”. Examples of fields.
Classification and fields to consider for data extraction | Description and examples |
---|---|
Research type | • Theoretical research focuses on abstract ideas, concepts, and theories built on literature reviews [9]. • Empirical research uses scientific data or case studies for explorative, descriptive, explanatory, or measurable findings [9]. Example: [1] an SLR on context-awareness for S-PSS and categorized the articles in theoretical and empirical research. |
By process phases, stages | When analyzing a process or series of processes, an effective way to structure the data is to find a well-established framework of reference or architecture. Examples: • [8] an SLR on self-adaptive systems uses the MAPE-K model to understand how the authors tackle each module stage. • [13] presented a context-awareness survey using the stages of context-aware lifecycle to review different methods. |
By technology, framework, or platform | When analyzing a computer science topic, it is important to know the technology currently employed to understand trends, benefits, or limitations. Example: • [3] an SLR on the big data ecosystem in the manufacturing field that includes frameworks, tools, and platforms for each stage of the big data ecosystem. |
By application field and/or industry domain | If the review is not limited to a specific “Context” or “Population" (industry domain), it can be useful to identify the field of application Example: • [23] an SLR on adaptive training using virtual reality (VR). The review presents an extensive description of multiple application domains and examines related work. |
Gaps and challenges | Identifying gaps and challenges is important in reviews to determine the research needs and further establish research directions that can help scholars act on the topic. |
Findings in research | Research in computer science can deliver multiple types of findings, e.g.: Framework, algorithm, methodology, data model, development approach. |
Evaluation method | Case studies, experiments, surveys, mathematical demonstrations, and performance indicators. |
The data extraction must be relevant to the research questions, and the relationship to each of the questions should be included in the form. Kitchenham & Charters [6] presented more pertinent data that can be captured, such as conclusions, recommendations, strengths, and weaknesses. Although the data extraction form can be updated if more information is needed, this should be treated with caution since it can be time-consuming. It can therefore be helpful to first have a general background in the research topic to determine better data extraction criteria.
After defining the protocol, conducting the review requires following each of the steps previously described. Using tools can help simplify the performance of this task. Standard tools such as Excel or Google sheets allow multiple researchers to work collaboratively. Another online tool specifically designed for performing SLRs is Parsif.al 1 . This tool allows researchers, especially in the context of software engineering, to define goals and objectives, import articles using BibTeX files, eliminate duplicates, define selection criteria, and generate reports.
Search strings are built considering the PICOC elements and synonyms to execute the search in each database library. A search string should separate the synonyms with the boolean operator OR. In comparison, the PICOC elements are separated with parentheses and the boolean operator AND. An example is presented next:
(“Smart Manufacturing” OR “Digital Manufacturing” OR “Smart Factory”) AND (“Business Process Management” OR “BPEL” OR “BPM” OR “BPMN”) AND (“Semantic Web” OR “Ontology” OR “Semantic” OR “Semantic Web Service”) AND (“Framework” OR “Extension” OR “Plugin” OR “Tool”
Databases that feature advanced searches enable researchers to perform search queries based on titles, abstracts, and keywords, as well as for years or areas of research. Fig. 1 presents the example of an advanced search in Scopus, using titles, abstracts, and keywords (TITLE-ABS-KEY). Most of the databases allow the use of logical operators (i.e., AND, OR). In the example, the search is for “BIG DATA” and “USER EXPERIENCE” or “UX” as a synonym.
Example of Advanced search on Scopus.
In general, bibliometric data of articles can be exported from the databases as a comma-separated-value file (CSV) or BibTeX file, which is helpful for data extraction and quantitative and qualitative analysis. In addition, researchers should take advantage of reference-management software such as Zotero, Mendeley, Endnote, or Jabref, which import bibliographic information onto the software easily.
The first step in this stage is to identify any duplicates that appear in the different searches in the selected databases. Some automatic procedures, tools like Excel formulas, or programming languages (i.e., Python) can be convenient here.
In the second step, articles are included or excluded according to the selection criteria, mainly by reading titles and abstracts. Finally, the quality is assessed using the predefined scale. Fig. 2 shows an example of an article QA evaluation in Parsif.al, using a simple scale. In this scenario, the scoring procedure is the following YES= 1, PARTIALLY= 0.5, and NO or UNKNOWN = 0. A cut-off score should be defined to filter those articles that do not pass the QA. The QA will require a light review of the full text of the article.