Understanding the Literature Search Process

6 min read
First Published: 
Dec 2012

Key Learnings contained in this article:

At Rx Communications we have recently been developing proposals for a number of literature review projects – including one where an original review and analysis required updating before publication, and others that were designed to find answers to fairly obscure or unusual questions. We find, at the start of the year, many of our clients looking for vendors for these types of projects, and so we thought it might be useful to share our experiences. Rx has been performing literature searches and reviews since the company began; we are one of the few boutique medical communication agencies to have a dedicated information manager who specialises in this work. With all the literature benchmarking, systematic and comprehensive reviews we have undertaken over the years, we have developed a useful 4-step thought process that helps ensure our clients get the results they want. Here’s hoping they get you the results you want too.

Step One – Decide what the question REALLY is.

Although this seems fairly obvious, it is often the make or break point of success, and we find that clients often haven’t thought through what it is they really want. This can make a tremendous difference to the size and complexity of the project: for example – is this research going to inform future drug or device developments or in-licencing opportunities? With a large future investment riding on the results of the search, you would do well to spend a little more on the literature search and analysis to ensure that your future investments will be well spent.  Is this search (for example, of study methodologies) going to inform the way future clinical trials will be conducted, or perhaps establish the basis for an HTA submission? In this case inclusion and exclusion criteria need to be very clearly defined so that only the most robust data are collected. Are you looking to publish the results as a foundation or background to your own development research?  It must be very clear that data have not been ’cherry picked‘, and that the search has not been skewed to omit unfavourable results or a competitor’s pivotal study.

On the other hand, you don’t want to end up analysing every single citation that may be identified in a comprehensive search, if all you want to know is the most common study approach taken to determine the parameter you are interested in. Part of deciding what the question is, is the issue of budget – do you need to know absolutely everything about your chosen question, or is a general indication sufficient?

Step Two – Choose the right literature resources

Once you have a better idea of what you need from the literature review, the next step is to determine which databases have the best results for the field of interest. In general, the four most important databases in the medical scientific field are MEDLINE, EMBASE, SCISEARCH and BIOSIS, but this is by no means guaranteed for any particular search requirement. For example, for health economics topics it may be better to use the HEED bibliographic database; for safety data TOXFILE might be useful; in another setting a good database to use might be the Cochrane literature reviews. Rx uses ProQuest – a search engine that accesses 64 relevant databases and that recently superseded Datastar. The search engine used will have some effect on the results, but in general it is best to use one that allows the removal of duplicates – otherwise costs increase considerably. However, if other databases or sources (e.g. abstract books, grey literature) are deemed to be important but are not available by the main search engine, these can be added to the results and de-duped manually.

Normally, for a comprehensive search strategy, Rx would use the top 4 databases that retrieve the most citations; particularly if the number of duplicates is low. PubMed is a logical start because it doesn’t cost anything, but researchers should be aware that PubMed and MEDLINE are not exactly the same entities, and the search engine change from PubMed to MEDLINE can also make a slight difference to the results. Depending on the therapeutic area in question, PubMed will usually pick up between 60 and 80% of the available published full papers; hence the need to use more than one literature source if the results must be definitive. We always include some form of benchmarking for databases and search engines to get a feel for the margin of error.

Step Three – Choose the right search strings and limits

Once the right databases have been selected, the search string needs to be constructed. Although in most databases the order of the terms won’t make a difference, this must be thoroughly checked because the optimal order may differ between databases. In addition, limiting the search to just titles, abstract and title, or using a full text search will yield considerable differences.

Testing of search strings, the terms used and their order, and whether a truncated term and asterisk gives better results, is an iterative process. The construction of search strings will differ considerably between search engines, so it may not be possible to exactly duplicate the results using the same search string if the search engine is not known. Experienced personnel who understand the search engine requirements and the way each bibliographic database is indexed should undertake the construction of search strings, together with researchers who understand the subject area and can advise on alternative terms to ensure information is not missed because different terminology has been used.

Deciding the fields in which to search is crucial. Searching on only the”title” field may not pick up relevant content; nor will searching on the “abstract” field, particularly if the search is for some secondary endpoint or minor element of study results. Although searching on the full text will pick up a great deal of literature that is not useful and must be discarded, we prefer this approach because in our experience we have picked up many pivotal papers that would have otherwise been missed.

The table below shows the differences between adding terms involved in treatment (rows) or in searching on titles, abstracts or full text (columns). In this particular instance we wanted to determine how treatment responses differed in adult patients with psoriasis. This table is a sample for demonstration purposes only – so for brevity’s sake (and for confidential reasons) we have not included the full brief, methodology or study design.

Using the example above, when the first few pages of titles from the results of rows 1 and 2 were compared, to determine if any useful literature had been missed by not including the “treatment” related terms, it was evident that approximately half of the relevant literature had been omitted.

In addition, an even larger amount of literature is retrieved if the abstract or the full text of the article is searched for the relevant terms. When the first few pages of the results from columns 1 and 2 were examined, 15 relevant papers out of the first 100 had been missed by searching on titles only. As can be seen above, it is possible that this particular search performed on titles only (which incidentally is what the NICE guidelines for literature reviews recommend!) picked up less than a tenth of the potential literature. This may not be an issue if one is looking for a simple answer to the question “IS there a difference in treatment response in adults with psoriasis?”, but if one wants to understand HOW a difference manifests and in WHAT PROPORTION of patients, then the smaller number of citations may be insufficient for an accurate answer.

Please note, we are not saying here that as a rule of thumb, you will miss 50% of relevant literature by leaving out key words, or that you will miss 15% of the relevant literature by searching on titles only.  This is our results for one search series in one therapy area on a particular day – our point is that you should clearly establish in your search strategy how much data you may be missing using your chosen method, and have made a judgement call on whether you can afford to miss it or not.

Step 4 – Use the right inclusion/exclusion criteria – a pilot phase?

Naturally, the right inclusion and exclusion criteria must be set a priori and stated in the methodology before the data extraction begins, if the analysis is to be systematic and consistent. However, a pilot phase can be a useful means of determining if the searches have in fact yielded the relevant results, and have not excluded anything important. This helps prevent a change in plan after the analysis begins, when it is discovered that the criteria are too stringent or are so broad as to waste the researcher’s time. A pilot phase has the advantage of ensuring that a single search sequence used on a particular day can then be replicated if need be by other researchers, giving you confidence that your final search methods will be transparent and repeatable. A rapidly conducted pilot study would help determine if more stringent exclusion criteria could limit the number of papers to be analysed, without losing important information. This may reduce the number of papers to a more manageable amount.

In summary then, these are the first 4 steps to get right in the lengthy process of performing a good literature review.  Good luck with your searches, and if you want any more information about how to put together a good search there are several people to talk to at Rx who would be happy to give you more insights.  For getting the question right, contact Ruth or Caroline; for deciding on databases and perfecting search strings, William is your man.  And we can all help with inclusion and exclusion criteria.

We'll deliver straight to your inbox

We take your privacy very seriously and will never share your details with other parties.
You're subscribed! We'll send you a welcome email shortly, keep an eye out and if you don't find it perhaps check the (sometimes over-zealous) spam folder.
Oops! Something went wrong while submitting the form.
Ruth Whittington
CEO of Rx Values Group Ltd
MSc(hons), NZSRN
Share this post

Discover the Power of Communication with Rx

Embark on your medcomms journey with Rx today and experience the difference of working with a world-class medical communications agency.

Child playing in autumn leaves
Copyright Rx Communications Ltd