Understanding Data Retrieval

In this article we continue our discussion of the best way to conduct literature searches, review and analysis.

This is the second phase in the Literature Review process: actually getting hold of the right data. If you have been following our series, we’ve covered the search process in our first article as 4 steps: Firstly: decide what your question really is, and take the appropriate path to answer that question.  In a nutshell, don’t go overboard on highly systematic comprehensive reviews if you want a yes/no answer, but also be aware of the search process pitfalls if your question is more complex. Secondly, choose the right literature sources.  Choosing the right literature sources can be influenced by data retrieval, which is discussed in this article.  Thirdly, choose the right search strings and limits (we advise getting a professional to do this if possible). And finally, select the right inclusion and exclusion criteria. It can be helpful to run a pilot phase to confirm that the limits you have set actually work for your particular research project.  Now read on… At this point, we have assumed that you have run your searches, decided your criteria and are now getting set to retrieve your data.  Here’s where the expenses of the study start to skyrocket, so it can be very useful to know how you might be able to reduce the costs by some judicious rethinking.  You have 3 key factors to consider at this point:  Search vs. Retrieval Costs, Project Timelines vs. Cost Savings, and Copyright Compliance.  Let’s get Copyright Compliance out of the way first.

 

Copyright Compliance

In the current environment, it’s essential to cover the thorny problem of copyright, which most academics have been ignoring for years.  However, ignorance of copyright constraints is no excuse – for institutions that have been caught breaching copyright law, the penalties can be financially very painful. Copyright is  treated differently in many countries – and in the UK is a very convoluted and complex undertaking. So the first thing to do is to familiarise yourself with the particular laws in your own country.

Copyright Conundrums in the UK

When writing a manuscript and getting it reviewed by a colleague, the original writing is owned by the writer but the reviewer changes are owned by the reviewer – one reason why, in Rx, our writers are bound by copyright deeds that transfer everything to us so that we can in turn transfer it to our clients in the knowledge that it cannot be challenged in the future.  Most journals require authors to transfer copyright to the journal for this reason – although many authors are now challenging this stricture on the reuse of their own material.

In the UK we have a non-profit organisation known as the CLA – Copyright Licencing Agency (http://www.cla.co.uk/).  This organisation covers printed material from 29 countries and digital material from 6 countries.  Most UK medical communications companies and all research organisations should have a CLA licence – the minimum cost of which is around £130 annually or about £25 per employee for a business licence. Costs have been drifting upwards for years; our licence costs have more than quadrupled since we started business.  And as costs have risen, so has the interpretation of what is and isn’t copyright become more stringent and yet confusing, particularly for medical communications agencies such as ourselves. Cutting through the verbiage of the legalese in our CLA Licence Agreement it appears that the Grant of Licence allows us to:

  • Make paper copies and distribute them to authorised people within the UK
  • Scan Material Licensed for Scanning to produce Digital copies unless we already subscribe to a digital version
  • Make digital copies solely within the licensees intranet
  • Store licenced digital copies on the intranet for a maximum of 30 days

 

In the most stringent interpretation of this licence it appears that you could be breaking the rules every time you:

  • Send digital or paper copies to reviewers or writers outside the UK
  • Keep digital copies for longer than 30 days (but let’s face it – most lit reviews take far longer than 30 days to complete!)
  • Don’t pay another copyright fee for any papers you share with your client.
  • Send digital or paper copies to a consultant/author who is not within your client company.

Another highly unenforceable part of this copyright law is the potential prohibition of any annotations on the paper – i.e. highlighting sections, making notes or comments on it.  I personally find this untenable – what researcher doesn’t scribble their comments in the margins of their source material?  And many pharma clients demand annotated papers as part of their quality review and approval processes, so if this component of copyright law is upheld, the implications could be staggering. It seems that the only acceptable annotations are those for teaching purposes.  However, the legalities are evolving continuously, and the CLA are amending licences for different organisation types e.g. for public relations agencies.  I believe some of these more ambitious strictures will be lifted soon, but at present be afraid, be very afraid.  Only individual researchers using one copy of the material for personal use are exempt.

 

So,  in summary:

  • Find out what copyright law exists in your country and follow it
  • If in doubt, follow at least the minimal CLA requirements
  • Don’t copy and distribute your retrieved articles to all and sundry

 

Search vs. Retrieval Costs

Anyone who has performed searches on large database engines knows the costs of the searches alone used to be significant, even without any substantive data retrieval; however, charges for just searching seem to be moving out of fashion at last.   Instead, rather than a Pay As You Go system, commercial search engine companies are asking for monthly standing charges instead, independent of  the amount of time spent searching.  For example, when ProQuest replaced Datastar, charges altered – even searches on previously expensive bibliographic databases now cost nothing until you download documents. Fortunate academics who have unlimited access to library systems, and employees of big pharmaceutical companies with library and information managers at their service are probably shielded from search and retrieval costs to a certain extent.  Believe me, they can be horrific. So use PubMed (www.ncbi.nlm.nih.gov/pubmed/)  first as it’s free.  (We should all offer up a vote of thanks to the US National Institutes of Health who have provided us poor but keen scientists with such a magnificent resource.) However, as we mentioned in our earlier article, if you are performing searches on more than one bibliographic database, you will need a search engine that removes duplicate references from your search list, and these are the ones that cost money. Wikipedia gives a good list of bibliographic databases/search engines that cover scientific fields if you are looking for free alternatives.

 

So, in summary:

  • start with PubMed  ( http://www.ncbi.nlm.nih.gov/pubmed/)
  • If very poor, use the Wikipedia list to find free databases that cover your topic
  • Use a service that will remove duplicates from your searches of more than one database.

 

Project Timelines vs. Cost Savings

The first thing to note is that document suppliers will charge to supply documents that are freely (and legally) available, so it’s important to check whether the reference is classed as `open access’.  The easiest way to check this is to search for the document on PubMed and if it is in their database then a link to the publisher’s site is usually shown.  This link will often say whether the document is free – look for terms such as “open access” or “free full text”.  Many of the free papers are available on PubMed Central (http://www.ncbi.nlm.nih.gov/pmc/), which is a valuable resource.  Now here’s the bit you may not know: not all the free papers are marked as being free on the publisher’s link.  The only way to be sure is to proceed to the publisher’s site and attempt to acquire the paper.  At this stage you will probably be confronted with a request for payment, but in some cases the reference is available for download without any mention of it being free.  It’s also worth noting that following a search in PubMed, it’s often possible to filter the results to display only free articles. Another useful resource is the Free Medical Journals site (http://www.freemedicaljournals.com/). This is not a document supplier but simply provides information on journals that offer access to free papers. Several mainstream medical journals offer free access to papers that are over a year old – this time limit can vary but you will rarely find that a newly published article is free. So where does this leave you as regards cost savings? In our agency, we tend to decide our approach on the amount of time we have – if a project is very urgent we have to retrieve our articles by the fastest route, which generally means Infotrieve   (http://www.infotrieve.com/document-delivery-service)  or the British Library (http://www.bl.uk/reshelp/index.html).  One thing to be aware of is that the British Library will supply encrypted documents with a time limit (i.e. you cannot open the .PDF file after 30 days, so you have to print it out), which adds to the inconvenience.  And naturally, the faster the retrieval, the greater the cost.  As the costs can be over £100 per article, this is very important – is the project REALLY so urgent?  Worth checking before spending large amounts of your project budget.. We tend to try and find as many articles at minimal or no cost first, before buying the newest articles from a document provider.  Another way to save money is to spend the money to download the abstracts and titles via the search engine, and then only order the full papers for the references that are the most promising.  Some analyses (often the ones where the research question is the yes/no kind) can be performed solely or mainly on abstracts; so you can largely avoid the cost of the article retrieval.  Even for these particular analyses we tend to go to PubMed first, and only use document providers for the abstracts of papers that are not available from anywhere else.  However, all these approaches add time to the project so it is always a compromise: time or cost. The table below summarises the factors you need to weigh up in determining how to go about your data retrieval, and which choices are likely to lower or raise your costs:

 

Factor to consider Cost-increasing Cost-saving
What is the nature of the research question? Needs in-depth analysis Needs a yes/no answer
Can we depend on abstract analysis? No Yes
What are the timelines? Short Not so urgent
What percentage of the published research has occurred in the last 12 months? Most of it Not so much
Are most of the papers available in PubMed? No Yes
Are there other free databases that can be used? No Yes
Do the most common journals in the search have an open access policy? No Yes
Do the most common journals in the search have an open access policy? No Yes
Do other people require copies of the papers? Yes No

 

And of course, if you need any help with deciding your strategy or managing your retrieval costs, don’t hesitate to give us a call for advice.