1473
Increasing Findability: Techniques for Improving Search Engine Rank on CDC’s Cancer Survivors' Online Content
Increasing Findability: Techniques for Improving Search Engine Rank on CDC’s Cancer Survivors' Online Content
Background:
Nearly 3 out of every 4 U.S. adults search for health information online using a search engine (Pew Internet Research, 2013). Search engines are critical to individuals beginning their search for disease information, symptoms and treatment and finding content relevant to their needs. Public health practitioners can use search query data to discover audience search intent and craft relevant, tailored, and actionable health information online. Several challenges, however, limit using and analyzing search engine data effectively. Google privacy standards and corporate policies, for example, limit the availability of search data. Moreover, search query data can accumulate to thousands of search terms that can be unwieldy to analyze effectively. This project addressed these concerns by leveraging methods in R statistical software (R) and the Google Search Console API (Google) to address search engine optimization.Program background:
By 2026, the number of cancer survivors is expected to increase by more than 30% to 20.3 million. The Centers for Disease Control and Prevention’s (CDC) Division of Cancer Prevention and Control (DCPC) identified its Cancer Survivors' site as a divisional priority. The Centers for Disease Control and Prevention’s Office of the Associate Director of Communication (OADC) worked together with the National Center for Chronic Disease Prevention and Health Promotion (NCCDPHP) Division of Cancer Prevention and Control (DCPC) to evaluate DCPC’s Cancer Survivors’ website (https://www.cdc.gov/cancer/survivors/). DCPC looked at strategies to improve the site’s usefulness to cancer survivors, their caretakers, and health care providers, with the goal of improving cancer survivors’ overall health and quality of life.Evaluation Methods and Results:
Baseline measures were established using web traffic data to understand existing visitor behavior. To analyze Google search query data, we identified “seed” words (or root words), such as “cancer,” “caregiver,” and “surviv-“, and used R programming with Google API data to plot the relationship between seed terms, Google rank, and clickthrough rate. We compiled a list of top 10 results for each search term and recorded the page titles and metadata to analyze similarities in word choice and target audience type. We used Google Trends data to estimate search volume for related terms and performed reading level analyses to make content recommendations. Our results showed a 51% increase in page traffic to the Cancer Survivors' site compared to traffic from the same time period in the previous year. Mobile traffic increased from 15% to 26%. Rank for key terms such as “cancer survivors” improved from position 8 to 1 on Google, and the site ranked at position 3 for a new term, “cancer stories.”Conclusions:
Improving findability of health information by understanding relationships between search terms, content, search volume, and clickthrough data can effectively improve search engine rank and page traffic.Implications for research and/or practice:
Significant literature describes optimizing web content for better visibility in search results, but skews heavily toward high-cost analytical platforms and rarely addresses the needs of public health and government entities. This effort describes a low cost approach to analyzing large amounts of search data using open-source technologies, and using the data to improve findability of content.